Go back to All Articles

Azure Event Hubs Silently Truncates JSON Payloads

Azure Event Hubs truncates JSON logs that exceed undocumented size limits, appending '...' and producing invalid JSON with no error or warning. Here is how we found it and fixed it.

AlphaSOC5 min read
Azure Event Hubs silently truncates JSON payloads

Azure Event Hubs truncates JSON payloads that exceed an undocumented size limit, appends ... to the cut-off line, produces syntactically invalid JSON, and does all of this without emitting an error, a warning, or any indication that data was lost. We discovered this while testing our log ingestion pipeline and fixed it with a single json.Valid() check. This post documents what we found, why it matters, and what the fix looks like.

How We Found It

During testing of the AlphaSOC Event Hub consumer against Azure Activity logs, our pipeline began producing consistent JSON parse errors. The payloads were failing validation downstream, and the failures were reproducible.

The error logs pointed to malformed JSON arriving from the Event Hub consumer. Pulling a sample payload showed the problem immediately: lines were being cut mid-structure, with ... appended at the truncation point.

{"time":"2026-01-14T09:23:11Z","resourceId":"/SUBSCRIPTIONS/...","operationName":"Microsoft.Compute/virtualMachines/write","properties":{"statusCode":"Created","serviceRequestId":"a3f1b2c4...

That is not a JSON object. No downstream parser will accept it. The original event is unrecoverable.

What Azure Is Doing

Azure Event Hubs documents a maximum message size of 1 MB. The truncation occurs well below this limit. An independent report on the Microsoft Q&A forum confirmed events of 45-80 KB being truncated to approximately 16 KB using the latest Azure SDK — the same threshold we observed. The problem is not isolated to Activity logs: Logic Apps diagnostic logs and other Azure services exhibit the same behavior. There is no documented threshold that explains it, no configuration option to reject instead of truncate, and no error code emitted when truncation occurs.

The choice to truncate rather than reject creates a worse outcome than a hard failure would. A rejected message is detectable and actionable. A truncated message with ... appended looks superficially like a complete payload until you attempt to parse it.

This issue was first reported in July 2020. A Microsoft moderator responded that same month confirming the engineering team planned to fix it "this semester." That was nearly six years ago. The thread has collected reports from 2020, 2022, 2024, and 2025 — including a comment from a member of the AlphaSOC engineering team after we encountered it ourselves. As of this writing, it remains unfixed.

The Impact on Security Pipelines

For general-purpose data pipelines, silent truncation is a data quality problem. For security pipelines, it is a visibility gap.

A truncated Azure Activity log entry means a security event that was generated, transmitted, and received by your pipeline was silently dropped before it reached your detection engine. No parse error surfaces in the SIEM. No alert fires. The event simply does not exist in your data.

The failure mode is particularly difficult to detect because small payloads pass through without truncation. A test environment with representative but compact log events will show no issues. The problem only manifests at realistic payload sizes, which means it can go undetected through development and into production.

The Fix

The solution is to validate JSON at the point of ingestion and discard any payload that fails. There is no recovery path for a truncated event: the original data is gone and the ... marker cannot be reversed. Attempting to parse or partially process a truncated payload will produce incorrect results.

Our Event Hub consumer in Go now runs json.Valid() against each event body before passing it to the write pipeline:

for _, event := range receivedData {
    // There's a known issue with Azure Event Hub where large messages
    // are truncated in a way that they are no longer valid JSON.
    if !json.Valid(event.Body) {
        w.InvalidCount++
        continue
    }

    _, err := w.gw.Write(event.Body)
    if err != nil {
        return fmt.Errorf("write data: %w", err)
    }

    w.ValidCount++
    w.EventsSize += int64(len(event.Body))
}

The stats struct was updated alongside the fix to track valid and invalid counts separately:

type stats struct {
    ValidCount   int64 `json:"validCount"`
    InvalidCount int64 `json:"invalidCount"`
    EventsSize   int64 `json:"eventsSize"`
}

The InvalidCount field is the key addition. Without it, truncated events were being silently skipped with no record of how many were lost. Now the pipeline surfaces the drop rate explicitly, which makes the Azure truncation behavior visible rather than hidden.

What This Means for Your Pipeline

If you are ingesting Azure Activity logs, Diagnostic logs, or any other Azure-sourced telemetry via Event Hubs, you may be silently losing events today. The problem is not specific to AlphaSOC's pipeline: any consumer that does not validate JSON before processing is exposed to this.

The checks to add are straightforward:

  • Validate JSON structure before parsing. In Go, json.Valid(). In Python, a try/except around json.loads(). In most languages, the equivalent is one or two lines.
  • Track invalid event counts separately from valid ones. If your metrics only count processed events, you have no visibility into what was dropped.
  • Test with realistic payload sizes. Small synthetic test events will not trigger the truncation threshold.

If your current pipeline does not have these checks and you are ingesting from Azure Event Hubs, the first thing worth doing is adding an invalid event counter and watching it for 24 hours. The number it reports is the size of your current visibility gap.

The underlying issue is Azure's. The fix is yours to deploy.

AlphaSOCAzureData IntegrityBlog Article