Skip to main content
Catalog
T020
Technology

Observability Data Explosion

MEDIUM(79%)
·
February 2026
·
4 sources
T020Technology
79% confidence

What people believe

More observability data means better debugging and faster incident resolution.

What actually happens
+500-1000%Monthly observability cost
+200%Observability cost as % of infra spend
80% wasteDashboards created vs actively used
MinimalMean time to resolve (MTTR)
4 sources · 3 falsifiability criteria
Context

Teams adopt observability platforms (Datadog, New Relic, Splunk) to understand their distributed systems. The pitch: instrument everything, collect all the data, and you'll be able to debug any issue. So they do. Every service emits metrics, traces, and logs. The data volume explodes. The observability bill becomes one of the largest line items in the infrastructure budget — sometimes exceeding the cost of the infrastructure being observed. And despite all this data, teams still can't find the needle in the haystack when production breaks.

Hypothesis

What people believe

More observability data means better debugging and faster incident resolution.

Actual Chain
Data volume and cost grow exponentially(Observability costs: $10-50K/month for mid-size companies)
Every new service, endpoint, and metric adds to the bill
Log volume grows 30-50% annually without intervention
Observability vendor pricing designed to scale with data — not with value
More data doesn't mean better insights(Signal-to-noise ratio degrades with volume)
Engineers spend more time querying dashboards than fixing issues
Too many metrics — nobody knows which ones matter
Dashboard sprawl: 50-200 dashboards, most never viewed
Vendor lock-in through proprietary query languages and integrations(Migration cost: months of engineering time)
Custom dashboards, alerts, and queries tied to vendor-specific syntax
Switching vendors means rebuilding all observability from scratch
Vendor raises prices knowing you can't easily leave
Cardinality explosions create surprise bills(Single high-cardinality metric can cost $10K+/month)
A developer adds a user_id tag to a metric — bill doubles overnight
Cardinality limits force teams to drop useful dimensions
Impact
MetricBeforeAfterDelta
Monthly observability cost$1-5K$10-50K++500-1000%
Observability cost as % of infra spend5-10%20-40%+200%
Dashboards created vs actively usedMost used80% never viewed80% waste
Mean time to resolve (MTTR)Expected to decreaseFlat or marginal improvementMinimal
Navigation

Don't If

  • Your observability bill exceeds 20% of your infrastructure cost
  • Your team has more dashboards than they can review in a week

If You Must

  • 1.Define SLOs first, then instrument only what's needed to measure them
  • 2.Implement sampling for high-volume traces and logs — you don't need 100% of everything
  • 3.Set cardinality budgets and enforce them in CI
  • 4.Audit dashboards and alerts quarterly — delete what nobody uses

Alternatives

  • OpenTelemetry + open-source backendsVendor-neutral instrumentation with Grafana/Prometheus/Jaeger — control your data and costs
  • SLO-driven observabilityInstrument for SLO measurement, not for 'collect everything' — focused and cost-effective
  • Adaptive samplingSample more during incidents, less during normal operation — right data at the right time
Falsifiability

This analysis is wrong if:

  • Collecting more observability data consistently reduces MTTR proportionally
  • Observability costs grow slower than infrastructure costs as systems scale
  • Teams with more dashboards and metrics resolve incidents faster than teams with fewer
Sources
  1. 1.
    Datadog S-1 Filing

    Datadog's revenue growth demonstrates how observability costs scale with customer infrastructure

  2. 2.
    Chronosphere: Observability Cost Report

    Analysis showing observability costs growing 30-50% annually, often faster than infrastructure costs

  3. 3.
    Honeycomb: Observability Engineering

    Framework for effective observability that avoids the 'collect everything' trap

  4. 4.
    CNCF: OpenTelemetry Project

    Vendor-neutral observability standard that reduces lock-in and enables cost control

Related

This is a mirror — it shows what's already true.

Want to surface the hidden consequences of your engineering decisions?

Try Lagbase