The Renewal Email That Started This Post
A friend running platform engineering at a Series C startup forwarded me his Datadog renewal quote last month. The number had a comma in a place he wasn’t expecting. His infra hadn’t grown — they’d added a few services, turned on APM for two more apps, and let some custom metrics sneak in through a contractor’s PR. The renewal came back roughly 2.4x the previous year.
That conversation isn’t unusual in 2026. It’s the conversation. Average annual Datadog spend is sitting north of $30K per customer, enterprise contracts routinely cross seven figures, and the lock-in pain is real because most teams instrumented in vendor-specific SDKs years ago. Three things changed the picture this year: OpenTelemetry hit production maturity across all three big vendors, Grafana Cloud kept eating the open-source-friendly segment, and New Relic’s user-based pricing finally has enough mileage on it to evaluate honestly.
So here’s the comparison I wish I’d had when we were re-evaluating last quarter. Not vendor-marketing fluff, not a “they all have their merits” cop-out. Real numbers, real trade-offs, and where each one will burn you.
What’s Actually Different in 2026
Before pricing, the lay of the land. Three forces are reshaping observability buying decisions:
OpenTelemetry is no longer aspirational. OTel’s metrics, logs, and traces are GA across every major vendor. You can instrument your code once with vendor-neutral SDKs and switch backends without rewriting application code. That doesn’t make migration trivial — dashboards, alerts, and runbooks are still vendor-specific — but it removes the deepest layer of lock-in.
Datadog’s renewal cycle is biting. A lot of teams signed 3-year deals during 2022-2023 when revenue and growth were richer. Those contracts are coming up for renewal now, and the price-up at renewal is hitting at the same time finance is asking sharper questions about cloud spend.
AI-workload observability is a legit category now. GPU utilization, token cost tracking, prompt/response inspection, vector DB health — every vendor has a story here, and the stories are surprisingly different.
Now, the platforms.
Datadog: The Best UX, the Worst Bill
Datadog is the BMW of observability. The product is excellent. Dashboards are fast, the query language is sane, APM has more depth than anything else on the market, the AI features (Bits AI, Watchdog) actually find things, and integrations exist for every piece of obscure infrastructure you have. If you walk in cold and just want monitoring to work, this is the thing that will work fastest.
How the bill is built
The pricing model is also where most teams get cooked. Datadog charges along multiple axes simultaneously, and they all add up:
- Per host for infrastructure monitoring (around $15-23/host/month for Pro/Enterprise tiers)
- Per GB ingested for logs, plus separate per million indexed and retained
- Per custom metric above a small included allowance — this is where bills explode
- Per indexed span for APM traces, with sampling controls that aren’t always intuitive
- Per user for some products like CI Visibility and DBM
- Per session for RUM, per check for synthetics, per device for Network Device Monitoring
A Datadog bill is not a number. It’s a budget across 8-12 line items, each scaling on different metrics. The “just turn it on for that service” reflex is what kills you. Custom metrics in particular have a rough edge: high-cardinality tags (user IDs, request IDs, anything user-generated) can multiply a single metric into millions of unique time series, and you pay per series.
When the price is defensible
What you actually get for the money is the most polished operational experience in the category. Watchdog detects anomalies you didn’t write rules for. Service maps build themselves. The single-pane-of-glass story isn’t marketing — it’s mostly real, and that operational savings is the only reason the price tag is defensible at scale. If you’re a 10-person startup, Datadog’s price tag will look insane. If you’re a 200-person engineering org with a dedicated SRE function, the math gets murkier in Datadog’s favor because the alternative is more humans.
Grafana Cloud: Open Source With a Bill
Grafana Cloud is the LGTM stack — Loki for logs, Grafana for dashboards, Tempo for traces, Mimir for metrics — sold as a managed SaaS. It’s the closest thing to a vendor-neutral default in 2026. The data model is Prometheus and OpenTelemetry-native, the visualization layer is the de facto industry standard, and the pricing is comparatively simple.
Tiers and what they cost
Pricing as of April 2026 (verify the current rates on grafana.com — these change):
- Free tier: 10K series for metrics, 50GB logs, 50GB traces, 3 users, 14-day retention. Generous enough to actually run small production workloads.
- Pro: $8/active user/month plus usage-based ingest (per-series for metrics, per-GB for logs and traces). Per-series pricing for Mimir is roughly $8 per 1K series per month at low volumes, scaling down with commitments.
- Advanced/Enterprise: Custom pricing, includes things like enterprise plugins, Grafana Cloud k6 load testing, advanced RBAC.
Portability is the real product
The Grafana pitch is portability. Your dashboards are Grafana JSON — you can host them yourself if you ever need to. Your data is Prometheus and OTel — you can scrape and forward elsewhere. Your queries are PromQL and LogQL, which are also what you’d use against self-hosted Prometheus and Loki. The escape hatch is real, and it changes negotiation dynamics with the vendor.
The catch is what I’ll call the high-cardinality trap. Mimir scales beautifully on time series count, but per-series pricing means a runaway label dimension hits the bill the same way it would in Datadog. The pricing is more transparent because there are fewer line items, but a metric with user_id as a label will still bankrupt you. Loki’s log-volume pricing rewards low-cardinality logs and punishes you for emitting structured JSON with too many unique fields.
Grafana Cloud is also less polished than Datadog on the day-to-day. The auto-instrumentation story isn’t as rich. Anomaly detection exists but it’s not Watchdog. AI assistance (“Grafana Assistant”) shipped in 2025 but hasn’t caught up to Bits AI yet. You’re trading polish for portability and a substantially smaller bill — which is the right trade for most teams under 200 engineers, in my opinion.
New Relic: Per-User Pricing, Per-GB Pain
New Relic re-architected their pricing in 2020 around a per-user model, and 2026 is the year that’s finally settled into something predictable. There are now four user types — Basic (free), Core, Full Platform, and Full Platform Compute — and you pay per Full Platform user (the engineers actively using APM, distributed tracing, and the deep stuff) plus per GB of data ingested.
Headline numbers
The headline numbers as of April 2026:
- Original Data ingest: $0.40/GB
- Data Plus ingest: $0.60/GB (longer retention, FedRAMP Moderate, HIPAA, more entitlements)
- Full Platform user: roughly $549/user/month at list, with significant discounts at commit
- Core user: roughly $49/user/month
- Free tier: 100GB ingest/month, one Full Platform user, unlimited Basic users
For small teams the free tier is unusually useful — you can actually run real production observability on it for a while. For mid-size teams the per-user math is where it gets interesting. Five Full Platform users at list is $2,745/month before any data costs. That’s wildly cheaper than equivalent Datadog at low data volumes and roughly comparable at moderate ones.
Where it stings
Where New Relic falls down is the per-GB ingest math at high log volume. If you’re shipping 5TB of logs a month, that’s $2,000-3,000/month in ingest alone before user fees. Datadog’s log pricing isn’t cheaper at that volume, but it’s similar — and Grafana Cloud’s Loki, with its index-light architecture, can be materially cheaper for high-volume low-cardinality logs.
The other thing to know about New Relic is the product breadth is genuinely competitive — APM, infra, logs, traces, RUM, synthetics, browser, mobile, AI monitoring, vulnerability management, all in one bill. The UI is a half-step behind Datadog and a half-step ahead of Grafana for operations work. The AI features (New Relic AI, error inbox auto-grouping) are good without being obviously better than the alternatives.
Real-World Cost Scenarios
Forget marketing tables. Here’s what bills actually look like at three sizes, based on real conversations with teams I trust. Numbers are list price approximations as of early 2026 — the point is the relative shape, not the exact line items.
Startup: 10 services, 5 engineers, ~200GB logs/month
- Datadog: ~$800-1,400/month if you keep the feature set tight. Easily double that if APM and custom metrics get out of hand.
- Grafana Cloud Pro: $40/month in user fees plus maybe $200-400 in usage. Free tier covers a lot of this if you’re disciplined.
- New Relic: Free tier likely covers most of it. Paid: 1-2 Full Platform users plus minimal data overage = ~$600-1,200/month.
Winner at this size: Grafana Cloud or New Relic free tier. Datadog is a luxury you’ll regret.
Mid-Size: 50 services, 40 engineers, ~2TB logs/month, ~3M custom metrics
- Datadog: $8,000-15,000/month, with a wide variance depending on retention, custom metric discipline, and which features are toggled.
- Grafana Cloud Pro: $1,500-4,500/month. The user count starts to matter and log volume drives most of the variance.
- New Relic: $5,000-10,000/month. Per-user pricing is a tax at this size, but log ingest is similar to Datadog.
Winner at this size: Grafana Cloud, usually. New Relic if you really want the polished APM and don’t have the in-house Grafana skill.
Enterprise: 500 services, 300 engineers, 20TB logs/month
At this size, list prices stop being meaningful. Everyone is in custom-deal territory. Real renewal numbers I’ve heard about:
- Datadog: $80K-300K/month range. Big variance based on what’s enabled.
- Grafana Cloud Advanced: $30K-150K/month, often paired with self-hosted Prometheus for the highest-volume pieces.
- New Relic: $50K-200K/month, with the per-user fee actually working in your favor if you have a lot of engineers who only need Core access.
Winner at this size: It depends entirely on what you’ve already invested in. Migration cost is the dominant variable, not list price.
Where The Money Actually Goes
When teams audit their observability bills, the same line items keep showing up at the top. Custom metrics with high-cardinality tags are the single biggest source of bill surprise — a label with a request ID or a UUID generates one time series per unique value, and those add up fast. Logs with debug statements left in production are second. APM trace retention defaults are third — most platforms default to 15 days, and almost nobody actually queries traces older than a week.
The fix for all three is unglamorous: enforce cardinality limits in your instrumentation, treat log-level changes as production deploys, and tune your trace sampling rates. This work pays back 20-40% of a typical observability bill without changing vendors. I’d do this before any migration, because if you don’t, the new vendor’s bill will have the same shape.
OpenTelemetry as the Escape Hatch
The strategic move in 2026 is to instrument with OpenTelemetry SDKs and treat the backend as swappable. All three platforms ingest OTLP natively. That means your application code looks the same whether you’re shipping to Datadog, Grafana, or New Relic, and switching becomes a config change in your collector — not an SDK rewrite.
What doesn’t transfer automatically: dashboards, saved queries, alerting rules, runbooks that reference specific UI flows, any custom processors built on a vendor’s SDK extensions. These are real costs in a migration but they’re capped — typically 1-2 SRE-quarters of work for a 50-service org, more for enterprise. The OTel instrumentation work is the much bigger lift, and once it’s done it doesn’t have to be re-done.
If you’re starting fresh in 2026, instrument with OTel from day one. If you have existing Datadog or New Relic SDK instrumentation, the migration to OTel is its own project worth doing on its own merits — vendor lock-in shrinks every quarter you delay, and the ecosystem around OTel-native tooling (Tempo, Jaeger, Honeycomb, SigNoz) keeps getting stronger.
My Actual Recommendation
For most teams I talk to: instrument with OpenTelemetry, run Grafana Cloud as the primary backend, and put New Relic or Datadog on the shortlist only if you have a specific feature need (RUM at scale, deep AWS-specific integrations, SAP/database-deep APM) that those platforms genuinely do better.
For startups under 30 engineers: New Relic’s free tier is the most generous starting point in the category. Use it until it doesn’t fit, then evaluate.
For platform teams at large orgs already on Datadog: don’t migrate just to save money on paper. Audit your custom metrics and log retention first — the savings from cleanup typically exceed the savings from migration, and the migration risk is real. If you’ve cleaned up and the bill is still painful, then negotiate hard at renewal with a credible OTel-based exit plan in your back pocket.
Pick one thing to do this week: pull last month’s invoice, sort line items by cost, and find the one that’s growing fastest. That’s the conversation to have with your team — not which vendor logo you want on the dashboard.