CubeAPM
CubeAPM CubeAPM

DynamoDB Monitoring: On-Demand vs Provisioned Capacity Cost Optimization

DynamoDB Monitoring: On-Demand vs Provisioned Capacity Cost Optimization

Table of Contents

DynamoDB offers two capacity modes, On-Demand and Provisioned and the pricing gap between them can reach 5x for workloads with steady traffic. A table processing 10 million writes per month costs $15 on Provisioned mode vs $75 on On-Demand mode at constant utilization. That gap narrows or reverses for spiky workloads, but most teams discover the real cost structure only after their first bill arrives.

The choice between On-Demand and Provisioned is not a one-time decision — workload patterns change, auto-scaling breaks, and sudden traffic spikes expose hidden cost drivers most teams never planned for.

This guide compares both capacity modes on pricing structure, real cost scenarios for small and mid-market teams, monitoring requirements, and when to switch between them. It also covers how infrastructure monitoring platforms track DynamoDB utilization to prevent cost overruns before they hit your bill.

Quick Comparison: On-Demand vs Provisioned Capacity

FeatureOn-Demand ModeProvisioned Mode
Pricing modelPer-request ($1.25/million WRUs, $0.25/million RRUs)Per-hour capacity ($0.00065/WCU-hour, $0.00013/RCU-hour)
Best forUnpredictable traffic, new apps, spiky workloadsSteady predictable workloads, cost-sensitive at scale
Capacity planningNone — auto-scales instantlyRequired — you define WCUs and RCUs upfront
Idle costZero — pay only for requests madeFull hourly cost even if unused
Auto-scalingBuilt-in, instantOptional via Application Auto Scaling
Reserved capacity discountsNot availableUp to 75% savings with 1-3 year commitments
Cost at low utilizationLower — no idle chargesHigher — you pay for provisioned capacity
Cost at high constant utilizationHigher — per-request fees add upLower — fixed hourly rate amortizes
Mode switchingOnce per 24 hours per tableOnce per 24 hours per table
Monitoring needsTrack request volume spikesTrack consumed vs provisioned capacity, throttling

Pricing based on US East (N. Virginia) as of early 2026. Other regions may vary slightly.

What Is DynamoDB Capacity Mode?

DynamoDB capacity mode defines how AWS bills you for read and write throughput on your tables. You choose between two models when creating a table: On-Demand and Provisioned. Both modes charge separately for storage ($0.25/GB-month), backups, and data transfer, but the core difference is how they handle throughput billing.

On-Demand mode charges per request. You pay for every read and write your application makes, measured in Request Units. One Write Request Unit (WRU) covers a write up to 1 KB. One Read Request Unit (RRU) covers a strongly consistent read up to 4 KB or two eventually consistent reads up to 4 KB each. DynamoDB scales instantly to accommodate any request volume with no capacity planning required.

Provisioned mode charges per hour for the read and write capacity you reserve. You specify how many Read Capacity Units (RCUs) and Write Capacity Units (WCUs) your table needs. One RCU provides one strongly consistent read per second for items up to 4 KB. One WCU provides one write per second for items up to 1 KB. You pay for the reserved capacity whether you use it or not.

The AWS DynamoDB documentation states that On-Demand mode is the default and recommended option for most workloads, but the actual cost-optimal choice depends entirely on your traffic pattern.

DynamoDB On-Demand Mode: How It Works and When to Use It

On-Demand mode removes capacity planning entirely. You create a table, start reading and writing, and AWS bills you at the end of the month based on request volume. No provisioning, no manual scaling, no throttling unless you exceed default service quotas.

Pricing structure

On-Demand charges per million requests:

  • Write Request Units: $1.25 per million WRUs
  • Read Request Units: $0.25 per million RRUs

Each WRU covers a write up to 1 KB. Writes larger than 1 KB consume multiple WRUs. For example, a 3.5 KB write consumes 4 WRUs. Each RRU covers a strongly consistent read up to 4 KB. Eventually consistent reads consume half the RRUs of strongly consistent reads for the same item size.

Storage costs $0.25/GB-month regardless of capacity mode. Data transfer, backups, and DynamoDB Streams add to the bill but are not part of the core capacity pricing.

When On-Demand makes sense

On-Demand fits workloads where request volume is unpredictable, spiky, or too low to justify provisioning capacity. Examples include:

  • New applications where usage patterns are unknown
  • Dev and test environments that see irregular traffic
  • Seasonal workloads with rare but intense traffic bursts
  • Event-driven systems where requests cluster around specific triggers
  • Small apps processing fewer than 1 million requests per month

A startup launching a new SaaS product with 10 users in week one and 1,000 users in week four cannot forecast capacity accurately. On-Demand charges only for actual usage, avoiding wasted spend on over-provisioned capacity during the ramp-up phase.

The hidden cost of constant high utilization

On-Demand becomes expensive when request volume is high and constant. A table processing 10 million writes per month (333,333 writes per day) on On-Demand costs:

10,000,000 writes / 1,000,000 × $1.25 = $12.50/month

The same workload on Provisioned mode with 4 WCUs (enough for ~10.4 million writes per month at constant utilization) costs:

4 WCUs × $0.00065/WCU-hour × 730 hours = $1.90/month

That is a 6.5x cost difference. The gap narrows as utilization drops or spikes become more frequent, but the break-even point for constant workloads sits around 30% average utilization of what you would provision.

On-Demand monitoring requirements

Without provisioned limits, On-Demand tables do not throttle under normal conditions. Monitoring focuses on cost control rather than capacity exhaustion:

  • Request volume trends to detect unexpected traffic spikes
  • Cost per request to validate application efficiency
  • Request size distribution to catch oversized writes consuming multiple WRUs
  • Eventually consistent vs strongly consistent read ratio since the former costs half as much

AWS CloudWatch provides basic request count metrics at no extra charge, but detailed analysis requires either custom CloudWatch dashboards or third-party observability platforms.

DynamoDB Provisioned Mode: How It Works and When to Use It

Provisioned mode requires you to define upfront how much read and write capacity your table needs. AWS reserves that capacity and charges you hourly whether you use it or not. If your application exceeds the reserved capacity, DynamoDB throttles requests and returns ProvisionedThroughputExceededException errors.

Pricing structure

Provisioned mode charges per capacity unit per hour:

  • Write Capacity Units (WCUs): $0.00065 per WCU-hour
  • Read Capacity Units (RCUs): $0.00013 per RCU-hour

One WCU supports one write per second for items up to 1 KB. One RCU supports one strongly consistent read per second for items up to 4 KB. Eventually consistent reads consume half an RCU for the same item size.

A table with 10 WCUs and 20 RCUs costs:

  • Writes: 10 × $0.00065 × 730 hours = $4.75/month
  • Reads: 20 × $0.00013 × 730 hours = $1.90/month
  • Total: $6.65/month (plus storage, backups, data transfer)

Reserved capacity discounts

DynamoDB reserved capacity offers discounts up to 75% for teams that commit to a one-year or three-year reservation. Reserved capacity applies only to Provisioned mode and requires estimating future capacity needs accurately. Reserved capacity locks you into paying for the reserved amount even if your workload shrinks.

When Provisioned mode makes sense

Provisioned mode fits workloads with predictable, steady traffic where you can estimate capacity needs reliably. Examples include:

  • Production apps with stable traffic where request volume varies less than 2x daily
  • Cost-sensitive workloads where minimizing per-request cost matters more than simplicity
  • High-throughput systems processing millions of requests daily where On-Demand costs would spiral
  • Long-running apps where reserved capacity discounts justify the commitment

A fintech platform processing 50 million reads and 10 million writes per month with minimal variance can provision capacity precisely and pay a fraction of On-Demand costs.

Auto-scaling in Provisioned mode

Provisioned mode supports auto-scaling via AWS Application Auto Scaling. You set target utilization (typically 70%), minimum capacity, and maximum capacity. DynamoDB adjusts provisioned capacity automatically as traffic changes.

Auto-scaling adds complexity. It reacts to utilization after consumption has already increased, not before. A sudden traffic spike can still cause throttling during the scale-up delay. Auto-scaling also increases cost unpredictably if not bounded properly — a misconfigured max capacity setting can allow DynamoDB to provision far more capacity than your budget allows.

Provisioned mode monitoring requirements

Provisioned mode requires active monitoring to avoid throttling and cost waste:

  • Consumed vs provisioned capacity to detect over-provisioning or under-provisioning
  • Throttled request rate to catch capacity limits before user impact
  • Auto-scaling events to understand when and why scaling occurs
  • Utilization percentage to validate whether reserved capacity matches workload

Synthetic monitoring platforms can simulate DynamoDB requests from multiple regions to validate capacity sufficiency before production traffic hits the table.

Cost Comparison: Real Scenarios for Small and Mid-Market Teams

Abstract pricing tables hide the real question: what will this actually cost me? Below are two scenarios modeled for teams at different scales.

Scenario 1: Small team — 3 million writes, 6 million reads per month

Assumptions:

  • 3 million writes per month (100,000 writes/day, ~1.16 writes/second average)
  • 6 million reads per month (200,000 reads/day, ~2.31 reads/second average)
  • All items under 1 KB
  • 50% eventually consistent reads
  • 10 GB storage
  • No reserved capacity

On-Demand mode:

  • Writes: 3,000,000 / 1,000,000 × $1.25 = $3.75
  • Reads: 6,000,000 / 1,000,000 × $0.25 = $1.50
  • Storage: 10 GB × $0.25 = $2.50
  • Total: $7.75/month

Provisioned mode:

  • Provision 2 WCUs (supports ~5.2M writes/month at 100% utilization)
  • Provision 3 RCUs (supports ~7.8M reads/month at 100% utilization, accounting for eventually consistent reads)
  • Writes: 2 × $0.00065 × 730 = $0.95
  • Reads: 3 × $0.00013 × 730 = $0.28
  • Storage: 10 GB × $0.25 = $2.50
  • Total: $3.73/month

Winner: Provisioned mode saves 52% ($4.02/month).

This estimate models a production-ready setup with steady utilization. Actual costs vary based on item size, read consistency, and traffic variance.

Scenario 2: Mid-market team — 30 million writes, 60 million reads per month

Assumptions:

  • 30 million writes per month (1,000,000 writes/day, ~11.6 writes/second average)
  • 60 million reads per month (2,000,000 reads/day, ~23.1 reads/second average)
  • All items under 1 KB
  • 50% eventually consistent reads
  • 100 GB storage
  • No reserved capacity

On-Demand mode:

  • Writes: 30,000,000 / 1,000,000 × $1.25 = $37.50
  • Reads: 60,000,000 / 1,000,000 × $0.25 = $15.00
  • Storage: 100 GB × $0.25 = $25.00
  • Total: $77.50/month

Provisioned mode:

  • Provision 15 WCUs (supports ~39M writes/month at 100% utilization)
  • Provision 30 RCUs (supports ~78M reads/month at 100% utilization, accounting for eventually consistent reads)
  • Writes: 15 × $0.00065 × 730 = $7.12
  • Reads: 30 × $0.00013 × 730 = $2.85
  • Storage: 100 GB × $0.25 = $25.00
  • Total: $34.97/month

Winner: Provisioned mode saves 55% ($42.53/month).

Pricing based on publicly available information as of April 2026. Enterprise discounts, custom contracts, and negotiated rates are not reflected here.

The pattern holds: Provisioned mode wins decisively for steady high-volume workloads. On-Demand wins for low-volume, spiky, or unpredictable traffic.

Cost Optimization Strategies for Both Capacity Modes

For On-Demand mode

1. Use eventually consistent reads wherever possible

Eventually consistent reads cost half as much as strongly consistent reads. If your application tolerates slight staleness (typically under one second), switch all GetItem and Query operations to eventually consistent. Most read-heavy apps can use eventually consistent reads for 80% of requests without user impact.

2. Batch operations reduce request count

BatchWriteItem and BatchGetItem let you process up to 25 items per request. A single batch write of 10 items costs 10 WRUs — the same as 10 individual writes — but reduces network round trips and client overhead.

3. Monitor request size distribution

A 1.1 KB item consumes 2 WRUs, not 1. Track your average item size and consider schema changes to keep items under 1 KB where possible. Offload large attributes to S3 and store only references in DynamoDB.

4. Use DynamoDB Streams instead of polling

Polling a table repeatedly to detect changes wastes read capacity. Enable DynamoDB Streams and use Lambda triggers to react to changes instead. Streams charge separately but eliminate unnecessary read requests.

For Provisioned mode

1. Enable auto-scaling with tight bounds

Set auto-scaling target utilization to 70%, minimum capacity to your baseline, and maximum capacity to a hard budget ceiling. This prevents runaway scaling during traffic spikes while maintaining headroom for growth.

2. Monitor consumed vs provisioned capacity daily

Over-provisioning wastes money. Under-provisioning causes throttling. Use CloudWatch ConsumedReadCapacityUnits and ConsumedWriteCapacityUnits metrics to validate whether your provisioned capacity matches actual consumption. If average utilization stays below 40%, reduce provisioned capacity.

3. Use reserved capacity for stable workloads

If a table has run at consistent capacity for three months and is projected to remain stable, purchase reserved capacity. A one-year reservation at 50 WCUs saves $1,424/year compared to on-demand pricing.

4. Split hot partitions

DynamoDB distributes data across partitions based on partition key. A single hot partition can exhaust its share of provisioned capacity while other partitions sit idle. Use composite partition keys or add randomness to distribute writes evenly.

5. Use Global Secondary Indexes (GSIs) carefully

Each GSI consumes its own provisioned capacity. A table with five GSIs requires capacity provisioning for the base table plus all five indexes. Remove unused indexes and consolidate query patterns where possible.

Monitoring DynamoDB with CubeAPM

CubeAPM provides unified monitoring for DynamoDB tables across both On-Demand and Provisioned capacity modes, surfacing cost drivers, throttling events, and utilization patterns in real time. It runs inside your VPC or on-prem, so DynamoDB telemetry never leaves your cloud.

What CubeAPM tracks for DynamoDB

Capacity utilization and throttling

CubeAPM collects DynamoDB CloudWatch metrics via OpenTelemetry and surfaces consumed vs provisioned capacity for Provisioned tables, throttled request rates, and read/write request volume. Alerts fire when throttling exceeds thresholds or when utilization drops below 30% (indicating over-provisioning waste).

Cost analysis by table and operation

CubeAPM correlates request volume with capacity mode pricing to estimate per-table costs. For On-Demand tables, it tracks WRU and RRU consumption and projects monthly spend based on current request rates. For Provisioned tables, it highlights unused capacity and calculates potential savings from downsizing or switching modes.

Request latency and error rates

Beyond capacity, CubeAPM tracks DynamoDB API call latency and error rates (including throttling errors) across all operations — PutItem, GetItem, Query, Scan, BatchWriteItem. Slow queries or high error rates get traced back to the originating service in your application stack.

Auto-scaling event correlation

For Provisioned tables with auto-scaling enabled, CubeAPM logs every scaling event and correlates it with traffic spikes, deployment changes, or external events. This helps validate whether auto-scaling is responding appropriately or over-reacting to transient load.

How to set up DynamoDB monitoring in CubeAPM

CubeAPM supports multiple ingestion paths for DynamoDB metrics:

  1. CloudWatch Metric Streams — Stream DynamoDB metrics to CubeAPM via OpenTelemetry Collector using CloudWatch Metric Streams.
  2. AWS SDK instrumentation — Instrument your application with OpenTelemetry SDKs to capture DynamoDB API call traces, including operation type, item size, and response time.
  3. Prometheus CloudWatch Exporter — Use Prometheus to scrape DynamoDB CloudWatch metrics and forward them to CubeAPM’s Prometheus-compatible ingest endpoint.

All three paths feed into CubeAPM’s unified dashboard where DynamoDB metrics appear alongside application traces, logs, and infrastructure metrics. This makes it possible to correlate a DynamoDB throttling event with the exact API endpoint that triggered the spike.

CubeAPM pricing is $0.15/GB for all ingested telemetry with no per-host or per-metric surcharges. DynamoDB metrics typically add 50-100 MB/month per table depending on request volume.

When to Switch Between On-Demand and Provisioned Mode

DynamoDB allows you to switch capacity mode once per 24 hours per table. This gives teams flexibility to adapt to changing workload patterns, but frequent switching adds operational complexity.

Switch from On-Demand to Provisioned when:

  • Traffic stabilizes — Your app has been live for three months and request volume variance is less than 2x daily.
  • Cost projections show savings — Your monthly On-Demand bill exceeds what Provisioned mode would cost by 30% or more.
  • Reserved capacity makes sense — You are confident the workload will remain stable for a year and want to lock in discounted pricing.

Switch from Provisioned to On-Demand when:

  • Traffic becomes unpredictable — Workload variance exceeds 5x daily, making capacity planning impossible.
  • Throttling is frequent — Auto-scaling cannot keep up with sudden spikes and user experience is degrading.
  • Capacity planning overhead is too high — Your team spends more time adjusting provisioned capacity than the cost savings justify.
  • You want to eliminate idle costs — Your table sees bursts of activity separated by long idle periods where provisioned capacity sits unused.

A common pattern: start new tables on On-Demand, collect usage data for 30-90 days, then switch to Provisioned mode with auto-scaling once traffic patterns stabilize.

Common DynamoDB Cost Pitfalls and How to Avoid Them

Over-provisioning capacity “just in case”

Teams often provision 2x or 3x the capacity they need to avoid throttling. This works but wastes money. Instead, use auto-scaling with appropriate bounds and monitor consumed capacity weekly to right-size provisioning.

Ignoring eventually consistent reads

Strongly consistent reads cost twice as much as eventually consistent reads. Most read-heavy workloads tolerate eventual consistency. Audit your application’s read operations and switch to eventually consistent wherever staleness under one second is acceptable.

Running full table scans

Scan operations consume read capacity proportional to the entire table size, not the number of items returned. A 100 GB table scan consumes 25,000 RCUs even if you only need 10 items. Use Query with partition keys and indexes instead.

Forgetting data transfer costs

DynamoDB charges for data transfer out to the internet and across AWS regions. A table in us-east-1 queried from an EC2 instance in eu-west-1 incurs cross-region transfer fees at $0.02/GB. Use Real User Monitoring to track where requests originate and consider replicating tables closer to users with Global Tables.

Not using DynamoDB Accelerator (DAX) for read-heavy workloads

DAX is an in-memory cache for DynamoDB that reduces read latency to microseconds and offloads read capacity from the base table. For read-heavy apps, DAX can reduce provisioned RCUs by 80%, offsetting its $0.12/node-hour cost.

Disclaimer: The information in this article reflects the latest details available at the time of publication and may change as technologies and products evolve. Features, pricing, and plan limits can change over time. Always verify the latest information directly with the vendor before making purchasing or deployment decisions.

Frequently Asked Questions

What is the difference between On-Demand and Provisioned capacity in DynamoDB?

On-Demand charges per request with instant auto-scaling. Provisioned charges per hour for reserved capacity you define upfront. On-Demand fits unpredictable workloads; Provisioned fits steady high-volume workloads where cost predictability matters.

When should I use DynamoDB On-Demand mode?

Use On-Demand for new apps with unknown traffic patterns, spiky workloads, low-volume tables, or dev/test environments. It eliminates capacity planning and charges only for actual requests.

When should I use DynamoDB Provisioned mode?

Use Provisioned for production apps with steady predictable traffic, high request volumes, or workloads where you want to lock in reserved capacity discounts. Provisioned mode costs less at scale when utilization is consistent.

Can I switch between On-Demand and Provisioned mode?

Yes, you can switch once per 24 hours per table. This lets you adapt to changing workload patterns, but frequent switching adds operational complexity.

What is DynamoDB auto-scaling and how does it work?

Auto-scaling adjusts provisioned capacity automatically based on utilization. You set a target utilization percentage, minimum capacity, and maximum capacity. DynamoDB increases or decreases capacity as traffic changes, but scaling is reactive and may lag sudden spikes.

How much does DynamoDB cost for a small app?

A small app processing 3 million writes and 6 million reads per month with 10 GB storage costs around $7.75/month on On-Demand or $3.73/month on Provisioned mode. Costs scale linearly with request volume and storage size.

What is the break-even point between On-Demand and Provisioned pricing?

The break-even point depends on traffic consistency. For steady workloads, Provisioned becomes cheaper when average utilization exceeds 30% of what you would provision. For spiky workloads, On-Demand often wins even at higher request volumes.

×
×