AWS Lambda Cost Monitoring: Concurrency, Duration, and Provisioned Concurrency Waste

Author: Vineet Chirania
Category: Monitoring
Published Date: June 30, 2026

AWS Lambda’s serverless model promises pay-per-use simplicity, but in practice, three billing dimensions interact in ways that create cost surprises: request count, execution duration, and provisioned concurrency uptime. According to the 2024 State of Lambda report, 60% of Lambda users now monitor costs at least weekly, up from 35% in 2022, suggesting widespread concern about billing predictability. The gap between expected and actual Lambda bills often comes from provisioned concurrency running 24/7 to avoid cold starts, billed duration rounding up to the nearest millisecond, or concurrency spikes during traffic bursts where every parallel execution counts as a separate billable event.

This guide covers how Lambda billing works across all three dimensions, which metrics expose waste, and how to monitor Lambda cost drivers in real time to prevent overruns before they compound.

What Is AWS Lambda Cost Monitoring

AWS Lambda cost monitoring is the practice of tracking function invocations, execution duration, memory allocation, and provisioned concurrency usage to measure actual spend against expected costs and identify inefficiencies before they scale. Unlike traditional compute where you pay for uptime, Lambda charges per request and per GB-second of execution time, meaning cost scales directly with traffic volume and code efficiency.

Lambda billing has four core components:

Request charges — $0.20 per million invocations. Each invocation counts as one request, regardless of execution time. For asynchronous events from S3, SNS, or EventBridge, the first 256 KB is counted as one request, then every additional 64 KB chunk up to 1 MB adds another request charge.

Duration charges — billed per GB-second based on allocated memory and actual runtime. For x86 architecture in US East Ohio, pricing tiers start at $0.0000166667 per GB-second for the first 6 billion GB-seconds per month. If you allocate 1024 MB (1 GB) and your function runs for 200 ms (0.2 seconds), you pay for 0.2 GB-seconds. Duration is rounded up to the nearest millisecond.

Provisioned concurrency charges — calculated from the time you enable it on your function until it is disabled, rounded up to the nearest 5 minutes. Unlike on-demand Lambda that charges only when invoked, provisioned concurrency charges continuously for keeping execution environments warm. AWS Lambda pricing lists this at $0.0000041667 per GB-second in US East Ohio for x86 on-demand provisioned concurrency.

Data transfer charges — standard AWS data transfer rates apply when Lambda functions send data out to the internet or across regions. Egress to the public internet costs approximately $0.09 per GB after the first 100 GB per month.

The free tier includes 1 million requests and 400,000 GB-seconds of compute per month, but provisioned concurrency and data transfer are not covered by the free tier.

Cost monitoring for Lambda means tracking all four components in real time and correlating spikes with function-level behavior to understand which functions drive the most cost and why. Without monitoring, a misconfigured provisioned concurrency setting can run up $500/month on a single function that handles 10 requests per day.

How AWS Lambda Billing Works Across Concurrency, Duration, and Provisioned Concurrency

Lambda billing is not a flat per-invocation fee. Three independent dimensions combine to produce your monthly bill, and understanding how they interact is the only way to control costs at scale.

Request count and concurrency

Every Lambda invocation counts as one billable request. Concurrency is the number of in-flight requests your function is handling at the same time. If 100 requests arrive simultaneously and each takes 1 second to complete, your concurrency at that moment is 100. AWS does not charge separately for concurrency itself, but high concurrency requires Lambda to provision multiple execution environments in parallel, and each environment consumes duration charges while it runs.

The formula for concurrency is:

Concurrency = Requests per second × Average duration in seconds

If your function receives 500 requests per second and each request takes 200 ms (0.2 seconds) to complete, your concurrency is 100. This means Lambda provisions 100 execution environments to handle the load without throttling.

By default, your account has a concurrency limit of 1,000 concurrent executions across all functions in a region. If you hit this limit, additional requests are throttled and return a 429 error. You can request a quota increase through AWS Support, but monitoring concurrency usage is critical to avoid throttling during traffic spikes.

Duration and memory allocation

Duration charges depend on how long your function runs and how much memory you allocate. You choose memory between 128 MB and 10,240 MB in 1 MB increments. CPU allocation scales proportionally with memory.

If you allocate 512 MB and your function runs for 300 ms, you are billed for 0.3 seconds × 0.5 GB = 0.15 GB-seconds. At the first pricing tier ($0.0000166667 per GB-second in US East Ohio for x86), this costs $0.0000025 per invocation. Multiply by 1 million invocations per month, and duration charges alone come to $2.50.

Duration is always rounded up to the nearest millisecond. A function that completes in 1.2 ms is billed for 2 ms. This rounding has minimal impact on short functions but adds up for high-volume invocations.

The pricing tier structure means the first 6 billion GB-seconds per month in a region cost less than subsequent usage. If your monthly aggregate duration exceeds 6 billion GB-seconds, the next tier applies at a slightly higher rate. This is rare for most workloads but matters at enterprise scale.

Provisioned concurrency and cold start elimination

Provisioned concurrency keeps a specified number of execution environments initialized and ready to respond instantly. This eliminates cold starts, the initialization delay that occurs when Lambda provisions a new environment for the first time or after a function has been idle.

Provisioned concurrency is billed continuously from the time you enable it until you disable it, rounded up to the nearest 5 minutes. If you configure 10 units of provisioned concurrency on a function with 1024 MB allocated memory, you are billed for 10 × 1 GB × every second the configuration is active.

In US East Ohio, x86 provisioned concurrency costs $0.0000041667 per GB-second. For 10 units running 24/7 for a month (2,592,000 seconds), the cost is 10 × 2,592,000 × $0.0000041667 = $108 per month, before any invocation or duration charges.

Invocations that use provisioned concurrency still incur duration charges, but at a slightly lower rate: $0.0000097222 per GB-second for provisioned concurrency duration vs. $0.0000166667 for on-demand duration in US East Ohio x86. The trade off is that you pay uptime charges even when no requests are being served.

Provisioned concurrency makes sense for latency-sensitive APIs where cold starts are unacceptable. It rarely makes sense for batch jobs, scheduled tasks, or low-traffic functions where cold starts have no user-facing impact.

Cost monitoring must track all three dimensions together

A function with low request count but high provisioned concurrency will cost more than a high-traffic function running on-demand. A function with high concurrency but short duration may cost less than a low-concurrency function with long runtime. Lambda cost monitoring tools must correlate request count, duration, and provisioned concurrency usage per function to show which functions drive the bill and whether provisioned concurrency is justified by actual traffic patterns.

Key Lambda Cost Metrics to Track

Monitoring Lambda costs requires tracking metrics that surface inefficiencies and cost drivers before they compound. The following metrics are the minimum required for effective cost visibility.

Invocations per function

Total invocations per function over a time window shows which functions drive request volume. A function invoked 10 million times per month costs $2 in request charges alone, before duration is considered. Invocations should be segmented by function name, version, and alias to identify whether old versions still receive traffic after deployments.

Billed duration vs. actual duration

Lambda bills in milliseconds, but initialization code outside your handler function runs during the Init phase and is not billed. Billed duration is the time your handler function runs, rounded up to the nearest millisecond. Actual duration includes both Init and Invoke phases. If billed duration consistently rounds up to 2 ms when actual runtime is 1.2 ms, the cost impact per invocation is small, but at 10 million invocations per month, rounding adds $0.0000166667 × 2 ms × 1 GB × 10 million = $33 per month in unnecessary charges.

Tracking billed duration per function exposes which functions run longest and consume the most GB-seconds.

Memory allocation vs. memory used

You allocate memory in advance, but your function may use far less. A function configured with 1024 MB that consistently uses 300 MB wastes 70% of allocated memory. Reducing allocation to 512 MB cuts duration charges in half without affecting performance. AWS CloudWatch Logs reports actual memory usage in the log output for each invocation. Monitoring max memory used per function over a week identifies over-provisioned functions.

Provisioned concurrency utilization

Provisioned concurrency is wasted if it sits idle. Utilization is the percentage of provisioned concurrency units actively serving requests. If you provision 20 units and peak usage is 5 concurrent requests, utilization is 25%. You are paying for 20 units 24/7 but using only 5 at peak.

Monitoring provisioned concurrency invocations vs. on-demand invocations shows whether provisioned concurrency is necessary. If 95% of invocations use on-demand execution environments, provisioned concurrency is likely wasted spend.

Throttles and concurrency limit breaches

Throttles occur when concurrent executions exceed your account limit or reserved concurrency quota. A throttled request returns a 429 error and the client must retry. High throttle counts indicate under-provisioned concurrency limits, but increasing limits without monitoring can lead to runaway costs during traffic spikes.

Tracking concurrent executions per function and comparing against account-level or reserved concurrency limits shows how close you are to throttling.

Cost per invocation and cost per function

Cost per invocation is total monthly cost for a function divided by total invocations. A function with 1 million invocations costing $50 per month has a cost per invocation of $0.00005. Tracking this metric over time surfaces functions where cost per invocation increases due to longer runtime or higher memory allocation.

Cost per function aggregates request charges, duration charges, and provisioned concurrency charges to show which functions contribute most to your Lambda bill. A function with low invocations but high provisioned concurrency can be the single largest cost driver.

Why Provisioned Concurrency Creates the Most Lambda Waste

Provisioned concurrency is the billing dimension most likely to create waste because it charges continuously whether or not your function receives traffic. AWS Lambda documentation describes provisioned concurrency as “calculated from the time you enable it on your function until it is disabled, rounded up to the nearest five minutes.” This means every function with provisioned concurrency enabled is billing 24/7, and if traffic patterns do not justify it, you are paying for idle capacity.

The provisioned concurrency trap for spiky workloads

Provisioned concurrency is designed for predictable, steady-state traffic where cold starts degrade user experience. But many teams enable it on functions with spiky traffic patterns, a weekly batch job, an hourly scheduled task, or an API that sees traffic only during business hours. In these cases, provisioned concurrency sits idle most of the time.

A function configured with 10 units of provisioned concurrency at 1024 MB costs $108 per month in US East Ohio for uptime charges alone. If the function handles 50,000 invocations per month, mostly during a 2-hour daily window, provisioned concurrency is idle 22 hours per day. The $108 provisioned concurrency bill supports 2 hours of actual traffic. On-demand execution for the same workload would cost request charges ($0.01) plus duration charges based on 50,000 invocations, likely under $10 total.

Provisioned concurrency vs. on-demand cost comparison

On-demand execution charges per invocation and per GB-second. Provisioned concurrency charges uptime continuously plus a lower per-invocation duration rate. The break-even point depends on traffic volume and distribution.

For a function with 1024 MB allocated memory running 200 ms per invocation:

On-demand cost per invocation:

Request charge: $0.20 / 1,000,000 = $0.0000002
Duration charge: 0.2 seconds × 1 GB × $0.0000166667 = $0.0000033
Total: $0.00000353 per invocation

Provisioned concurrency cost per invocation (assuming 10 units provisioned):

Uptime charge: 10 units × 1 GB × 2,592,000 seconds/month × $0.0000041667 = $108/month
Duration charge per invocation: 0.2 seconds × 1 GB × $0.0000097222 = $0.0000019
Request charge: $0.0000002
Total per invocation: $0.0000021 + ($108 / total invocations)

If the function receives 1 million invocations per month, provisioned concurrency total cost is $108 (uptime) + $2.10 (duration + requests) = $110.10. On-demand cost is $3.53. Provisioned concurrency is 31× more expensive.

If the function receives 10 million invocations per month, provisioned concurrency total cost is $108 + $21 = $129. On-demand cost is $35.30. Provisioned concurrency is still 3.7× more expensive.

Provisioned concurrency only makes economic sense when cold start elimination justifies the uptime premium and traffic is sustained, not spiky.

How to identify wasted provisioned concurrency

Monitor ProvisionedConcurrencyInvocations vs. Invocations in CloudWatch. If total invocations are 10 million per month but only 100,000 use provisioned concurrency, 99% of traffic uses on-demand environments. Provisioned concurrency is wasted.

Track ProvisionedConcurrencyUtilization to see what percentage of provisioned units are actively serving requests. If utilization stays below 50%, you are paying for double the capacity you need.

Compare cost per invocation with provisioned concurrency enabled vs. disabled. If disabling provisioned concurrency reduces cost per invocation by 80% with acceptable cold start frequency, the provisioned concurrency configuration is waste.

How to Monitor AWS Lambda Costs with CloudWatch and Cost Explorer

AWS provides native tools for Lambda cost monitoring: CloudWatch for real time function metrics and AWS Cost Explorer for billing visibility. Neither is purpose-built for Lambda cost optimization, but both surface the data needed to identify waste.

CloudWatch metrics for Lambda cost drivers

CloudWatch automatically publishes Lambda metrics at no additional charge. The following metrics map directly to cost:

Invocations — total number of times a function is invoked. Multiply by $0.0000002 to get request charges.

Duration — billed duration in milliseconds. Multiply by allocated memory in GB and by the duration pricing tier rate to get duration charges.

ConcurrentExecutions — number of function instances processing requests simultaneously. High concurrency during traffic spikes increases total duration charges because more environments run in parallel.

ProvisionedConcurrentExecutions — number of provisioned concurrency units configured. Multiply by allocated memory, seconds in the billing period, and the provisioned concurrency rate to get uptime charges.

ProvisionedConcurrencyInvocations — number of invocations served by provisioned concurrency environments. Compare to total invocations to see if provisioned concurrency is actually used.

ProvisionedConcurrencyUtilization — percentage of provisioned concurrency units actively serving requests. Low utilization means you are paying for idle capacity.

Throttles — number of invocations rejected due to concurrency limits. High throttle counts suggest under-provisioned limits, but increasing limits without cost monitoring can lead to runaway bills.

CloudWatch metrics are free but require manual analysis to translate into cost impact. You must calculate cost per invocation by pulling duration, memory allocation, and invocation count, then applying AWS pricing formulas.

AWS Cost Explorer for Lambda billing breakdowns

Cost Explorer shows actual spend by service, broken down by linked account, region, and usage type. For Lambda, usage types include:

Request — total request charges
Duration — total GB-second charges for on-demand execution
Provisioned Concurrency — uptime charges for provisioned concurrency
Data Transfer — egress charges for data sent out of Lambda functions

Cost Explorer updates daily with a 24-hour lag, meaning you cannot see real time cost spikes. It shows what you already paid, not what you are about to pay. For functions with provisioned concurrency enabled on a Friday evening and forgotten over the weekend, Cost Explorer will not flag the waste until Monday when $15 in unnecessary uptime charges have already accrued.

Cost Explorer does not correlate spend with function-level behavior. If your Lambda bill increased 40% month over month, Cost Explorer shows the total increase but not which functions or workloads drove it. You must cross-reference CloudWatch metrics manually to identify the cause.

Limitations of native AWS tools for Lambda cost monitoring

CloudWatch and Cost Explorer were not built for proactive cost optimization. Neither tool automatically correlates function-level metrics with cost, alerts when cost per invocation trends upward, or flags provisioned concurrency waste in real time. For teams running hundreds of Lambda functions, manual correlation across CloudWatch and Cost Explorer is slow and incomplete.

Third-party tools purpose-built for Lambda cost monitoring track cost per function, per invocation, and per concurrency type in real time, alerting when waste patterns emerge. For teams with tight cost controls or high Lambda spend, native AWS tools alone are insufficient.

Tools for Monitoring Lambda Costs

Beyond CloudWatch and Cost Explorer, several platforms provide Lambda-specific cost monitoring with real time visibility and waste detection.

CubeAPM

CubeAPM provides infrastructure monitoring with native AWS Lambda support, tracking invocations, duration, memory usage, and provisioned concurrency utilization per function in real time. It correlates Lambda metrics with application traces and logs, showing which functions contribute to overall application latency and cost. CubeAPM runs on your own infrastructure, so telemetry data never leaves your cloud and there are no AWS data egress charges for sending metrics to an external SaaS platform.

CubeAPM’s flat $0.15/GB pricing model covers all telemetry ingested, including Lambda metrics, traces, and logs. There are no per-host, per-user, or per-function charges. This makes cost forecasting straightforward for teams monitoring hundreds or thousands of Lambda functions across multiple AWS accounts.

For Lambda cost monitoring, CubeAPM tracks cost per invocation, flags over-provisioned memory allocations, and identifies provisioned concurrency waste by comparing provisioned vs. on-demand invocations. Alerts fire when cost per function trends upward or when provisioned concurrency utilization drops below a defined threshold.

Deployment is self-hosted in your AWS VPC or on-premises data center. CubeAPM manages upgrades and support, removing the Day 2 ops burden of self-hosted observability stacks. Full OpenTelemetry compatibility means Lambda functions instrumented with OTel SDKs send telemetry directly to CubeAPM without agent changes.

Pricing: $0.15/GB data ingested, unlimited retention, no per-function or per-user fees. Verify current pricing at CubeAPM pricing page.

Datadog

Datadog Lambda monitoring tracks invocations, duration, errors, and cold starts per function with real time dashboards. It integrates with AWS Cost and Usage Reports to correlate Lambda metrics with actual spend, showing cost per function and cost per invocation in the Datadog UI.

Datadog Serverless Monitoring charges $5 per million trace invocations for functions instrumented with Datadog’s tracing library. For a function handling 10 million invocations per month, Datadog charges $50 for trace visibility alone, before log ingestion or infrastructure monitoring costs.

Datadog’s per-host pricing model for infrastructure monitoring adds $18/host/month for 15-month retention. While Lambda functions are not billed per host, any EC2 instances, containers, or RDS databases in the same environment are. For teams with hybrid architectures, Datadog’s pricing compounds quickly across multiple billing dimensions.

Pricing: $5 per million Lambda trace invocations, $18/host/month for infra monitoring, log ingestion billed separately. Verify current rates at Datadog pricing page.

New Relic

New Relic Serverless Monitoring provides Lambda telemetry with out-of-box dashboards for invocations, duration, errors, and cold starts. It correlates Lambda metrics with application traces and logs for full-stack visibility.

New Relic uses a consumption-based pricing model where Lambda telemetry counts toward your monthly data ingest limit. The free tier includes 100 GB per month, after which pricing is $0.30/GB for additional data. A high-traffic Lambda workload generating 500 GB of telemetry per month costs $120 in New Relic data ingest fees.

New Relic charges per user for full platform access. The Standard plan is $99/user/month. For a team of 10 engineers, user fees alone are $990/month before data ingest charges. This makes New Relic expensive for small teams monitoring Lambda workloads.

Pricing: $0.30/GB beyond 100 GB free tier, $99/user/month for Standard plan. Verify current pricing at New Relic pricing page.

Lumigo

Lumigo is purpose-built for serverless observability, focusing exclusively on AWS Lambda, Step Functions, and event-driven architectures. It automatically instruments Lambda functions without code changes, capturing invocations, errors, and transaction traces end to end.

Lumigo pricing starts at $39/month for up to 200,000 function invocations. The Developer plan covers 2 million invocations per month for $195/month. For high-traffic workloads exceeding 10 million invocations per month, contact Lumigo for custom enterprise pricing.

Lumigo does not track Lambda costs directly but surfaces the metrics needed to calculate cost per function: invocations, duration, memory usage, and cold starts. It flags functions with frequent cold starts and suggests memory optimizations to reduce duration charges.

Pricing: starts at $39/month for up to 200,000 invocations, scales with usage. Verify current plans at Lumigo pricing page.

AWS Compute Optimizer

AWS Compute Optimizer analyzes Lambda function metrics and recommends optimal memory configurations to reduce cost. It uses machine learning to identify over-provisioned functions and suggests memory reductions that lower duration charges without degrading performance.

Compute Optimizer is free for Lambda recommendations. It integrates with Cost Explorer to show projected savings if recommendations are applied. For a function configured with 2048 MB that consistently uses 512 MB, Compute Optimizer recommends reducing allocation to 1024 MB, cutting duration charges by 50%.

Compute Optimizer does not monitor real time Lambda costs or alert on provisioned concurrency waste. It provides periodic recommendations that must be reviewed and applied manually.

Pricing: Free for Lambda function recommendations.

How to Optimize Lambda Costs Based on Monitoring Data

Lambda cost monitoring exposes waste, but reducing spend requires action. The following optimizations have the highest ROI for most workloads.

Right-size memory allocation

Lambda charges for allocated memory, not memory used. A function configured with 3008 MB that uses 800 MB wastes 73% of allocated memory. Reducing allocation to 1024 MB cuts duration charges by 66%.

Review max memory used per function over a 7-day window in CloudWatch Logs. Add 20% headroom for spikes, then reduce allocation. Test under load to confirm performance is not degraded.

For functions with variable memory usage, consider using Lambda Power Tuning, an open-source tool that tests multiple memory configurations and recommends the optimal setting for cost and performance.

Disable or schedule provisioned concurrency

Provisioned concurrency sitting idle during off-peak hours is waste. If traffic is predictable, use AWS Application Auto Scaling to schedule provisioned concurrency only during peak hours.

For a function with traffic only during 8am–6pm weekdays, configure scheduled scaling to enable 10 units of provisioned concurrency at 7:55am and disable it at 6:05pm. This cuts provisioned concurrency uptime from 168 hours per week to 50 hours, reducing monthly provisioned concurrency charges by 70%.

For functions with unpredictable traffic, disable provisioned concurrency entirely and accept cold starts. Most cold starts last 100–300 ms, acceptable for non-user-facing workloads like batch processing or webhook handlers.

Reduce function duration

Shorter duration means lower GB-second charges. The fastest optimizations:

Move initialization code outside the handler function so it runs only during the Init phase, not billed.
Use connection pooling for database clients to avoid opening new connections per invocation.
Reduce dependency bundle size by removing unused libraries. Smaller bundles reduce Init phase duration.
Use Lambda Extensions for shared state instead of calling external APIs per invocation.

A function running 500 ms with 1024 MB allocated costs $0.00000833 per invocation in duration charges. Reducing runtime to 200 ms drops cost to $0.00000333, a 60% reduction.

Set reserved concurrency limits on low-priority functions

Reserved concurrency caps the maximum concurrent executions for a function, preventing runaway costs during traffic spikes. If a misconfigured API suddenly receives 10,000 requests per second, unlimited concurrency would provision 10,000 execution environments, each billing for duration. A reserved concurrency limit of 100 throttles excess requests, capping cost impact.

Set reserved concurrency on batch jobs, scheduled tasks, and non-critical APIs where throttling is acceptable. Leave user-facing APIs unrestricted to avoid impacting customer experience.

Track cost per invocation over time

Cost per invocation trending upward signals inefficiency: longer runtime, higher memory allocation, or provisioned concurrency added without traffic justification. Monitor cost per invocation per function weekly and investigate any function where cost increases more than 20% month over month.

If cost per invocation increases but invocations remain flat, the cause is usually memory over-allocation or provisioned concurrency waste.

Conclusion

AWS Lambda’s billing model charges per request, per GB-second of execution duration, and continuously for provisioned concurrency uptime. Without monitoring all three dimensions together, teams pay for over-allocated memory, idle provisioned concurrency, and functions with inefficient code that consumes more duration than necessary. Native AWS tools, CloudWatch and Cost Explorer, surface the raw data but require manual correlation to identify waste. Purpose-built Lambda cost monitoring tools track cost per function, per invocation, and per concurrency type in real time, flagging waste before it compounds.

The highest-ROI optimizations are right-sizing memory allocation, disabling or scheduling provisioned concurrency during off-peak hours, and reducing function duration through code efficiency. For teams running serverless workloads at scale, Lambda cost monitoring is not optional. It is the only way to keep billing predictable as traffic grows.

Disclaimer: The information in this article reflects the latest details available at the time of publication and may change as technologies and products evolve. Features, pricing, and plan limits can change over time. Always verify the latest information directly with the vendor before making purchasing or deployment decisions.

Frequently Asked Questions

What causes AWS Lambda bills to spike unexpectedly?

Provisioned concurrency running 24/7 on functions with low traffic is the most common cause. A single function with 10 units of provisioned concurrency at 1024 MB costs $108/month in uptime charges, regardless of invocations. The second cause is memory over-allocation. A function configured with 3008 MB that uses 800 MB wastes $0.00005 per invocation in unnecessary duration charges.

How do I calculate cost per Lambda invocation?

Add request charges ($0.0000002 per invocation) plus duration charges (billed duration in seconds × memory in GB × pricing tier rate). For provisioned concurrency, add the per-invocation duration charge plus uptime charges divided by total invocations. If monthly invocations are 1 million and provisioned concurrency uptime is $108, cost per invocation includes $0.000108 in uptime allocation.

Should I use provisioned concurrency for all Lambda functions?

No. Provisioned concurrency only makes sense for latency-sensitive user-facing APIs where cold starts degrade experience. Functions handling batch jobs, scheduled tasks, or webhook handlers rarely justify provisioned concurrency costs. Monitor actual cold start frequency and user impact before enabling provisioned concurrency.

How do I track which Lambda functions cost the most?

Use AWS Cost Explorer to view Lambda spend by usage type and CloudWatch metrics to track invocations and duration per function. Multiply invocations by average duration and memory allocation to calculate GB-seconds per function. The functions with the highest GB-seconds drive the most duration charges.

What is the difference between reserved concurrency and provisioned concurrency?

Reserved concurrency sets a maximum limit on concurrent executions for a function, throttling requests beyond that limit. It does not eliminate cold starts or incur continuous charges. Provisioned concurrency keeps execution environments warm and ready, eliminating cold starts but billing continuously for uptime. Reserved concurrency controls cost during spikes. Provisioned concurrency reduces latency but increases cost.

How much does provisioned concurrency cost compared to on-demand execution?

Provisioned concurrency charges continuously at $0.0000041667 per GB-second in US East Ohio for x86 architecture. For 10 units at 1024 MB running 24/7, monthly uptime charges are $108 before invocation costs. On-demand execution charges only when invoked at $0.0000166667 per GB-second. Provisioned concurrency is cost-effective only for high-traffic functions where continuous uptime charges are offset by high invocation volume.

Can CubeAPM monitor AWS Lambda costs in real time?

Yes. CubeAPM tracks Lambda invocations, duration, memory usage, and provisioned concurrency utilization per function in real time. It correlates Lambda metrics with application traces and logs to show which functions drive cost and latency. CubeAPM runs on your infrastructure with flat $0.15/GB pricing and no per-function fees.

AWS X-Ray Pricing & Review 2026: Trace Costs, Features, and Alternatives

Vijay Aggarwal July 1, 2026

Catchpoint Pricing and Review 2026: IPM Costs, Features, User Reviews, and Alternatives

Vineet Chirania July 1, 2026

Sysdig Pricing and Review 2026: Plans, Costs, User Reviews, and Alternatives

Vineet Chirania July 1, 2026

DynamoDB Monitoring: On-Demand vs Provisioned Capacity Cost Optimization

Vineet Chirania July 1, 2026

Vertex AI Cost Monitoring: Training Job and Endpoint Pricing Breakdown

Abhinav Garg July 1, 2026

AWS Glue Monitoring: DPU Consumption and Job Cost Optimization

Vineet Chirania June 30, 2026

AWS Lambda Cost Monitoring: Concurrency, Duration, and Provisioned Concurrency Waste

Table of Contents

What Is AWS Lambda Cost Monitoring

How AWS Lambda Billing Works Across Concurrency, Duration, and Provisioned Concurrency

Request count and concurrency

Duration and memory allocation

Provisioned concurrency and cold start elimination

Cost monitoring must track all three dimensions together

Key Lambda Cost Metrics to Track

Invocations per function

Billed duration vs. actual duration

Memory allocation vs. memory used

Provisioned concurrency utilization

Throttles and concurrency limit breaches

Cost per invocation and cost per function

Why Provisioned Concurrency Creates the Most Lambda Waste

The provisioned concurrency trap for spiky workloads

Provisioned concurrency vs. on-demand cost comparison

How to identify wasted provisioned concurrency

How to Monitor AWS Lambda Costs with CloudWatch and Cost Explorer

CloudWatch metrics for Lambda cost drivers

AWS Cost Explorer for Lambda billing breakdowns

Limitations of native AWS tools for Lambda cost monitoring

Tools for Monitoring Lambda Costs

CubeAPM

Datadog

New Relic

Lumigo

AWS Compute Optimizer

How to Optimize Lambda Costs Based on Monitoring Data

Right-size memory allocation

Disable or schedule provisioned concurrency

Reduce function duration

Set reserved concurrency limits on low-priority functions

Track cost per invocation over time

Conclusion

Frequently Asked Questions

What causes AWS Lambda bills to spike unexpectedly?

How do I calculate cost per Lambda invocation?

Should I use provisioned concurrency for all Lambda functions?

How do I track which Lambda functions cost the most?

What is the difference between reserved concurrency and provisioned concurrency?

How much does provisioned concurrency cost compared to on-demand execution?

Can CubeAPM monitor AWS Lambda costs in real time?

Related Posts

Features

Resources

Links