CloudWatch is the default monitoring layer for Lambda – it collects invocation metrics automatically, at no extra charge, from the moment your function is first invoked. Lambda Enhanced Monitoring (CloudWatch Lambda Insights) is an optional add-on that runs as a Lambda extension and captures a second layer of data: system-level metrics like actual memory consumption, CPU time, disk I/O, and network I/O per invocation. The two work together, not in place of each other.
The distinction that matters in practice: CloudWatch tells you what your function did. Lambda Insights tells you what it consumed while doing it.
Key Takeaways
- CloudWatch standard metrics – Invocations, Errors, Duration, Throttles, ConcurrentExecutions – are free and require zero configuration.
- Lambda Insights adds CPU time, memory utilization, disk usage, network I/O, cold start duration, and worker shutdown events – none of which appear in standard CloudWatch.
- Lambda Insights is not free – it costs per invocation and adds ~1KB of log data per invocation to the /aws/lambda-insights/ log group; enable it selectively, not on every function
- Lambda Insights adds 10-20ms of overhead per invocation because the extension writes metrics after your handler returns – this counts toward billed duration.
- Standard CloudWatch cannot tell you whether your function is memory-constrained or CPU-bound; Lambda Insights can.
What Standard CloudWatch Gives You (Free, Always On)
Lambda publishes these metrics to the AWS/Lambda namespace automatically, in 1-minute intervals, with no setup required:
| Metric | What it tells you |
| Invocations | Total calls – success and failure, excluding throttles |
| Errors | Failed invocations – exceptions, OOM kills, timeouts |
| Duration | Execution time per invocation (avg, p99, max) |
| Throttles | Invocations rejected at the concurrency limit |
| ConcurrentExecutions | Parallel instances running at a point in time |
| DeadLetterErrors | Failed async event delivery to DLQ |
| IteratorAge | Stream lag for Kinesis/DynamoDB Streams triggers |
These metrics answer the operational question: Is my function working, and is it keeping up?
What they don’t answer: Why a function is slow. Whether it’s using 90% of its allocated memory. Whether it’s CPU-bound or waiting on I/O. Whether cold starts are lasting 800ms or 80ms. For those questions, you need Lambda Insights.
What Lambda Insights Adds (Opt-In, Per-Invocation Cost)
Lambda Insights runs as a Lambda layer – a lightweight extension that executes after your handler returns and writes a single structured log event per invocation to /aws/lambda-insights/. CloudWatch parses those events into metrics in the LambdaInsights namespace.
The additional metrics you get:
| Metric | What it tells you |
| memory_utilization | Actual memory used as a percentage of allocated memory |
| used_memory_max | Peak memory consumed in MB during the invocation |
| cpu_total_time | Total CPU time consumed during the invocation |
| init_duration | Cold start initialization time – as a proper metric, not just a log line |
| tmp_used | /tmp ephemeral storage used – relevant if your function writes to disk |
| rx_bytes / tx_bytes | Network data received and sent |
| billed_duration | Actual billed duration including extension overhead |
These metrics answer the resource question: is my function sized correctly, and where is time going?
The practical value of memory_utilization: Lambda allocates CPU proportionally to memory. If your function consistently uses 30% of its allocated memory, you’re likely over-provisioned – and paying for CPU you’re not using. If it’s regularly hitting 85-90%, the next spike will cause an OOM kill. Neither scenario is visible in standard CloudWatch.
How Lambda Insights Works Under the Hood
Lambda Insights is implemented as a Lambda layer. When you enable it, AWS attaches the LambdaInsightsExtension layer to your function. After each invocation, the extension:
- Collects CPU, memory, disk, and network data from the execution environment
- Writes a single structured log event in Embedded Metric Format (EMF) to /aws/lambda-insights/
- CloudWatch processes the EMF logs and surfaces them as time-series metrics
This is why Lambda Insights adds 10-20ms of overhead per invocation – the data collection and write happen synchronously after your handler exits, before the execution environment freezes.
How to Enable Lambda Insights
AWS Console: Lambda function → Configuration → Monitoring and operations tools → Edit → Enable Enhanced monitoring
AWS CLI:
# Step 1: Attach the layer (replace region and version as needed)
aws lambda update-function-configuration \
--function-name your-function-name \
--layers arn:aws:lambda:us-east-1:580247275435:layer:LambdaInsightsExtension:49# Step 2: Attach the required IAM policy to your function’s execution role
aws iam attach-role-policy \
--role-name your-function-execution-role \
--policy-arn arn:aws:iam::aws:policy/CloudWatchLambdaInsightsExecutionRolePolicyTerraform:
resource "aws_lambda_function" "example" {
# ... your existing config
layers = ["arn:aws:lambda:us-east-1:580247275435:layer:LambdaInsightsExtension:49"]
}
resource "aws_iam_role_policy_attachment" "insights" {
role = aws_iam_role.lambda_exec.name
policy_arn = "arn:aws:iam::aws:policy/CloudWatchLambdaInsightsExecutionRolePolicy"
}Find the current layer ARN for your region in the Lambda Insights documentation.
Cost Comparison
| Parameter | Standard CloudWatch | Lambda Insights |
| Metrics | Free | ~$0.20 per 1M invocations |
| Logs | You pay for log ingestion | ~1KB per invocation added to log costs |
| Setup | Zero – automatic | Layer + IAM policy per function |
| Invocation overhead | None | 10-20ms per invocation |
| Namespace | AWS/Lambda | LambdaInsights |
At 100 million monthly invocations, Lambda Insights adds roughly $20 in metric costs plus ~100GB of additional log data. Under CloudWatch’s 2025 tiered pricing starting at $0.50/GB, that’s approximately $50 in additional log ingestion – around $70/month total for the Insights layer alone, on top of your standard CloudWatch spend.
The practical implication: Enable Lambda Insights selectively on your most critical or resource-intensive functions, not uniformly across your entire Lambda fleet. For functions you rarely investigate, standard CloudWatch is sufficient.
Side-by-Side: Which Questions Each Tool Answers
| Question | Standard CloudWatch | Lambda Insights |
| Is my function erroring? | ✅ | ✅ |
| How long are invocations taking? | ✅ | ✅ |
| Am I being throttled? | ✅ | ✅ |
| How much memory is my function actually using? | ❌ | ✅ |
| Is my function about to OOM? | ❌ | ✅ |
| How long are cold starts taking (as a metric)? | ❌ (log line only) | ✅ |
| Is my function CPU-bound or I/O-bound? | ❌ | ✅ |
| Is /tmp storage filling up? | ❌ | ✅ |
| Is downstream I/O causing slowness? | ❌ | Partially (rx/tx bytes) |
| Which service caused a slow invocation? | ❌ | ❌ (needs distributed tracing) |
The Gap Neither Fills
Both standard CloudWatch and Lambda Insights are function-scoped. They tell you what happened inside a Lambda invocation – not what happened across the request chain that triggered it.
If a Lambda function took 3 seconds, CloudWatch tells you the duration. Lambda Insights tells you memory and CPU during those 3 seconds. Neither tells you that 2.6 of those seconds were spent waiting on a slow DynamoDB query – or that this Lambda was invoked by another Lambda that was itself responding to an API Gateway request.
That gap – the one between “my function was slow” and “here’s exactly why, and here’s the full request chain” – is where distributed tracing lives. It’s also the gap that neither CloudWatch nor Lambda Insights was designed to fill.
CubeAPM connects these dots. It instruments Lambda via the OpenTelemetry layer – the same layer you may already have for metrics – and gives you the full request trace: from the upstream trigger, through your function’s execution, to every downstream service it called. When a function shows high duration in CloudWatch and high memory in Lambda Insights but no obvious cause, the trace in CubeAPM is usually where the answer is. Self-hosted in your own AWS account, no data leaves your environment.
Summary
| Parameter | Standard CloudWatch | Lambda Insights |
| Cost | Free | Per-invocation charge + log costs |
| Setup | None | Layer + IAM policy per function |
| Best for | Invocation health, errors, throttles, concurrency | Memory sizing, CPU profiling, cold start duration as a metric |
| Enable on | Every function – it’s automatic | Critical and resource-intensive functions only |
| Overhead | None | 10-20ms per invocation |
Start with standard CloudWatch. Add Lambda Insights to functions where you need to answer resource questions – memory sizing, CPU utilization, or cold start duration tracking – rather than enabling it everywhere by default.
Disclaimer: Configurations, thresholds, and code examples are for guidance only. Verify against the current AWS and OpenTelemetry documentation before applying to production. AWS service details change frequently. CubeAPM references reflect genuine use cases; evaluate all tools against your own requirements.





