AWS Fargate is a serverless compute engine for containers – AWS manages the underlying infrastructure, which means you cannot install agents on the host OS or run DaemonSet-style collectors on the node. The only supported pattern for OpenTelemetry collection on Fargate is the sidecar pattern: an OTel Collector container running inside the same ECS task as your application, sharing the task’s network namespace so your application can send telemetry to localhost.
This guide covers how to set up the AWS Distro for OpenTelemetry (ADOT) Collector as a sidecar, monitor AWS Fargate containers with OpenTelemetry, what container and task metrics it can collect, what IAM permissions are required, and the key architectural constraints Fargate imposes that differ from EC2-backed ECS.
Key Takeaways
- Fargate does not support daemon-mode collectors. The sidecar pattern is the only option. You cannot run an ADOT Collector as a DaemonSet or ECS daemon service to collect host-level metrics on Fargate – that pattern is EC2-only
- The awsecscontainermetrics receiver in the OTel Collector reads task and container metrics from the ECS Task Metadata Endpoint V4. This requires Fargate platform version 1.4.0 or later
- Your application containers send telemetry to the sidecar collector at localhost:4317 (gRPC) or localhost:4318 (HTTP) – containers in the same ECS task share a network namespace
- You need two IAM roles: a task execution role (for ECR image pulls and Secrets Manager/SSM access) and a task role (for the collector to call CloudWatch, X-Ray, AMP, or other exporters)
- ADOT Collector configuration should be stored in AWS Systems Manager Parameter Store and referenced in the task definition – not baked into the container image
- Fargate cannot provide host OS-level metrics (CPU steal, disk I/O per node, network interfaces at the instance level) – only task and container-scoped metrics are available
The Two ECS Collector Deployment Patterns (and Why Fargate Only Gets One)
Sidecar pattern: The collector runs as an additional container definition inside each ECS task. It shares the task’s network namespace, so application containers reach it at localhost. Ideal for per-task telemetry with low latency and no inter-task networking.
Central collector / daemon pattern: A single ADOT Collector instance (or a small pool) runs as a separate ECS service. Application tasks send telemetry to it over the network. On EC2-backed ECS, a daemon service ensures one collector per EC2 instance, enabling host-level metrics collection. This pattern does not apply to Fargate. AWS explicitly states that EC2 instance-level metrics collection via the ADOT Collector daemon is not supported on Fargate clusters.
For Fargate, use the sidecar pattern for everything.
Step 1: IAM Roles Required
You need two separate IAM roles before deploying the collector.
Task Execution Role (used by ECS to start the task):
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ecr:GetAuthorizationToken",
"ecr:BatchCheckLayerAvailability",
"ecr:GetDownloadUrlForLayer",
"ecr:BatchGetImage",
"logs:CreateLogStream",
"logs:PutLogEvents",
"ssm:GetParameters",
"secretsmanager:GetSecretValue"
],
"Resource": "*"
}
]
}The SSM and Secrets Manager permissions are needed to pull the collector configuration from Parameter Store at task startup.
Task Role (used by the running collector to export telemetry):
Permissions depend on your chosen exporter. For CloudWatch and X-Ray:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"cloudwatch:PutMetricData",
"logs:PutLogEvents",
"logs:DescribeLogStreams",
"logs:DescribeLogGroups",
"logs:CreateLogGroup",
"logs:CreateLogStream",
"xray:PutTraceSegments",
"xray:PutTelemetryRecords",
"xray:GetSamplingRules",
"xray:GetSamplingTargets",
"xray:GetSamplingStatisticSummaries"
],
"Resource": "*"
}
]
}For Amazon Managed Service for Prometheus (AMP), add:
{
"Effect": "Allow",
"Action": [
"aps:RemoteWrite",
"aps:GetSeries",
"aps:GetLabels",
"aps:GetMetricMetadata"
],
"Resource": "*"
}Step 2: Store Collector Configuration in SSM Parameter Store
Store your ADOT Collector configuration as a String parameter in AWS Systems Manager Parameter Store. This avoids baking configuration into container images and allows updates without rebuilding.
Example collector configuration for collecting ECS container metrics, application traces, and exporting to CloudWatch and X-Ray:
extensions:
health_check:
receivers:
awsecscontainermetrics:
collection_interval: 20s
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
processors:
batch:
timeout: 60s
filter:
metrics:
include:
match_type: strict
metric_names:
- ecs.task.memory.utilized
- ecs.task.memory.limit
- ecs.task.cpu.utilized
- ecs.task.cpu.limit
- ecs.task.network.rate.rx
- ecs.task.network.rate.tx
exporters:
awsemf:
namespace: ECS/ContainerMetrics
region: us-east-1
dimension_rollup_option: "ZeroAndSingleDimensionRollup"
awsxray:
region: us-east-1
service:
extensions: [health_check]
pipelines:
metrics:
receivers: [awsecscontainermetrics]
processors: [filter, batch]
exporters: [awsemf]
traces:
receivers: [otlp]
processors: [batch]
exporters: [awsxray]Store in SSM:
aws ssm put-parameter \
--name "/fargate/otel-collector-config" \
--type "String" \
--value file://collector-config.yaml \
--region us-east-1Step 3: Add the ADOT Collector Sidecar to Your Task Definition
Add the collector as a second container definition in your existing ECS task definition. The application container sends telemetry to localhost – no networking configuration needed because containers in the same task share a network namespace.
{
"family": "my-fargate-task",
"networkMode": "awsvpc",
"requiresCompatibilities": ["FARGATE"],
"cpu": "512",
"memory": "1024",
"executionRoleArn": "arn:aws:iam::123456789:role/ecsTaskExecutionRole",
"taskRoleArn": "arn:aws:iam::123456789:role/ecsTaskRole",
"containerDefinitions": [
{
"name": "my-application",
"image": "my-app-image:latest",
"essential": true,
"environment": [
{
"name": "OTEL_EXPORTER_OTLP_ENDPOINT",
"value": "http://localhost:4318"
},
{
"name": "OTEL_RESOURCE_ATTRIBUTES",
"value": "service.name=my-service,deployment.environment=production"
}
],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/my-application",
"awslogs-region": "us-east-1",
"awslogs-stream-prefix": "ecs"
}
}
},
{
"name": "adot-collector",
"image": "public.ecr.aws/aws-observability/aws-otel-collector:latest",
"essential": false,
"command": [
"--config=env:AOT_CONFIG_CONTENT"
],
"secrets": [
{
"name": "AOT_CONFIG_CONTENT",
"valueFrom": "/fargate/otel-collector-config"
}
],
"portMappings": [
{
"containerPort": 4317,
"protocol": "tcp"
},
{
"containerPort": 4318,
"protocol": "tcp"
}
],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/adot-collector",
"awslogs-region": "us-east-1",
"awslogs-stream-prefix": "ecs"
}
}
}
]
}Key decisions in this task definition:
essential: false on the collector – if the collector crashes, the application container should not be stopped. The application can still run (and degrade gracefully) without the collector. Set this to true only if you require telemetry collection for the task to be considered healthy.
AOT_CONFIG_CONTENT as a secret reference to SSM Parameter Store – this pulls the configuration at task startup without it being visible in the task definition itself.
Step 4: What the awsecscontainermetrics Receiver Collects
The awsecscontainermetrics receiver reads the ECS Task Metadata Endpoint V4 – the local HTTP endpoint every ECS task exposes at a well-known URL inside the task’s network namespace. It extracts resource utilization metrics that Docker stats reports for each container in the task.
Fargate platform version requirement: The Task Metadata Endpoint V4 is available on Fargate tasks using platform version 1.4.0 or later. Tasks on older platform versions cannot use this receiver.
Metrics available from the receiver:
| Metric | What it measures |
| ecs.task.memory.utilized | Memory currently in use by the task in bytes |
| ecs.task.memory.limit | Memory limit configured for the task |
| ecs.task.cpu.utilized | CPU units consumed by the task |
| ecs.task.cpu.limit | CPU units allocated to the task |
| ecs.task.network.rate.rx | Inbound network bytes per second for the task |
| ecs.task.network.rate.tx | Outbound network bytes per second for the task |
| ecs.task.storage.read_bytes | Storage bytes read by the task |
| ecs.task.storage.write_bytes | Storage bytes written by the task |
Container-level equivalents of these metrics are also available with the container.name attribute for per-container breakdown.
What is not available on Fargate: Host OS-level metrics are not available. You cannot collect node CPU steal time, disk IOPS at the instance level, or physical network interface statistics – because Fargate abstracts the underlying EC2 instance from you entirely. This is a Fargate architectural constraint, not a collector limitation.
Application Instrumentation: Sending Telemetry to the Sidecar
Your application sends traces, metrics, and logs to the OTel Collector sidecar at localhost. The two standard OTLP endpoints are:
- localhost:4317 for gRPC (lower overhead, preferred for high-throughput)
- localhost:4318 for HTTP (easier firewall traversal in some configurations)
Configure your OTel SDK to export to one of these endpoints. The environment variables OTEL_EXPORTER_OTLP_ENDPOINT and OTEL_RESOURCE_ATTRIBUTES are the standard way to configure this without code changes.
Resource attributes to set for Fargate:
service.name=my-service
service.version=1.0.0
deployment.environment=production
aws.ecs.task.family=my-fargate-task
aws.ecs.cluster.arn=arn:aws:ecs:us-east-1:123456789:cluster/my-cluster
The awsecscontainermetrics receiver automatically adds ECS-specific resource attributes (cluster ARN, task ARN, task family, container name) to all metrics it collects. For application traces, set these as OTEL_RESOURCE_ATTRIBUTES in the container environment so they appear on every span.
Log Collection: FireLens with Fluent Bit
OpenTelemetry does not replace ECS log routing – it complements it. For structured log collection from Fargate, the standard approach is FireLens with the Fluent Bit log driver. You can route logs to the OTel Collector sidecar (via the Fluent Bit plugin for OpenTelemetry) or directly to CloudWatch Logs, OpenSearch, or another destination.
For most Fargate deployments, sending logs directly to CloudWatch Logs via awslogs driver (as shown in the task definition above) is the simplest path. Add OpenTelemetry log collection via FireLens only when you need log-trace correlation in a backend that supports it.
Practical Gotchas
Always pin the ADOT Collector image version in production. The task definition above uses the latest for illustration. In production, pin to a specific version tag. The latest tag changes without notice and can cause unexpected behavior during task replacement events. Find current releases at the ADOT Collector GitHub releases page.
The collector needs CPU and memory allocation. The ADOT Collector sidecar consumes task resources. A baseline of 256 CPU units and 512 MB memory is recommended as a starting point for most workloads. Adjust based on your telemetry volume. If you under-allocate and the collector is CPU-starved, telemetry will be dropped.
essential: false is usually correct for the collector. If the collector is marked essential: true and crashes or fails health checks, ECS will stop the entire task, including your application container. This is rarely the right behavior. The application should degrade gracefully (continue running without telemetry) rather than failing entirely if the collector is unavailable.
Fargate platform version matters. If awsecscontainermetrics is receiving no data, verify the task is running on Fargate platform version 1.4.0 or later. Older platform versions use Task Metadata Endpoint V3, which this receiver does not support.
Containers in the same task share a network namespace automatically. You do not need to configure port exposure between the application and the collector – localhost works because they share the same network namespace by definition in ECS.
How Do I Correlate a Fargate Container Metric Spike with the Application Request That Caused It?
The awsecscontainermetrics receiver tells you which task is consuming high CPU or memory at the container level. It does not tell you which incoming request pattern, which API endpoint, or which background job within your application is responsible for the spike.
When CPU utilization climbs on a Fargate task, the OTel metrics show you the task-level symptom. Without distributed traces, diagnosing whether the cause is a specific API endpoint receiving more traffic, a memory leak in a specific code path, or a downstream service causing retries requires correlating container metrics with application traces manually.
CubeAPM instruments your application via the OpenTelemetry SDK and captures every request as a distributed trace with spans covering the application logic, downstream service calls, and database queries. When a Fargate container metric alarm fires, the trace in CubeAPM shows you which API endpoint was being called at the time of the spike, what each request was waiting on, and whether the pattern repeats across multiple containers in the task. Container metrics tell you that a task is under pressure. Application traces tell you why. Both run self-hosted inside your own AWS account, with no data leaving your environment.
Summary
| Step | What to configure | Why it matters |
| IAM task execution role | ECR, CloudWatch Logs, SSM permissions | Required for ECS to start the task and pull the config |
| IAM task role | CloudWatch, X-Ray, AMP permissions | Required for the collector to export telemetry |
| SSM Parameter Store | ADOT Collector YAML configuration | Separates config from image, allows updates without rebuilds |
| Task definition | adot-collector sidecar container definition | The collector runs alongside your app in the same network namespace |
| Fargate platform version | 1.4.0 or later | Required for Task Metadata Endpoint V4 and awsecscontainermetrics |
| Application env vars | OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 | Points SDK to the sidecar collector |
| essential: false on collector | CloudWatch, X-Ray, and AMP permissions | Prevents task failure if the collector crashes |
Fargate forces the sidecar pattern and limits metrics to task and container scope. Within those constraints, ADOT, with the awsecscontainermetrics receiver gives you CPU, memory, network, and storage metrics per task and per container, correlated with application traces via a single collector pipeline.
Disclaimer: Configurations, IAM policies, and task definition examples are for guidance only – verify against current AWS ADOT documentation and AWS ECS documentation before applying to production. ADOT Collector versions, IAM requirements, and ECS platform version support change over time. CubeAPM references reflect genuine use cases; evaluate all tools against your own requirements.
Also read:
How to Monitor AWS DynamoDB Read/Write Capacity and Throttles





