How to Set CPU Alerts: Step-by-Step Guide 2026

Author: Indu Priya
Category: Alerts
Published Date: June 11, 2026
Last updated: June 24th, 2026

High CPU usage can silently degrade application performance, trigger scaling failures, or cause service timeouts before anyone notices. Without proactive CPU alerts, teams discover these problems only after users report slow response times or systems crash under load. CPU alerts detect resource saturation early, giving teams time to investigate, optimize, or scale infrastructure before impact becomes visible.

This guide walks through the process to set up CPU alerts across AWS CloudWatch, Prometheus, Grafana, Kubernetes environments, and on-prem infrastructure like CubeAPM. Each step includes real configuration examples, threshold recommendations based on workload type, and troubleshooting advice for common false positive scenarios.

Prerequisites

Before you set up CPU alerts, ensure you have:

Monitoring infrastructure deployed and collecting CPU metrics (CloudWatch, Prometheus, cAdvisor, Telegraf, or equivalent)
Admin or operator access to your monitoring platform and alerting tools
Notification channels configured (Slack, PagerDuty, email, webhooks)
Basic familiarity with your monitoring query language (PromQL for Prometheus, CloudWatch metric filters, or platform-specific query syntax)
Historical CPU baseline data for at least 7 days to understand normal usage patterns and avoid threshold misconfigurations

Step 1: Understand Your Workload and Set the Right Threshold

CPU alert thresholds depend entirely on workload type. A batch processing server hitting 95% CPU during scheduled jobs is normal. A frontend API server hitting 80% CPU during regular traffic is a warning sign.

Workload-specific threshold recommendations

Web servers and API endpoints: Alert at 70–75% sustained CPU. Brief spikes to 90% during traffic bursts are normal, but sustained levels above 70% indicate capacity problems or inefficient code.

Database servers: Alert at 60–70% CPU. Databases hitting sustained CPU pressure often show query latency spikes and connection pool exhaustion shortly after.

Batch processing and data pipelines: Alert at 85–90% CPU only if sustained for longer than expected job duration. Short-lived high CPU is expected during processing windows.

Kubernetes pods and containers: Alert at 80% of the defined CPU limit, not raw host CPU. If a pod has a 2-core limit and is using 1.6 cores, that is 80% utilization and warrants investigation.

Define evaluation period and consecutive breaches

CPU can spike briefly during auto-scaling events, deployments, or cache rebuilds. Alerting on a single data point above threshold creates noise.

Set evaluation rules like:

CloudWatch: “2 out of 3 evaluation periods” means CPU must breach threshold in 2 consecutive 5-minute windows before alerting
Prometheus: for: 5m in alert rules means the condition must be true continuously for 5 minutes before firing
Grafana: Define alert condition as “When avg() of query is above X for 5 minutes”

Step 2: Set Up CPU Alerts in AWS CloudWatch

Amazon CloudWatch vs Splunk Observability Cloud vs CubeAPM

AWS CloudWatch monitors EC2 instances, ECS tasks, Lambda functions, and RDS databases. CPU metrics appear automatically once CloudWatch agent or default EC2 monitoring is enabled.

Create a CloudWatch alarm for EC2 CPU utilization

Navigate to the CloudWatch console at https://console.aws.amazon.com/cloudwatch/ and select Alarms > All Alarms > Create Alarm.

Choose Select metric > EC2 > Per-Instance Metrics. Find the instance ID you want to monitor and select the CPUUtilization metric.

Under Specify metric and conditions:

Statistic: Choose Average to smooth out short spikes, or Maximum to catch peak usage
Period: Set to 5 minutes for most workloads
Threshold type: Select Static
Condition: Choose Greater than and enter your threshold (e.g., 75)

Under Additional configuration:

Datapoints to alarm: Set 2 out of 3 to require two consecutive breaches before alerting
Missing data treatment: Choose Treat missing data as missing to avoid false alerts during instance stops

Choose Next and configure SNS topic for notifications. If you do not have an SNS topic, create one and subscribe your email or Slack webhook endpoint.

Complete the alarm setup and save.

Example CloudWatch CLI command

aws cloudwatch put-metric-alarm \
  --alarm-name high-cpu-web-server \
  --alarm-description "Alert when CPU exceeds 75% for 10 minutes" \
  --metric-name CPUUtilization \
  --namespace AWS/EC2 \
  --statistic Average \
  --period 300 \
  --evaluation-periods 2 \
  --threshold 75 \
  --comparison-operator GreaterThanThreshold \
  --dimensions Name=InstanceId,Value=i-0123456789abcdef0 \
  --alarm-actions arn:aws:sns:us-east-1:123456789012:high-cpu-alerts

aws cloudwatch put-metric-alarm \
  --alarm-name high-cpu-web-server \
  --alarm-description "Alert when CPU exceeds 75% for 10 minutes" \
  --metric-name CPUUtilization \
  --namespace AWS/EC2 \
  --statistic Average \
  --period 300 \
  --evaluation-periods 2 \
  --threshold 75 \
  --comparison-operator GreaterThanThreshold \
  --dimensions Name=InstanceId,Value=i-0123456789abcdef0 \
  --alarm-actions arn:aws:sns:us-east-1:123456789012:high-cpu-alerts

This creates an alarm that triggers when average CPU exceeds 75% for two consecutive 5-minute periods.

Pricing based on publicly available information as of April 2026. AWS alarm pricing is $0.10 per alarm per month for standard metrics. Verify current rates at the [AWS CloudWatch pricing page](https://aws.amazon.com/cloudwatch/pricing/).

Step 3: Set Up CPU Alerts in Prometheus

Prometheus collects CPU metrics from node_exporter for hosts, cAdvisor for containers, and kube-state-metrics for Kubernetes workloads. Alert rules are defined in YAML files and evaluated by Prometheus server.

Create a Prometheus alert rule for host CPU

Create or edit your Prometheus alert rules file (typically /etc/prometheus/alert_rules.yml):

groups:
  - name: cpu_alerts
    interval: 30s
    rules:
      - alert: HighCPUUsage
        expr: 100 - (avg by(instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 75
        for: 5m
        labels:
          severity: warning
          component: infrastructure
        annotations:
          summary: "High CPU usage on {{ $labels.instance }}"
          description: "CPU usage is {{ $value }}% on instance {{ $labels.instance }} for more than 5 minutes."

groups:
  - name: cpu_alerts
    interval: 30s
    rules:
      - alert: HighCPUUsage
        expr: 100 - (avg by(instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 75
        for: 5m
        labels:
          severity: warning
          component: infrastructure
        annotations:
          summary: "High CPU usage on {{ $labels.instance }}"
          description: "CPU usage is {{ $value }}% on instance {{ $labels.instance }} for more than 5 minutes."

This rule calculates CPU usage by subtracting idle time from 100%. The for: 5m clause ensures the alert fires only after CPU remains above 75% continuously for 5 minutes.

Reload Prometheus configuration:

curl -X POST http://localhost:9090/-/reload

curl -X POST http://localhost:9090/-/reload

Verify the alert rule appears in the Prometheus UI under Alerts.

Forward Prometheus alerts to Alertmanager

Prometheus does not send notifications directly. Configure Alertmanager to route alerts to Slack, PagerDuty, or email.

Edit /etc/alertmanager/alertmanager.yml:

route:
  receiver: 'slack-notifications'
  group_by: ['alertname', 'instance']
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 3h
receivers:
  - name: 'slack-notifications'
    slack_configs:
      - api_url: 'https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK'
        channel: '#alerts'
        title: 'CPU Alert'
        text: '{{ range .Alerts }}{{ .Annotations.summary }}\n{{ .Annotations.description }}{{ end }}'

route:
  receiver: 'slack-notifications'
  group_by: ['alertname', 'instance']
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 3h
receivers:
  - name: 'slack-notifications'
    slack_configs:
      - api_url: 'https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK'
        channel: '#alerts'
        title: 'CPU Alert'
        text: '{{ range .Alerts }}{{ .Annotations.summary }}\n{{ .Annotations.description }}{{ end }}'

Restart Alertmanager to apply changes.

Step 4: Set Up CPU Alerts in Grafana

Grafana supports alerting directly from dashboard panels when connected to Prometheus, CloudWatch, or other data sources.

Create a Grafana alert from a CPU panel

Open a Grafana dashboard with a CPU utilization panel. Edit the panel and navigate to the Alert tab.

Define the alert condition:

Query: Use the same PromQL or CloudWatch query as the panel
Condition: WHEN avg() OF query(A, 5m) IS ABOVE 75
Evaluate every: 1m
For: 5m

This configuration checks CPU every minute and fires the alert only after CPU exceeds 75% continuously for 5 minutes.

Under Notifications, select your notification channel (Slack, PagerDuty, email). If you have not configured a channel, go to Alerting > Notification channels and create one.

Save the dashboard. The alert is now active.

Example Grafana alert rule in JSON

{
  "alert": {
    "conditions": [
      {
        "evaluator": {
          "params": [75],
          "type": "gt"
        },
        "operator": {
          "type": "and"
        },
        "query": {
          "params": ["A", "5m", "now"]
        },
        "reducer": {
          "params": [],
          "type": "avg"
        },
        "type": "query"
      }
    ],
    "executionErrorState": "keep_state",
    "for": "5m",
    "frequency": "1m",
    "handler": 1,
    "name": "High CPU Alert",
    "noDataState": "no_data",
    "notifications": []
  }
}

{
  "alert": {
    "conditions": [
      {
        "evaluator": {
          "params": [75],
          "type": "gt"
        },
        "operator": {
          "type": "and"
        },
        "query": {
          "params": ["A", "5m", "now"]
        },
        "reducer": {
          "params": [],
          "type": "avg"
        },
        "type": "query"
      }
    ],
    "executionErrorState": "keep_state",
    "for": "5m",
    "frequency": "1m",
    "handler": 1,
    "name": "High CPU Alert",
    "noDataState": "no_data",
    "notifications": []
  }
}

Step 5: Set Up CPU Alerts for Kubernetes Pods

Kubernetes environments require monitoring both node-level CPU and pod-level CPU relative to resource limits. A pod using 100% of its CPU limit may only be using 10% of the node’s total CPU capacity.

Monitor pod CPU with Prometheus and kube-state-metrics

Deploy kube-state-metrics in your cluster to expose pod resource metrics:

kubectl apply -f https://github.com/kubernetes/kube-state-metrics/releases/download/v2.10.0/kube-state-metrics.yaml

kubectl apply -f https://github.com/kubernetes/kube-state-metrics/releases/download/v2.10.0/kube-state-metrics.yaml

Create a Prometheus alert rule for pod CPU usage:

groups:
  - name: kubernetes_cpu_alerts
    interval: 30s
    rules:
      - alert: PodCPUThrottling
        expr: rate(container_cpu_cfs_throttled_seconds_total[5m]) > 0.1
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Pod {{ $labels.pod }} is being CPU throttled"
          description: "Pod {{ $labels.pod }} in namespace {{ $labels.namespace }} is experiencing CPU throttling."
      - alert: PodHighCPUUsage
        expr: (sum(rate(container_cpu_usage_seconds_total[5m])) by (pod, namespace) / sum(container_spec_cpu_quota / container_spec_cpu_period) by (pod, namespace)) * 100 > 80
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Pod {{ $labels.pod }} CPU usage is high"
          description: "Pod {{ $labels.pod }} in namespace {{ $labels.namespace }} is using {{ $value }}% of its CPU limit."

groups:
  - name: kubernetes_cpu_alerts
    interval: 30s
    rules:
      - alert: PodCPUThrottling
        expr: rate(container_cpu_cfs_throttled_seconds_total[5m]) > 0.1
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Pod {{ $labels.pod }} is being CPU throttled"
          description: "Pod {{ $labels.pod }} in namespace {{ $labels.namespace }} is experiencing CPU throttling."
      - alert: PodHighCPUUsage
        expr: (sum(rate(container_cpu_usage_seconds_total[5m])) by (pod, namespace) / sum(container_spec_cpu_quota / container_spec_cpu_period) by (pod, namespace)) * 100 > 80
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Pod {{ $labels.pod }} CPU usage is high"
          description: "Pod {{ $labels.pod }} in namespace {{ $labels.namespace }} is using {{ $value }}% of its CPU limit."

The first rule detects CPU throttling, which happens when a pod hits its CPU limit and Kubernetes restricts its CPU cycles. The second rule alerts when a pod uses more than 80% of its defined CPU limit.

Set CPU alerts for Kubernetes nodes

Monitor node CPU pressure to detect capacity problems before pods are evicted:

- alert: NodeCPUPressure
  expr: kube_node_status_condition{condition="CPUPressure",status="true"} == 1
  for: 5m
  labels:
    severity: critical
  annotations:
    summary: "Node {{ $labels.node }} is under CPU pressure"
    description: "Kubernetes node {{ $labels.node }} is reporting CPU pressure. Pods may be evicted."

- alert: NodeCPUPressure
  expr: kube_node_status_condition{condition="CPUPressure",status="true"} == 1
  for: 5m
  labels:
    severity: critical
  annotations:
    summary: "Node {{ $labels.node }} is under CPU pressure"
    description: "Kubernetes node {{ $labels.node }} is reporting CPU pressure. Pods may be evicted."

Step 6: Set Up CPU Alerts with CubeAPM

CubeAPM provides infrastructure monitoring with native support for host, container, and Kubernetes CPU metrics. It runs inside your VPC or on-prem, keeping all telemetry data local while delivering managed observability.

Configure CPU alerts in CubeAPM

CubeAPM automatically collects CPU metrics from OpenTelemetry, Prometheus exporters, and native Kubernetes integrations once agents are deployed.

Navigate to Alerts > Create Alert in the CubeAPM dashboard.

Select Infrastructure as the alert type and define the condition:

Metric: host.cpu.utilization or k8s.pod.cpu.utilization
Aggregation: avg
Threshold: > 75
Evaluation window: 5 minutes
Consecutive breaches: 2

Add notification channels (Slack, PagerDuty, email, webhook) and save the alert.

CubeAPM correlates CPU alerts with application traces, logs, and deployment events, giving full context when an alert fires. If CPU spikes correlate with a recent deployment or slow database query, CubeAPM surfaces that connection automatically.

Why CubeAPM simplifies CPU alerting

Unlike Prometheus or Grafana where you build and maintain alert rules manually, CubeAPM provides pre-configured alert templates for common scenarios including CPU, memory, disk, and network thresholds. Alerts auto-populate with contextual data like affected pods, services, and recent changes.

CubeAPM runs on your infrastructure with no data egress, making it suitable for regulated environments where telemetry cannot leave the VPC. Pricing is $0.15/GB of ingested data with unlimited retention and no per-host or per-user fees.

For teams running AWS Lambda monitoring, CubeAPM also tracks Lambda invocation CPU time and memory usage alongside traditional infrastructure metrics.

Step 7: Configure Notification Channels

CPU alerts are only useful if the right people see them in time. Configure notification channels that integrate with your team’s existing workflow.

Slack integration

Most monitoring platforms support Slack webhooks. Create an incoming webhook in your Slack workspace under Apps > Incoming Webhooks.

Copy the webhook URL and paste it into your monitoring platform’s notification settings. Test the integration to ensure alerts appear in the correct channel.

PagerDuty integration

For on-call rotations, integrate with PagerDuty to route critical CPU alerts to the engineer on duty.

Create a PagerDuty integration key for your monitoring platform under Services > Service Directory > New Service. Choose the integration type (Prometheus, CloudWatch, Grafana, or CubeAPM).

Add the integration key to your alerting platform under notification channels.

Email alerts

Email remains the most universal notification method. Configure SMTP settings in your monitoring platform and create an email notification channel.

Set different severity levels to different email addresses. Warning-level CPU alerts can go to a monitoring alias. Critical alerts can page on-call engineers directly.

Webhook for custom workflows

Use webhooks to route alerts to custom dashboards, ticketing systems, or automation workflows. Webhook payloads include alert metadata like instance ID, metric value, timestamp, and severity level.

Example webhook payload from Prometheus Alertmanager:

{
  "alerts": [
    {
      "status": "firing",
      "labels": {
        "alertname": "HighCPUUsage",
        "instance": "web-server-01",
        "severity": "warning"
      },
      "annotations": {
        "summary": "High CPU usage on web-server-01",
        "description": "CPU usage is 82% on instance web-server-01 for more than 5 minutes."
      },
      "startsAt": "2026-04-15T10:32:00Z"
    }
  ]
}

{
  "alerts": [
    {
      "status": "firing",
      "labels": {
        "alertname": "HighCPUUsage",
        "instance": "web-server-01",
        "severity": "warning"
      },
      "annotations": {
        "summary": "High CPU usage on web-server-01",
        "description": "CPU usage is 82% on instance web-server-01 for more than 5 minutes."
      },
      "startsAt": "2026-04-15T10:32:00Z"
    }
  ]
}

Troubleshooting Common Issues

False positives from auto-scaling events

Auto-scaling triggers brief CPU spikes as new instances warm up or existing instances drain connections. These spikes resolve within minutes and should not trigger alerts.

Solution: Set evaluation periods to require sustained threshold breaches. Use for: 5m in Prometheus or 2 out of 3 datapoints in CloudWatch to filter out short-lived spikes.

CPU alerts firing during known batch jobs

Scheduled batch processing jobs intentionally drive CPU to high levels during execution windows. Alerting during these periods creates noise.

Solution: Use alert suppression windows or maintenance mode during scheduled jobs. In Prometheus Alertmanager, define inhibition rules that silence CPU alerts when a job_running metric is active.

Missing data causing alert state changes

If the monitoring agent stops reporting metrics, some platforms treat missing data as a breach and fire alerts. This creates false positives during network interruptions or agent restarts.

Solution: Configure missing data treatment explicitly. In CloudWatch, choose “Treat missing data as missing” instead of “Treat missing data as breaching.” In Prometheus, use absent() queries to detect metric disappearance separately from threshold breaches.

CPU throttling in containers not triggering alerts

Containers hitting CPU limits experience throttling without necessarily maxing out host CPU. Standard CPU utilization metrics miss this condition.

Solution: Monitor container_cpu_cfs_throttled_seconds_total in Kubernetes environments. This metric increments whenever a container is throttled due to hitting its CPU quota. Alert when the rate of throttling exceeds acceptable levels.

Alert fatigue from overly sensitive thresholds

Setting CPU alert thresholds too low generates constant alerts during normal traffic patterns, leading teams to ignore or mute them.

Solution: Review historical CPU patterns over at least 7 days before setting thresholds. Use percentile-based thresholds (e.g., alert when CPU exceeds the 95th percentile of the past 30 days) to account for normal variability. Adjust thresholds after the first week of alerts to reduce noise.

Conclusion

CPU alerts protect application performance by detecting resource saturation before it causes user-facing impact. The threshold, evaluation period, and notification routing must match your workload type and team structure. A web API server needs different alert rules than a batch processing pipeline.

Start with conservative thresholds, monitor alert frequency for the first week, and adjust based on false positive rate. Tools like CubeAPM simplify this process by correlating CPU spikes with deployments, slow queries, and pod events automatically, reducing the time spent investigating whether an alert requires action.

Disclaimer: The information in this article reflects the latest details available at the time of publication and may change as technologies and products evolve. Features, pricing, and plan limits can change over time. Always verify the latest information directly with the vendor before making purchasing or deployment decisions.

Frequently Asked Questions

What is a good CPU threshold for alerts?

For web servers and APIs, alert at 70–75% sustained CPU. For batch processing, alert at 85–90% only if sustained beyond expected job duration. Thresholds depend entirely on workload type and historical patterns.

How do I avoid false positive CPU alerts?

Set evaluation periods to require sustained threshold breaches across multiple consecutive data points. Use `for: 5m` in Prometheus or `2 out of 3 datapoints` in CloudWatch to filter out brief spikes from auto-scaling or cache rebuilds.

Should I monitor CPU utilization or CPU throttling?

Monitor both. CPU utilization shows overall resource consumption. CPU throttling in containers shows when workloads hit their CPU limit and are being restricted by the scheduler, which can degrade performance even if host CPU appears low.

What is the difference between average and maximum CPU in alerts?

Average smooths out short spikes and reflects sustained load. Maximum catches peak usage during brief bursts. Use average for most workloads and maximum only when brief spikes cause user-facing impact.

How do I set CPU alerts for Kubernetes pods?

Monitor pod CPU relative to its defined limit, not host CPU. Alert when a pod uses more than 80% of its CPU limit or when CPU throttling rate exceeds acceptable levels. Use kube-state-metrics and cAdvisor for accurate pod-level metrics.

Can I set dynamic CPU thresholds based on traffic patterns?

Yes, some platforms support anomaly detection or percentile-based thresholds. Prometheus can calculate historical percentiles with `quantile_over_time()`. CubeAPM includes built-in anomaly detection that learns normal CPU patterns and alerts on deviations.

What should I do when a CPU alert fires?

Check recent deployments, database query performance, and traffic patterns. Correlate CPU spikes with application traces and logs to identify root cause. Scale infrastructure if CPU pressure is driven by legitimate load growth, or optimize code if caused by inefficient queries or memory leaks.

Azure DevOps Pipeline Monitoring: Build and Release Failures

Indu Priya July 20, 2026

Azure Managed Grafana: Setup and Comparison with Self-Hosted

Indu Priya July 20, 2026

10 Best Azure Cost Monitoring Tools in 2026: Deep Comparison for Cloud Cost Governance

Indu Priya July 20, 2026

Azure Monitor vs OpenObserve: In-Depth Comparison 2026

Indu Priya July 20, 2026

OpenCost vs Kubecost: In-Depth Comparison 2026

Abhinav Garg July 20, 2026

10 Best Kubernetes Cost Optimization Tools in 2026: Best Platforms Compared

Abhinav Garg July 20, 2026

How to Set CPU Alerts: Step-by-Step Guide 2026

Table of Contents

Prerequisites

Step 1: Understand Your Workload and Set the Right Threshold

Workload-specific threshold recommendations

Define evaluation period and consecutive breaches

Step 2: Set Up CPU Alerts in AWS CloudWatch

Create a CloudWatch alarm for EC2 CPU utilization

Example CloudWatch CLI command

Step 3: Set Up CPU Alerts in Prometheus

Create a Prometheus alert rule for host CPU

Forward Prometheus alerts to Alertmanager

Step 4: Set Up CPU Alerts in Grafana

Create a Grafana alert from a CPU panel

Example Grafana alert rule in JSON

Step 5: Set Up CPU Alerts for Kubernetes Pods

Monitor pod CPU with Prometheus and kube-state-metrics

Set CPU alerts for Kubernetes nodes

Step 6: Set Up CPU Alerts with CubeAPM

Configure CPU alerts in CubeAPM

Why CubeAPM simplifies CPU alerting

Step 7: Configure Notification Channels

Slack integration

PagerDuty integration

Email alerts

Webhook for custom workflows

Troubleshooting Common Issues

False positives from auto-scaling events

CPU alerts firing during known batch jobs

Missing data causing alert state changes

CPU throttling in containers not triggering alerts

Alert fatigue from overly sensitive thresholds

Conclusion

Frequently Asked Questions

What is a good CPU threshold for alerts?

How do I avoid false positive CPU alerts?

Should I monitor CPU utilization or CPU throttling?

What is the difference between average and maximum CPU in alerts?

How do I set CPU alerts for Kubernetes pods?

Can I set dynamic CPU thresholds based on traffic patterns?

What should I do when a CPU alert fires?

Related Posts

Features

Resources

Links