CubeAPM
CubeAPM CubeAPM

How to Build a RabbitMQ Grafana Dashboard From Scratch 

How to Build a RabbitMQ Grafana Dashboard From Scratch 

Table of Contents

Setting up a RabbitMQ Grafana dashboard requires three things working together: the built-in rabbitmq_prometheus plugin to expose metrics, Prometheus to scrape and store them, and Grafana to visualize and alert. This guide covers each step end-to-end, from enabling the plugin to building custom panels for per-queue monitoring.

Key Takeaways

  • The rabbitmq_prometheus plugin ships with RabbitMQ 3.8 and above. No third-party exporter is needed. The current RabbitMQ release series is 4.x.
  • Metrics are exposed on port 15692 at /metrics (aggregated) or /metrics/detailed (per-object, higher overhead).
  • The RabbitMQ team publishes six official Grafana dashboards. Start with ID 10991 (Overview), which covers every metric the management UI shows, with node-level cluster hotspot detection.
  • Queue depth, unacknowledged message count, node memory, and disk free space catch the vast majority of production incidents.
  • The official dashboards use hardcoded 60-second query ranges. Keep your Prometheus scrape interval at 15 to 30 seconds, or panels will show no data.
  • For RabbitMQ 3.7 and below, use the community rabbitmq_exporter at github.com/kbudde/rabbitmq_exporter.

Step 1: Enable the Prometheus Plugin

From RabbitMQ 3.8 onward, the plugin is bundled. Enable it without a broker restart:

rabbitmq-plugins enable rabbitmq_prometheus

Verify it is running:

curl http://localhost:15692/metrics | head -20

You should see lines starting with # HELP rabbitmq_queue_messages. If the endpoint returns nothing, confirm the plugin is active:

rabbitmq-plugins list | grep prometheus

If you are running RabbitMQ in Docker, mount a config file and an enabled_plugins file into the container:

services:

  rabbitmq:

    image: rabbitmq:4-management

    ports:

      - "5672:5672"

      - "15672:15672"

      - "15692:15692"

    volumes:

      - ./rabbitmq.conf:/etc/rabbitmq/rabbitmq.conf

      - ./enabled_plugins:/etc/rabbitmq/enabled_plugins

Where enabled_plugins contains:

[rabbitmq_management,rabbitmq_prometheus].

If you are using the Kubernetes RabbitMQ Cluster Operator, the plugin is enabled automatically on every node. You only need to expose port 15692 via a Service, which is covered in Step 2.

Step 2: Configure Prometheus to Scrape RabbitMQ

Add a scrape job to your prometheus.yml. List each node separately so Grafana can identify per-node metrics and spot cluster imbalances. The RabbitMQ team recommends a scrape interval of 15 to 30 seconds for production:

scrape_configs:

  - job_name: rabbitmq

    scrape_interval: 15s

    static_configs:

      - targets:

          - rabbitmq-node1:15692

          - rabbitmq-node2:15692

          - rabbitmq-node3:15692

        labels:

          cluster: production

For Kubernetes with Prometheus Operator, expose the metrics port on your RabbitMQ Service first:

apiVersion: v1

kind: Service

metadata:

  name: rabbitmq-prometheus

  namespace: rabbitmq

  labels:

    app: rabbitmq

spec:

  ports:

    - name: prometheus

      port: 15692

      targetPort: 15692

  selector:

    app: rabbitmq

Then create a ServiceMonitor:

apiVersion: monitoring.coreos.com/v1

kind: ServiceMonitor

metadata:

  name: rabbitmq

  namespace: monitoring

spec:

  selector:

    matchLabels:

      app: rabbitmq

  namespaceSelector:

    matchNames:

      - rabbitmq

  endpoints:

    - port: prometheus

      interval: 15s

After applying, navigate to http://prometheus:9090/targets and confirm the RabbitMQ targets show status UP.

Step 3: Add Prometheus as a Data Source in Grafana

  1. Go to Connections → Data sources → Add new data source
  2. Select Prometheus
  3. Set the URL to your Prometheus server (e.g. http://prometheus:9090)
  4. Click Save & test. You should see “Data source is working”

If you are provisioning Grafana via config files, create /etc/grafana/provisioning/datasources/prometheus.yml:

apiVersion: 1

datasources:

  - name: Prometheus

    type: prometheus

    access: proxy

    url: http://prometheus:9090

    isDefault: true

    editable: true

Once the data source is connected, run a quick validation query in Explore:

rabbitmq_queues

If this returns a value, your pipeline from RabbitMQ through Prometheus to Grafana is working end-to-end.

Step 4: Import the Official RabbitMQ Dashboards

The RabbitMQ team maintains six official dashboards on Grafana Labs. Start with these before building anything custom.

DashboardGrafana IDWhat it covers
RabbitMQ-Overview10991Queues, connections, channels, message rates, memory, disk, node health. The primary day-to-day dashboard
RabbitMQ-Quorum-Queues-Raft11340Raft consensus state for quorum queues. Essential if you use quorum queues in production
Erlang-Distribution11352Inter-node communication links and buffers. Use when diagnosing cluster split or connectivity issues
Erlang-Memory-Allocators11350BEAM VM memory subsystem. Use when debugging memory-related performance regressions
RabbitMQ-Stream14798Stream protocol message rates and errors. Use if you are running RabbitMQ Streams
RabbitMQ-PerfTest6566Performance testing metrics. Useful during load testing, not for production monitoring

To import any dashboard: go to Dashboards → Import, enter the ID, select your Prometheus data source, and click Import.

The Overview dashboard (10991) is what most teams use daily. The Raft, Erlang Distribution, and Erlang Memory Allocators dashboards are diagnostic tools to reach for when something specific is wrong.

Step 5: Key Metrics Reference

Before building custom panels, understand which metrics matter and what a rising value means in production.

MetricWhat it measuresWhy it matters
rabbitmq_queue_messages_readyMessages waiting to be consumedA rising trend means consumers are falling behind producers
rabbitmq_queue_messages_unackedDelivered but not yet acknowledgedA plateau points to slow or stuck consumers
rabbitmq_channel_messages_published_totalTotal messages published (counter)Use rate() to get publish rate per second
rabbitmq_channel_messages_delivered_totalTotal messages delivered (counter)Compare with publish rate to spot a growing backlog
rabbitmq_connectionsTotal open connectionsSudden spikes indicate connection leaks
rabbitmq_consumersTotal consumers across all queuesA queue with zero consumers is a silent failure
rabbitmq_process_resident_memory_bytesNode resident memoryRabbitMQ throttles publishers when a memory alarm fires (default at 40% of system RAM)
rabbitmq_disk_space_available_bytesFree disk on each nodeDisk alarms block all publishing when free space drops below the configured threshold

Useful PromQL queries to add as panels:

# Publish rate (5-minute window)

rate(rabbitmq_channel_messages_published_total[5m])

# Deliver rate

rate(rabbitmq_channel_messages_delivered_total[5m])

# Total unacknowledged messages across all queues

sum(rabbitmq_queue_messages_unacked)

# Memory usage as a percentage of limit

rabbitmq_process_resident_memory_bytes / rabbitmq_resident_memory_limit_bytes * 100

# Queue depth for a specific queue

rabbitmq_queue_messages_ready{queue=”your-queue-name”}

Step 6: Build Custom Panels for Per-Queue Monitoring

The official dashboards show cluster-level data. For per-queue visibility, add Grafana template variables.

Add a queue selector variable:

  1. Go to Dashboard settings → Variables → Add variable
  2. Name: queue
  3. Type: Query
  4. Data source: your Prometheus instance
  5. Query: label_values(rabbitmq_queue_messages_ready, queue)
  6. Enable Multi-value and Include All option
  7. Save

Add a vhost variable using the same steps with this query:

label_values(rabbitmq_queue_messages_ready, vhost)

Then reference both variables in your panel queries:

rabbitmq_queue_messages_ready{queue=~"$queue", vhost=~"$vhost"}

This lets any engineer drill into any queue from a dropdown without duplicating panels across the dashboard.

Step 7: Set Up Alerts

groups:

  - name: rabbitmq

    rules:

      - alert: RabbitMQQueueDepthHigh

        expr: rabbitmq_queue_messages_ready > 10000

        for: 5m

        labels:

          severity: warning

        annotations:

          summary: "RabbitMQ queue depth is high"

          description: "Queue {{ $labels.queue }} on vhost {{ $labels.vhost }} has {{ $value }} ready messages."

      - alert: RabbitMQNoConsumers

        expr: rabbitmq_consumers == 0

        for: 2m

        labels:

          severity: critical

        annotations:

          summary: "RabbitMQ queue has no consumers"

          description: "Queue {{ $labels.queue }} has no active consumers."

      - alert: RabbitMQNodeMemoryHigh

        expr: >

          rabbitmq_process_resident_memory_bytes /

          rabbitmq_resident_memory_limit_bytes * 100 > 80

        for: 5m

        labels:

          severity: warning

        annotations:

          summary: "RabbitMQ node memory above 80% of limit"

          description: "Node {{ $labels.node }} is using {{ $value | humanize }}% of its memory limit."

      - alert: RabbitMQDiskSpaceLow

        expr: rabbitmq_disk_space_available_bytes < 1073741824

        for: 5m

        labels:

          severity: critical

        annotations:

          summary: "RabbitMQ node disk space below 1 GB"

          description: "Node {{ $labels.node }} has {{ $value | humanize }}B free disk space."

Set the rabbitmq_queue_messages_ready threshold based on your write rate and SLA. The formula is: acceptable delay in seconds multiplied by write rate in records per second. A queue receiving 500 records per second with a 30-second tolerance gives a threshold of 15,000.

Common Setup Problems

ProblemLikely causeFix
Port 15692 returns nothingPlugin not enabledRun rabbitmq-plugins enable rabbitmq_prometheus and verify with rabbitmq-plugins list
Dashboard panels show “No data”Scrape interval too longKeep scrape interval at 15 to 30 seconds. The official dashboards use hardcoded 60-second query ranges and will return no data if the interval exceeds 60 seconds
Grafana shows “No data” after importWrong data source selected during importEdit each panel and confirm the data source is set to your Prometheus instance
Per-node panels show all nodes mergedNodes not listed individually in scrape configList each node as a separate target in prometheus.yml rather than pointing to a load-balanced endpoint
Memory alarm fires unexpectedlyConfusing system RAM with RabbitMQ’s memory limitRabbitMQ’s memory alarm fires at 40% of system RAM by default, not at total RAM. Query rabbitmq_resident_memory_limit_bytes to see the actual threshold in use

How Do You Know Which Consumer Is Slow, Not Just That a Queue Is Growing?

Your Grafana dashboard fires a queue depth alert and shows the rabbitmq_queue_messages_ready graph climbing. That tells you something is wrong. It does not tell you which consumer instance is falling behind, which message type is taking the longest to process, or whether the bottleneck is inside the consumer’s own code or a downstream service it is waiting on.

When a queue depth alert fires, the next question is always the same: is this a throughput problem (not enough consumer instances), a processing problem (each message takes too long), or a downstream dependency problem (the consumer is blocked waiting on a database, an external API, or another service)?

CubeAPM instruments your RabbitMQ consumer application via OpenTelemetry and captures each message processing cycle as a span in the full distributed trace. When an alert fires, CubeAPM shows you which consumer instance is slowest, how long each message takes end-to-end through the system, which downstream service calls are consuming the most time per message, and whether the slowdown is concentrated on specific message types or is evenly spread. The Prometheus alert tells you something is wrong. CubeAPM tells you what and where. It runs self-hosted inside your own infrastructure at $0.15/GB ingestion pricing, so no data leaves your environment.

Summary

Building a RabbitMQ Grafana dashboard from scratch comes down to four steps: enable the rabbitmq_prometheus plugin, point Prometheus at port 15692 on each node, connect Prometheus as a Grafana data source, and import the official Overview dashboard (ID 10991) as your baseline. From there, add template variables for per-queue filtering and build alert rules based on your actual write rate rather than arbitrary numbers.

StepWhat to doKey detail
Enable pluginrabbitmq-plugins enable rabbitmq_prometheusNo broker restart needed. Port 15692, path /metrics
Configure PrometheusAdd scrape job targeting each node on port 15692Scrape interval: 15 to 30 seconds. List nodes individually
Add data source in GrafanaConnections → Data sources → PrometheusValidate with rabbitmq_queues query in Explore
Import official dashboardsDashboards → Import, start with ID 10991Six official dashboards at grafana.com/orgs/rabbitmq
Add key metric panelsQueue depth, unacked messages, memory, diskUse rate() on counter metrics for throughput panels
Add per-queue variableslabel_values(rabbitmq_queue_messages_ready, queue)Enables per-queue and per-vhost filtering from dropdowns
Set up alertsAlert on queue depth, no consumers, memory, diskThreshold = SLA seconds multiplied by write rate in records per second

Disclaimer : Metric names, plugin flags, and dashboard IDs are verified against RabbitMQ 4.3 documentation (rabbitmq.com/docs/prometheus) and the official Grafana dashboard marketplace as of May 2026.

Also read:

How to Set Up a Kafka Consumer Lag Alert with Prometheus

Consumer Lag vs Offset in Kafka: What Is the Difference?

How to Monitor ActiveMQ Queues and Consumers

×
×