You have two paths for monitoring GKE clusters with Prometheus and Grafana. The first is self-hosted: deploy the kube-prometheus-stack Helm chart into your cluster, which installs Prometheus, Grafana, Alertmanager, kube-state-metrics, and node-exporter together with pre-built dashboards and alert rules. The second is Google Managed Service for Prometheus (GMP): let Google run the Prometheus collectors and storage, and connect Grafana to GMP as a data source.
Both approaches use PromQL, support the same Grafana dashboards, and give you the same GKE metrics. The choice is an operational one: self-hosted gives you full control and no additional GCP cost for metrics ingestion; GMP removes the operational overhead of running Prometheus at scale but bills per metric sample ingested.
Key Takeaways
- GKE requires an extra RBAC step before deploying kube-prometheus-stack – without it, ClusterRole creation fails with a Forbidden error
- kube-prometheus-stack is the fastest path to a complete self-hosted setup – it deploys Prometheus, Grafana, Alertmanager, node-exporter, kube-state-metrics, and default alert rules in a single Helm command
- Google Managed Service for Prometheus is enabled by default on new GKE clusters created after a certain version – check whether it is already collecting metrics before deploying a second Prometheus stack
- kube-state-metrics and cAdvisor cover different things: kube-state-metrics exposes object state (desired vs actual replica count, pod status), cAdvisor exposes resource usage (CPU, memory per container) – you need both
- Grafana dashboard ID 315 (Kubernetes cluster monitoring) and ID 6417 (Kubernetes Cluster) are the most widely used community dashboards for GKE
- Enable persistent storage for both Prometheus and Grafana in production – the defaults are in-memory and do not survive pod restarts
Option 1: Self-Hosted kube-prometheus-stack
Step 1: Fix GKE RBAC Before Installing
GKE restricts the ability to create ClusterRoles and ClusterRoleBindings unless your Google identity explicitly has cluster-admin. This is a known GKE-specific requirement. Without this step, the kube-prometheus-stack installation fails with a Forbidden error on ClusterRole creation.
# Replace with your Google Cloud account email
ACCOUNT=$(gcloud info --format='value(config.account)')
kubectl create clusterrolebinding owner-cluster-admin-binding \
--clusterrole cluster-admin \
--user $ACCOUNTRun this once before any Helm installation that creates ClusterRoles.
Step 2: Install kube-prometheus-stack
# Add the Prometheus community Helm repo
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
# Create a dedicated monitoring namespace
kubectl create namespace monitoring
# Install the stack with persistent storage
helm install prometheus prometheus-community/kube-prometheus-stack \
--namespace monitoring \
--set prometheus.prometheusSpec.retention=15d \
--set prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.spec.resources.requests.storage=50Gi \
--set grafana.persistence.enabled=true \
--set grafana.persistence.size=10Gi \
--set grafana.persistence.storageClassName=standard-rwoThe standard-rwo storage class is the GKE default for ReadWriteOnce volumes. If you are on Autopilot, use premium-rwo for better I/O.
What gets installed:
- Prometheus server with 15-day retention and 50Gi persistent storage
- Grafana with 10Gi persistent storage (default credentials: admin/prom-operator)
- Alertmanager
- kube-state-metrics
- prometheus-node-exporter on every node
- A full set of pre-configured dashboards and alert rules via the Kubernetes Monitoring Mixin
Step 3: Verify the Installation
# Check all pods are running
kubectl get pods -n monitoring
# Check Prometheus targets (should show all as UP)
kubectl port-forward -n monitoring svc/prometheus-kube-prometheus-prometheus 9090:9090
# Open http://localhost:9090/targets
# Access Grafana
kubectl port-forward -n monitoring svc/prometheus-grafana 3000:80
# Open http://localhost:3000 - login: admin / prom-operatorChange the default Grafana password immediately after the first login. The default password prom-operator is publicly known – anyone who can reach your Grafana endpoint can log in with it. Go to Profile > Change Password as soon as you access the UI for the first time.
Step 4: Expose Grafana with a LoadBalancer or Ingress
For persistent access without port-forwarding:
# Upgrade with LoadBalancer service type
helm upgrade prometheus prometheus-community/kube-prometheus-stack \
--namespace monitoring \
--set grafana.service.type=LoadBalancerOr deploy an Ingress if you have an ingress controller configured in the cluster. For internal-only access, annotate the LoadBalancer service with cloud.google.com/load-balancer-type: “Internal”.
Option 2: Google Managed Service for Prometheus (GMP)
GMP is enabled by default on new GKE clusters. It replaces the self-hosted Prometheus server with Google-managed collectors that forward metrics to Google’s Monarch backend, queryable via the Prometheus API.
Check if GMP is already enabled on your cluster:
gcloud container clusters describe your-cluster-name \
--zone your-zone \
--format="value(monitoringConfig.componentConfig.enableComponents)"
Enable GMP on an existing cluster:gcloud container clusters update your-cluster-name \
--zone your-zone \
--enable-managed-prometheus
Connect Grafana to GMP as a data source:GMP exposes a Prometheus-compatible API endpoint. Create a GCP service account with the Monitoring Viewer role and configure it in Grafana:
# Create a service account for Grafana
gcloud iam service-accounts create grafana-reader \
--display-name="Grafana GMP Reader" \
--project=your-project-id
# Grant Monitoring Viewer role
gcloud projects add-iam-policy-binding your-project-id \
--member="serviceAccount:[email protected]" \
--role="roles/monitoring.viewer"
# Create a key file for Grafana authentication
gcloud iam service-accounts keys create grafana-key.json \
--iam-account=grafana-reader@your-project-id.iam.gserviceaccount.comIn Grafana, add a Prometheus data source with the URL: https://monitoring.googleapis.com/v1/projects/your-project-id/location/global/prometheus
Upload the JSON key file as the authentication credential.
What GMP gives you over self-hosted:
- 24-month metric retention included in the price
- No Prometheus servers to scale, shard, or maintain
- Global querying across all clusters and regions from a single Grafana instance
- PromQL queries work identically – existing dashboards and alerts need no changes
- Free GKE system metrics (CPU, memory, pod status) are included without sending data to GMP
What GMP costs: GMP bills per metric sample ingested. For large GKE clusters with many pods and custom metrics, this can add up. Run the GCP pricing calculator with your expected sample volume before committing.
What kube-state-metrics and cAdvisor Each Cover
A common point of confusion when setting up GKE monitoring is what each component actually measures.
| Source | What it measures | Example metrics |
| kube-state-metrics | Kubernetes object state – desired vs actual | kube_deployment_spec_replicas, kube_pod_status_phase, kube_node_status_condition |
| cAdvisor (via kubelet) | Container resource usage – actual consumption | container_cpu_usage_seconds_total, container_memory_working_set_bytes |
| node-exporter | Node-level OS and hardware metrics | node_cpu_seconds_total, node_filesystem_avail_bytes |
You need all three for complete GKE visibility. kube-prometheus-stack installs all of them. GMP’s managed collection also covers all three when fully configured.
Key PromQL Queries for GKE
CPU usage by namespace:
sum(rate(container_cpu_usage_seconds_total{
namespace!="kube-system",
container!=""
}[5m])) by (namespace)
Memory usage by pod:
sum(container_memory_working_set_bytes{
container!="",
namespace!="kube-system"
}) by (pod, namespace)
Pods not in Running state:
kube_pod_status_phase{
phase!="Running",
phase!="Succeeded"
} == 1
Node CPU saturation (alert threshold: > 80%):
100 - (avg by (node) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 80
Deployment replicas available vs desired:
kube_deployment_spec_replicas - kube_deployment_status_replicas_available > 0
Container restart rate (useful for crashloop detection):
rate(kube_pod_container_status_restarts_total[15m]) * 60 > 0Grafana Dashboard IDs for GKE
Import these directly in Grafana via Dashboards > Import > Enter dashboard ID:
| Dashboard | Grafana ID | What it shows |
| Kubernetes Cluster Monitoring | 315 | Cluster-wide CPU, memory, pods, nodes |
| Kubernetes Cluster (kube-prometheus) | 6417 | Nodes, pods, deployments overview |
| Kubernetes Namespace Resources | 7249 | Per-namespace CPU and memory |
| Node Exporter Full | 1860 | Per-node OS metrics |
| kube-state-metrics | 13332 | Kubernetes object states |
Note: Dashboard ID 315 requires node-exporter to be running. kube-prometheus-stack installs node-exporter automatically.
GKE-Specific Gotchas
- Private cluster webhook firewall: If you run a private GKE cluster, the kube-prometheus-stack admission webhooks require a firewall rule allowing the GKE control plane to reach port 8443 on your Prometheus Operator pod. Without this, CRD validation fails, and ServiceMonitors cannot be created. Either add the firewall rule or disable webhooks with –set prometheusOperator.admissionWebhooks.enabled=false for non-production clusters.
- Do not run GMP and self-hosted Prometheus scraping the same targets: If GMP is already enabled on your cluster and you also deploy kube-prometheus-stack without disabling GMP collection, you will have duplicate metric ingestion. Either disable GMP (–no-enable-managed-prometheus at cluster creation) or configure kube-prometheus-stack to scrape only your application metrics while GMP handles system metrics.
- Grafana persistence defaults are in-memory: The default kube-prometheus-stack Grafana installation has no persistent storage. Any dashboards you create or import are lost when the Grafana pod restarts. Always set grafana.persistence.enabled=true in production.
- Scrape interval for GKE: The default scrape interval in kube-prometheus-stack is 30 seconds, which is appropriate for most GKE workloads. For high-cardinality environments with many pods, increasing to 60 seconds reduces Prometheus memory pressure significantly.
When Prometheus and Grafana Metrics Are Not Enough
Prometheus and Grafana give you exceptional visibility into cluster resource health: CPU saturation, memory pressure, pod restarts, deployment availability, node status. These are the right tools for infrastructure-level alerting and capacity planning.
What they do not show is what is happening inside individual requests. When CPU spikes on a node, Prometheus tells you the node is hot. It does not tell you which service is generating the load, which API endpoint is being hit, or which downstream database call is taking 3 seconds on every request. Metrics tell you that something is wrong. Traces tell you why.
Teams running GKE typically already have a Prometheus and Grafana stack for infrastructure metrics. CubeAPM layers distributed tracing on top of that existing setup via the OpenTelemetry SDK – no agents to deploy on nodes, no changes to your Prometheus configuration. When a CPU alert fires in Grafana, engineers switch to CubeAPM to navigate directly to the service generating the load, the specific endpoint, and the trace that shows the slow downstream call responsible. The two tools answer different questions from the same incident. CubeAPM can be self-hosted inside your GKE cluster, keeping all trace data within your GCP project.
Summary
| Approach | Best for | Operational overhead |
| kube-prometheus-stack (self-hosted) | Full control, no extra GCP cost | You manage Prometheus scaling and storage |
| Google Managed Prometheus + Grafana | Large clusters, multi-cluster, long retention | Minimal – Google manages collection and storage |
| Both combined | App metrics in self-hosted, system metrics in GMP | Medium – requires careful target scoping to avoid duplicates |
Start with kube-prometheus-stack if you want a self-contained setup in under 10 minutes. Use GMP if you are managing multiple GKE clusters and want unified querying across all of them without running Prometheus servers per cluster. In either case, the GKE-specific RBAC step and persistent storage configuration are non-negotiable for a production-ready deployment.
Disclaimer : Commands, Helm values, and configuration examples are for guidance only – verify against current kube-prometheus-stack documentation and Google Managed Service for Prometheus documentation before applying to production. GKE behavior and GCP pricing change over time. CubeAPM references reflect genuine use cases; Evaluate all tools against your own requirements.
Also read:
What Are the Key AWS RDS CloudWatch Metrics to Watch?
How to Monitor AWS RDS PostgreSQL Slow Queries
How Do I Monitor AWS RDS with Prometheus?





