CubeAPM
CubeAPM CubeAPM

What Is Java Application Performance Monitoring (APM)? 

What Is Java Application Performance Monitoring (APM)? 

Table of Contents

Java Application Performance Monitoring (APM) is the practice of continuously measuring, collecting, and analyzing the behavior of a Java application in production. For Java specifically, APM covers two distinct layers: the application layer (request latency, error rates, throughput, distributed traces) and the JVM layer (heap memory, garbage collection, thread pools, class loading). Both layers need monitoring because JVM problems often manifest as application symptoms. A long GC pause looks like a slow request, and heap exhaustion looks like a timeout.

APM is distinct from general infrastructure monitoring. Infrastructure monitoring tells you a server’s CPU is high. Java APM tells you which method call or GC event caused the CPU spike, which request was slow because of it, and which downstream service that request was waiting on.

Key Takeaways

  • Java APM covers two layers: the application layer (latency, errors, throughput, traces) and the JVM layer (heap, GC, threads). Missing either layer creates blind spots
  • The Java agent is the primary instrumentation mechanism for APM. It attaches to the JVM at startup via the -javaagent flag and uses bytecode manipulation to inject telemetry without code changes
  • The OpenTelemetry Java agent (v2.27.0, May 2026, targeting OTel SDK 1.61.0) is the standard open-source instrumentation option. It supports Java 8+ and auto-instruments hundreds of libraries and frameworks, including Spring Boot, JDBC, gRPC, Kafka, and Hibernate
  • Since OTel Java agent v2.0.0, the default export protocol is HTTP/protobuf, not gRPC. This aligns with the OTel specification default
  • GC pause time is the most commonly overlooked Java performance signal. A GC pause freezes all application threads simultaneously. A 200ms pause is invisible in most infrastructure dashboards but causes request timeouts in latency-sensitive APIs
  • Distributed tracing is the feature that separates modern Java APM from JVM monitoring. Traces follow a single request across services, databases, and message queues, and show exactly where time is spent

What Java APM Monitors

Java APM data falls into three categories, each answering different questions.

Application-Level Signals

What these answer: is the application working correctly and fast enough for users?

SignalWhat it measuresWhy it matters
Request latencyTime from request received to response sentThe most direct measure of user experience. Alert on p95 and p99, not just the average
ThroughputRequests per secondBaseline for capacity planning and anomaly detection
Error ratePercentage of requests returning 5xx or exceptionsA rising error rate on a specific endpoint pinpoints a regression
Distributed tracesEnd-to-end request path across services and databasesShows exactly where time is spent in a slow request
Database query timeTime spent in JDBC, JPA, Hibernate, or R2DBC callsDatabase queries are the most common cause of Java service latency spikes
External HTTP call durationTime spent calling downstream servicesA slow dependency shows up here before it shows up in your own latency metrics

JVM-Level Signals

These answer: is the runtime environment healthy?

SignalWhat it measuresWhy it matters
Heap memory used vs maxCurrent heap usage as a percentage of the configured maximumAlert before 80%. At 100%, the JVM throws OutOfMemoryError
GC pause timeDuration of stop-the-world GC eventsPauses freeze all threads. Even 100ms pauses cause timeouts in real-time APIs
GC frequencyNumber of GC cycles per minuteHigh frequency with low recovery indicates a memory leak or undersized heap
Live thread countNumber of currently active JVM threadsUnexpected growth indicates a thread leak. A sudden drop may indicate deadlock
Thread pool queue depthPending tasks in executor thread poolsA growing queue means threads are not keeping up with incoming work
Non-heap memoryMemory used for class metadata, JIT-compiled code, string interningCan grow unboundedly in some deployment configurations

Infrastructure Correlation

Java APM becomes most useful when application and JVM signals are correlated with the infrastructure they run on: CPU utilization, network I/O, and disk I/O. A full GC that coincides with a CPU spike is a different problem from a GC that coincides with a pod being throttled.

How Java APM Agents Work

The most practical way to instrument a Java application for APM is the Java agent. It requires no changes to application code and no modifications to build files.

Bytecode manipulation at class load time. When a Java application starts with a -javaagent flag, the agent registers itself with the JVM’s instrumentation API. When the JVM loads a class, the agent intercepts the loading process and modifies the bytecode before the class is used. This modification injects telemetry collection into method calls such as HTTP handlers, database drivers, and messaging clients, without the application developer doing anything.

What this means in practice:

java -javaagent:opentelemetry-javaagent.jar \

  -Dotel.service.name=order-service \

  -Dotel.exporter.otlp.endpoint=http://otel-collector:4318 \

  -jar order-service.jar

This single line, with no code changes, gives you:

  • A span for every incoming HTTP request with method, route, status code, and duration
  • A span for every outgoing HTTP call with the target host and status code
  • A span for every database query with the SQL statement and duration
  • A span for every Kafka producer and consumer operation
  • JVM metrics: heap usage, GC pause time, thread counts, class loading
  • W3C TraceContext propagation on all outgoing HTTP calls

The OpenTelemetry Java Agent

The OTel Java agent (opentelemetry-javaagent.jar) is the official open-source instrumentation agent maintained by the OpenTelemetry project. It is the standard starting point for Java APM in 2026 for teams that are not using a commercial APM vendor.

Current version: v2.27.0 (May 2026), targeting OTel SDK 1.61.0. Requires Java 8 or above.

Download:

curl -L -o opentelemetry-javaagent.jar \

https://github.com/open-telemetry/opentelemetry-java-instrumentation/releases/latest/download/opentelemetry-javaagent.jar

Key facts about the agent:

  • Single JAR file containing the agent and all auto-instrumentation libraries (typically 50-60MB)
  • Default export protocol is HTTP/protobuf to http://localhost:4318. Changed from gRPC to HTTP/protobuf in v2.0.0 to align with the OTel specification
  • Supports hundreds of libraries and frameworks out of the box
  • Configuration is via -D system properties or environment variables. Environment variables take precedence
  • Declarative YAML-based configuration is supported from v2.26.0 onward via -Dotel.config.file=/path/to/otel-config.yaml

Supported frameworks and libraries (selection):

CategorySupported
Web frameworksSpring MVC, Spring WebFlux, Jakarta EE Servlets, Quarkus, Micronaut, Vert.x
HTTP clientsApache HttpClient, OkHttp, java.net.HttpURLConnection, Jetty client
DatabasesJDBC (all drivers), Hibernate, Spring Data, R2DBC, MongoDB, Redis (Jedis, Lettuce)
MessagingKafka, RabbitMQ, ActiveMQ, JMS
RPCgRPC, Thrift
CachingEhcache, Hazelcast
LoggingLog4j 2, Logback, java.util.logging (trace ID injection into log records)

JVM Garbage Collection: The Most Overlooked APM Signal

  • GC pauses deserve specific attention because they are the most common source of Java performance problems that are invisible to standard infrastructure monitoring.
  • Stop-the-world events. When the JVM runs a major GC cycle, it pauses all application threads simultaneously. During that pause, no requests are processed and no responses are sent. From a user’s perspective, the application freezes. From an infrastructure monitor’s perspective, nothing unusual happened. CPU usage may have been high during the GC, but the server was not down.
  • GC pause impact on request latency. A 300ms GC pause will add 300ms to the response time of every request that was in-flight during the pause, even if the request itself only takes 5ms. This shows up as a latency spike in APM traces but is completely invisible in CPU or memory dashboards.

Modern GC collectors and their trade-offs:

CollectorJVM flagBest forPause behavior
G1GC (default since Java 9)-XX:+UseG1GCGeneral-purpose workloadsPredictable, configurable pause targets. Pauses in the tens to hundreds of milliseconds range
ZGC-XX:+UseZGCLatency-sensitive services. Production-ready since Java 15. Generational ZGC (recommended mode) available since Java 21Sub-millisecond pauses regardless of heap size. Requires 15-30% more memory than G1GC
Shenandoah-XX:+UseShenandoahGCLow-latency with large heapsSub-millisecond pauses. Available in OpenJDK distributions
Parallel GC-XX:+UseParallelGCBatch processing, throughput-focusedLonger stop-the-world pauses acceptable in exchange for higher throughput

For latency-sensitive Java services on Java 21 or above, Generational ZGC (-XX:+UseZGC) is the recommended collector. It delivers consistent sub-millisecond pause times regardless of heap size, which eliminates GC pauses as a source of request latency spikes. The trade-off is 15 to 30% higher memory usage and 8 to 20% additional CPU overhead from concurrent GC threads.

APM vs Logging vs Infrastructure Monitoring

Java teams often have logging (via Logback or Log4j 2) and infrastructure monitoring (via Prometheus node exporter or cloud provider metrics) already in place. APM adds a third layer that neither of the others can replace.

What you need to knowLoggingInfrastructure monitoringAPM
This request took 800ms, where?No (logs show events, not spans)No (infra shows aggregate CPU/memory)Yes (distributed trace shows breakdown)
Error rate is rising on /checkoutPossible (if errors are logged with URL)NoYes (per-endpoint error rate)
Memory is growing, is it a leak?NoPartially (heap total)Yes (heap breakdown with GC correlation)
GC pause caused this latency spikeNoNoYes (GC pause timeline overlaid on traces)
Which SQL query is slow?No (unless explicitly logged)NoYes (JDBC span with SQL text and duration)
Downstream service is slowNo (unless you log it)NoYes (outbound HTTP span with target and latency)

OpenTelemetry vs Commercial Java APM Agents

The OTel Java agent and commercial APM agents (Datadog, Dynatrace, New Relic, AppDynamics) instrument Java applications using the same underlying mechanism: bytecode manipulation at class load time. The instrumentation approach is identical. What differs is where the data goes and what the backend does with it.

OTel Java agentCommercial APM agent
Vendor lock-inNone. Data goes to any OTLP-compatible backendProprietary format. Data goes to that vendor’s platform
Backend costYour choice. Open-source (Jaeger, Tempo) or commercialIncluded in vendor pricing, often per-host or per-user
Library coverageHundreds of libraries, community-maintainedComparable coverage, vendor-maintained
ConfigurationEnvironment variables or -D propertiesVendor-specific config files
Custom instrumentationOTel API (stable, vendor-neutral)Vendor-specific SDK
Data portabilityFull. Switch backends without re-instrumentingNone. Switching requires re-instrumentation

The standard recommendation in 2026 for new Java projects is to instrument with the OTel agent and choose a backend separately. This decouples the instrumentation decision from the vendor decision and preserves the ability to switch backends without touching application code.

How Java APM Works in Practice

A complete Java APM setup has four parts working together.

1. Instrumentation: The OTel Java agent attaches at startup and emits OTLP telemetry.

2. Collection: The OTel Collector receives OTLP, applies sampling and filtering, and routes to backends.

3. Storage: Traces go to Jaeger or Tempo, metrics go to Prometheus, logs go to Loki or Elasticsearch.

4. Analysis: Grafana queries all backends via their native query languages, correlating signals from the same request using the shared trace ID.

The OTel trace ID is the linking mechanism. When the Java agent injects a log record during a traced request, it adds the active trace ID to the log entry. When Grafana displays a slow trace span, it can use that trace ID to fetch the logs from that same request. This is the practical value of unified OTel instrumentation: the same context ID ties together the trace, the JVM metric at that moment, and the log line from that request.

Correlating JVM Internals with Request Traces: Where CubeAPM Fits

A GC pause, a thread pool queue backup, or a memory pressure event in JVM metrics tells you something is wrong at the runtime level. It does not tell you which in-flight requests were affected, which endpoints were most impacted, or whether the slowdown was isolated to one service or cascaded across a distributed call chain.

CubeAPM is purpose-built for Java teams and auto-instruments Spring Boot, Hibernate, Tomcat, and Kafka via the OTel Java agent with no additional configuration. It continuously tracks JVM internals, including heap usage, GC pause duration, and thread activity, and correlates them directly with distributed request traces. When an elevated error rate appears, CubeAPM links it to the specific SQL query, GC pause, or downstream call responsible. Its smart sampling preserves slow, error-prone, and unusual traces while cutting ingestion volume by up to 80%, which keeps costs manageable at scale. It runs self-hosted inside your own infrastructure at $0.15/GB ingestion with no per-user fees.

Summary

Java APM monitors two distinct layers simultaneously: application behavior (latency, errors, throughput, distributed traces) and JVM health (heap memory, GC pauses, thread activity). The OpenTelemetry Java agent is the standard open-source instrumentation mechanism, attaching to any Java 8+ application via the -javaagent flag with no code changes.

GC pause time is the most commonly missed signal in Java monitoring. It causes real user-facing latency but is invisible to infrastructure dashboards. Distributed tracing is the signal that ties everything together, showing exactly where time is spent across a request’s journey through services, databases, and message queues.

LayerWhat to monitorWhy it matters
ApplicationRequest latency (p95, p99), error rate, throughputDirect measure of user experience
Distributed tracesSpan breakdown per request, database query times, external call durationsPinpoints where time is spent in a slow request
JVM: heapUsed vs max, allocation rateHigh usage causes GC pressure and eventual OutOfMemoryError
JVM: GCPause duration, pause frequency, GC throughputPauses freeze all threads and spike user-facing latency
JVM: threadsLive count, thread pool queue depthThread leaks and pool saturation cause request queuing
JVM: non-heapMetaspace, code cacheCan grow unboundedly in some container configurations

Disclaimer: OTel Java agent version (v2.27.0), supported frameworks, and JVM GC details are verified against the OpenTelemetry Java instrumentation GitHub repository (github.com/open-telemetry/opentelemetry-java-instrumentation/releases), OpenTelemetry official documentation (opentelemetry.io/docs/languages/java, last modified May 20, 2026), Java platform release information (Java 26 current, Java 25 LTS), and CubeAPM Java APM documentation (cubeapm.com/blog/top-apm-tools-for-java) as of May 2026.

Also read:

How to Instrument a FastAPI App with OpenTelemetry 

What is the Difference Between OpenTelemetry and Prometheus?

What Is OpenTelemetry and How Does It Work? 

×
×