Stop-the-World Events: Why Your Application Pauses

Q: 1. What is the JVM memory model and why does it matter?

It defines memory interaction rules across threads, ensuring safe concurrency.

Q: 2. How does G1 GC differ from CMS?

G1 is region-based with predictable pauses; CMS was prone to fragmentation.

Q: 3. When should I use ZGC or Shenandoah?

For latency-sensitive applications needing ultra-low pause times.

Q: 4. What are JVM safepoints?

Points where all threads pause for GC or JIT optimizations.

Q: 5. How do I solve long STW pauses?

Switch to concurrent collectors, tune heap, and monitor GC logs.

Q: 6. What are the trade-offs of throughput vs latency in GC?

Throughput maximizes work done, latency minimizes pause times.

Q: 7. How do I read STW duration from logs?

Use -XX:+PrintGCDetails and analyze with JMC or GCViewer.

Q: 8. How does JIT compilation affect STW?

JIT can trigger STW when compiling methods, though usually short.

Q: 9. What’s new in Java 21 for STW reduction?

Project Lilliput reduces object header size, cutting GC pause times.

Q: 10. How does GC differ in microservices vs monoliths?

Microservices focus on predictable latency; monoliths optimize throughput.

Stop-the-World Events: Why Your Application Pauses

By Ashwani Kumar Last updated: 09 Sep 2025

If you’ve ever noticed sudden pauses in your Java application—even when CPU usage seemed normal—you’ve likely encountered a Stop-the-World (STW) event. These pauses are moments when the JVM halts all application threads so it can perform critical internal tasks like Garbage Collection (GC), JIT compilation, or class redefinition.

In this tutorial, we’ll explore what Stop-the-World events are, why they happen, their impact on performance, and strategies to minimize them in production systems.

Why Stop-the-World Events Matter

Directly impact application latency.
Critical in low-latency systems like trading platforms and real-time services.
Understanding STW is essential for GC tuning and troubleshooting.

Analogy: Imagine a busy highway where all traffic is stopped so workers can repair the road. Once repairs finish, cars move again. This is exactly how JVM STW pauses work.

What is a Stop-the-World Event?

A Stop-the-World (STW) event occurs when the JVM pauses all non-JVM threads to safely perform memory management or other internal operations. No application code executes during this pause.

Common Triggers of STW Events

Garbage Collection (GC) → Most frequent cause.
JIT Compilation → Methods compiled to native code.
Heap Dump Creation → For debugging.
Class Redefinition / Loading → During dynamic changes.
Biased Lock Revocation → Synchronization-related.

Garbage Collection and STW

Most garbage collectors in the JVM rely on STW events:

Minor GC → Brief pause to clean the Young Generation.
Major/Full GC → Longer pause to clean the Old Generation.
Compact Phase → Moves objects to reduce fragmentation.

Even concurrent collectors (G1, ZGC, Shenandoah) require short STW phases for safety.

Example: Observing STW in Action

public class STWExample {
    public static void main(String[] args) {
        for (int i = 0; i < 1000000; i++) {
            String s = new String("STW-" + i);
        }
        System.gc(); // Suggests a Full GC -> STW pause likely
        System.out.println("Finished");
    }
}

Running this with -XX:+PrintGCDetails will show GC pauses indicating STW events.

Impact of STW Events

Latency spikes in user-facing systems.
Throughput reduction in batch systems.
Unpredictable performance in microservices.
Jitter in real-time applications.

GC Algorithms and STW Duration

Serial GC → Long STW, not for large heaps.
Parallel GC → Uses multiple threads, shorter pauses.
CMS → Concurrent phases but still has STW.
G1 GC → Region-based, predictable pause times.
ZGC & Shenandoah → Concurrent, ultra-low STW (<10ms).

Monitoring STW Events

Java Flight Recorder (JFR) → Low-overhead monitoring.
Java Mission Control (JMC) → Visualize pauses and GC activity.
VisualVM → Heap and GC graphs.
jstat → Command-line GC statistics.

JVM Tuning to Reduce STW Pauses

-XX:+UseG1GC → Use G1 for balanced pauses.
-XX:+UseZGC or -XX:+UseShenandoahGC → For ultra-low latency.
-XX:MaxGCPauseMillis=<n> → Target pause times.
-Xms / -Xmx → Proper heap sizing to reduce Full GCs.
Profile workloads before tuning.

Real-World Case Study

A high-frequency trading application suffered 500ms pauses due to Full GCs. By switching from CMS to ZGC and tuning heap size, pauses dropped to <5ms, enabling stable, predictable latency.

JVM Version Tracker

Java 8 → Parallel GC default, CMS widely used.
Java 9 → G1 became default.
Java 11 → ZGC introduced (experimental).
Java 17 → ZGC & Shenandoah stable.
Java 21+ → NUMA-aware GC and Project Lilliput improve pause efficiency.

Best Practices

Use modern collectors (G1, ZGC, Shenandoah).
Avoid unnecessary System.gc() calls.
Tune Young vs Old Gen sizes to balance collections.
Monitor GC logs regularly.
For microservices, prioritize predictable low-latency GC.

Conclusion & Key Takeaways

Stop-the-World events pause all threads for JVM internal work.
GC is the primary cause, but other JVM activities also trigger STW.
Modern GCs minimize pause duration, but STW cannot be eliminated entirely.
Monitoring and tuning are essential for reducing STW impact in production.

FAQs

1. What is the JVM memory model and why does it matter?
It defines memory interaction rules across threads, ensuring safe concurrency.

2. How does G1 GC differ from CMS?
G1 is region-based with predictable pauses; CMS was prone to fragmentation.

3. When should I use ZGC or Shenandoah?
For latency-sensitive applications needing ultra-low pause times.

4. What are JVM safepoints?
Points where all threads pause for GC or JIT optimizations.

5. How do I solve long STW pauses?
Switch to concurrent collectors, tune heap, and monitor GC logs.

6. What are the trade-offs of throughput vs latency in GC?
Throughput maximizes work done, latency minimizes pause times.

7. How do I read STW duration from logs?
Use -XX:+PrintGCDetails and analyze with JMC or GCViewer.

8. How does JIT compilation affect STW?
JIT can trigger STW when compiling methods, though usually short.

9. What’s new in Java 21 for STW reduction?
Project Lilliput reduces object header size, cutting GC pause times.

10. How does GC differ in microservices vs monoliths?
Microservices focus on predictable latency; monoliths optimize throughput.

Stop-the-World Events: Why Your Application Pauses

Why Stop-the-World Events Matter

What is a Stop-the-World Event?

Common Triggers of STW Events

Garbage Collection and STW

Example: Observing STW in Action

Impact of STW Events

GC Algorithms and STW Duration

Monitoring STW Events

JVM Tuning to Reduce STW Pauses

Real-World Case Study

JVM Version Tracker

Best Practices

Conclusion & Key Takeaways

FAQs

📖 Part of a Series