Senior 19 min · March 05, 2026

Garbage Collection in Java

Java GC — Unbounded Cache Full GC Spiral

Q: What is the difference between a young GC and a full GC?

Young GC collects only the young generation (eden and survivor) and is fast because most objects die young. Full GC collects the entire heap, compacting all regions, and can take seconds or tens of seconds. Full GC should be a rare event in a well-tuned service.

Q: How do I enable GC logging for a running JVM without restarting?

Use `jcmd VM.log output=gc.log what=gc*` (JDK 10+). For JDK 8/9, you need to restart with `-Xloggc: ` or use `-XX:+UnlockDiagnosticVMOptions -XX:+LogVMOutput -XX:+PrintGCDetails -XX:+PrintGCDateStamps`. For persistent change, add the flag to the JVM startup command.

Q: What is the recommended heap size for a containerized Java application?

Set -XX:MaxRAMPercentage=75.0 (or -XX:MaxRAMFraction=1) so the JVM uses 75% of container memory for heap. The remaining 25% is for native memory (threads, metaspace, GC overhead). Never set -Xmx equal to container memory limit.

G1 Full GC from unbounded cache spikes p99 latency to 30s+ and kills Kubernetes pods.

Naren Founder & Principal Engineer

20+ years shipping production Java in banking & fintech. Drawn from code that ran under real load.

✓ Production

production tested

May 23, 2026

last updated

1,554

articles · all by Naren

● Production Incident 🔎 Debug Guide ⚙ Triage Commands

⚡Quick Answer

Java GC automatically reclaims memory from unreachable objects by tracing from GC roots
The heap is divided into young (Eden + Survivor) and old generations — most objects die young
G1 is the default collector, using region-based evacuation with tunable pause targets
ZGC and Shenandoah provide sub-10ms pauses at the cost of higher CPU and native memory overhead
Biggest production mistake: treating GC as a black box without enabling GC logging or measuring allocation rate

✦ Definition~90s read

What is Garbage Collection in Java?

Garbage Collection in Java is the JVM's automatic memory management mechanism. The GC periodically identifies objects that are no longer reachable from any GC root (thread stacks, static fields, JNI references) and reclaims their heap memory. This eliminates manual memory management but introduces pauses and CPU overhead that must be managed in production.

★

Imagine you're at a big party and everyone keeps leaving empty cups on tables.

The JVM determines object reachability through a reachability analysis starting from GC roots. An object is considered dead — and eligible for collection — when no chain of references from any root can reach it. This is fundamentally different from reference counting (used in early Python/PHP) which cannot handle cyclic references.

Java's tracing GC handles cycles naturally because it only cares about reachability, not reference count.

The key production insight: GC does not run when memory is low. GC runs when allocation pressure triggers it. This means a service with a large heap but low allocation rate may run GC infrequently, while a service with a small heap and high allocation rate may run GC constantly. Allocation rate, not heap size, is the primary driver of GC frequency.

Plain-English First

Imagine you're at a big party and everyone keeps leaving empty cups on tables. You hired a cleaner (the Garbage Collector) whose only job is to walk around, spot cups nobody is holding anymore, and throw them away so there's room for fresh drinks. The cleaner doesn't interrupt the party every second — they work in bursts, and sometimes they have to pause everything to do a deep clean. That pause is what Java developers are always trying to shrink. Java's GC is exactly that cleaner: it automatically finds objects your program no longer references and reclaims their memory so you never have to call free() yourself.

Every Java application runs a second program inside the JVM — the Garbage Collector. It decides when memory gets freed, how long your threads pause, and whether your latency SLAs hold up under load. Most developers treat it like a black box and then wonder why their microservice spikes to 500ms every few seconds in production.

Before automatic memory management, C and C++ developers had to manually allocate and free every byte. Java solved this with a managed heap and a runtime that tracks object reachability — if nothing in your program can reach an object, its memory can be reclaimed. That single idea eliminated an entire class of bugs but introduced a new challenge: the collector itself consumes CPU and introduces pauses.

The core misconception: GC pauses are inevitable and unfixable. They are not. Modern collectors offer pause-time guarantees independent of heap size — but only if you understand the trade-offs and tune correctly for your workload.

What is Garbage Collection in Java?

io/thecodeforge/gc/ReachabilityDemo.javaJAVA

package io.thecodeforge.gc;

import java.util.ArrayList;
import java.util.List;

/**
 * Demonstrates how Java GC determines object reachability.
 *
 * Key concept: An object is reachable if any GC root can access it
 * through a chain of references. When the chain breaks, the object
 * becomes eligible for collection.
 */
public class ReachabilityDemo {

    public static void main(String[] args) {
        // Object created on the heap — referenced by local variable 'order'
        // 'order' is a GC root (stack reference)
        Order order = new Order("ORD-001", 149.99);
        System.out.println("Order created: " + order.getId());

        // After this reassignment, the original Order object has no
        // reachable references. It becomes eligible for GC.
        order = new Order("ORD-002", 299.99);
        // The first Order("ORD-001") is now unreachable — GC will reclaim it

        // Demonstrating cyclic references — GC handles this correctly
        OrderNode nodeA = new OrderNode("A");
        OrderNode nodeB = new OrderNode("B");
        nodeA.next = nodeB;
        nodeB.next = nodeA; // cycle: A -> B -> A

        // Even though A and B reference each other, if we null out
        // our stack references, both become unreachable and are collected
        nodeA = null;
        nodeB = null;
        // The cycle A -> B -> A is still intact in memory, but no GC root
        // can reach either node. Both are eligible for collection.
    }

    static class Order {
        private final String id;
        private final double amount;

        Order(String id, double amount) {
            this.id = id;
            this.amount = amount;
        }

        String getId() { return id; }
    }

    static class OrderNode {
        final String name;
        OrderNode next;

        OrderNode(String name) {
            this.name = name;
        }
    }
}

Output

Order created: ORD-001

GC Roots — What Counts as a Root

Local variables on thread stacks — every active method frame holds references to objects it is using
Static fields of loaded classes — ClassLoader roots keep static objects alive for the lifetime of the class
JNI references — native code can hold references that the JVM must respect
Active monitors — objects currently locked by a thread are temporarily rooted during GC

Production Insight

The most common cause of memory leaks in production Java services is unintentional GC root retention. A static Map that accumulates entries, a ThreadLocal that is never cleaned, or a listener that is never deregistered creates a chain of references from a root that the GC cannot break. Use heap dumps (jmap -dump:live,format=b,file=heap.hprof) and analyze with Eclipse MAT to find dominator trees — the objects keeping the most memory alive through root chains.

Key Takeaway

GC reclaims memory from objects that no GC root can reach. Cyclic references are handled correctly by tracing GC. The #1 production memory leak pattern is objects retained through static fields, ThreadLocals, or unremoved listeners — not missing free() calls.

thecodeforge.io

Java GC: Unbounded Cache Full GC Spiral

Garbage Collection Java

The Generational Heap — Why Most Objects Die Young

The JVM heap is divided into generations based on the weak generational hypothesis: most objects die young, and objects that survive one collection are likely to survive many more. This observation drives the generational heap design that every modern JVM collector uses.

The young generation consists of eden space (where new objects are allocated) and two survivor spaces (S0 and S1). New objects are allocated in eden. When eden fills up, a minor GC (young collection) runs: live objects in eden are copied to one survivor space, and live objects in the other survivor space are also copied and aged. Objects that survive enough young collections (controlled by -XX:MaxTenuringThreshold) are promoted to the old generation.

The old generation holds long-lived objects. When the old generation fills up or a collection threshold is reached, a major GC runs. In G1, this is a mixed GC that collects both young and old regions. In extreme cases, a full GC (stop-the-world compaction of the entire heap) is triggered — this is the catastrophic failure mode you must avoid.

The critical production insight: the tenuring threshold determines how quickly objects move to old generation. Too low, and short-lived objects pollute old generation, increasing old gen GC frequency. Too high, and survivor spaces overflow, forcing premature promotion. Both paths degrade performance.

io/thecodeforge/gc/GenerationalBehaviorDemo.javaJAVA

package io.thecodeforge.gc;

import java.util.ArrayList;
import java.util.List;

/**
 * Demonstrates how allocation patterns interact with the generational heap.
 *
 * Objects that survive young collections are promoted to old generation.
 * Understanding this promotion mechanism is critical for tuning.
 */
public class GenerationalBehaviorDemo {

    /**
     * Pattern 1: Short-lived objects — ideal for generational GC.
     * These objects die in eden and never reach old generation.
     * GC can reclaim them with a fast young collection.
     */
    public void processRequest() {
        // These objects are created, used, and become unreachable
        // within a single method call. They die in eden.
        String requestId = java.util.UUID.randomUUID().toString();
        byte[] payload = new byte[4096];
        List<String> validationErrors = new ArrayList<>();

        // After this method returns, all three objects become unreachable
        // because they are only referenced by local variables (stack roots).
    }

    /**
     * Pattern 2: Long-lived cached objects — promoted to old gen.
     * These objects survive young collections and get promoted.
     * They occupy old generation permanently (or until eviction).
     *
     * Production risk: If this cache grows unbounded, old generation
     * fills up and triggers full GC or OOM.
     */
    private final List<byte[]> longLivedCache = new ArrayList<>();

    public void cacheData(byte[] data) {
        // This reference keeps the byte array alive indefinitely.
        // After surviving MaxTenuringThreshold young collections,
        // it is promoted to old generation.
        longLivedCache.add(data);
    }

    /**
     * Pattern 3: Premature promotion — objects that should die young
     * but get promoted because survivor space is full.
     *
     * If allocation rate exceeds survivor space capacity, objects
     * are promoted directly to old generation even if they are short-lived.
     * This is called premature promotion and it pollutes old generation.
     *
     * Fix: Increase survivor space ratio (-XX:SurvivorRatio)
     *       or reduce allocation rate.
     */
    public void burstAllocation() {
        // If this loop runs fast enough to fill eden AND overflow
        // survivor space, these temporary objects get promoted to
        // old generation even though they die after each iteration.
        for (int i = 0; i < 100_000; i++) {
            byte[] temp = new byte[256];
            // temp is short-lived, but under pressure it may be
            // prematurely promoted to old generation
        }
    }

    /**
     * Production tuning flags for generational behavior:
     *
     * -XX:NewRatio=2              // old:young = 2:1 (default for most collectors)
     * -XX:SurvivorRatio=8         // eden:survivor = 8:1 (default)
     * -XX:MaxTenuringThreshold=15 // objects survive 15 young GCs before promotion
     * -XX:+AlwaysTenure            // promote immediately (dangerous — avoid)
     * -XX:+NeverTenure             // never promote (survivor overflow → old gen)
     *
     * Monitor promotion rate with:
     *   jstat -gcutil <pid> 1000
     *   Watch 'O' column (old gen utilization) for steady growth.
     *   Steady growth with low live data = premature promotion.
     */
}

The Weak Generational Hypothesis — The Foundation of All Modern GC

If 90% of objects die in eden, collecting eden reclaims 90% of garbage with minimal work
Young collection only scans eden + survivor spaces — not the entire heap. This is fast.
Old generation collection is expensive because it must handle long-lived object graphs
The hypothesis fails for workloads with uniform object lifetimes — batch processing, data pipelines
When the hypothesis fails, you see high promotion rates and frequent old gen collections

Production Insight

Monitor promotion rate as a leading indicator of GC health. Use jstat -gcutil and watch the bytes promoted from young to old generation per GC cycle. A healthy service promotes < 5% of young gen per cycle. If promotion rate exceeds 20%, your objects are living too long in young gen — either increase -XX:MaxTenuringThreshold, increase survivor space (-XX:SurvivorRatio=6), or investigate why short-lived objects are escaping young gen (common cause: objects stored in thread-local caches or request-scoped maps that persist across requests).

Key Takeaway

The generational heap exploits the statistical fact that most objects die young. Young collection is fast because it only scans eden + survivor. Old generation is expensive to collect. Premature promotion — short-lived objects reaching old gen — is a silent performance killer. Monitor promotion rate with jstat -gcutil.

GC Algorithms — Mark-Sweep, Copying, and Compaction

All GC algorithms are built on three fundamental operations: marking (identifying live objects), sweeping (reclaiming dead objects' memory), and compacting (defragmenting live objects to create contiguous free space). Different collectors combine these operations differently to optimize for pause time, throughput, or memory efficiency.

Mark-and-sweep identifies live objects (mark phase) then reclaims unmarked memory (sweep phase). The problem: it creates fragmentation. After many allocation-deallocation cycles, free memory is scattered in small chunks. Large object allocations may fail even when total free memory is sufficient — this is external fragmentation.

Copying collectors solve fragmentation by copying live objects to a fresh region and discarding the old region entirely. This is inherently compacting — live objects end up contiguous. The cost: copying live objects takes time proportional to the live data set, and you need double the memory (from-space and to-space). The generational heap reduces this cost by only copying in young generation.

Mark-and-compact identifies live objects then slides them to one end of the heap, creating one contiguous free region. This avoids the double-memory cost of copying but requires updating every reference to moved objects — a potentially expensive operation that must be done during a stop-the-world pause or with complex concurrent mechanisms.

io/thecodeforge/gc/GCAlgorithmDemo.javaJAVA

package io.thecodeforge.gc;

import java.util.ArrayList;
import java.util.List;

/**
 * Demonstrates how different GC algorithm characteristics
 * affect production behavior.
 *
 * This is not a GC implementation — it illustrates the concepts
 * that drive real collector design decisions.
 */
public class GCAlgorithmDemo {

    /**
     * MARK-AND-SWEEP characteristic:
     * - Fast reclaim but creates fragmentation
     * - External fragmentation: total free > requested, but not contiguous
     *
     * Production impact: After hours of operation, allocation of large
     * objects fails even though 40% of heap is free — it's fragmented.
     * This triggers unnecessary full GC or OOM.
     */
    public void demonstrateFragmentation() {
        // Imagine this array is the heap, each index is a memory block
        // true = occupied, false = free
        boolean[] heap = new boolean[100];

        // Simulate allocation pattern: allocate and free alternating blocks
        for (int i = 0; i < 100; i++) {
            heap[i] = true; // allocate
        }
        for (int i = 0; i < 100; i += 2) {
            heap[i] = false; // free every other block
        }
        // Result: 50% free, but no contiguous block of size > 1
        // A request for 3 contiguous blocks fails despite 50 free blocks
        // This is external fragmentation — the problem compaction solves
    }

    /**
     * COPYING COLLECTOR characteristic:
     * - Copies live objects to to-space, discards from-space
     * - Inherently compacting — no fragmentation
     * - Cost: proportional to live data, not dead data
     * - Requires double the memory (from + to spaces)
     *
     * Production insight: Copying cost is why large live data sets
     * cause longer young collection pauses. If your service has
     * 2GB of live objects in young gen, copying takes measurable time.
     */
    public void demonstrateCopyingCost() {
        // Simulating live data that must be copied during young GC
        List<byte[]> liveObjects = new ArrayList<>();
        for (int i = 0; i < 10_000; i++) {
            liveObjects.add(new byte[1024]); // 1KB each = ~10MB live data
        }

        // During young GC, all 10MB must be copied to survivor space.
        // If only 1MB were live, the cost would be 10x lower.
        // This is why reducing live data in young gen reduces pause time.
        //
        // Real production fix: avoid holding references to temporary
        // objects across request boundaries. Let them die in eden.
    }
}

The Three Fundamental GC Operations

Serial GC: mark-sweep-compact, all stop-the-world. Simple but pauses grow with heap.
Parallel GC: same algorithm as Serial but uses multiple threads. Faster but same pause characteristics.
G1: mark + concurrent sweep via region evacuation. Compaction happens per-region, not whole-heap.
ZGC: concurrent mark + concurrent compact via colored pointers. All phases concurrent except initial/final mark.
Shenandoah: concurrent mark + concurrent compact via Brooks pointers. Similar to ZGC with different implementation.

Production Insight

Fragmentation is the silent killer of long-running services. After days of operation, a heap with 40% free memory may fail to allocate a 10MB object because no contiguous 10MB block exists. This triggers a full GC to compact the heap. Monitor fragmentation with jcmd <pid> GC.heap_info and look at free region distribution. G1 handles fragmentation well through region-based evacuation. If you see increasing full GC frequency over time without increasing live data, fragmentation is the cause.

Key Takeaway

All GC algorithms are built on mark, sweep, and compact. Fragmentation is the primary failure mode of mark-and-sweep. Copying collectors solve fragmentation but cost proportional to live data. Modern collectors (G1, ZGC, Shenandoah) do as much work concurrently as possible to minimize stop-the-world pauses.

G1 GC — The Default Workhorse

G1 (Garbage-First) has been the default JVM collector since Java 9. It divides the heap into equal-sized regions (1MB to 32MB) and prioritizes collecting regions with the most garbage — hence 'garbage-first'. G1 maintains a remembered set per region tracking incoming references, enabling independent region collection without scanning the entire heap.

G1 operates in young-only and mixed collection cycles. Young GC collects survivor and eden regions. When the heap occupancy exceeds the Initiating Heap Occupancy Percent (IHOP), G1 triggers a concurrent marking cycle. After marking completes, subsequent mixed GCs collect both young and old regions identified as mostly garbage.

The critical production insight: G1's pause time is primarily driven by the number of regions it must collect in a single pause, not heap size. A 64GB heap with aggressive evacuation can pause longer than a 4GB heap with conservative settings. This is the opposite of what most engineers assume.

io/thecodeforge/gc/G1TuningExample.javaJAVA

package io.thecodeforge.gc;

import java.util.concurrent.ConcurrentHashMap;
import java.util.Map;

/**
 * Demonstrates allocation patterns that stress G1 differently.
 *
 * Key insight: G1 humongous objects (>50% region size) bypass normal
 * allocation and can trigger to-space exhausted failures.
 */
public class G1TuningExample {

    // Cache with large value objects — common source of humongous allocations
    private final Map<String, byte[]> payloadCache = new ConcurrentHashMap<>();

    /**
     * BAD: Allocates objects that may exceed humongous threshold.
     * With default 1MB region size, objects > 512KB are humongous.
     * With 32MB regions, threshold is 16MB — much safer for large payloads.
     *
     * Tuning: -XX:G1HeapRegionSize=32M
     *         -XX:G1ReservePercent=15
     *         -XX:InitiatingHeapOccupancyPercent=35
     */
    public void cacheLargePayload(String key, int sizeBytes) {
        byte[] payload = new byte[sizeBytes];
        for (int i = 0; i < Math.min(sizeBytes, 1024); i++) {
            payload[i] = (byte) (i & 0xFF);
        }
        payloadCache.put(key, payload);
    }

    /**
     * BETTER: Chunk large payloads to stay below humongous threshold.
     * Each chunk is independently collectible as a regular object.
     */
    public void cacheChunkedPayload(String key, byte[] fullPayload) {
        int chunkSize = 256 * 1024; // 256KB chunks
        int numChunks = (fullPayload.length + chunkSize - 1) / chunkSize;

        for (int i = 0; i < numChunks; i++) {
            int offset = i * chunkSize;
            int length = Math.min(chunkSize, fullPayload.length - offset);
            byte[] chunk = new byte[length];
            System.arraycopy(fullPayload, offset, chunk, 0, length);
            payloadCache.put(key + ":chunk:" + i, chunk);
        }
    }

    /**
     * Production G1 flags for a 16GB heap with mixed allocation profile:
     *
     * -XX:+UseG1GC
     * -Xms16g -Xmx16g
     * -XX:G1HeapRegionSize=16m
     * -XX:MaxGCPauseMillis=200
     * -XX:G1ReservePercent=15
     * -XX:InitiatingHeapOccupancyPercent=35
     * -XX:G1MixedGCCountTarget=8
     * -XX:G1MixedGCLiveThresholdPercent=85
     * -Xlog:gc*,gc+humongous=debug:file=/var/log/gc.log:time,uptime,level,tags
     */
}

G1's Core Mental Model: Region-Based Evacuation

Pause time scales with live data in collected regions, not total heap size
Humongous objects break this model — they span multiple regions and cannot be partially evacuated
Remembered sets consume 5-10% of heap as off-heap overhead — budget for this when setting -Xmx
To-space exhausted means G1 literally ran out of regions to evacuate into — this is a full GC fallback

Production Insight

G1's -XX:MaxGCPauseMillis is a soft target, not a hard guarantee. G1 will attempt to meet this by adjusting how many regions to collect per cycle, but allocation rate spikes can violate it. If you need hard latency guarantees, G1 is the wrong collector. Monitor actual pause times against your SLA — if G1 violates MaxGCPauseMillis more than 5% of the time, the workload demands ZGC or Shenandoah.

Key Takeaway

G1 is the right default for most workloads, but it has a hard ceiling on pause-time predictability. Once your latency budget drops below ~100ms p99, evaluate ZGC or Shenandoah. Never tune G1 without GC logs enabled — the default logging is insufficient for production diagnosis.

G1 Tuning Decision Tree

IfHumongous allocations appearing in GC logs

→

UseIncrease -XX:G1HeapRegionSize to reduce humongous threshold. Max region size is 32MB. Chunk large objects at the application level if possible.

IfMixed GCs are too frequent, causing throughput loss

→

UseIncrease -XX:G1MixedGCCountTarget (default 8) to spread collection over more cycles. Adjust -XX:G1MixedGCLiveThresholdPercent to collect only regions with more garbage.

IfFull GC appearing despite adequate heap

→

UseIHOP is miscalibrated. Set -XX:InitiatingHeapOccupancyPercent lower (try 35) or enable -XX:+G1UseAdaptiveIHOP (Java 10+) to let G1 self-tune.

IfPause times exceed MaxGCPauseMillis consistently

→

UseLive data set is too large for G1's evacuation budget. Either reduce live data (caching strategy) or migrate to ZGC/Shenandoah where pause times are independent of live data size.

ZGC — Sub-Millisecond Pause Collector

ZGC (Z Garbage Collector) was introduced as experimental in JDK 11 and became production-ready in JDK 15. Its defining characteristic: pause times stay below 10ms regardless of heap size — tested up to 16TB heaps. ZGC achieves this through concurrent everything: marking, relocation, and reference processing all happen while application threads run.

ZGC uses load barriers with colored pointers. Every object reference carries metadata bits (marked0, marked1, remap, finalize) embedded in the pointer itself. The load barrier intercepts every object access to check if the reference needs remapping. This is the fundamental trade-off: ZGC replaces long GC pauses with per-access overhead on every object load.

As of JDK 21, ZGC supports generational mode (-XX:+ZGenerational) which dramatically improves throughput by focusing collection on young objects. Non-generational ZGC collects the entire heap every cycle, which limits throughput on allocation-heavy workloads.

io/thecodeforge/gc/ZGCTuningExample.javaJAVA

package io.thecodeforge.gc;

import java.util.concurrent.atomic.AtomicLong;

/**
 * ZGC-specific considerations for production workloads.
 *
 * ZGC trades per-access overhead for near-zero pause times.
 * The load barrier adds ~4-8% overhead on pointer-heavy workloads.
 */
public class ZGCTuningExample {

    private final AtomicLong allocationCounter = new AtomicLong(0);

    /**
     * Production ZGC flags for a 32GB heap, latency-sensitive service:
     *
     * -XX:+UseZGC
     * -XX:+ZGenerational              // JDK 21+ — critical for throughput
     * -Xms32g -Xmx32g                // Always set Xms=Xmx for ZGC
     * -XX:SoftMaxHeapSize=28g         // ZGC-specific: target heap occupancy
     * -XX:ZCollectionInterval=5       // Suggest GC cycle every 5 seconds
     * -XX:ConcGCThreads=4             // Concurrent GC threads
     * -Xlog:gc*:file=/var/log/zgc.log:time,uptime,level,tags
     *
     * CRITICAL: ZGC uses ~20% native memory overhead beyond -Xmx.
     * Container memory limit must be heap * 1.25 minimum.
     */

    /**
     * ZGC SoftMaxHeapSize is unique — it tells ZGC to try to stay below
     * this threshold but can exceed it under allocation pressure.
     *
     * Use case: Set heap to 32GB, SoftMaxHeapSize to 28GB.
     * ZGC will trigger cycles aggressively to stay under 28GB.
     * Only allocates into the remaining 4GB under extreme pressure.
     */
    public void demonstrateSoftMaxHeapConcept() {
        // With SoftMaxHeapSize=28g and Xmx=32g:
        // - ZGC targets 28GB occupancy
        // - If allocation pressure pushes past 28GB, ZGC cycles more aggressively
        // - If it hits 32GB, allocation stalls (not OOM, but backpressure)
    }
}

ZGC's Core Mental Model: Colored Pointers + Load Barriers

Pause times are truly independent of heap size and live data size — tested to 16TB
The trade-off is per-access CPU overhead, not pause time — you pay on every object load
ZGC cannot use compressed object pointers (UseCompressedOops) — increases memory usage by ~15% on heaps < 32GB
Generational ZGC (JDK 21+) reduces overhead dramatically by focusing on young generation

Production Insight

ZGC's biggest production risk is native memory consumption. ZGC multi-maps the heap across multiple virtual address spaces for colored pointer management, and this multi-mapping eats into the process's virtual address space. Budget container memory as heap 1.25 for ZGC versus heap 1.15 for G1. Also, ZGC requires a 64-bit system — it does not run on 32-bit.

Key Takeaway

ZGC is the correct choice when p99 latency must be below 10ms and you can afford 10-15% throughput overhead. Enable generational mode on JDK 21+. Budget 25% extra native memory beyond heap size. ZGC's SoftMaxHeapSize is the most underrated production feature for containerized deployments.

Shenandoah — Red Hat's Low-Pause Contender

Shenandoah is Red Hat's concurrent compacting collector, available as production-ready since JDK 12. It achieves low pause times through concurrent evacuation — moving live objects while application threads run — using Brooks pointers (an indirection layer on every object).

Shenandoah differs from ZGC in a critical way: it uses Brooks pointers (every object has a forwarding pointer field) instead of colored pointers. This means Shenandoah does not require specific pointer bit layouts and works with compressed oops, reducing memory overhead compared to ZGC on heaps under 32GB.

Shenandoah operates in three concurrent phases: concurrent mark, concurrent evacuate, and concurrent update-refs. The initial mark and final mark phases are short stop-the-world pauses, typically under 10ms. Shenandoah's pacing mechanism backpressures allocation threads proportionally when the collector falls behind, creating smoother degradation than ZGC's hard allocation stalls.

io/thecodeforge/gc/ShenandoahTuningExample.javaJAVA

package io.thecodeforge.gc;

import java.util.ArrayList;
import java.util.List;

/**
 * Shenandoah-specific production considerations.
 *
 * Shenandoah uses Brooks pointers — every object has an extra forwarding
 * pointer field. This adds 8 bytes per object on 64-bit systems.
 */
public class ShenandoahTuningExample {

    /**
     * Brooks pointer overhead calculation:
     *
     * Object with 2 fields (16 bytes header + 16 bytes data = 32 bytes)
     * + 8 bytes Brooks pointer = 40 bytes per object
     * Overhead: 25% increase per object
     *
     * For 10 million small objects: ~80MB additional memory
     * For 100 million small objects: ~800MB additional memory
     */
    public long estimateBrooksOverhead(int objectCount) {
        return (long) objectCount * 8;
    }

    /**
     * Production Shenandoah flags for a 16GB heap:
     *
     * -XX:+UseShenandoahGC
     * -Xms16g -Xmx16g
     * -XX:ShenandoahGCHeuristics=adaptive
     * -XX:ShenandoahAllocationThreshold=10
     * -XX:+UseCompressedOops               // works with Shenandoah (unlike ZGC)
     * -Xlog:gc*:file=/var/log/shenandoah.log:time,uptime,level,tags
     */

    /**
     * Shenandoah pacing is a unique feature that backpressures allocation
     * threads when the collector falls behind.
     *
     * Unlike ZGC which stalls allocation entirely, Shenandoah slows down
     * allocating threads proportionally. This creates smoother latency
     * degradation under load rather than sharp spikes.
     */
    public void demonstratePacingBehavior() {
        List<byte[]> allocations = new ArrayList<>();

        // Under heavy allocation, Shenandoah will pace this loop
        // by adding small delays to each allocation.
        // The delay is proportional to how far behind the collector is.
        for (int i = 0; i < 100_000; i++) {
            allocations.add(new byte[1024]);
        }
    }
}

Shenandoah's Core Mental Model: Brooks Pointers + Concurrent Evacuation

No load barrier overhead — Shenandoah uses store barriers instead, which fire less frequently
Works with compressed oops — saves ~15% memory compared to ZGC on heaps under 32GB
Per-object overhead of 8 bytes — significant for workloads with many small objects
Pacing mechanism creates graceful degradation instead of hard allocation stalls

Production Insight

Shenandoah's biggest production risk is the Brooks pointer overhead on small-object-heavy workloads. If your service has 100M+ objects under 64 bytes, the 8-byte Brooks pointer per object adds ~800MB of overhead. Profile with compressed oops disabled to see true memory consumption. Additionally, Shenandoah's pacing can create subtle latency degradation that is hard to distinguish from application-level slowness — always correlate pacing delays with latency metrics.

Key Takeaway

Shenandoah is the right choice when you need low-pause GC on moderate heaps (< 32GB) and want compressed oops support. Its pacing mechanism creates smoother degradation than ZGC's allocation stalls. The Brooks pointer overhead is the hidden cost — budget 8 bytes per object.

JVM Flags That Actually Matter

Most JVM GC flags have sensible defaults. A small subset moves the needle in production. Understanding which flags to adjust — and when — prevents the common anti-pattern of blindly copying flags from blog posts without understanding their impact on your specific workload.

Flags fall into three categories: heap sizing, collector behavior, and logging. Heap sizing flags (-Xms, -Xmx, -XX:NewRatio) control memory layout. Collector behavior flags (-XX:MaxGCPauseMillis, -XX:InitiatingHeapOccupancyPercent) control collection strategy. Logging flags (-Xlog:gc) enable observability. The third category is the most important — you cannot tune what you cannot measure.

📚 RELATED NEXT STEPS

→ JVM Memory Model — Understand the heap regions these flags operate on → JVM GC Tuning Guide: G1, ZGC, Shenandoah Explained with Real Trade-offs — Production flag selection and GC algorithm trade-offs

→ JVM Memory Issues in Production: Debugging Guide (OOM, GC, Leaks) — When flags alone are not enough and you need live incident triage

io/thecodeforge/gc/ProductionJVMFlags.javaJAVA

package io.thecodeforge.gc;

/**
 * Production JVM flag configurations organized by collector.
 * These are starting points — tune based on measured workload characteristics.
 */
public class ProductionJVMFlags {

    /**
     * UNIVERSAL FLAGS (apply to all collectors):
     *
     * -Xms<size> -Xmx<size>         // Set min=max to avoid resize overhead
     * -XX:+AlwaysPreTouch             // Pre-zero heap pages at startup
     * -XX:+DisableExplicitGC          // Ignore System.gc() calls
     * -XX:+HeapDumpOnOutOfMemoryError // Auto heap dump on OOM
     * -XX:HeapDumpPath=/var/log/      // Where to write heap dumps
     * -XX:+UseContainerSupport        // Respect cgroup limits (default JDK 10+)
     * -XX:MaxRAMPercentage=75.0       // Set heap as % of container memory
     * -XX:NativeMemoryTracking=detail // Track off-heap memory usage
     *
     * LOGGING FLAGS (always enable in production):
     * -Xlog:gc*:file=/var/log/gc.log:time,uptime,level

The Flag Hierarchy — What to Tune First

First: Set -Xms = -Xmx to prevent resize overhead. Size heap based on container limits, not guesswork.
Second: Enable GC logging. You cannot tune what you cannot measure. This alone solves 50% of debugging issues.
Third: Adjust collector-specific flags only after measuring with logging enabled.
Never: Copy flags from blog posts without understanding your workload's allocation profile.

Production Insight

The most impactful single flag change is enabling GC logging. Most production services run with default or minimal GC logging, making post-incident diagnosis impossible. A single line -Xlog:gc*:file=/var/log/gc.log:time,uptime,level,tags:filecount=5,filesize=50m provides pause time breakdowns, heap occupancy trends, and humongous allocation detection. Enable it before you need it — GC logs are retroactive only if they were already enabled.

Key Takeaway

Most JVM GC flags have sensible defaults. The three flags that matter most: (1) -Xms=-Xmx to prevent resize, (2) GC logging flags for observability, (3) collector-specific flags only after measuring. Never copy-paste JVM flags from the internet without profiling your own workload.

GC Algorithm Comparison: Serial, Parallel, G1, ZGC, Shenandoah

Choosing the right garbage collector depends on your workload's pause-time sensitivity, heap size, and throughput requirements. The table below summarizes the key characteristics of each major collector available in the JVM.

Collector	Pause Model	Heap Size	Primary Use Case	Java Version
Serial	Stop-the-world (STW) single-thread	<1GB	Small applications, client-side, embedded	Since JDK 1.2
Parallel	STW multi-thread	1-8GB	Throughput-oriented batch jobs, analytics	Since JDK 1.2 (default JDK 5-8)
G1	Region-based STW + concurrent marking	1GB-64GB+	General-purpose server applications	Since JDK 7 (default JDK 9+)
ZGC	Concurrent (STW < 10ms)	4GB-16TB	Ultra-low latency, large heaps	Experimental JDK 11, prod JDK 15+
Shenandoah	Concurrent (STW < 10ms)	1GB-64GB	Low latency with memory efficiency	Since JDK 12 (backported to 8, 11)

Key takeaway: For most web services, start with G1. Only move to ZGC or Shenandoah when your measured p99 latency exceeds 100ms after tuning G1. Serial and Parallel are legacy choices for resource-constrained or batch workloads.

Production Insight

The comparison above is based on default configurations. Actual production behavior depends on allocation rate, live data size, and object distribution. Always profile with your workload before making a collector switch. The most common mistake is switching to ZGC for a 2GB heap service — the native memory overhead and lack of compressed oops can increase memory consumption by 30%, leading to OOM kills.

Key Takeaway

G1 is the default for a reason — it balances throughput and pause time for most workloads. ZGC and Shenandoah offer sub-10ms pauses but cost throughput and memory. Serial and Parallel are specialized tools for batch processing or tiny heaps.

System.gc() and finalize() — Patterns to Avoid

Two legacy Java mechanisms that should be avoided in production: System.gc() and finalize(). Both degrade GC performance and unpredictability.

System.gc() — An explicit request to run the garbage collector. It's a hint, not a command, but JVM often treats it as a full GC trigger (especially with -XX:+DisableExplicitGC disabled). Calling it frequently causes unnecessary full GC pauses, wrecking latency. Also, some frameworks like RMI, NIO, and JNDI call it internally. Always set -XX:+DisableExplicitGC in production to mitigate accidental calls.

finalize() — The finalize() method, defined in Object, runs before an object is reclaimed. It's unpredictable — the JVM may never call it before exit, and GC threads can finalize objects out of order. Additionally, finalize() can resurrect objects by assigning this to a reachable reference. The method also introduces latency as the JVM must finalize objects in a separate pass. Since Java 9, finalize() is deprecated. Use Cleaner (JDK 9+), PhantomReference with a cleanup thread, or AutoCloseable / try-with-resources instead.

io/thecodeforge/gc/AvoidSystemGCAndFinalize.javaJAVA

package io.thecodeforge.gc;

import java.lang.ref.Cleaner;

/**
 * Demonstrates how to avoid System.gc() and finalize().
 * 
 * BAD PRACTICES:
 * 1. Calling System.gc() - triggers unnecessary full GC
 * 2. Overriding finalize() - unpredictable, deprecated.
 *
 * GOOD: Use Cleaner (JDK 9+) or PhantomReference with reference queue.
 */
public class AvoidSystemGCAndFinalize {

    // BAD - Avoid this
    @Override
    @Deprecated(since = "9")
    protected void finalize() throws Throwable {
        try {
            // Cleanup logic here - but this may never run!
            close();
        } finally {
            super.finalize();
        }
    }

    private void close() {
        System.out.println("Cleanup (if finalize runs)");
    }

    // GOOD - Use Cleaner (JDK 9+)
    private static final Cleaner CLEANER = Cleaner.create();

    // State that needs cleaning
    private final Cleaner.Cleanable cleanable;

    public AvoidSystemGCAndFinalize() {
        // Register a cleaning action
        this.cleanable = CLEANER.register(this, () -> {
            // This runs when the object becomes phantom-reachable
            System.out.println("Cleanup via Cleaner");
        });
    }

    public static void main(String[] args) {
        // NEVER do this:
        // System.gc(); // tells JVM to run GC - pauses, unpredictable

        // Better: let GC decide.
        // Disable explicit calls with -XX:+DisableExplicitGC
        
        // Use Cleaner or try-with-resources for cleanup.
    }
}

Production Risk: System.gc() in Libraries

Some third-party libraries (RMI, JNDI, direct buffer management) call System.gc() internally. Without -XX:+DisableExplicitGC, these calls trigger full GC in your application, causing latency spikes. Always disable explicit GC in production, but test thoroughly — some frameworks rely on it for cleanup.

Production Insight

Even with -XX:+DisableExplicitGC, System.gc() is silently ignored. Best practice: always set this flag in production. For resource cleanup (file handles, sockets), use try-with-resources or Cleaner. Never rely on finalize() — it's deprecated and removed in future JDK versions (proposed for removal in JDK 18+).

Key Takeaway

Avoid System.gc() and finalize() at all costs in production code. Use -XX:+DisableExplicitGC to ignore explicit GC calls. Prefer try-with-resources for deterministic cleanup, and Cleaner for native resource cleanup.

Advantages and Disadvantages of Garbage Collection

Garbage Collection is a mixed blessing. It eliminates manual memory management bugs but introduces new operational challenges. The table below summarizes the trade-offs.

Advantages	Disadvantages
Eliminates memory leaks caused by forgotten `free()` calls	Introduces pauses (stop-the-world) that affect latency
Prevents dangling pointer bugs - objects are only reused after being unreachable	CPU overhead – GC threads consume processor time
Handles cyclic references automatically (unlike reference counting)	Memory overhead – additional per-object metadata (mark bits, forwarding pointers)
Reduces developer cognitive load – no manual memory management	Performance unpredictability – pauses vary with allocation pattern
Enables memory-safe concurrent programming with bounded overhead	Full GC occasionally compacts the entire heap, causing multi-second pauses
Provides tools for analysis (heap dumps, GC logs) to diagnose issues	Tuning requires deep understanding of collector algorithms and application behavior
Monitored at runtime – GC logs give insight into object lifetimes	Cannot control exactly when memory is reclaimed – objects may linger in old gen

Key takeaway: The disadvantages can be mitigated with proper collector selection and tuning. For most production services, the benefits far outweigh the costs, but ignore the downsides at your peril.

Production Insight

The biggest hidden disadvantage is the 'death by a thousand cuts' effect: a service with 50ms young GC pauses every second spends 5% of its time in GC. Combined with mixed GCs, remark pauses, and occasional full GCs, the total GC overhead can exceed 10% without any single pause being catastrophic. Track total GC time as a percentage of wall-clock time using GC logs – alert if it exceeds 5% for latency-sensitive services.

Key Takeaway

GC removes an entire class of programming errors but introduces pause and CPU overhead. Modern collectors minimize pauses but cannot eliminate them entirely. The key is to choose the right collector for your latency and throughput budget.

GC Tuning Flags Reference Table

This table lists the most important GC tuning flags along with their purpose and typical values. Use it as a quick reference when configuring JVM options for production.

Flag	Affects	Purpose	Typical Value / Range
`-Xms`, `-Xmx`	Heap size	Set initial and maximum heap	Equal values, e.g., `-Xms4g -Xmx4g`
`-XX:MaxGCPauseMillis`	G1	Soft target for maximum pause time	50–200ms (default 200)
`-XX:G1HeapRegionSize`	G1	Size of each region (humongous threshold = 50% of region)	1–32MB, power of 2
`-XX:InitiatingHeapOccupancyPercent`	G1	Heap occupancy % to trigger concurrent marking	30–45 (default 45)
`-XX:G1ReservePercent`	G1	Reserve % of heap for evacuation failures	10–20 (default 10)
`-XX:ConcGCThreads`	All concurrent	Number of threads for concurrent GC work	Auto-detected, typically n-1 cores
`-XX:+DisableExplicitGC`	All	Ignore `System.gc()` calls	Always enable in production
`-XX:+UseContainerSupport`	All	Respect container memory limits	Enabled by default JDK 10+
`-XX:MaxRAMPercentage`	All	Set max heap as % of container memory	75–85 (default 25 if not set!)
`-XX:+AlwaysPreTouch`	All	Commit heap pages at startup to reduce runtime latency	Enable for large heaps
`-XX:NativeMemoryTracking`	All	Track off-heap memory usage	summary or detail
`-XX:+HeapDumpOnOutOfMemoryError`	All	Generate heap dump on OOM	Enable for diagnosis
`-XX:+ZGenerational`	ZGC	Enable generational mode (JDK 21+)	Always enable on JDK 21+
`-XX:SoftMaxHeapSize`	ZGC	Target heap occupancy for ZGC (hints GC to cycle earlier)	75–90% of Xmx
`-XX:ShenandoahGCHeuristics`	Shenandoah	Collection policy: adaptive, compact, or static	adaptive
`-Xlog:gc*`	Logging	Enable GC logging with details	`-Xlog:gc*,gc+phases=debug:file=gc.log:time,uptime`

Key takeaway: The most impactful flags are GC logging (for observability) and heap sizing. Tuning collector-specific flags without enabling logs is like fixing a car engine blindfolded – possible but wasteful.

Production Insight

A common mistake is setting -XX:MaxRAMPercentage incorrectly. Many container images leave it at default (25%), causing the JVM to allocate only 25% of container memory as heap. Always explicitly set -XX:MaxRAMPercentage=75.0 (or MaxRAMFraction=1) to utilize available memory. Also, never set -Xmx equal to container memory – you need room for native overhead.

Key Takeaway

Use this reference table when configuring JVM flags for a new service. Start with logging and heap sizing, then add collector-specific flags based on observed behavior. Test flag changes in staging before applying to production.

Practice Problems: GC Diagnosis and Tuning

Test your understanding of GC concepts with these five practical problems. Each problem presents a real-world scenario; identify the issue and propose a fix or tuning change.

Problem 1: Unbounded Cache Scenario: A user service caches profile objects in a HashMap. Over a weekend spike, GC logs show rising old-gen occupancy followed by frequent full GCs. P99 latency jumps from 50ms to 5s. Question: What is the likely cause and the immediate fix? Answer: Unbounded cache retains all entries, filling old gen. Immediate fix: apply size and time-based eviction (e.g., Caffeine with maximumSize and expireAfterWrite).

Problem 2: Large Object Allocation Scenario: A service using G1 with default region size (1MB) allocates many 800KB byte arrays. GC logs show numerous humongous allocation warnings and to-space exhaustion. Question: What tuning change can reduce humongous objects? Answer: Increase G1HeapRegionSize (e.g., -XX:G1HeapRegionSize=4m) so 800KB objects are under the 50% humongous threshold. Alternatively, chunk large allocations.

Problem 3: Metaspace OOM Scenario: After deploying a new microservice, pods restart every few hours with OutOfMemoryError. Heap is not full; metaspace shows steady growth. Thread count is stable. Question: What is the likely root cause and how to diagnose? Answer: Class loader leak (e.g., from repeated dynamic class generation or redeployment). Use -XX:NativeMemoryTracking=detail and jcmd to monitor metaspace. Consider -XX:MaxMetaspaceSize to limit, but fix the leak.

Problem 4: Long Pauses on 64GB Heap Scenario: A data processing service uses Parallel GC on a 64GB heap. Full GC pauses exceed 60 seconds. Changing to G1 reduces pauses but they are still >2s. Question: What should be the next step? Answer: G1 pauses scale with live data. If live data is >30GB, G1 cannot meet sub-second pauses. Consider switching to ZGC or Shenandoah, which have pause times independent of heap size.

Problem 5: Allocation Rate Spike Scenario: During a flash sale, the order service's allocation rate spikes from 100 MB/s to 2 GB/s. GC is triggered every few hundred milliseconds, CPU at 80% GC threads. Question: What is the best approach to reduce GC pressure? Answer: Optimize application code to reduce allocation (use object pooling, reuse buffers, avoid String concatenation in loops). If spikes are unavoidable, adjust heap sizing and consider ZGC for concurrent collection. Also, increase NewSize to absorb young allocation spikes.

gc.logLOG

[2026-03-05T14:23:45.123+0000] GC(52) Pause Young (Normal) (G1 Evacuation Pause) 2048M->512M(8192M) 48.123ms
[2026-03-05T14:23:45.171+0000] GC(53) Pause Young (Normal) (G1 Evacuation Pause) 2560M->1024M(8192M) 51.789ms
[2026-03-05T14:23:45.223+0000] GC(54) Pause Full (Allocation Failure) 4096M->2048M(8192M) 12345.678ms  # <-- Problem 1: full GC due to old gen exhaustion
[2026-03-05T14:23:57.568+0000] GC(55) Pause Young (Normal) (G1 Evacuation Pause) 2048M->1024M(8192M) 52.345ms
[2026-03-05T14:24:01.234+0000] GC(56) Humongous allocation of size 819200 bytes (800KB) detected. Region size 1MB, threshold 512KB.  # <-- Problem 2: humongous allocation warning

Approach to These Problems

Start by asking: what is the allocation pattern? Is the leak in heap or non-heap? Use GC logs, jstat, and heap dumps to gather data before proposing a tuning change.

Production Insight

These practice problems are distilled from real incidents. The unbounded cache problem alone accounts for 30% of GC-related production outages. Practicing diagnosis in a controlled setting trains the instincts needed during a live incident. Internalizing these five patterns covers 80% of common GC failures.

Key Takeaway

GC problems almost always stem from application code (caching, allocation rate) rather than default JVM settings. Tune the application first, then the collector. Use GC logs to confirm your hypothesis before making changes.

Why Objects Become Unreachable (And Why That Matters)

Every production outage I've debugged that boiled down to a GC problem started with one thing: an object that should have died but didn't. Or worse, an object that died too late.

Unreachable means zero active references. Not "I think it's done." Not "nobody should need it." Zero references on the stack or from any GC root (static fields, JNI handles, active threads). The JVM doesn't care about your intentions. It traces live references from roots outward. Everything not reached during that trace is dead.

Here's the kicker: objects can become unreachable faster than you expect. A local reference inside a method block? Gone after the method returns. An object passed to a collection that gets cleared? Eligible immediately. But the reverse is also true — a single stray reference keeps an entire object graph alive. That's how "small" memory leaks bring down production services.

Understanding reachability isn't academic. It's the difference between writing code that the GC can efficiently reclaim and code that forces full GCs every hour.

ReachabilityTrap.javaJAVA

// io.thecodeforge — java tutorial

public class ReachabilityTrap {
    private static List<byte[]> leakList = new ArrayList<>();

    public static void main(String[] args) {
        // This object is local — becomes unreachable after method exit
        processData();

        // This object is held by a static reference — stays alive forever
        while (true) {
            leakList.add(new byte[1024 * 1024]); // 1 MB each
            try { Thread.sleep(100); } catch (InterruptedException e) {}
        }
    }

    static void processData() {
        byte[] temp = new byte[10 * 1024 * 1024]; // 10 MB
        // temp is the ONLY reference to this 10 MB array
        // After this method returns, temp goes out of scope
        System.out.println("Data processed");
    }
}

Output

Data processed

(program crashes with OutOfMemoryError within seconds)

Production Trap:

A static collection that accumulates objects but never clears is the single most common memory leak pattern I see in Java services. Use WeakHashMap or explicit size limits if the data isn't truly immortal.

Key Takeaway

An object stays alive as long as any active reference chain exists. One forgotten reference = infinite lifetime.

The Two Types of GC Activity: Minor vs. Major

You can't tune GC properly if you don't understand that garbage collection runs on two distinct modes: minor and major. They're not the same thing, and confusing them gets you fired.

Minor GC happens in the Young Generation. It's fast. The JVM stops the world, copies all live objects from Eden to a survivor space, clears Eden, and resumes. Typical pause: 1-10 milliseconds. This is your friend. A healthy application should survive on minor GCs alone for 99% of its lifetime.

Major GC (or Full GC) hits the Old Generation. This is where the JVM does mark-sweep-compact across the entire heap. Pause times balloon: 100ms, 500ms, even seconds. A full GC every few hours? Fine. Every few minutes? You have a problem — either your survivor space sizing is wrong, or you're creating long-lived objects that should be short-lived.

The critical insight: you want to avoid promoting objects to Old Generation prematurely. Each object that survives a minor GC gets an age increment. When it exceeds tenuring threshold (default 15 for G1), it's promoted. If your survivor spaces are too small, objects get promoted early, fill Old Gen, and trigger frequent full GCs.

Monitor your promotion rate. If it's higher than expected, your objects are living too long.

PromotionAnalysis.javaJAVA

// io.thecodeforge — java tutorial
// Simulate different object lifetimes to see GC impact

public class PromotionAnalysis {
    private static final int OBJECT_COUNT = 1_000_000;

    public static void main(String[] args) {
        // Short-lived objects: die in Young Gen
        for (int i = 0; i < OBJECT_COUNT; i++) {
            byte[] temp = new byte[100];
        }
        System.out.println("Short-lived done — minor GC only");

        // Long-lived objects: promoted to Old Gen
        List<byte[]> holders = new ArrayList<>();
        for (int i = 0; i < OBJECT_COUNT / 10; i++) {
            holders.add(new byte[100]);
        }
        // Holders survive — these get promoted
        System.out.println("Long-lived done — full GC coming");
    }
}

Output

Short-lived done — minor GC only

Long-lived done — full GC coming

(Full GC pause: ~150ms on a 4GB heap)

Senior Shortcut:

Add -XX:+PrintGCDetails -XX:+PrintTenuringDistribution to your JVM flags. Watch the 'Desired survivor size' line. If survivor spaces fill above 50%, increase -XX:SurvivorRatio from the default 8 to 4.

Key Takeaway

Minor GCs are cheap and should dominate. Full GCs are expensive — minimize them by keeping objects short-lived and survivor spaces properly sized.

Requesting GC: System.gc() Is a Hint, Not a Command

I've seen junior devs sprinkle System.gc() like seasoning. "The app's memory is high, I'll tell GC to run." Stop. Please.

System.gc() is a suggestion. The JVM can ignore it entirely. Modern collectors like G1 and ZGC often do. But even when they run it, you're paying for a full GC — and you just threw away all of the collector's adaptive sizing data. The JVM has been monitoring allocation rates, promotion patterns, and pause times to optimize future GCs. Calling System.gc() resets those heuristics. Your app will run slower for minutes afterward.

There are three legitimate reasons to call System.gc(): 1. Right before a heap dump (to minimize garbage in the dump) 2. During testing, to verify GC behavior under controlled conditions 3. After a known burst of short-lived object creation that the collector hasn't processed yet

That's it. If you think you need it in production, you almost certainly have a different problem: a memory leak, oversized heap, or wrong collector choice. Fix the real problem, don't call System.gc().

And for heaven's sake, never call System.gc() in a loop, in a request handler, or inside a timer thread. I've seen all three. Each time it caused a production incident.

DontDoThis.javaJAVA

// io.thecodeforge — java tutorial
// Demonstrates why System.gc() hurts performance

public class DontDoThis {
    public static void main(String[] args) {
        long start = System.nanoTime();

        // Real work
        List<Integer> data = new ArrayList<>();
        for (int i = 0; i < 10_000_000; i++) {
            data.add(i);
        }

        long mid = System.nanoTime();
        System.out.println("Work took: " + (mid - start) / 1_000_000 + " ms");

        // Production antipattern: force GC
        for (int i = 0; i < 5; i++) {
            System.gc();
        }

        long end = System.nanoTime();
        System.out.println("After GC spam: " + (end - mid) / 1_000_000 + " ms wasted");
    }
}

Output

Work took: 143 ms

After GC spam: 890 ms wasted

(Your request latency just went up 6x for nothing)

Production Trap:

If you hot-deploy code that calls System.gc(), you'll see a latency spike immediately. The GC trigger is synchronous — it blocks the calling thread until GC completes. Don't. Just don't.

Key Takeaway

System.gc() is a hint that most modern JVMs ignore or throttle. If you think you need it, you have a real problem elsewhere. Fix that instead.

Customizing GC Settings in Jelastic PaaS

Jelastic PaaS exposes heap and collector flags via topology manifests and environment variables, not raw JVM arguments. You set JAVA_OPTS or use the Cloud Scripting env block to override GC strategies per node. For example, switching from G1 to Shenandoah on a production layer requires adding JAVA_OPTS=-XX:+UseShenandoahGC to the manifest. Memory limits are tied to cloudlet quotas: the heap maximum defaults to 85% of the container's RAM, which you can cap with -Xmx inside the same variable. The trap is that Jelastic's auto-tuning may silently revert your flags on node restart if you edit them via the admin panel instead of the jelastic.env file. Always commit GC changes through version-controlled manifest sections — every cloudlet restart will read from there. This prevents production surprises when horizontal scaling spawns new nodes that inherit the wrong collector.

JelasticGCConfig.javaJAVA

// io.theforge — java tutorial
// (C) Jelastic manifest snippet — not runnable directly

public class JelasticGCConfig {
    // Exposed via Jelastic env variable JAVA_OPTS
    // Switch default G1 to Shenandoah:
    //      -XX:+UseShenandoahGC
    //      -Xmx2048m
    //      -XX:MaxGCPauseMillis=100
    // This is injected before main() by the platform
    public static void main(String[] args) {
        // Verify active GC at runtime:
        for (GarbageCollectorMXBean gc : ManagementFactory
            .getGarbageCollectorMXBeans()) {
            System.out.println(gc.getName());
            // Expected: Shenandoah Pauses
        }
    }
}

Output

Shenandoah Pauses

Production Trap:

Jelastic automatically recycles dead containers. If you set GC flags via the admin dashboard instead of the manifest, flags disappear after a node restart. Always version your GC settings in the manifest file.

Key Takeaway

In PaaS environments, GC customization lives in deployment manifests, not inside application code.

GC Implementations: HotSpot's Historical Lineage

The JVM isn't one GC — it's seven major implementations baked into HotSpot. The Serial collector uses a single thread for both minor and full GCs, suitable for single-core or client machines. Parallel (Throughput) GC employs multiple threads but stops-the-world for both young and old collection — good for batch jobs. G1 splits the heap into regions and predicts pause targets, becoming the default in JDK 9. ZGC uses load barriers and colored pointers to achieve sub-millisecond pauses regardless of heap size; it starts scanning live objects before pausing. Shenandoah evolved differently — it uses a Brooks pointer forwarding technique to relocate objects concurrently, even during the compaction phase. Each implementation betrays a different trade-off: Serial sacrifices throughput for footprint, Parallel sacrifices latency for throughput, and ZGC/Shenandoah sacrifices throughput for latency. Choose based on your application's tolerance for pause time versus raw processing speed.

ListGCImplementations.javaJAVA

// io.theforge — java tutorial
import java.lang.management.ManagementFactory;

public class ListGCImplementations {
    public static void main(String[] args) {
        System.out.println("Active GCs:");
        ManagementFactory.getGarbageCollectorMXBeans()
            .forEach(gc -> System.out.println("  " + gc.getName()));
        // Run with: -XX:+UseZGC
        // Output — ZGC only shows one collector name
    }
}

Output

Active GCs:

ZGC

Hidden Variation:

Each GC implementation manages the heap differently — Serial uses a contiguous old space, G1 uses regions, ZGC uses a sparse heap with forwarding tables. The same -Xmx flag changes behavior across collectors.

Key Takeaway

There is no single 'Java GC' — the JVM hosts at least seven distinct implementations, each with a unique pause/throughput profile.

Overview

Garbage collection in Java is an automatic memory management process that reclaims heap space occupied by objects no longer referenced by the application. It frees developers from manual memory deallocation, preventing two critical bugs: dangling pointers and memory leaks. However, GC is not free—it consumes CPU cycles and introduces pauses (stop-the-world events). The JVM’s heap is divided into young generation (Eden, Survivor spaces) and old generation. Most objects die young; a Minor GC in Eden is cheap. Objects that survive multiple cycles get promoted to the old generation, where a Major GC (full collection) is more expensive. Understanding when and why objects become unreachable is essential: losing all strong references, circular references between unreachable objects, or references from cleared weak/soft references. The choice of GC algorithm—Serial, Parallel, G1, ZGC, or Shenandoah—depends on latency vs. throughput trade-offs. Java’s GC has evolved from a simple mark-sweep to ultra-low-pause collectors that handle terabytes of heap without freezing applications.

GCOverviewDemo.javaJAVA

// io.thecodeforge — java tutorial
// 25 lines max
public class GCOverviewDemo {
    public static void main(String[] args) {
        // Object becomes unreachable after scope ends
        for (int i = 0; i < 100_000; i++) {
            String s = new String("temp");
        } // s eligible for GC here
        
        // Explicitly nulling a reference (unnecessary usually)
        Object leak = new Object();
        leak = null; // eligible now
        
        // Circular reference still collectable
        class Node { Node next; }
        Node a = new Node();
        Node b = new Node();
        a.next = b; b.next = a;
        a = null; b = null; // both collectable
        System.gc(); // hint, not guarantee
    }
}

Output

No output (GC runs asynchronously)

Production Trap:

Calling System.gc() can trigger full GCs that pause all threads. Modern collectors like G1 ignore it by default with -XX:+DisableExplicitGC.

Key Takeaway

GC automatically frees unreachable objects; understanding reachability is key to avoiding performance pitfalls.

Conclusion

Java’s garbage collection is a powerful abstraction that eliminates manual memory management, but it requires thoughtful tuning to avoid latency spikes and throughput degradation. The key insight is that GC behavior is determined by object reachability: as long as references exist from live roots (stack, static fields, JNI handles), objects remain alive. Understanding why objects become unreachable—scope exit, null assignment, weak reference clearing—lets you predict GC load. The two types of GC activity, Minor and Major, have drastically different pause profiles; optimizing object allocation rates reduces Minor GC frequency, while avoiding accidental retention prevents expensive Major collections. Modern collectors like ZGC and Shenandoah achieve sub-millisecond pauses even on multi-terabyte heaps by performing most work concurrently. However, no collector is a silver bullet: low-latency collectors trade CPU overhead for responsiveness. The future of Java GC includes generational ZGC and continued improvements to G1. Effective GC tuning starts with monitoring (GC logs, JFR), identifying pause patterns, and then adjusting flags like heap size, survivor ratio, and concurrent threads. Always test changes under realistic production load, and favor default settings from Java 17+ unless metrics prove otherwise.

GCTuningCheck.javaJAVA

// io.thecodeforge — java tutorial
// 25 lines max
import java.util.ArrayList;
import java.util.List;

public class GCTuningCheck {
    public static void main(String[] args) {
        List<byte[]> holder = new ArrayList<>();
        // Simulate accidental retention (bad)
        for (int i = 0; i < 100; i++) {
            holder.add(new byte[1024 * 1024]); // 1MB each
        }
        // Without clearing holder, objects stay reachable
        // This forces Major GC if memory tight
        System.out.println("Allocated 100 MB; clearing reference");
        holder.clear(); // now eligible for GC
        System.gc(); // hint only
    }
}

Output

Allocated 100 MB; clearing reference

Production Trap:

Accidentally holding references in static collections is the #1 cause of memory leaks. Always clear collections or use WeakHashMap for caches.

Key Takeaway

GC tuning is about measuring, not guessing. Monitor GC logs and start with defaults before tweaking flags.

● Production incidentPOST-MORTEMseverity: high

Full GC Spiral Crashes Order Processing Service During Flash Sale

Symptom

Order API p99 latency spiked from 80ms to 30+ seconds. Kubernetes liveness probes failed, triggering pod restarts. After restart, the pattern repeated within 10 minutes. GC logs showed 'Pause Full (Allocation Failure)' with increasing frequency.

Assumption

Team assumed the heap was too small and doubled -Xmx from 4GB to 8GB. The problem persisted — full GC pauses were longer because the live data set was larger.

Root cause

The service cached order objects in a ConcurrentHashMap with no eviction policy. Under flash sale traffic, the cache grew unbounded until old generation was 98% full. G1 could not reclaim enough space during mixed GCs because most old regions contained live cached data. Concurrent marking kept running but found almost nothing collectible. Eventually, young generation allocation failed and G1 fell back to a full GC stop-the-world pause. Doubling the heap only delayed the inevitable — the cache still grew unbounded.

Fix

Three-part fix: (1) Added size-bounded eviction to the order cache using Caffeine with maximumSize(50000) and expireAfterWrite(Duration.ofMinutes(30)). (2) Enabled GC logging with -Xlog:gc*,gc+humongous=debug:file=/var/log/gc.log to monitor heap pressure proactively. (3) Set -XX:InitiatingHeapOccupancyPercent=35 to trigger concurrent marking earlier, giving mixed GCs more cycles to reclaim space before allocation pressure hit.

Key lesson

Unbounded caches are the #1 cause of GC-related production incidents in Java services
Full GC 'Allocation Failure' means the collector cannot free enough space — it is not a tuning problem, it is an application memory management problem
Doubling heap without fixing the allocation pattern just delays the same failure with a longer full GC pause
Every production service must have a bounded eviction strategy for any in-memory data structure
Monitor old generation utilization sustained above 85% as a leading indicator of full GC risk

Production debug guideFollow this path when GC is suspected as the root cause of latency or availability issues.5 entries

Symptom · 01

Latency spikes correlate with GC pauses in application logs

→

Fix

Enable GC logging with -Xlog:gc*,gc+phases=debug:file=gc.log:time,uptime,level,tags and correlate pause timestamps with latency metrics. Check if pauses are young GC, mixed GC, or full GC.

Symptom · 02

Full GC appearing frequently in steady-state traffic

→

Fix

Full GC signals the collector cannot keep up. Check for unbounded caches, humongous allocation rate, heap fragmentation, or metaspace exhaustion. Use jmap -histo to identify which object types dominate the heap.

Symptom · 03

Throughput drops but pause times are acceptable

→

Fix

Collector is consuming too much CPU. Check concurrent GC thread count (-XX:ConcGCThreads). Reduce if GC CPU usage exceeds 15-20% of total. Profile allocation rate — if > 2GB/sec, reduce allocation pressure at the application level.

Symptom · 04

OOM kill with no heap exhaustion visible in metrics

→

Fix

Check native memory: metaspace, thread stacks, direct byte buffers, mmap regions. Use -XX:NativeMemoryTracking=detail and jcmd <pid> VM.native_memory summary.

Symptom · 05

GC pause time increases linearly with heap size

→

Fix

G1 pauses scale with live data set, not heap size. If pauses scale with heap, evaluate switching to ZGC or Shenandoah where pauses are independent of heap size.

★ GC Triage Cheat Sheet — First 60 SecondsFast diagnostic commands when GC is suspected. Run these before diving into GC logs.

Application unresponsive, suspected full GC−

Immediate action

Check if JVM is in a GC stop-the-world pause

Commands

jcmd <pid> GC.heap_info

jstat -gcutil <pid> 1000 10

Fix now

If Full GC count is incrementing, check for unbounded caches and heap fragmentation immediately. Restart with -Xlog:gc+humongous=debug

High CPU with low application throughput+

Latency spikes at regular intervals+

OOM kill by container orchestrator (k8s)+

Allocation failure in logs, to-space exhausted+

Key takeaways

GC reclaims memory from unreachable objects. The primary driver of GC frequency is allocation rate, not heap size.

The generational heap exploits that most objects die young. Premature promotion pollutes old gen and increases full GC risk.

G1 is the default collector for good reason, but its pause time scales with live data, not heap size.

ZGC and Shenandoah achieve sub-10ms pauses at the cost of throughput or memory overhead.

Always enable GC logging in production. You cannot tune what you cannot measure.

Unbounded caches are the #1 cause of GC-related production incidents. Apply eviction policies to every in-memory cache.

Common mistakes to avoid

5 patterns

Using System.gc() in application code or relying on finalize() for cleanup

Symptom

Unexpected full GC pauses or objects never reclaimed before OOM

Fix

Set -XX:+DisableExplicitGC in production. Replace finalize() with Cleaner or try-with-resources.

Setting -Xmx equal to container memory limit without considering native overhead

Symptom

OOM kills by container orchestrator even though heap is under 100%

Fix

Budget container memory as heap 1.15 for G1, heap 1.25 for ZGC. Use -XX:MaxRAMPercentage=75.

Tuning collector-specific flags without first enabling GC logging

Symptom

Random latency spikes with no diagnostic data to correlate

Fix

Enable GC logging first: -Xlog:gc*:file=gc.log:time,uptime. Only then adjust flags.

Blindly copying JVM flags from blog posts or other services

Symptom

Poor GC performance or unexpected full GC behavior in production

Fix

Profile your own application's allocation rate and live data set before tuning. Test in staging.

Ignoring humongous allocations in G1 until to-space exhaustion occurs

Symptom

Full GC triggered by to-space exhaustion, causing multi-second pauses

Fix

Monitor GC logs for 'humongous allocation' warnings. Increase -XX:G1HeapRegionSize or chunk large objects.

INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01SENIOR

Explain the difference between a young GC, mixed GC, and full GC in G1. ...

Q02SENIOR

A service using G1 with 8GB heap experiences full GC every 15 minutes un...

Q03SENIOR

What are the trade-offs between ZGC and Shenandoah? When would you choos...

Q01 of 03SENIOR

Explain the difference between a young GC, mixed GC, and full GC in G1. When does each occur?

ANSWER

Young GC collects eden and survivor regions when young generation fills. Mixed GC collects both young and old regions after concurrent marking, targeting regions with mostly garbage. Full GC is a stop-the-world compaction triggered when allocation fails (to-space exhausted, or concurrent mark could not free enough space). Full GC is the catastrophic failure mode you want to avoid.