Java Concurrent Programming Essentials
Java Concurrent Programming Essentials
Mastering the art of writing efficient, thread-safe Java applications in multi-core environments
What is Concurrent Programming?
Java concurrent programming is the discipline of writing programs that execute
multiple tasks simultaneously while preserving correctness and performance. Multi-Core
This paradigm has become essential in modern software development as
Exploits modern CPU architectures
applications need to maximize the capabilities of multi-core processors.
Coordinated
Manages shared state safely
Concurrency enables multiple tasks to make progress simultaneously, whether that's handling thousands of web requests on a server,
keeping a desktop application responsive during intensive operations, or processing large datasets in parallel.
ExecutorService ex = [Link](2);
[Link](() -> taskA());
[Link](() -> taskB());
[Link]();
The Java Memory Model (JMM)
The Java Memory Model is the foundation of concurrent programming in Java. It defines the rules that govern how threads interact through
memory, establishing crucial guarantees about visibility and ordering of operations across threads.
Visibility ensures that updates made by one thread are eventually seen by other threads. Without proper synchronization, threads may
cache variables locally, leading to one thread never observing changes made by another.
Instruction reordering, performed by both compilers and CPUs for optimization, can cause operations to execute in a different order than
written in source code. This reordering is invisible in single-threaded programs but can break concurrent algorithms.
The Runnable interface represents a task that can be executed by a thread but doesn't return a result. This separation of task from thread is
fundamental to modern concurrent programming, as it allows the same task logic to be executed by different threads or thread pools.
1 2 3 4
Key Methods
Callable<Integer> c = () -> {
[Link](1000); • get() — blocks until result available
return 10;
• isDone() — checks completion status
};
• cancel() — attempts to cancel execution
ExecutorService executor =
[Link](); • get(timeout) — bounded waiting
Future<Integer> f = [Link](c);
Integer result = [Link](); // blocks until ready
Executor Framework
The Executor framework provides a high-level abstraction for managing thread pools and task execution. Instead of manually creating and
managing threads, you submit tasks to an executor service that handles the threading details.
This decoupling of task submission from execution policy enables you to easily change how tasks are executed—sequentially, in parallel
with a fixed pool, or with dynamically adjusted concurrency—without modifying the task code itself.
90% 5000 4x
Overhead Reduction Task Queue Throughput Gain
Eliminating thread creation costs Pending tasks in typical pool Typical improvement vs new thread per task
1 2 3 4
Locks on the object instance (this) Explicit lock object for fine-grained control Locks on the Class object
How It Works
Best practice: Keep synchronized regions as
Every Java object has an associated monitor lock. When a thread enters a synchronized small as possible to minimize contention and
method or block, it must acquire the monitor. If another thread already holds it, the maximize concurrency.
requesting thread blocks until the lock becomes available.
The lock is automatically released when the thread exits the synchronized region, even if
an exception is thrown, preventing deadlocks from forgotten unlocks.
ReentrantLock: Explicit Locking
While synchronized is convenient, ReentrantLock provides more sophisticated locking capabilities. It offers the same mutual exclusion guarantees but
with additional features like timed lock attempts, interruptible locking, and multiple condition variables.
1 2 3 4
Use case: Caches, configuration stores, and other data structures with infrequent updates but frequent reads benefit enormously from read-write locks.
StampedLock: Optimistic Concurrency
Introduced in Java 8, StampedLock offers an even more sophisticated locking mechanism with three modes: writing, reading, and
optimistic reading. The optimistic mode allows reads to proceed without any locking overhead, validating afterward that no write
occurred.
Optimistic Read Pattern When to Use
StampedLock excels in scenarios with very low write contention where optimistic reads
StampedLock sl = new StampedLock(); rarely need to retry. This makes it ideal for read-dominated data structures in low-
long stamp = [Link](); contention environments.
// read variables
if () {
// write occurred, need real lock Performance: Can outperform ReadWriteLock by 2-3x in read-heavy
stamp = [Link](); workloads with minimal writes.
try {
// re-read variables
} finally {
[Link](stamp);
}
}
10x 0 3
Read Throughput Lock Overhead Lock Modes
Improvement over synchronized in read-heavy scenarios Optimistic reads have zero locking cost Write, read, and optimistic read
Atomic Variables: Lock-Free Concurrency
Atomic variables provide thread-safe operations on single variables without using locks. Built on top of compare-and-swap (CAS) CPU instructions, they
offer a lightweight, lock-free alternative to synchronization for simple operations.
The [Link] package provides atomic classes for integers, longs, booleans, references, and arrays. These classes form the foundation for building
high-performance concurrent data structures.
AtomicInteger in Action
AtomicInteger provides thread-safe operations on integer values without requiring explicit synchronization. It's particularly useful for counters, sequence
generators, and other scenarios requiring atomic numeric updates.
incrementAndGet() getAndIncrement()
Atomically increment and return new value Return current value, then increment
addAndGet(delta) compareAndSet()
Add delta and return new value Update if value matches expectation
Retry on Failure
Attempt CAS
If CAS fails due to concurrent modification, loop and retry
Atomically compare and swap if value hasn't changed
99% 2/10
Success Rate Retry Average
First-try success in low contention Iterations needed under moderate contention
Key insight: CAS loops work well when contention is low to moderate. Under extreme contention, locks may actually perform better due to less wasted retry work.
AtomicReference: Lock-Free Object Updates
While AtomicInteger handles numeric values, AtomicReference enables atomic operations on object references. This allows you to build lock-free data
structures by atomically swapping entire objects rather than protecting field updates with locks.
1 2 3
1 2 3 4
Simple data types: For plain integers and simple references without reuse, ABA is typically not a concern.
AtomicStampedReference: Solving ABA
AtomicStampedReference solves the ABA problem by pairing each reference with an integer stamp (version number). Now CAS operations require both
the reference and the stamp to match, detecting any intermediate changes even if the reference value cycles back.
AtomicMarkableReference
Simpler alternative using a boolean mark instead of an integer stamp when you only need to detect change, not count versions
LongAdder: High-Throughput Counters
LongAdder is a specialized class designed for high-contention scenarios where many threads frequently update a counter. Instead of all threads
competing for a single atomic value, it maintains multiple cells that threads can update independently, summing them only when the total is needed.
When to Use
LongAdder adder = new LongAdder();
• High-frequency updates from many threads
// Increment from many threads • Reads are infrequent compared to writes
[Link]();
[Link](5); • Perfect for metrics, statistics, request counters
// Get current sum Tradeoff: sum() is not atomic and may not reflect all concurrent updates—use when
long total = [Link](); approximate counts are acceptable.
// Reset to zero
[Link]();
Concurrent Collections
The [Link] package provides thread-safe collection implementations optimized for concurrent access. Unlike
synchronized wrappers, these collections use sophisticated lock-free or fine-grained locking algorithms to maximize parallelism.
// Consumer
Task task = [Link]();
ConcurrentHashMap: Scalable Thread-Safe Map
ConcurrentHashMap is one of the most important concurrent collections. Unlike Hashtable or [Link](), it allows concurrent reads
and uses fine-grained locking for writes, achieving excellent scalability.
16 100% 10x
Default Segments Read Concurrency Throughput Gain
Partitions for parallel writes Lock-free reads always vs synchronized HashMap under load
// Atomic remove
[Link]("key", 42);
Java 8+: ConcurrentHashMap offers rich atomic operations like compute(), merge(), and parallel bulk operations for functional-style concurrent programming.
Coordination Utilities
Beyond locks and atomic variables, Java provides high-level utilities for coordinating the execution phases of multiple threads. These synchronizers handle common patterns
like waiting for multiple tasks to complete or implementing phased parallel algorithms.
CountDownLatch CyclicBarrier
Makes threads wait until a count reaches zero Threads wait for each other at a barrier point
Exchanger Semaphore
Two threads exchange objects at a synchronization point Controls access to a resource pool with permits
Distribute
Fork
Worker threads steal subtasks from queue
Split task into independent subtasks
Join
Compute
Combine results from subtasks
Process subtasks in parallel
Work Stealing
ForkJoinPool pool = [Link]();
The framework uses work-stealing: idle worker threads
// Submit a recursive task steal tasks from busy threads' queues, ensuring balanced
Integer result = [Link]( load distribution and maximum CPU utilization.
new RecursiveTaskImpl(data)
);
Best for: Recursive algorithms like merge sort, tree traversals, parallel array operations, and other divide-and-conquer problems.
RecursiveTask: Fork/Join with Results
RecursiveTask<V> is the primary abstraction for tasks that return values in the Fork/Join framework. You extend this class and implement the compute()
method to define your parallel algorithm.
Pattern Structure Key Decisions
RecursiveAction: Use RecursiveAction instead of RecursiveTask when your parallel computation doesn't need to return a value.
CompletableFuture: Asynchronous Programming
CompletableFuture revolutionizes asynchronous programming in Java by providing a rich API for composing, combining, and transforming asynchronous
operations. It represents a value that will be available in the future and supports non-blocking callbacks.
1 2 3 4
1 2 3
thenAccept
Consume final result
Stage Variants
[Link](() -> {
return fetchUserId(); • thenApply — sync transformation
}) • thenApplyAsync — async transformation
.thenApply(id -> {
• thenAccept — consume result
return "User-" + id;
}) • thenRun — side effect action
.thenCompose(username -> { • thenCompose — flatten nested futures
return fetchUserDetails(username); • thenCombine — merge two futures
})
.thenAccept(details -> {
[Link](details);
})
.exceptionally(ex -> {
handleError(ex);
return null;
});
Non-blocking: The entire pipeline executes without blocking any threads, enabling highly scalable reactive applications.
ThreadLocal: Thread-Confined State
ThreadLocal provides each thread with its own independent copy of a variable. This enables thread-safe code without synchronization by eliminating
shared mutable state—each thread accesses only its own copy.
Warning: ThreadLocal can cause memory leaks in thread pools if not properly cleaned up. Always call remove() when done.
Common Concurrency Pitfalls
Concurrent programming introduces subtle failure modes that don't exist in sequential code. Understanding these common pitfalls helps you write more robust concurrent
applications and debug issues when they arise.
// Monitor contention
java -XX:+PrintConcurrentLocks