正在切换页面...

JMM: Synchronized and Volatile Deep Dive

hardConcurrencyJMMSynchronizedVolatileLockInflationUpdated

The Java Memory Model (JMM) is the cornerstone of concurrent programming. To master the "Two Titans"—synchronized and volatile—one must look beyond the syntax and into the hardware-level pipelines, CPU buffers, and the intricate contract between the JVM and the hardware.

1. The JMM Abstraction: "The Courier System"

While JVM Runtime Data Areas (Stack, Heap, Metaspace) define where data resides, the JMM defines how threads safely interact with that data.

1.1 Main Memory vs. Working Memory

The JMM abstracts the hardware (Registers, L1/L2/L3 caches) into two logical tiers:

Main Memory: Shared storage (Heap and Static fields).
Working Memory: Thread-private storage (CPU registers/cache).

1.2 The 8 Atomic Operations

To ensure data flows correctly between these tiers, the JMM enforces 8 atomic primitives:

read/load: Fetching from Main to Working memory.
use/assign: Interacting with the CPU's execution engine.
store/write: Flushing from Working back to Main memory.
lock/unlock: The ultimate authority for atomic exclusivity, which also flushes buffers upon release.

2. Hardware Reality: The Root of Corruption

Why is concurrency difficult? Because hardware engineers prioritized speed over simplicity.

2.1 The Speed Gap & Store Buffers

CPUs are rockets; RAM is a bicycle. To bridge this, CPUs use Store Buffers (outbox) and Invalidate Queues (drafts). A CPU writes to its Store Buffer and continues executing without waiting for the RAM to acknowledge. This causes Visibility Lag: Core A updated a value, but Core B still sees the old one from its own cache.

2.2 Reordering: `as-if-serial`

The JVM and CPU will reorder instructions to maximize pipeline throughput, as long as the result in a single-threaded context remains the same. In multi-threaded environments, this reordering is a catastrophe (e.g., partial object exposure).

3. The Defense: Memory Barriers

To fight hardware lag, the JMM utilizes Memory Barriers (Fences):

LoadLoad / StoreStore: Ensures previous ops complete before subsequent ones.
StoreLoad: The "Heavy Hammer." It forces a full flush of all store buffers and waits for all invalidation signals to be processed. This is usually implemented via the lock prefix in x86 assembly.

4. Volatile: The Lightweight Shield

4.1 Bytecode vs. Hardware

At the bytecode level, volatile fields are marked with ACC_VOLATILE. When the JIT compiler sees this, it injects memory barriers around the read/write instructions.

4.2 The Magic Grid (Visibility & Ordering)

volatile provides two guarantees:

Visibility: A write is immediately flushed to main memory; a read always bypasses local cache and goes to main memory.
Ordering: It prevents reordering of instructions across the volatile "fence."

4.3 The DCL Singleton Pitfall

Without volatile, the instance = new Singleton() line can be reordered:

Allocate Memory 2. Assign Reference 3. Initialize Object. A second thread might see a non-null reference (Step 2) and try to use an uninitialized object (Step 3 hasn't happened yet), leading to a NullPointerException. Adding volatile forces Step 3 to happen before Step 2.

5. Synchronized: The Heavy Artillery

While volatile handles visibility and ordering, synchronized is the only mechanism that guarantees Atomicity (serialized execution).

5.1 The Mark Word: The Battle for the Crown

Every Java object has a 64-bit Mark Word in its header. This tag tracks the lock's state:

State	Bit Tag	Description
Biased	`01`	Optimistically assigned to the first thread (Solo runner).
Lightweight	`00`	Threads "Spin" (Adaptive Spinning) and update a stack-based Lock Record.
Heavyweight	`10`	The "Inflation." Threads enter a kernel-level sleep managed by `ObjectMonitor`.

5.2 The Lock Inflation Lifecycle

Biased Lock: The JVM records the Thread ID in the Mark Word. Subsequent entries cost nearly zero.
Lightweight Lock: When a second thread arrives, the biased lock is revoked. Both threads attempt to swap the Mark Word with their own stack's Lock Record via CAS.
Spinning: Failure doesn't cause sleep immediately. The thread "spins" in a while(true) loop, hoping the holder finishes soon.
Heavyweight Lock: If spinning fails (the holder is doing heavy work), the lock inflates. A C++ ObjectMonitor is created, and waiting threads are put into a kernel-mode sleep (Context Switch cost: ~10,000ns).

5.3 The Brutal Unfairness

Synchronized is Unfair. When a lock is released, a newly arrived thread might "barge in" and steal the lock via CAS before a sleeping thread can even wake up. This is a deliberate design to maximize System Throughput by avoiding unnecessary context switches.

Summary Decision Matrix

Feature	`volatile`	`synchronized`
Scope	Variables	Blocks and Methods
Visibility	Yes	Yes
Ordering	Yes	Yes
Atomicity	No	Yes
Mechanism	Memory Barriers	Monitor / Lock Inflation
Overhead	Minimal	Medium to High (at inflation)

Golden Rule: Use volatile for status flags and read-heavy indicators. Use synchronized for complex state transitions where multiple steps must appear as a single atomic unit.

JMM: Synchronized and Volatile Deep Dive

hardConcurrencyJMMSynchronizedVolatileLockInflationUpdated

1. The JMM Abstraction: "The Courier System"

While JVM Runtime Data Areas (Stack, Heap, Metaspace) define where data resides, the JMM defines how threads safely interact with that data.

1.1 Main Memory vs. Working Memory

The JMM abstracts the hardware (Registers, L1/L2/L3 caches) into two logical tiers:

Main Memory: Shared storage (Heap and Static fields).
Working Memory: Thread-private storage (CPU registers/cache).

1.2 The 8 Atomic Operations

To ensure data flows correctly between these tiers, the JMM enforces 8 atomic primitives:

read/load: Fetching from Main to Working memory.
use/assign: Interacting with the CPU's execution engine.
store/write: Flushing from Working back to Main memory.
lock/unlock: The ultimate authority for atomic exclusivity, which also flushes buffers upon release.

2. Hardware Reality: The Root of Corruption

Why is concurrency difficult? Because hardware engineers prioritized speed over simplicity.

2.1 The Speed Gap & Store Buffers

2.2 Reordering: `as-if-serial`

3. The Defense: Memory Barriers

To fight hardware lag, the JMM utilizes Memory Barriers (Fences):

LoadLoad / StoreStore: Ensures previous ops complete before subsequent ones.
StoreLoad: The "Heavy Hammer." It forces a full flush of all store buffers and waits for all invalidation signals to be processed. This is usually implemented via the lock prefix in x86 assembly.

4. Volatile: The Lightweight Shield

4.1 Bytecode vs. Hardware

At the bytecode level, volatile fields are marked with ACC_VOLATILE. When the JIT compiler sees this, it injects memory barriers around the read/write instructions.

4.2 The Magic Grid (Visibility & Ordering)

volatile provides two guarantees:

Visibility: A write is immediately flushed to main memory; a read always bypasses local cache and goes to main memory.
Ordering: It prevents reordering of instructions across the volatile "fence."

4.3 The DCL Singleton Pitfall

Without volatile, the instance = new Singleton() line can be reordered:

Allocate Memory 2. Assign Reference 3. Initialize Object. A second thread might see a non-null reference (Step 2) and try to use an uninitialized object (Step 3 hasn't happened yet), leading to a NullPointerException. Adding volatile forces Step 3 to happen before Step 2.

5. Synchronized: The Heavy Artillery

While volatile handles visibility and ordering, synchronized is the only mechanism that guarantees Atomicity (serialized execution).

5.1 The Mark Word: The Battle for the Crown

Every Java object has a 64-bit Mark Word in its header. This tag tracks the lock's state:

State	Bit Tag	Description
Biased	`01`	Optimistically assigned to the first thread (Solo runner).
Lightweight	`00`	Threads "Spin" (Adaptive Spinning) and update a stack-based Lock Record.
Heavyweight	`10`	The "Inflation." Threads enter a kernel-level sleep managed by `ObjectMonitor`.

5.2 The Lock Inflation Lifecycle

Biased Lock: The JVM records the Thread ID in the Mark Word. Subsequent entries cost nearly zero.
Lightweight Lock: When a second thread arrives, the biased lock is revoked. Both threads attempt to swap the Mark Word with their own stack's Lock Record via CAS.
Spinning: Failure doesn't cause sleep immediately. The thread "spins" in a while(true) loop, hoping the holder finishes soon.
Heavyweight Lock: If spinning fails (the holder is doing heavy work), the lock inflates. A C++ ObjectMonitor is created, and waiting threads are put into a kernel-mode sleep (Context Switch cost: ~10,000ns).

5.3 The Brutal Unfairness

Summary Decision Matrix

Feature	`volatile`	`synchronized`
Scope	Variables	Blocks and Methods
Visibility	Yes	Yes
Ordering	Yes	Yes
Atomicity	No	Yes
Mechanism	Memory Barriers	Monitor / Lock Inflation
Overhead	Minimal	Medium to High (at inflation)

Golden Rule: Use volatile for status flags and read-heavy indicators. Use synchronized for complex state transitions where multiple steps must appear as a single atomic unit.

1. The JMM Abstraction: "The Courier System"

1.1 Main Memory vs. Working Memory

1.2 The 8 Atomic Operations

2. Hardware Reality: The Root of Corruption

2.1 The Speed Gap & Store Buffers

2.2 Reordering: as-if-serial

3. The Defense: Memory Barriers

4. Volatile: The Lightweight Shield

4.1 Bytecode vs. Hardware

4.2 The Magic Grid (Visibility & Ordering)

4.3 The DCL Singleton Pitfall

5. Synchronized: The Heavy Artillery

5.1 The Mark Word: The Battle for the Crown

5.2 The Lock Inflation Lifecycle

5.3 The Brutal Unfairness

Summary Decision Matrix

1. The JMM Abstraction: "The Courier System"

1.1 Main Memory vs. Working Memory

1.2 The 8 Atomic Operations

2. Hardware Reality: The Root of Corruption

2.1 The Speed Gap & Store Buffers

2.2 Reordering: as-if-serial

3. The Defense: Memory Barriers

4. Volatile: The Lightweight Shield

4.1 Bytecode vs. Hardware

4.2 The Magic Grid (Visibility & Ordering)

4.3 The DCL Singleton Pitfall

5. Synchronized: The Heavy Artillery

5.1 The Mark Word: The Battle for the Crown

5.2 The Lock Inflation Lifecycle

5.3 The Brutal Unfairness

Summary Decision Matrix

2.2 Reordering: `as-if-serial`

2.2 Reordering: `as-if-serial`