Memory Management Overview
What Problems Does Memory Management Solve?
An operating system's memory management subsystem must address four core architectural challenges:
- Address Space Isolation: Each process must believe it has exclusive access to a massive, contiguous block of memory, completely isolated from the data of other processes.
- Allocation and Deallocation: Dynamically assigning physical memory frames to processes when requested and reclaiming them when processes terminate or free memory.
- Memory Expansion (Virtual Memory): Leveraging disk storage to seamlessly extend available memory when physical RAM is exhausted.
- Address Translation: Dynamically mapping the virtual addresses used by application code into the actual physical addresses on the hardware RAM chips.
Virtual Addresses vs. Physical Addresses
The addresses used by user-space programs are Virtual Addresses. Before the CPU can actually fetch data from RAM, these must be translated into Physical Addresses by hardware known as the MMU (Memory Management Unit):
CPU MMU Physical RAM
┌──────┐ ┌──────────┐ ┌──────────┐
│Virtual│──────▶ │ Address │──────▶ │ Data at │
│Address│ │ Translator│ │ Physical │
└──────┘ │ (Page Tab)│ │ Address │
└──────────┘ └──────────┘
0x00400000 Table Lookup 0x7F200000
Why not just use physical addresses directly?
- Relocation: When a program is compiled, the compiler has no idea which physical RAM addresses will be free when the program eventually runs.
- Contention: Multiple processes would inevitably overwrite each other if they hardcoded the same physical addresses.
- Security: Without a translation layer, malicious code could directly read the kernel's memory or passwords belonging to other users.
The Three Memory Management Architectures
1. Segmentation
Divides a process's address space into contiguous blocks of varying sizes based on logical function—such as the Code Segment, Data Segment, Stack Segment, and Heap Segment. Each segment has a base physical address and a limit (length).
Logical Address = Segment Number + Offset Within Segment
Segment Table:
Segment No. Base Addr Length/Limit
0 (Code) 0x1000 0x400
1 (Data) 0x2000 0x300
2 (Stack) 0x3000 0x200
Translation: Segment=1, Offset=0x50
→ Physical Address = 0x2000 + 0x50 = 0x2050
Pros: Directly maps to the program's logical structure, making sharing and protection easy (e.g., marking the entire Code segment as read-only). Cons: Because segments are variable in size, allocating and freeing them quickly leads to External Fragmentation (free memory exists, but it's chopped into pieces too small to fit a new segment).
2. Paging
Divides both physical RAM and virtual address spaces into fixed-size blocks. The physical blocks are called Frames, and the virtual blocks are called Pages. The standard size is usually 4KB.
Virtual Address = Virtual Page Number (VPN) + Page Offset
Page Table (Maintained in RAM by the OS):
VPN Physical Frame No. Valid Bit
0 3 1
1 — 0 ← Unmapped (Triggers Page Fault)
2 7 1
3 1 1
Virtual Addr: 0x00002050 (VPN=2, Offset=0x050)
→ Physical Frame No. = 7
→ Physical Address = (7 × 4096) + 0x050 = 0x7050
Pros: Eliminates external fragmentation because all blocks are identical in size. Cons: Can cause Internal Fragmentation (if a process only needs 1KB, it still gets a 4KB page, wasting 3KB). The Page Table itself can consume massive amounts of RAM.
Multi-Level Page Tables
In a 64-bit architecture, a single-level page table mapping every 4KB page would require $2^{52}$ entries per process. This would consume petabytes of RAM just for the table. The solution is a Multi-Level Page Table (a tree structure):
x86-64 Four-Level Page Table:
Virtual Address (48 bits actively used):
┌────────┬────────┬────────┬────────┬────────────┐
│ PML4 │ PDP │ PD │ PT │ Offset │
│ 9 bits │ 9 bits │ 9 bits │ 9 bits │ 12 bits │
└───┬────┴───┬────┴───┬────┴───┬────┴────────────┘
│ │ │ │
▼ ▼ ▼ ▼
PML4 Tab → PDP Tab → PD Tab → PT Tab → Frame No. + Offset
The Advantage: Sparse Allocation. If a massive chunk of virtual memory is never used by the process, the OS simply doesn't allocate the lower-level tables for that region, saving immense amounts of RAM.
3. Segmented Paging
A hybrid approach: First, divide the address space into segments, then divide each segment into fixed-size pages.
Note: Modern Linux structurally retains segmentation due to x86 hardware legacy, but effectively nullifies it by setting all segment base addresses to 0 and limits to Max, relying entirely on Paging.
TLB: The Page Table Cache
Translating an address via a 4-level page table requires 4 separate trips to physical RAM. This is unacceptably slow. The TLB (Translation Lookaside Buffer) is an ultra-fast hardware cache situated directly inside the MMU that stores recently translated VPN-to-Frame mappings.
CPU emits Virtual Address
│
▼
Check TLB ─── Cache Hit (~1 cycle) ───▶ Emit Physical Address
│
Cache Miss
│
▼
Walk Multi-Level Page Table (~100 cycles)
│
▼
Update TLB Cache
│
▼
Emit Physical Address
Because software execution exhibits Locality of Reference (code executes sequentially, and data is often accessed in clusters), TLB hit rates typically exceed 99%.
System Design Audit & Observability
Memory management abstractions dictate the latency and stability of high-performance applications.
1. The TLB Thrashing Phenomenon
If an application randomly accesses massive datasets scattered across gigabytes of memory (e.g., massive hash tables or in-memory databases), the TLB cache constantly misses. This "TLB thrashing" forces the CPU to constantly walk the page table, devastating performance.
- Audit Protocol: Enable HugePages (2MB or 1GB pages instead of 4KB) in the Linux kernel. A single TLB entry can now map 1GB of contiguous memory, drastically reducing TLB misses for large-heap JVMs or databases like PostgreSQL.
2. Monitoring Page Faults
When an application allocates memory via malloc(), the kernel doesn't immediately assign physical RAM. It only assigns RAM when the application actually tries to write to that address, triggering a "Page Fault".
- Audit Command: Use
pidstat -r 1orsar -B. Monitorminflt/s(Minor Faults: assigning RAM) andmajflt/s(Major Faults: reading swapped memory from disk). A high Major Fault rate means your application is thrashing against the disk, and a catastrophic latency spike is imminent.
3. The Cost of Context Switching on Memory
When the OS switches from Process A to Process B, Process B has an entirely different page table.
- Audit Protocol: The OS must flush the TLB during a process context switch because the cached VPN-to-Frame mappings are no longer valid for the new process. This makes process switching computationally expensive. Thread switching, conversely, shares the same page table, so the TLB remains hot and valid.