The Log Trio: Redo, Undo, and Binlog
The Three Pillars of MySQL Logging
| Log | Layer | Format | Primary Purpose |
|---|---|---|---|
| Redo Log | InnoDB Engine | Physical | Crash Recovery (Durability) |
| Undo Log | InnoDB Engine | Logical | Transaction Rollback + MVCC |
| Binlog | MySQL Server | Logical | Replication + Point-in-time Recovery |
Redo Log vs. Binlog: The Essential Difference
| Feature | Redo Log | Binlog |
|---|---|---|
| Ownership | Exclusive to InnoDB | Server-wide (All engines) |
| Level | Physical (Page offsets) | Logical (Statement or Row change) |
| Lifespan | Circular. Overwrites old logs as checkpoints advance. | Persistent. New files are created as old ones fill up. |
| Role | Ensures the database is structurally sound after a crash. | Ensures other servers (replicas) have the same data. |
The Two-Phase Commit (2PC)
Because Redo Logs and Binlogs are two separate systems, we face a "Distributed Transaction" problem. If one is written but the other fails (due to a crash), the Master and its Replicas will have different data.
The Solution: 2-Phase Commit (2PC).
- Prepare Phase: InnoDB writes the change to the Redo Log and marks it as
PREPARE. - Commit Phase Part 1: The change is written to the Binlog.
- Commit Phase Part 2: InnoDB marks the Redo Log as
COMMIT.
Recovery Logic:
- If
PREPAREis found but noBinlogentry $\rightarrow$ Rollback (Master and Slave stay consistent at "old" state). - If
PREPAREis found AND aBinlogentry exists $\rightarrow$ Commit (Master and Slave stay consistent at "new" state).
Technical Deep Dive: The Circular Redo Log
┌────────┬────────┬────────┬────────┐
│ file 0 │ file 1 │ file 2 │ file 3 │
└────────┴────────┴────────┴────────┘
↑ checkpoint ↑ write pos
(Dirty pages flushed) (Current writing tail)
The Redo Log has a fixed size. As the write pos moves forward, it eventually wraps around. If it catches up to the checkpoint, MySQL must stop all user traffic to "flush" memory pages to disk, freeing up space in the log. This is known as a "Redo Log Checkpoint Pause" and can cause massive latency spikes in high-concurrency systems.
Implementation Insight: Group Commit
Writing to disk is slow. To optimize, MySQL doesn't disk-sync after every single transaction. Instead, it waits a few milliseconds to "group" multiple committed transactions into a single physical write to the log. This technique, Group Commit, is what allows MySQL to handle thousands of transactions per second even on standard hardware.