HTTP Protocol
HTTP Fundamentals
The HyperText Transfer Protocol (HTTP) is the foundation of the World Wide Web. It is an application-layer protocol based on a request-response model, typically running over TCP (and recently UDP via QUIC).
Request Message Anatomy
GET /index.html HTTP/1.1 ← Request Line (Method + URL + Version)
Host: www.example.com ← Headers
User-Agent: Mozilla/5.0
Accept: text/html
Connection: keep-alive
← Empty Line
[Request Body] ← Optional (usually empty for GET)
Response Message Anatomy
HTTP/1.1 200 OK ← Status Line (Version + Code + Phrase)
Content-Type: text/html ← Headers
Content-Length: 1234
Set-Cookie: session=abc123
← Empty Line
<html>...</html> ← Response Body
HTTP Methods
| Method | Purpose | Idempotent | Safe |
|---|---|---|---|
| GET | Retrieve a resource | ✅ | ✅ |
| POST | Create a new resource / Submit data | ❌ | ❌ |
| PUT | Replace a resource | ✅ | ❌ |
| DELETE | Remove a resource | ✅ | ❌ |
| PATCH | Partial update | ❌ | ❌ |
| OPTIONS | Describe communication options (CORS preflight) | ✅ | ✅ |
GET vs POST: GET parameters are appended to the URL (limited length, visible in history), while POST parameters are in the request body (unlimited, not in history). Semantically, GET is for "reading" and POST is for "writing/submitting."
Status Codes
| Range | Category | Common Examples |
|---|---|---|
| 1xx | Informational | 101 Switching Protocols |
| 2xx | Success | 200 OK, 201 Created, 204 No Content |
| 3xx | Redirection | 301 Permanent Redirect, 302 Found, 304 Not Modified (Cache) |
| 4xx | Client Error | 400 Bad Request, 401 Unauthorized, 403 Forbidden, 404 Not Found |
| 5xx | Server Error | 500 Internal Server Error, 503 Service Unavailable |
Evolution of HTTP
| Version | Year | Key Features |
|---|---|---|
| HTTP/1.0 | 1996 | One TCP connection per request (Very slow) |
| HTTP/1.1 | 1997 | Keep-Alive (Persistent connections), Chunked Transfer, Host Header |
| HTTP/2 | 2015 | Multiplexing, Header Compression (HPACK), Binary Frames, Server Push |
| HTTP/3 | 2022 | Built on QUIC (UDP), eliminates Transport-layer Head-of-line Blocking |
Multiplexing: The HTTP/2 Revolution
HTTP/1.1 suffered from Head-of-Line (HoL) Blocking: if one large request on a connection was slow, all subsequent requests behind it were blocked. HTTP/2 solves this by splitting messages into Binary Frames with Stream IDs. Multiple requests and responses can be interleaved on a single TCP connection, allowing the browser to download images, CSS, and JS in parallel without waiting.
State Management: Cookie vs. Session
Since HTTP is inherently stateless, we use Cookies and Sessions to "remember" users:
- Cookie: Small text files stored in the client browser.
- Risk: Can be intercepted or tampered with if not marked as
HttpOnlyandSecure.
- Risk: Can be intercepted or tampered with if not marked as
- Session: Data stored on the server (RAM or Database).
- Mechanism: The server sends a
Set-Cookie: session_id=XYZ. The browser includes this ID in subsequent requests. The server uses the ID to look up the user's data.
- Mechanism: The server sends a
Deep Technical Insights
The "Cost" of Connection
In HTTP/1.0, every image on a page required a full TCP 3-way handshake. For 50 images, that's 50 handshakes + 50 slow-starts. HTTP/1.1 fixed this with persistent connections, but browser limits (usually 6 parallel connections per domain) still constrained speed. HTTP/2’s single-connection multiplexing is the true performance unlock for modern web apps.
Idempotency and API Design
Idempotency means that making the same request multiple times has the same effect as making it once. GET, PUT, and DELETE should be idempotent. If a DELETE request is retried due to a network glitch, it shouldn't "cleanly delete" something else; it should simply return a success or 404. Designing APIs with these semantics ensures that network retries don't cause unexpected data side effects.