Java I/O Architecture Overview
Java's I/O architecture has undergone three major evolutions: BIO (Blocking I/O) → NIO (Non-blocking I/O) → AIO (Asynchronous I/O). This article establishes a high-level conceptual understanding first, with subsequent articles diving deep into each individually.
What is I/O? (Clarifying the Reference Perspective)
Many beginners frequently confuse the directions of Input and Output: for instance, if data is written to a computer's hard drive, is that considered 'In' or 'Out'?
This requires establishing a core, unbreakable rule: When discussing programmatic I/O, the central perspective and reference point must ALWAYS be the "currently executing program" itself (i.e., the JVM memory).
- I (Input): Refers to the process of all external data (e.g., files on the hard drive, requests sent from another server over the network, keystrokes from the user) entering our JVM memory. Thus, "reading a file (Read)" is considered Input (e.g.,
InputStream). - O (Output): Refers to the process of sending processed data from our JVM memory out to the external world (e.g., saving to a hard drive file, sending to a frontend browser via network, printing to a monitor). Thus, "writing to a file (Write)" is considered Output (e.g.,
OutputStream).
Layman's Analogy: Imagine your "brain" is the current JVM memory.
- Using your eyes to read a book and recognize words (bringing external text into your brain) is Input.
- Using a pen to write an article or speaking to someone (expressing the knowledge from your brain to the external world) is Output.
Java I/O Evolutionary History
| Version | Introduced | Characteristics |
|---|---|---|
| JDK 1.0 | java.io (BIO) | Blocking, operates on Streams, simple and intuitive. |
| JDK 1.4 | java.nio (NIO) | Non-blocking, operates on Blocks (Buffers), Selector for multiplexing. |
| JDK 1.7 | java.nio.file (NIO.2) | New File APIs (Path, Files), Asynchronous I/O (AIO). |
Stream vs Block
| Dimension | Stream (BIO) | Block (NIO) |
|---|---|---|
| Unit of Operation | Byte/Character by Byte/Character | Block of data (Buffer) |
| Direction | Unidirectional (InputStream or OutputStream) | Bidirectional (Channel) |
| Efficiency | Relatively Low | High (Batch transmission) |
| Programming Complexity | Simple | Relatively Complex |
Core Class Hierarchy of java.io
java.io
├── Byte Streams (Ending with Stream)
│ ├── InputStream (Abstract Base Class)
│ │ ├── FileInputStream ← File read
│ │ ├── BufferedInputStream ← Buffered (Performance boost)
│ │ ├── DataInputStream ← Read primitives
│ │ └── ObjectInputStream ← Deserialization
│ └── OutputStream (Abstract Base Class)
│ ├── FileOutputStream ← File write
│ ├── BufferedOutputStream ← Buffered
│ ├── DataOutputStream ← Write primitives
│ └── ObjectOutputStream ← Serialization
│
└── Character Streams (Ending with Reader/Writer)
├── Reader (Abstract Base Class)
│ ├── FileReader ← Text file read
│ ├── BufferedReader ← Buffered, supports readLine()
│ └── InputStreamReader ← Byte stream to Char stream (Specify encoding)
└── Writer (Abstract Base Class)
├── FileWriter ← Text file write
├── BufferedWriter ← Buffered, supports newLine()
└── OutputStreamWriter ← Char stream to Byte stream
Byte Streams vs Character Streams
| Dimension | Byte Stream | Character Stream |
|---|---|---|
| Processing Unit | 8-bit Bytes | 16-bit Unicode Characters |
| Applicable Scenarios | Images, audio/video, binary files | Text files |
| Base API | InputStream/OutputStream | Reader/Writer |
| Encoding Issues | Requires manual handling | Handled internally (Specify encoding via InputStreamReader) |
Decorator Pattern
java.io extensively utilizes the Decorator Pattern, allowing streams to be wrapped layer by layer:
// Read a file, with buffering, supporting reading line by line
BufferedReader reader = new BufferedReader(
new InputStreamReader(
new FileInputStream("file.txt"),
StandardCharsets.UTF_8
)
);
// Write to a file, with buffering, auto-flush
PrintWriter writer = new PrintWriter(
new BufferedWriter(
new FileWriter("out.txt")
)
);
Each layer adds a new functionality while the core entity (FileInputStream) remains unchanged—this is the essence of the Decorator Pattern.
Core Components of java.nio
The three massive pillars of NIO: Buffer, Channel, Selector
Selector (Multiplexer)
│
├── Channel 1 (Non-blocking) ←→ Buffer (Data container)
├── Channel 2 (Non-blocking) ←→ Buffer
└── Channel N (Non-blocking) ←→ Buffer
| Component | Role |
|---|---|
| Buffer | Data container, possessing three pointers: position/limit/capacity. |
| Channel | Bidirectional conduit for data, can be configured as non-blocking. |
| Selector | Monitors multiple Channels, processing whichever is ready (Multiplexing). |
NIO permits a single thread to manage multiple connections via a Selector, drastically reducing thread overhead on the server side.
AIO (Asynchronous I/O)
AIO is genuinely asynchronous; the operating system proactively notifies the program once the I/O operation is complete:
// NIO (Non-blocking, requires polling)
channel.configureBlocking(false);
// Must constantly check the Selector for ready events
// AIO (Asynchronous, just register a callback)
AsynchronousFileChannel channel = AsynchronousFileChannel.open(path);
channel.read(buffer, 0, null, new CompletionHandler<>() {
public void completed(Integer result, Void attachment) {
// Automatically invoked after I/O completion
}
public void failed(Throwable exc, Void attachment) {
// Invoked upon failure
}
});
// Returns immediately after registration, no blocking
Note: AIO's underlying implementation on Linux is still simulated based on thread pools (Linux's asynchronous I/O semantics are incomplete), and does not yield significant performance improvements. In reality, frameworks like Netty and Vertx heavily favor the NIO + Reactor pattern rather than AIO.
Deep Dive Comparison of the Three I/Os (OS Perspective & Layman's Analogies)
To truly comprehend the distinctions among these three, we must dissect them from the underlying mechanisms of the operating system (e.g., blocking/non-blocking, synchronous/asynchronous) and combine this with relatable analogies.
1. Core Concepts: Blocking vs Non-blocking, Synchronous vs Asynchronous
At the operating system level, a complete network read (I/O) operation is primarily divided into two stages:
- Data Preparation Stage (Waiting for data): Waiting for the NIC (Network Interface Card) to receive external data, and copying that data into the OS Kernel Buffer.
- Data Copy Stage (Copying data from kernel to user): Copying data from the Kernel Buffer into the user-space process memory (JVM Memory).
- Blocking vs Non-blocking: The distinction lies in the first stage.
- Blocking: If the kernel data is not yet ready, the current process/thread is suspended and put to sleep (proactively yielding the CPU), and will not be awakened until the data arrives.
- Non-blocking: If the kernel data is not ready, the process does not sleep; instead, it immediately returns an error code (e.g.,
EAGAIN). The process can utilize this time to do other work, continuously retrying (polling).
- Synchronous vs Asynchronous: The distinction lies in the second stage, specifically who is responsible for copying the data.
- Synchronous: The data copy stage is proactively invoked and executed by the user process itself. During this copy, the user process must wait for the copy to finish; hence, it inherently involves blocking. Both BIO and NIO are Synchronous I/O.
- Asynchronous: The data copy stage is entirely delegated to the Operating System (Kernel). The kernel prepares the data, copies it from kernel space to user space, and then directly issues a callback notification to the user process. The user process is absolutely unblocked during both stages.
2. BIO (Synchronous Blocking I/O)
- OS Level: After the application initiates a
readsystem call, it gets blocked during both stages mentioned above. The process is suspended; thereadcall will not return until the kernel prepares the data AND copies it from kernel space to user space. (Underneath, this corresponds to Linux's default blocking socket). - Analogy (Queueing for food at a restaurant):
You order a freshly cooked steak at a restaurant (initiating a
read). You can only stand foolishly at the pickup window, waiting bitterly and unable to do anything else (blocking), until the chef finishes the steak and places it on the counter (data preparation stage). Then, you carefully carry the plate of steak to your seat yourself (data copy stage completed, which takes time), before you can finally eat.
3. NIO (Synchronous Non-blocking I/O and Multiplexing)
-
OS Level: Pure "non-blocking I/O" possesses a fatal flaw—if you have 10,000 connections, the application must continuously issue 10,000
readsystem calls in a dead loop to the OS, asking, "Is your data ready?". This frequent switching between user mode and kernel mode causes the CPU to spin pointlessly, incurring massive overhead. To resolve this, operating systems designed I/O Multiplexing technology. Simply put: Delegating the dirty work of polling 10,000 connections to the operating system operating in kernel mode, instead of having the application frantically ask in user mode.Evolution of Core Technology (select → epoll):
select/poll(Early Multiplexing): The application packages the list of 10,000 sockets (File Descriptors, FDs) and sends them all at once to the OS kernel. The kernel iterates through these 10,000 connections. If any are ready, it marks them and returns the list to the application. The application then iterates through the list again to find the ready sockets and performs the read. Disadvantages: 1) The full 10,000 FDs must be copied from user space to the kernel every single time; 2) The kernel must foolishly iterate through all 10,000 FDs every time; 3)selectin Linux has a default maximum connection limit of 1024.epoll(The ultimate weapon of modern Linux, the high-performance cornerstone of Java NIO / Netty / Redis): To overcome the performance bottlenecks ofselect, Linux introducedepoll, which separated the originally bloated operation:epoll_create: Allocates a designated area in kernel space, creating an epoll instance (internally structured as a Red-Black Tree + a Ready List).epoll_ctl: Whenever the server receives a new connection, it registers only this single new connection onto the kernel's Red-Black tree. This eliminates the need to repeatedly upload the entire stack of connections likeselect. The kernel utilizes NIC hardware interrupt callbacks; if data arrives for a connection, it automatically places it into the "Ready List".epoll_wait: When the application calls this method, it merely checks if there is anything in the kernel's "Ready List". If there is nothing, the current thread will block and sleep. The moment data arrives for any connection, the kernel awakens the thread blocking onepoll_wait. Upon waking, the thread receives a list containing ONLY the connections that are already ready, hitting the target with pinpoint precision! Advantages: No maximum connection limits; no repeated copying of all connections; no blind iterating (Time complexity plummets from O(N) to O(1)), supporting millions of concurrent connections.
Supplementary FAQ: Do all processes in the operating system share the same global epoll? No. The
epollinstance is not globally unique to the OS. Every time your application (e.g., Tomcat or Netty inside the JVM) callsepoll_create, the kernel independently allocates a new, exclusive epoll instance (with its own isolated Red-Black tree and Ready Queue) in kernel space. In fact, you can even create multiple different epoll instances within the same application process. This forms the underlying foundation of Netty's "Multi-Reactor Thread Model": Every processing thread (EventLoop) internally embraces its own exclusive epoll instance, continuously performingepoll_waitmonitoring.Practical Scenario Example (Why use epoll): Imagine a WeChat chat server holding millions of online users. Even though 1,000,000 users maintain long-lived connections (1,000,000 Sockets) simultaneously, the vast majority are just lurking. In any given second, perhaps only 10 people are actually sending messages.
- If using
select: The server must send the entire roster of 1,000,000 people to the kernel every second, and the kernel must painstakingly iterate through all 1,000,000 connections, only to find out that a mere 10 people sent messages. This instantly spikes the CPU to 100%, utterly incapable of handling the concurrency. - If using
epoll: These 1,000,000 people only callepoll_ctlonce to hang themselves in the kernel when establishing the connection. Thereafter, the server thread merely callsepoll_waitand sleeps. When those 10 people send messages, the hardware interrupt from the NIC quietly plucks out those 10 connections and places them into the "Ready List".epoll_waitis instantly awakened, and upon waking, the list contains exactly those 10 precise targets. It takes them and processes them with absolute efficiency!
Therefore, when invoking the Selector in Java NIO, the underlying reality is: Utilizing a minimal number of threads, via
epoll_wait, to simultaneously monitor massive amounts of connections in a blocking manner. Whenepoll_waitreturns, the application proactively invokesreadon these "verifiably ready" connections to copy the data. (Note: The second stage—copying data from kernel space to JVM user space—remains synchronous and blocking, but because it is already confirmed that data definitively exists, the copy is guaranteed to not hit an empty void). -
Analogy (Eating at a restaurant - Front desk gives out multiple pagers): You order a steak at an internet-famous restaurant, and the waiter hands you a vibrating pager. With the pager in hand, you are free to go shopping or play on your phone outside. Under the Java NIO multiplexing model, imagine this: You are the "Big Boss" of the company, and you helped hundreds of your employees grab pagers at the restaurant (Selector/Multiplexer managing connections). Your singular task is to stare at a table full of pagers (
epoll_waitblocking monitor). Suddenly, you notice pagers #7 and #20 vibrating simultaneously (data is ready). You sprint to the pickup window, show the pagers, and you yourself carry the prepared plates of steak back to the company seats (Still the second stage: Personally carrying data from the kernel back to user space, which is still blocking). However, this is exponentially more efficient than hiring one hundred people to stand in line.
4. AIO (Asynchronous I/O)
- OS Level: The application initiates an asynchronous read call (e.g., the IOCP API in Windows), passing in a Buffer container for storing the final data along with a callback function. The process tells the kernel: "When the data is ready on the NIC, help me stuff it into this Buffer I've carved out in user space, and only yell for me when everything is completely finished." From start to finish, the application never waits foolishly, experiencing zero blocking.
- Analogy (Food delivery to your door - Zero hassle): You order food delivery at home (initiating an AIO call) and leave your home address for the delivery driver (passing in a Buffer or callback). After ordering, you go play video games or take a shower (entirely non-blocking). When the delivery driver finishes cooking the food, they proactively open your front door and place the perfectly arranged dishes right onto your dining table (Data Preparation + Data Copy are completely delegated and finished in the OS background). Then they yell: "Food's ready, enjoy!" (Executing the callback function). Throughout this process, all the dirty work of carrying plates has been outsourced.
Summary Table of Core Differences
| Dimension | BIO (Synchronous Blocking) | NIO (Synchronous Non-blocking + Multiplexing) | AIO (True Asynchronous) |
|---|---|---|---|
| Data Preparation Stage | Waits indefinitely (Thread suspended/sleeping) | Multiplexer centralizes the wait/suspend (Monitors aggregated channels; does not block on a specific connection) | Operating system processes in background (Process NEVER waits) |
| Data Copy Stage | Process proactively copies data (Blocks thread during this stage) | Process proactively copies data (Blocks thread during this stage) | Operating system completely handles the copy, then notifies process |
| Thread Model | 1 independent worker thread per connection | A small number of selector threads manage tens of thousands of long-lived connections | Relies on OS callbacks to trigger minimal threads |
| Applicable Scenarios | Ancient systems with extremely low concurrency and few connections | High-concurrency, massive long-lived connections, high-load network frameworks (e.g., Netty, Redis) | Idealistic, but Linux AIO implementation is lackluster (simulated via epoll). Less popular than NIO in production. |
Core Implementation Code for the Three I/O Architectures
To viscerally experience their differences from a source code perspective, let's look at what the foundational server-side code looks like for these three network programming models in Java. (Please pay close attention to the comments in the code, as they reveal the critical points causing blocking and system calls).
1. BIO Server Example (Blocking points: accept and read)
A BIO server generally allocates an independent thread for every newly received connection; otherwise, it cannot accept connections from other newcomers.
import java.io.InputStream;
import java.net.ServerSocket;
import java.net.Socket;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
public class BioServer {
public static void main(String[] args) throws Exception {
// Create a Server Socket, listening on port 8080
ServerSocket serverSocket = new ServerSocket(8080);
// Utilize a thread pool instead of 'new Thread' each time for slight performance optimization
ExecutorService threadPool = Executors.newCachedThreadPool();
System.out.println("BIO Server started...");
while (true) {
// [Blocking Point 1: Waiting for connection initiation]
// The main thread will be suspended and sleep here until a client connects over the network
// (Corresponding to the blocking mechanism in the underlying OS)
Socket socket = serverSocket.accept();
System.out.println("Detected a new client connecting to this server...");
// Whenever a client connects successfully, a dedicated independent thread MUST be allocated to handle its I/O read/write.
// Because the subsequent read() method is also blocking, if the main thread reads here, it cannot go back to accept other connections.
threadPool.execute(() -> {
try {
byte[] bytes = new byte[1024];
// Obtain the input stream from this connection (Essentially requesting data from OS kernel buffers)
InputStream inputStream = socket.getInputStream();
while (true) {
// [Blocking Point 2: Waiting for data arrival]
// This dedicated thread will suspend and sleep here. If the client doesn't send data for a day,
// it will wait obediently for a day, completely wasting this thread.
int readCount = inputStream.read(bytes);
// -1 means the client disconnected normally
if (readCount != -1) {
System.out.println("Received data: " + new String(bytes, 0, readCount));
} else {
break; // Client disconnected, break loop, end worker thread
}
}
} catch (Exception e) {
e.printStackTrace();
}
});
}
}
}
2. NIO Server Example (Core: Selector Multiplexing)
Under the NIO model, a single thread (or a handful of threads) can maintain massive amounts of connections. It gathers all connections inside a Selector and collectively hands them to the underlying operating system (like epoll) for highly efficient management.
import java.net.InetSocketAddress;
import java.nio.ByteBuffer;
import java.nio.channels.SelectionKey;
import java.nio.channels.Selector;
import java.nio.channels.ServerSocketChannel;
import java.nio.channels.SocketChannel;
import java.util.Iterator;
import java.util.Set;
public class NioServer {
public static void main(String[] args) throws Exception {
// 1. Create the server's ServerSocketChannel (Similar to BIO's ServerSocket)
ServerSocketChannel serverSocketChannel = ServerSocketChannel.open();
// 2. Bind to port 8080
serverSocketChannel.socket().bind(new InetSocketAddress(8080));
// ★ Core Step: Configure the server itself as "non-blocking mode", so accept() will no longer dead-wait
serverSocketChannel.configureBlocking(false);
// 3. Open the multiplexer (This line triggers epoll_create in Linux under the hood, opening a Red-Black tree warehouse in the kernel)
Selector selector = Selector.open();
// 4. Register the server itself onto the Selector, indicating we want to listen for "OP_ACCEPT" (Connection arrival events)
// From now on, any incoming or outgoing connections are uniformly controlled through this single selector
serverSocketChannel.register(selector, SelectionKey.OP_ACCEPT);
System.out.println("NIO Server started...");
while (true) {
// [Blocking Point: Merely blocking here, waiting for ANY associated event to trigger]
// select() is a blocking method based on the multiplexer (Corresponds to epoll_wait in Linux).
// It goes to sleep until there is activity in ANY of the monitored Channels (whether someone connects or sends data).
if (selector.select(1000) == 0) {
// If 1 second passes with no new events, skip it temporarily. We can use this chance to run cron jobs (Not a pure dead hang)
continue;
}
// If execution makes it past select() to here, it means a wake-up signal was received; an event occurred!
Set<SelectionKey> selectionKeys = selector.selectedKeys();
Iterator<SelectionKey> iterator = selectionKeys.iterator();
while (iterator.hasNext()) {
SelectionKey key = iterator.next();
// You MUST immediately remove an event after retrieving it, preventing the next while loop from processing this triggered action again
iterator.remove();
// 🌟 Event 1: Did a newcomer just initiate a connection handshake?
if (key.isAcceptable()) {
// Because it was evaluated above that it's definitively ready to accept, triggering accept here
// will immediately return the communication SocketChannel without even 1ms of blocking!
SocketChannel socketChannel = serverSocketChannel.accept();
// Configure this new guest's communication connection as "non-blocking mode" too!
// Because we don't want their subsequent read() to force the thread to stare blankly.
socketChannel.configureBlocking(false);
// [Report the newcomer to kernel control]: Hand it to the selector to listen for read activity (OP_READ) on this new channel
socketChannel.register(selector, SelectionKey.OP_READ, ByteBuffer.allocate(1024));
System.out.println("Successfully processed and received a new client connection!");
}
// 🌟 Event 2: Did a previously connected old guest send new data?
if (key.isReadable()) {
SocketChannel channel = (SocketChannel) key.channel();
ByteBuffer buffer = (ByteBuffer) key.attachment(); // This is the buffer tray specifically allocated for the guest above
// [The core action of escaping blocking]: The underlying system guarantees via epoll that there is 100% data ready to read here!
// So this read() will no longer hang blindly, but will swiftly scoop the data that has settled in the kernel pool directly into the program Buffer.
int readLen = channel.read(buffer);
if(readLen > 0) {
System.out.println("Received data from client: " + new String(buffer.array(), 0, readLen));
}else if(readLen < 0){
// A negative return value means the client voluntarily closed their Socket
channel.close();
}
}
}
}
}
}
3. AIO Server Example (Core: True Asynchronous Callback Mechanism)
Under the AIO model, everything involving waiting is completely delegated to the operating system. The application merely tosses out a "read" or "accept connection" command to the OS, attaching an object containing success/failure callbacks (CompletionHandler), and can then rest entirely easy.
import java.net.InetSocketAddress;
import java.nio.ByteBuffer;
import java.nio.channels.AsynchronousServerSocketChannel;
import java.nio.channels.AsynchronousSocketChannel;
import java.nio.channels.CompletionHandler;
public class AioServer {
public static void main(String[] args) throws Exception {
// Create an asynchronous server channel
AsynchronousServerSocketChannel serverChannel = AsynchronousServerSocketChannel.open()
.bind(new InetSocketAddress(8080));
System.out.println("AIO Server started...");
// Issue an asynchronous request command to wait for incoming new connections!
// DIFFERENT from NIO's select dead loop! Calling accept() here doesn't block at all; the method returns in an instant.
// The mechanism uses CompletionHandler to beg the system: "If a client ever connects, automatically trigger the completed() below for me!"
serverChannel.accept(null, new CompletionHandler<AsynchronousSocketChannel, Void>() {
// When the operating system truly finishes allocating a connection, this hook automatically executes!
@Override
public void completed(AsynchronousSocketChannel socketChannel, Void attachment) {
// (Extremely Important) We must hang the NEXT listening hook, like passing a baton; otherwise, the server won't receive any more traffic.
serverChannel.accept(null, this);
System.out.println("Welcomed a newly connected client! The system thread serving this order is: " + Thread.currentThread().getName());
// Assign a dedicated data parcel bag ByteBuffer for this new connection
ByteBuffer buffer = ByteBuffer.allocate(1024);
// [Core Code Point demonstrating the beauty of AIO Asynchrony]: Reading absolutely does not allow dead waiting!
// We immediately issue another order delegating to the OS: "Listen, when their data flies into our OS kernel through the NIC..."
// "Don't just yell for me; do a good deed to the end, and help me stuff the data into this bag named 'buffer'!"
// "After everything is smoothly done, turn around and call the completed below to find the program."
// Note: This line of code slides by in a flash; it doesn't stand blankly waiting.
socketChannel.read(buffer, buffer, new CompletionHandler<Integer, ByteBuffer>() {
// When the system [has already done the work of properly depositing the data into the ByteBuffer in user space], it notifies this final step.
@Override
public void completed(Integer result, ByteBuffer attachment) {
if (result > 0) {
attachment.flip();
System.out.println("So comfortable! The system served the client data directly to me: " + new String(attachment.array(), 0, result));
attachment.clear(); // Organize and wash the container for later reuse
// If we must continue listening to this client's complaints, continue submitting read delegation orders.
socketChannel.read(attachment, attachment, this);
} else if (result == -1) {
// Gracefully handle normal connection disconnects
try { socketChannel.close(); } catch (Exception e) {}
}
}
@Override
public void failed(Throwable exc, ByteBuffer attachment) {
System.out.println("Unfortunately, a read error exception occurred!");
}
});
}
@Override
public void failed(Throwable exc, Void attachment) {
System.out.println("Encountered unexpected disconnect or error exception while receiving an incoming connection!");
}
});
// To prevent the non-blocking main thread from executing past this and completely dying, we block here so the daemon program stays alive.
Thread.currentThread().join();
}
}
Best Practices for File I/O
Traditional BIO File Reading and Kernel Pipeline
BIO can certainly be utilized to read local files (In fact, FileInputStream and FileReader are the quintessential BIO file operation components). In file I/O scenarios, BIO remains "synchronous blocking"—if the hard disk head seek is slow or data hasn't been cached into memory, the read() operation will similarly block the currently executing thread.
Below is an example demonstrating how to utilize pure BIO to read a file. Please focus on the underlying principle comments, which reveal how data traverses step-by-step from the disk into your JVM program:
import java.io.File;
import java.io.FileInputStream;
import java.io.InputStream;
public class BioFileDemo {
public static void main(String[] args) {
File file = new File("demo.txt");
// 1. Initiate a system call to open a file handle (Underlying corresponds to a Linux File Descriptor)
try (InputStream is = new FileInputStream(file)) {
// 2. Allocate a byte array in JVM user space memory to act as a "shipping container" to receive data
// (Note: The array size should be weighed against available physical memory, usually 1024~8192, a power of 2)
byte[] buffer = new byte[1024];
int readLen;
// 3. Begin issuing read() system calls to the OS for cyclical data extraction
// [Blocking Point]: If the system kernel hasn't moved data from the hard drive into motherboard memory, the current execution flow is suspended and put to sleep to wait.
while ((readLen = is.read(buffer)) != -1) {
// [Deep Dive into Underlying Principles: What traversal happens during this line read(buffer)?]
// 1) Hardware Stage: Disk controller receives the hardware read instruction, uses DMA (Direct Memory Access) to silently copy data from the hard drive sector to the OS's [Kernel Buffer] (Page Cache), consuming zero main CPU cycles.
// 2) Suspend Stage: During this pure waiting period for the disk to spin, your Java thread is stripped of CPU execution rights and suspended.
// 3) Copy Stage: When the kernel buffer pool possesses the data for this file, the CPU intervenes, copying this batch of data completely unaltered from the OS's [Kernel Buffer] across the boundary into the byte array 'buffer' above (Crossing into [JVM User Space]).
// 4) Wake-up Stage: Once copying is entirely secured, the OS shouts to wake up the current thread, the read method concludes its block, and honestly returns the actual number of valid bytes stuffed into the large iron tub.
// Process the loaded data (Because the last read of a file might not fill the 1024 cabin, we must truncate the valid portion based on the actual length readLen)
String chunk = new String(buffer, 0, readLen);
System.out.println("Successfully read a chunk of data: " + chunk);
}
} catch (Exception e) {
e.printStackTrace();
}
}
}
Deep Advanced Thought: Why does everyone always say you MUST wrap file reading with
BufferedInputStream? Because the extremely primitive and ordinaryFileInputStreammentioned above triggers an underlying system call (a cross-state invocation causing extremely expensive CPU thread context switches) every timeread()is invoked, even for a minuscule number of bytes. Wrapping it withBufferedInputStreamis akin to building a "buffer transit station" for that channel—it internally maintains a default 8KB byte array. Even if the code only requests 1 byte, it forces the OS to copy an entire 8KB chunk of data into its own memory in one breath. Subsequently, the vast majority of ensuing read requests (within that 8KB) will no longer disturb the kernel; they are directly scooped from the memory array to the business logic, effectively slashing millions of pointless underlying system interactions.
NIO File Reading (The Cornerstone of Channels and Zero-Copy)
Many beginners harbor a misconception, assuming that using NIO to read files means it will no longer block. In truth, local disk files on the vast majority of underlying operating systems fundamentally DO NOT support "non-blocking mode" (For instance, if you try calling FileChannel.configureBlocking(false), you'll find the method simply doesn't exist).
NIO's absolute advantage in file reading stems not from non-blocking behavior, but from its fine-grained manipulation of memory (e.g., direct physical memory off-heap allocation) and its support for system-level "Zero-Copy" mechanisms.
import java.io.RandomAccessFile;
import java.nio.ByteBuffer;
import java.nio.channels.FileChannel;
public class NioFileDemo {
public static void main(String[] args) {
// 1. Obtain the underlying File Channel, extractable from traditional FileXXXStream or RandomAccessFile
try (RandomAccessFile file = new RandomAccessFile("demo.txt", "r");
FileChannel channel = file.getChannel()) {
// 2. Still carving out a Buffer container in JVM user space to receive data
// [Advanced Optimization Point / Must-know Architectural Concept]: If you switch this to ByteBuffer.allocateDirect(1024), it will "bypass the Java Heap" entirely, allocating space directly in the OS's physical memory!
// Thus, once the kernel reads the file data, it writes directly into this shared memory, eliminating the secondary copy overhead of moving bricks from "Kernel Space" into "JVM Heap Memory Space".
ByteBuffer buffer = ByteBuffer.allocate(1024);
// 3. The channel begins working.
// [Blocking Point]: Attention! As stated just now, local file reads remain synchronous blocking here; it must wait for the disk hardware to spin and extract data.
int bytesRead = channel.read(buffer);
while (bytesRead != -1) {
// Just like capping a pen after use, to extract the data just loaded into the Buffer, you must flip() it.
// In layman's terms, this pulls the internal cursor (Position) from the written end back to the beginning, allowing you to read sequentially outwards.
buffer.flip();
// Determine if this cargo truck (Buffer) still has unloaded freight (valid data)
while (buffer.hasRemaining()) {
System.out.print((char) buffer.get());
}
// After fetching the data (this truck is unloaded), immediately clear or reset the cabin, preparing to turn around and let the channel read() load the next truck.
buffer.clear();
bytesRead = channel.read(buffer);
}
// 🌟 [The Ultimate Performance Weapon: Zero-Copy Mechanism]:
// If your goal isn't to extract file contents for printing, but merely to "read a file, then send it completely unaltered to the network (e.g., sending it to frontend via Socket)"
// In NIO, you can directly call a god-tier method: channel.transferTo(...)
// At this moment, the OS kernel not only won't foolishly copy file data into the application above; it will directly connect a pipe to the NIC at the kernel layer, letting the file flow straight out into the network. This is the true trump card behind million-throughput middleware like Kafka/RocketMQ!
} catch (Exception e) {
e.printStackTrace();
}
}
}
AIO Asynchronous File Reading (The Ultimate Hands-off Experience)
JDK 7's NIO.2 completed the final piece of the puzzle, introducing the truly asynchronous class designed for completely hands-off file reading from start to finish: AsynchronousFileChannel. In this mode, your thread can issue assignments and then go out and play for the entire day.
import java.nio.ByteBuffer;
import java.nio.channels.AsynchronousFileChannel;
import java.nio.channels.CompletionHandler;
import java.nio.file.Path;
import java.nio.file.StandardOpenOption;
public class AioFileDemo {
public static void main(String[] args) throws Exception {
Path path = Path.of("demo.txt");
// 1. Open a file channel imbued with newfound asynchronous capabilities
try (AsynchronousFileChannel fileChannel = AsynchronousFileChannel.open(path, StandardOpenOption.READ)) {
// 2. Still preparing an empty knapsack to store contents
ByteBuffer buffer = ByteBuffer.allocate(1024);
// 3. [The Subversive Experience Point: Posting the bounty and walking away]: Throw out a read command and stick a post-completion processing checklist (Callback function object) on it.
// Parameter list respectively represents: Where to put data, read from 0th byte of file, attachment to pass through to callback (passing buffer here), and the hook object for completion feedback.
// Note: This line of code will execute and leap past instantly! Whether the file is 1 GB or 10 GB!
fileChannel.read(buffer, 0, buffer, new CompletionHandler<Integer, ByteBuffer>() {
// [When OS finishes behind the scenes, it automatically follows the vine to call this hook]
@Override
public void completed(Integer result, ByteBuffer attachment) {
System.out.println("\n[Mysterious Underlying Power(Callback System Thread)] Mission accomplished! Successfully pried " + result + " bytes of data from the hard drive this time.");
attachment.flip();
byte[] data = new byte[attachment.limit()];
attachment.get(data);
System.out.println("[Mysterious Underlying Power] Detailed file contents: " + new String(data));
attachment.clear();
}
@Override
public void failed(Throwable exc, ByteBuffer attachment) {
System.out.println("[Mysterious Underlying Power] Reporting to boss, hard drive might have kicked the bucket, scraping mission failed...");
exc.printStackTrace();
}
});
// 4. [Perfect Illustration of Non-blocking] The main storyline advances unbridled right here!
System.out.println("[Main Worker Thread] Since the dirty work of reading the hard drive has been tossed down, let's not waste our prime youth staring blankly. I'm going to play a round of PUBG right now...");
// (The output result will inevitably print the main thread's gaming log first, because the read action delegated above returned without any delay)
// Strictly for demonstration purposes here: We must leave some time for the underlying asynchronous callback to survive. If we don't sleep, the main system closes instantly, and the callback thread gets forcibly terminated along with it.
Thread.sleep(2000);
} catch (Exception e) {
e.printStackTrace();
}
}
}
Utilize try-with-resources
// ✅ Recommended: Auto-close
try (BufferedReader reader = new BufferedReader(new FileReader("file.txt"))) {
String line;
while ((line = reader.readLine()) != null) {
process(line);
}
}
// ❌ Not Recommended: Manual closing is easily forgotten
BufferedReader reader = null;
try {
reader = new BufferedReader(new FileReader("file.txt"));
// ...
} finally {
if (reader != null) reader.close();
}
Utilize NIO.2 (JDK 7+) Files Utility Class
// Read all lines
List<String> lines = Files.readAllLines(Path.of("file.txt"), UTF_8);
// Write to file
Files.write(Path.of("out.txt"), content.getBytes(UTF_8));
// Copy file
Files.copy(src, dst, StandardCopyOption.REPLACE_EXISTING);
// Recursively traverse directory
Files.walk(Path.of("/dir"))
.filter(Files::isRegularFile)
.forEach(System.out::println);
The Files and Path APIs of NIO.2 are much cleaner and more concise than the traditional File API, and are highly recommended.
High-Frequency Engineering Concepts
| Concept | Key Points |
|---|---|
| Differences between BIO/NIO/AIO? | BIO blocks, one-connection-one-thread; NIO is non-blocking, uses multiplexing; AIO relies on asynchronous callbacks. |
| Difference between Byte Stream and Character Stream? | Byte streams handle raw binary; character streams handle text and internally manage encoding conversions. |
| What design pattern is used in java.io? | Decorator pattern; wraps objects layer by layer to augment functionality. |
| What are the three pillars of NIO? | Buffer (Data container), Channel (Bidirectional conduit), Selector (Multiplexer). |
| Why is NIO actually more popular than AIO in production? | Linux AIO implementation is incomplete; The NIO + Reactor pattern (Netty) is already profoundly efficient. |