Polymorphism and the Compilation Lifecycle
Polymorphism is the crown jewel of Object-Oriented Programming, enabling powerful abstractions that separate "what a system does" from "how it does it." This article explores the journey of a method call from the source code, through the javac compiler pipeline, into the binary .class format, and finally into the JVM's memory structures.
1. The Two Faces of Polymorphism
In Java, polymorphism manifests in two distinct stages of the application lifecycle.
1.1 Compile-time Polymorphism: Overloading
Method overloading occurs when multiple methods in the same class share a name but differ in their parameter lists.
- Mechanism: Static Dispatch.
- Resolution: The compiler determines the exact method signature to call during compilation based on the declared (static) types of the arguments.
1.2 Runtime Polymorphism: Overriding
Method overriding occurs when a subclass provides a specific implementation for a method declared in its parent class.
- Mechanism: Dynamic Dispatch.
- Resolution: The JVM determines which implementation to invoke at runtime based on the actual (runtime) type of the object.
Analogy: Overloading is like ordering at a restaurant—you choose "Medium Pizza" or "Large Pizza" at the counter (Compile-time). Overriding is like a delivery service—you address it to "User X," but the driver only finds out the specific apartment number when they arrive at the building (Runtime).
2. The Runtime Engine: Virtual Method Tables (VTable)
How does the JVM find the right code to execute in $O(1)$ time?
2.1 VTable Structure
When the JVM loads a class, it generates a Virtual Method Table (vtable). This is essentially an array of pointers to the actual method code.
- Stability: If a subclass inherits from a parent, its vtable entries for inherited methods remain at the exact same index as in the parent.
- Overriding: If a subclass overrides a method, the pointer at that specific index is updated to point to the subclass's version.
2.2 The Execution Path
When the JVM executes an invokevirtual instruction (e.g., animal.speak()):
- It inspects the object header of the instance to find its class metadata.
- It looks up the method's pre-determined index in that class's vtable.
- It jumps to the memory address stored at that index.
3. Interface Dispatch: The ITable Challenge
Why can't interfaces use the same simple vtable index?
Unlike classes (single inheritance), a class can implement multiple interfaces in any order. InterfaceA.method() might be the first method in one class but the tenth in another.
3.1 ITable (Interface Method Table)
JVM uses a more complex structure for interfaces:
- Interface Entry Table: A list of
(Interface, Offset)pairs. - Method Array: The actual method pointers for those interfaces.
When calling invokeinterface, the JVM must linearly scan the entry table to find the correct interface, then use the offset to find the method. This makes invokeinterface theoretically slower than invokevirtual ($O(K)$ vs $O(1)$).
3.2 Optimization: Inline Caches (IC)
To mitigate the search overhead, modern JVMs use Inline Caches. They remember the class of the object at a specific call site. If the next 10,000 calls are of the same class (Monomorphic), the JVM skips the table lookup entirely and jumps directly to the cached code.
4. The javac Compilation Pipeline
The process of turning .java into .class involves six primary stages:
- Lexing (Scanning): Characters are converted into Tokens (Keywords, Identifiers, Operators).
- Parsing: Tokens are organized into an Abstract Syntax Tree (AST), checking for grammatical correctness (e.g., missing semicolons).
- Semantic Analysis: This is the "Brain" of the compiler. It checks:
- Symbol Resolution: Do the variables and methods exist?
- Type Checking: Can you assign this String to this Integer?
- Access Control: Is this private field accessible here?
- Annotation Processing: Executing processors like Lombok or MapStruct to generate new code.
- Desugaring: Removing "Syntactic Sugar." Generics are erased,
foreachloops become Iterators, andStringconcatenation becomesStringBuilder. - Code Generation: The AST is finally translated into JVM bytecode instructions.
5. Anatomy of a .class File
A .class file is a rigid, binary "Employee Dossier" for a class.
5.1 The Binary Header
- Magic Number:
0xCAFEBABE. A unique identifier for Java class files. - Version: Specifies which JDK version is required to run this file (e.g., Version 61 = Java 17).
5.2 The Constant Pool (The Registry)
The Constant Pool is a centralized registry of all names, strings, and numeric constants used in the class. Instead of repeating the string "java/lang/System" everywhere, the bytecode simply points to an index in the Constant Pool (e.g., #5).
5.3 Type Descriptors
The JVM uses a compact shorthand to represent types in bytecode:
I: intZ: booleanV: void[I: int arrayLjava/lang/String;: String object
6. Solving the Polymorphism Puzzle
Consider this common architectural scenario:
class Parent {
void show(Parent p) { System.out.println("P-P"); }
void show(Child c) { System.out.println("P-C"); }
}
class Child extends Parent {
@Override
void show(Parent p) { System.out.println("C-P"); }
void show(Child c) { System.out.println("C-C"); }
}
Parent obj = new Child();
obj.show(new Child()); // What is the output?
Analysis:
- Compilation (Static Dispatch): The compiler looks at the static type of
obj(Parent). It sees two available signatures:show(Parent)andshow(Child). Since the argumentnew Child()is aChild, it selects the most specific match:show(Child). The bytecode generated isinvokevirtual Parent.show(LChild;)V. - Runtime (Dynamic Dispatch): The JVM looks at the actual object (
Child). It goes to theChildvtable and looks for the implementation of the signature chosen by the compiler:show(Child). It findsChild.show(Child)and executes it.
Result: C-C.
Summary
| Component | Role | Complexity |
|---|---|---|
| Compiler | Semantic analysis, Type checking, Overload resolution. | High |
| VTable | O(1) lookup for class-based overrides. | Low |
| ITable | O(K) lookup for interface implementation. | Medium |
| Bytecode | Standardized instruction set (e.g., invokevirtual). | Fixed |
| Class File | Persistent binary dossier of class metadata. | Binary |