Hash Table Principles and Practice (Collisions · Common Patterns)
easy-mediumHash TableDesignHash Functions
Core Principles of Hash Tables
Hash Functions
Maps a key to an array index: index = hashCode(key) % capacity
Requirements for a Good Hash Function:
- Determinism: The same input must consistently produce the same output.
- Uniform Distribution: Minimizes collisions by spreading keys across the address space.
- Efficiency: Calculation must be O(1).
Collision Resolution
Separate Chaining:
- Each bucket in the array stores a linked list (or a balanced tree) of entries.
- Used by Java's
HashMap. - Load Factor: $\alpha = n / \text{capacity}$ (Default is 0.75 in Java).
Open Addressing:
- Probes for empty slots within the array (Linear Probing, Quadratic Probing, or Double Hashing).
- Used by Python's
dict.
Java HashMap Key Features
- Initial Capacity: 16; Load Factor: 0.75.
- Resizing: Doubles capacity when the load factor threshold is hit; all entries are rehashed.
- Java 8+ Optimization: When a bucket's list length reaches $\ge 8$, it transforms into a Red-Black Tree to prevent Hash DoS (Denial of Service) attacks.
- Thread Safety:
HashMapis not thread-safe; useConcurrentHashMapfor concurrent environments.
Manual Hash Table Implementation (Chaining)
class MyHashMap {
private static final int SIZE = 1000;
private LinkedList<int[]>[] table;
@SuppressWarnings("unchecked")
public MyHashMap() {
table = new LinkedList[SIZE];
}
public void put(int key, int value) {
int idx = key % SIZE;
if (table[idx] == null) table[idx] = new LinkedList<>();
for (int[] pair : table[idx]) {
if (pair[0] == key) {
pair[1] = value;
return;
}
}
table[idx].add(new int[]{key, value});
}
public int get(int key) {
int idx = key % SIZE;
if (table[idx] == null) return -1;
for (int[] pair : table[idx]) {
if (pair[0] == key) return pair[1];
}
return -1;
}
public void remove(int key) {
int idx = key % SIZE;
if (table[idx] == null) return;
table[idx].removeIf(pair -> pair[0] == key);
}
}
Common Problem Patterns
Two Sum (LeetCode 1)
int[] twoSum(int[] nums, int target) {
Map<Integer, Integer> map = new HashMap<>();
for (int i = 0; i < nums.length; i++) {
int complement = target - nums[i];
if (map.containsKey(complement)) {
return new int[]{map.get(complement), i};
}
map.put(nums[i], i);
}
return new int[]{};
}
Valid Anagram (LeetCode 242)
boolean isAnagram(String s, String t) {
if (s.length() != t.length()) return false;
int[] count = new int[26];
for (char c : s.toCharArray()) count[c - 'a']++;
for (char c : t.toCharArray()) {
count[c - 'a']--;
if (count[c - 'a'] < 0) return false;
}
return true;
}
Group Anagrams (LeetCode 49)
List<List<String>> groupAnagrams(String[] strs) {
Map<String, List<String>> map = new HashMap<>();
for (String s : strs) {
char[] chars = s.toCharArray();
Arrays.sort(chars);
String key = new String(chars);
map.computeIfAbsent(key, k -> new ArrayList<>()).add(s);
}
return new ArrayList<>(map.values());
}
Top K Frequent Elements (LeetCode 347)
int[] topKFrequent(int[] nums, int k) {
Map<Integer, Integer> freqMap = new HashMap<>();
for (int n : nums) freqMap.merge(n, 1, Integer::sum);
// Bucket Sort: Group numbers by their frequency
List<Integer>[] buckets = new List[nums.length + 1];
freqMap.forEach((num, count) -> {
if (buckets[count] == null) buckets[count] = new ArrayList<>();
buckets[count].add(num);
});
int[] result = new int[k];
int idx = 0;
// Iterate buckets from highest frequency to lowest
for (int i = buckets.length - 1; i >= 0 && idx < k; i--) {
if (buckets[i] != null) {
for (int n : buckets[i]) {
result[idx++] = n;
if (idx == k) break;
}
}
}
return result;
}