正在切换页面...

Hash Table Principles and Practice (Collisions · Common Patterns)

easy-mediumHash TableDesignHash FunctionsUpdated

Core Principles of Hash Tables

Hash Functions

Maps a key to an array index: index = hashCode(key) % capacity

Requirements for a Good Hash Function:

Determinism: The same input must consistently produce the same output.
Uniform Distribution: Minimizes collisions by spreading keys across the address space.
Efficiency: Calculation must be O(1).

Collision Resolution

Separate Chaining:

Each bucket in the array stores a linked list (or a balanced tree) of entries.
Used by Java's HashMap.
Load Factor: $\alpha = n / \text{capacity}$ (Default is 0.75 in Java).

Open Addressing:

Probes for empty slots within the array (Linear Probing, Quadratic Probing, or Double Hashing).
Used by Python's dict.

Java HashMap Key Features

Initial Capacity: 16; Load Factor: 0.75.
Resizing: Doubles capacity when the load factor threshold is hit; all entries are rehashed.
Java 8+ Optimization: When a bucket's list length reaches $\ge 8$, it transforms into a Red-Black Tree to prevent Hash DoS (Denial of Service) attacks.
Thread Safety: HashMap is not thread-safe; use ConcurrentHashMap for concurrent environments.

Manual Hash Table Implementation (Chaining)

class MyHashMap {
    private static final int SIZE = 1000;
    private LinkedList<int[]>[] table;

    @SuppressWarnings("unchecked")
    public MyHashMap() {
        table = new LinkedList[SIZE];
    }

    public void put(int key, int value) {
        int idx = key % SIZE;
        if (table[idx] == null) table[idx] = new LinkedList<>();
        for (int[] pair : table[idx]) {
            if (pair[0] == key) { 
                pair[1] = value; 
                return; 
            }
        }
        table[idx].add(new int[]{key, value});
    }

    public int get(int key) {
        int idx = key % SIZE;
        if (table[idx] == null) return -1;
        for (int[] pair : table[idx]) {
            if (pair[0] == key) return pair[1];
        }
        return -1;
    }

    public void remove(int key) {
        int idx = key % SIZE;
        if (table[idx] == null) return;
        table[idx].removeIf(pair -> pair[0] == key);
    }
}

Common Problem Patterns

Two Sum (LeetCode 1)

int[] twoSum(int[] nums, int target) {
    Map<Integer, Integer> map = new HashMap<>();
    for (int i = 0; i < nums.length; i++) {
        int complement = target - nums[i];
        if (map.containsKey(complement)) {
            return new int[]{map.get(complement), i};
        }
        map.put(nums[i], i);
    }
    return new int[]{};
}

Valid Anagram (LeetCode 242)

boolean isAnagram(String s, String t) {
    if (s.length() != t.length()) return false;
    int[] count = new int[26];
    for (char c : s.toCharArray()) count[c - 'a']++;
    for (char c : t.toCharArray()) {
        count[c - 'a']--;
        if (count[c - 'a'] < 0) return false;
    }
    return true;
}

Group Anagrams (LeetCode 49)

List<List<String>> groupAnagrams(String[] strs) {
    Map<String, List<String>> map = new HashMap<>();
    for (String s : strs) {
        char[] chars = s.toCharArray();
        Arrays.sort(chars);
        String key = new String(chars);
        map.computeIfAbsent(key, k -> new ArrayList<>()).add(s);
    }
    return new ArrayList<>(map.values());
}

Top K Frequent Elements (LeetCode 347)

int[] topKFrequent(int[] nums, int k) {
    Map<Integer, Integer> freqMap = new HashMap<>();
    for (int n : nums) freqMap.merge(n, 1, Integer::sum);
    
    // Bucket Sort: Group numbers by their frequency
    List<Integer>[] buckets = new List[nums.length + 1];
    freqMap.forEach((num, count) -> {
        if (buckets[count] == null) buckets[count] = new ArrayList<>();
        buckets[count].add(num);
    });
    
    int[] result = new int[k];
    int idx = 0;
    // Iterate buckets from highest frequency to lowest
    for (int i = buckets.length - 1; i >= 0 && idx < k; i--) {
        if (buckets[i] != null) {
            for (int n : buckets[i]) { 
                result[idx++] = n; 
                if (idx == k) break; 
            }
        }
    }
    return result;
}