HashMap
is one of the most important and commonly used classes in Java’s Collections Framework. It provides constant-time complexity for basic operations like get()
and put()
— but this efficiency hinges on how well hashing is implemented.
In this tutorial, we'll walk through how hashing works under the hood in HashMap
, explore collision resolution, resizing strategies, and show diagrams to visualize the internals. Whether you're optimizing code or preparing for an interview, understanding hashing is a must.
What is Hashing?
Hashing is the process of converting an input (key) into a fixed-size numeric value called a hash code. Java uses this hash code to index and quickly locate key-value pairs.
Analogy
Imagine a storage cabinet where each drawer is labeled. A hash function takes a key and tells you exactly which drawer to look in.
Purpose of Hashing in HashMap
- Enables O(1) average time complexity for retrieval and insertion
- Organizes data in a predictable, scalable way
- Reduces search space compared to linear scans
Java HashMap Syntax Recap
Map<String, Integer> map = new HashMap<>();
map.put("apple", 5);
int quantity = map.get("apple");
Step-by-Step Breakdown of Hashing in HashMap
Step 1: Compute Hash Code
int hash = key.hashCode(); // e.g., "apple".hashCode() = 93029210
Step 2: Convert Hash Code to Index
int index = (n - 1) & hash; // Bitwise AND to fit in table size
- Table size
n
is always a power of 2 (e.g., 16, 32, 64) - Bitwise AND distributes hash evenly across buckets
Step 3: Insert into Bucket
If bucket is empty, insert. If not, resolve collision.
Internal Data Structure and Memory Layout
Node<K,V>[] table; // array of linked lists or trees
Each bucket can be:
- Empty
- A single node (key-value pair)
- A linked list of nodes (collision)
- A red-black tree of nodes (Java 8+)
How Collision Resolution Works
Linked List Collision Handling (pre-Java 8)
Multiple keys with same index form a linked list.
Bucket[3] → Node1 → Node2 → Node3
Tree-Based Buckets (Java 8+)
If number of nodes in a bucket exceeds 8 and capacity ≥ 64:
- Converts to Red-Black Tree
- Improves search time from O(n) to O(log n)
Bucket[7] → Red-Black Tree
Resizing and Threshold Mechanics
Resizing Trigger
threshold = capacity * loadFactor; // Default loadFactor = 0.75
When size exceeds threshold, capacity doubles and rehashing occurs.
Resizing is Expensive
- All nodes are rehashed and moved to new buckets
- Can slow down performance if not planned
HashMap Performance and Big-O Complexity
Operation | Average Case | Worst Case |
---|---|---|
get/put/remove | O(1) | O(n) |
iteration | O(n) | O(n) |
Worst Case
All keys hash to same bucket → linear scan of linked list
Java 8+ Enhancements: Treeification
- Introduced treeified buckets to reduce worst-case time
- Threshold = 8 elements in a single bucket
- Only enabled if table size ≥ 64
Functional Programming Support
map.forEach((key, value) -> System.out.println(key + ": " + value));
map.entrySet()
.stream()
.filter(e -> e.getKey().startsWith("a"))
.forEach(System.out::println);
Best Practices and Common Pitfalls
✅ Override equals()
and hashCode()
for custom keys
@Override
public boolean equals(Object obj) { ... }
@Override
public int hashCode() { ... }
❌ Never use mutable keys
Key key = new Key("K1");
map.put(key, "value");
key.setName("K2"); // Now retrieval fails
✅ Pre-size HashMap if size is predictable
Map<String, String> map = new HashMap<>(1000);
Refactoring Examples
Before:
if (!map.containsKey(key)) {
map.put(key, new ArrayList<>());
}
map.get(key).add(value);
After:
map.computeIfAbsent(key, k -> new ArrayList<>()).add(value);
📌 What's New in Java [version]?
Java 8
- Treeification of buckets
forEach()
,compute()
,merge()
Java 9
Map.of()
for immutable maps
Java 10
var
keyword for inference
Java 21
- Memory and GC optimizations improve performance under pressure
Conclusion and Key Takeaways
- HashMap uses hashing + array indexing for fast access
- Collisions are handled via linked list or red-black trees
- Resizing impacts performance — pre-size when possible
- Always implement
equals()
andhashCode()
correctly - Java 8+ added performance improvements and new iteration APIs
FAQ – 10 Expert-Level Questions
1. Why is HashMap capacity always a power of 2?
To optimize index calculation using bitwise AND.
2. What happens if hashCode is same for two keys?
HashMap resolves with equals() and stores both in a chain/tree.
3. Can a HashMap store null keys?
Yes. One null key is allowed.
4. What is treeification threshold?
- If a bucket has ≥8 elements and table size ≥64, it becomes a red-black tree.
5. Is HashMap thread-safe?
No. Use ConcurrentHashMap
for concurrency.
6. Can I use mutable keys?
Avoid it. It can break retrieval after mutation.
7. Does resizing rehash all entries?
Yes. Every key is rehashed and moved to new index.
8. What’s the difference between load factor and threshold?
Load factor = how full a map can get before resizing. Threshold = capacity × load factor.
9. What’s the average time for get() in HashMap?
O(1) average, O(n) worst in poor hashing.
10. How do I avoid collisions?
Design hashCode()
to distribute keys uniformly.