How Hashing Works in HashMap – Step-by-Step Explanation with Diagrams and Code

Illustration for How Hashing Works in HashMap – Step-by-Step Explanation with Diagrams and Code
By Last updated:

HashMap is one of the most important and commonly used classes in Java’s Collections Framework. It provides constant-time complexity for basic operations like get() and put() — but this efficiency hinges on how well hashing is implemented.

In this tutorial, we'll walk through how hashing works under the hood in HashMap, explore collision resolution, resizing strategies, and show diagrams to visualize the internals. Whether you're optimizing code or preparing for an interview, understanding hashing is a must.


What is Hashing?

Hashing is the process of converting an input (key) into a fixed-size numeric value called a hash code. Java uses this hash code to index and quickly locate key-value pairs.

Analogy

Imagine a storage cabinet where each drawer is labeled. A hash function takes a key and tells you exactly which drawer to look in.

Purpose of Hashing in HashMap

  • Enables O(1) average time complexity for retrieval and insertion
  • Organizes data in a predictable, scalable way
  • Reduces search space compared to linear scans

Java HashMap Syntax Recap

Map<String, Integer> map = new HashMap<>();
map.put("apple", 5);
int quantity = map.get("apple");

Step-by-Step Breakdown of Hashing in HashMap

Step 1: Compute Hash Code

int hash = key.hashCode(); // e.g., "apple".hashCode() = 93029210

Step 2: Convert Hash Code to Index

int index = (n - 1) & hash; // Bitwise AND to fit in table size
  • Table size n is always a power of 2 (e.g., 16, 32, 64)
  • Bitwise AND distributes hash evenly across buckets

Step 3: Insert into Bucket

If bucket is empty, insert. If not, resolve collision.


Internal Data Structure and Memory Layout

Node<K,V>[] table; // array of linked lists or trees

Each bucket can be:

  • Empty
  • A single node (key-value pair)
  • A linked list of nodes (collision)
  • A red-black tree of nodes (Java 8+)

How Collision Resolution Works

Linked List Collision Handling (pre-Java 8)

Multiple keys with same index form a linked list.

Bucket[3] → Node1 → Node2 → Node3

Tree-Based Buckets (Java 8+)

If number of nodes in a bucket exceeds 8 and capacity ≥ 64:

  • Converts to Red-Black Tree
  • Improves search time from O(n) to O(log n)
Bucket[7] → Red-Black Tree

Resizing and Threshold Mechanics

Resizing Trigger

threshold = capacity * loadFactor; // Default loadFactor = 0.75

When size exceeds threshold, capacity doubles and rehashing occurs.

Resizing is Expensive

  • All nodes are rehashed and moved to new buckets
  • Can slow down performance if not planned

HashMap Performance and Big-O Complexity

Operation Average Case Worst Case
get/put/remove O(1) O(n)
iteration O(n) O(n)

Worst Case

All keys hash to same bucket → linear scan of linked list

Java 8+ Enhancements: Treeification

  • Introduced treeified buckets to reduce worst-case time
  • Threshold = 8 elements in a single bucket
  • Only enabled if table size ≥ 64

Functional Programming Support

map.forEach((key, value) -> System.out.println(key + ": " + value));

map.entrySet()
   .stream()
   .filter(e -> e.getKey().startsWith("a"))
   .forEach(System.out::println);

Best Practices and Common Pitfalls

✅ Override equals() and hashCode() for custom keys

@Override
public boolean equals(Object obj) { ... }

@Override
public int hashCode() { ... }

❌ Never use mutable keys

Key key = new Key("K1");
map.put(key, "value");
key.setName("K2"); // Now retrieval fails

✅ Pre-size HashMap if size is predictable

Map<String, String> map = new HashMap<>(1000);

Refactoring Examples

Before:

if (!map.containsKey(key)) {
    map.put(key, new ArrayList<>());
}
map.get(key).add(value);

After:

map.computeIfAbsent(key, k -> new ArrayList<>()).add(value);

📌 What's New in Java [version]?

Java 8

  • Treeification of buckets
  • forEach(), compute(), merge()

Java 9

  • Map.of() for immutable maps

Java 10

  • var keyword for inference

Java 21

  • Memory and GC optimizations improve performance under pressure

Conclusion and Key Takeaways

  • HashMap uses hashing + array indexing for fast access
  • Collisions are handled via linked list or red-black trees
  • Resizing impacts performance — pre-size when possible
  • Always implement equals() and hashCode() correctly
  • Java 8+ added performance improvements and new iteration APIs

FAQ – 10 Expert-Level Questions

1. Why is HashMap capacity always a power of 2?

To optimize index calculation using bitwise AND.

2. What happens if hashCode is same for two keys?

HashMap resolves with equals() and stores both in a chain/tree.

3. Can a HashMap store null keys?

Yes. One null key is allowed.

4. What is treeification threshold?

  1. If a bucket has ≥8 elements and table size ≥64, it becomes a red-black tree.

5. Is HashMap thread-safe?

No. Use ConcurrentHashMap for concurrency.

6. Can I use mutable keys?

Avoid it. It can break retrieval after mutation.

7. Does resizing rehash all entries?

Yes. Every key is rehashed and moved to new index.

8. What’s the difference between load factor and threshold?

Load factor = how full a map can get before resizing. Threshold = capacity × load factor.

9. What’s the average time for get() in HashMap?

O(1) average, O(n) worst in poor hashing.

10. How do I avoid collisions?

Design hashCode() to distribute keys uniformly.