The Set
interface in Java represents an unordered collection of unique elements. It’s a vital part of the Java Collections Framework and is frequently used in real-world projects for duplicate elimination, fast lookups, and mathematical set operations.
🧠 What is the Set Interface?
- A
Set
is a collection that contains no duplicate elements - Part of
java.util
- Extends the
Collection
interface - Common implementations:
HashSet
,LinkedHashSet
,TreeSet
Set<String> fruits = new HashSet<>();
fruits.add("apple");
fruits.add("banana");
fruits.add("apple"); // Duplicate ignored
System.out.println(fruits); // [banana, apple]
🛠️ Core Implementations
HashSet
- Backed by a
HashMap
- No guaranteed order
- Best performance
LinkedHashSet
- Maintains insertion order
- Slightly slower than HashSet
TreeSet
- Sorted order (natural or custom Comparator)
- Backed by a Red-Black Tree (O(log n) operations)
⚙️ Internal Working
HashSet
- Internally uses
HashMap<E, Object>
- Each element is stored as a key with a dummy value
private transient HashMap<E,Object> map;
private static final Object PRESENT = new Object();
TreeSet
- Uses
TreeMap
for ordering - Insertion, deletion, lookup: O(log n)
⏱️ Performance Comparison
Operation | HashSet | LinkedHashSet | TreeSet |
---|---|---|---|
Add/Remove | O(1) | O(1) | O(log n) |
Lookup | O(1) | O(1) | O(log n) |
Order | ❌ | ✅ Insertion | ✅ Sorted |
Thread-safe | ❌ | ❌ | ❌ |
💡 When to Use Which?
Scenario | Best Choice |
---|---|
Fastest access and no order | HashSet |
Preserve insertion order | LinkedHashSet |
Sorted elements required | TreeSet |
Thread-safe set | ConcurrentSkipListSet or Collections.synchronizedSet() |
🧪 Real-World Use Cases
- Removing duplicates from data (e.g., API input)
- Fast lookups (e.g., checking existing usernames)
- Maintaining sorted, unique datasets (e.g., tags, categories)
- Representing mathematical sets (union, intersection)
Set<String> unique = new HashSet<>(list);
🧬 Functional Java and Streams
List<String> items = Arrays.asList("apple", "banana", "apple");
Set<String> unique = items.stream()
.collect(Collectors.toSet());
To preserve order:
Set<String> ordered = items.stream()
.collect(Collectors.toCollection(LinkedHashSet::new));
❌ Common Pitfalls
- ❌ Assuming order in
HashSet
- ❌ Modifying set during iteration without iterator
- ❌ Using mutable elements in
TreeSet
with custom comparator (breaks contract)
✅ Use Iterator
for safe removal:
Iterator<String> it = set.iterator();
while (it.hasNext()) {
if (condition) it.remove();
}
📌 What's New in Java Versions?
Java 8
- Streams,
Collectors.toSet()
, lambdas removeIf
,forEach
,replaceAll
Java 9
- Immutable
Set.of(...)
factory
Java 10
var
support
Java 11
- Small perf enhancements
Java 17 & 21
- Immutable and concurrent collection refinements
🔄 Refactoring Examples
Before:
List<String> names = Arrays.asList("John", "Jane", "John");
After (deduplicated):
Set<String> uniqueNames = new HashSet<>(names);
🧠 Real-World Analogy
A
Set
is like a guest list for a party — no duplicate names allowed. If someone tries to sneak in twice, the list politely ignores the second entry.
❓ FAQ – Expert-Level Questions
-
Can a Set contain null values?
- Yes,
HashSet
andLinkedHashSet
allow one null.TreeSet
throwsNullPointerException
unless Comparator handles it.
- Yes,
-
Are Sets ordered?
HashSet
: no order,LinkedHashSet
: insertion order,TreeSet
: sorted
-
How to create an immutable Set?
Set<String> fixed = Set.of("A", "B");
-
Can Sets be serialized?
- Yes, all standard implementations are Serializable.
-
How to remove elements based on condition?
set.removeIf(s -> s.length() < 3);
-
How does HashSet avoid duplicates?
- Uses
equals()
andhashCode()
methods.
- Uses
-
What happens if two different objects have same hashCode?
- Collision occurs; resolved using chaining or probing in
HashMap
- Collision occurs; resolved using chaining or probing in
-
Thread-safe alternatives?
Collections.synchronizedSet()
,ConcurrentSkipListSet
-
Why TreeSet needs Comparable or Comparator?
- To maintain sorted order during insertion
-
How to iterate in insertion or sorted order?
- Use
LinkedHashSet
orTreeSet
- Use
🏁 Conclusion and Key Takeaways
Set
is the go-to for uniqueness constraints- Choose the right implementation based on ordering, performance, and use case
- Use Streams and
Collectors.toSet()
for modern, expressive logic - Be cautious about mutable elements and
hashCode()
/equals()
contracts
🧪 Pro Tip: Always override
hashCode()
andequals()
properly when using custom types in a Set.