HashSet in java

A set in java represents an unordered collection which can contain only unique elements or the elements which are not duplicate.

A java set is represented by Set interface. HashSet class implements this interface and hence all its methods.

A HashSet does not maintain the order of insertion,  elements are not retrieved in the same order in which they were inserted.

A HashSet does not support inserting elements using indexes since a set is not ordered. Also, HashSet does not provide any methods to retrieve values.

Java HashSet hierarchy
Below illustration depicts the position of HashSet class in java collection framework
java hashset hierarchy in collection frameworkJava HashSet creation
A HashSet can be created using any of its constructors and new operator.

Following are the different constructors in HashSet.

1. public HashSet()
Creates an empty HashSet with a default capacity of 16.
Example,

// empty hashset implementation 
Set<String> setOne = new HashSet<>();

2. public HashSet(Collection)
Creates a HashSet and populates it with all the elements of the supplied collection. This collection may be a list or another set or any object that implements java.util.Collection interface.
Example,

// hashset with collection
List<String> list = List.of("A", "B"): 
Set<String> setFour = new HashSet<>(list);

3. public HashSet(int capacity, float load)
Creates a HashSet with the supplied capacity and load factor.

  • Capacity
    A HashSet stores elements in buckets. Capacity defines the number of buckets in a HashSet.
  • Load factor
    It is a measure of how full the HashSet should become when its size should be increased.
    When the number of buckets exceed the product of load factor and capacity, then its capacity is doubled(also called rehashing).
    Default load factor for HashSet is 0.75. This means that when HashSet becomes 75% full, it is rehashed.
// hashset with load factor and capacity 
Set<Double> setThree = new HashSet<>(25, 0.6);

4. public HashSet(int capacity)
Creates a HashSet with the supplied capacity. Example,

// hashset with capacity of 25
Set<Integer> setTwo = new HashSet<>(25);

Java HashSet features

  1. Java HashSet is an unordered collection. Elements do not maintain insertion order.
  2. Elements of java HashSet can not be inserted using index.
  3. A java HashSet does not provide any method to get or retrieve its elements.
  4. Java HashSet does not allow duplicate elements.
  5. Java HashSet allows null elements.
  6. A java HashSet instance can be generic. This means that it can hold elements of a particular type.
    This type may be an inbuilt java type such as String, Integer etc., or user defined classes such as Person, Shape etc.
  7. HashSet is not synchronized or not thread safe. This means that if multiple threads are operating on a HashSet  concurrently, then it should be externally synchronized.
  8. HashSet internally stores values as keys in a HashMap.
HashSet internal working
A HashSet is built on a HashMap. That is, it uses a HashMap to store values internally.
So, when a HashSet is created, following is what happens internally

map = new HashMap<>();

Elements of HashSet are added as keys to this HashMap.

When an element is added to a HashSet, it calculates its hash code. To determine hash code, hashCode() method is invoked.
hashCode() method is defined in java.lang.Object class.

If the element overrides hashCode() in its class, then it is invoked, else hashCode() of Object will be called.
Hash code defines the position or bucket where the element is placed.

How HashSet checks for duplicates
It might happen that two different elements have the same hash code, which means, that they should be placed in the same bucket.
This is called hash collision.

In case of a hash collision, the element is compared with the existing element using their equals() method. Again, equals() method is defined in java.lang.Object class.
But, if the class of element overrides equals(), then it will be invoked.

If both the elements are same as per equals(), existing element is replaced with the newer one.
If the elements are not same, then the new element is also placed at the same bucket location, with existing element pointing to the new element.
That is, existing element now contains a reference to the new element.

Now, if a third element with same hash code is added, then the same process is repeated. It replaces an existing element which is same as the new element as per equals() or is added to the same bucket, if it is unique.

Same procedure is performed while removing an element from HashSet.
Important HashSet methods
Following are commonly used methods of java.util.HashSet.

1. add(E)
Adds the element supplied as argument to the set if it is not already present.
That is, there is no element in the set which is same as the argument element according to equals() method.
add() returns true, if the element was added to the set and false otherwise.

This method throws

  • ClassCastException, if the type of added element is incompatible with the type of set.
  • IllegalArgumentException if some property of the supplied argument does not allow adding it to the set.

2. addAll(Collection)
Adds all the elements of the supplied collection argument to the end of this set.
Throws

  • ClassCastException, if the type of added element is incompatible with the type of set.
  • IllegalArgumentException if some property of the supplied argument does not allow adding it to the set.

3. contains(Object)
Returns true if any element of HashSet matches the supplied argument element. Elements are compared and should match as per equals() method.
Throws

  • ClassCastException, if the type of added element is incompatible with the type of set.

4. clear()
Removes all elements from HashSet.

5. isEmpty()
Returns true if HashSet does not contain any elements.

6. iterator()
Returns an iterator to loop over the elements of the set.

7. remove(E)
Removes the first element from the set which matches the supplied argument element. Elements are compared and should match as per equals() method.
Throws

  • ClassCastException, if the type of element removed is incompatible with the type of set.

8. removeAll(c)
It removes all the elements from HashSet that are present in the collection supplied as argument. Argument collection may be a list, set or a type that implements Collections interface.

Returns true, if any element is removed from the set.
Throws

  • ClassCastException, if the type of element removed is incompatible with the type of set.

9. retainAll(c)
Retails all the elements in HashSet that are present in the collection supplied as argument and removes all others.
This method can be considered as the opposite of removeAll().

Returns true, if any element is removed from the set.
Throws,

  • ClassCastException, if the type of added element is incompatible with the type of set.
  • IllegalArgumentException, if some property of the supplied argument does not allow adding it to the set.

10. toArray()
Converts HashSet to a java array. Returns an Object array containing all the elements of the set.
Java HashSet example
Below is a code snippet that uses HashSet class and shows the usage of its methods described above.

 // create a hashset
Set<String> names = new HashSet<>();
System.out.println("HashSet empty? " + names.isEmpty());
// add elements
names.add("Abc");
names.add("Def");
names.add("Ghi");
System.out.println("HashSet size: " + names.size());
// remove element
names.remove("Abc");
System.out.println("HashSet size: " + names.size());
// remove non-existent element
names.remove("Xyz");
System.out.println("HashSet size: " + names.size());
// check if set has value
boolean contains = names.contains("Ghi");
System.out.println("HashSet contains name: " + contains);
contains = names.contains("Xyz");
System.out.println("HashSet contains name: " + contains);

Below is the output

HashSet empty? true
HashSet size: 3
HashSet size: 2
HashSet size: 2
HashSet contains name: true
HashSet contains name: false

Iterating  java HashSet
A HashSet can be iterated

1. Using an iterator
HashSet has an iterator() method which returns a java iterator.

This iterator can be used to loop through the HashSet using its next() and hasNext() methods. Example,

Set<String> names = new HashSet<>();
// add elements
names.add("Abc");
names.add("Def");
names.add("Ghi");
Iterator<String> iterator = names.iterator();
while(iterator.hasNext()) {
  System.out.println("Next set element: " + iterator.next());
}

Output is

Next set element: Def
Next set element: Ghi
Next set element: Abc

Note that the order in which the elements are retrieved is different from the one in which the elements were added.
This shows that HashSet is unordered.
2. Using java 8 forEach()
Java 8 introduced a forEach() method, which can be used to iterate through the elements of a HashSet. Example,

Set<String> names = new HashSet<>(); 
// add elements 
names.add("Abc"); 
names.add("Def"); 
names.add("Ghi"); 
names.forEach( 
      e -> System.out.println("Next set element: " + e
);

Union of HashSets
Union means combining two sets.
HashSet
has a method addAll() which accepts another collection as argument and adds all the elements of this collection to it.

So,  if we use addAll() and pass another HashSet as argument, then this set will contain the elements of both the sets. Example,

Set<String> setOne = new HashSet<>();
setOne.add("A");
setOne.add("C");
setOne.add("E");
Set<String> setTwo = new HashSet<>();
setTwo.add("B");
setTwo.add("D");
setTwo.add("F");
setTwo.addAll(setOne);
System.out.println(setTwo.toString());

Output set is

[A, B, C, D, E, F]

If there are any duplicate elements in both the sets, then they will be overwritten so that the set contains only unique elements.
Intersection of HashSets
Intersection of two HashSet objects means determining the common elements between the two sets.
HashSet has a retainAll() method which removes all the elements from the HashSet on which it is called, that are not present in the argument collection.

So, if we supply a HashSet as argument to retailAll(), then the HashSet calling this method will contain only the elements present in both the sets, or the intersection of sets. Example,

Set<String> setOne = new HashSet<>();
setOne.add("A");
setOne.add("C");
setOne.add("E");
setOne.add("F");
Set<String> setTwo = new HashSet<>();
setTwo.add("B");
setTwo.add("C");
setTwo.add("D");
setTwo.add("F");
setTwo.retainAll(setOne);
System.out.println(setTwo);

Output is

[C, F]

These are the common elements of both the sets.

Hope the article was useful.