Master Java Stream Collectors.groupingBy() With Examples

Want to make your Java code easier and faster?

Learn how to use Collectors.groupingBy() instead of writing long loops!

This guide explains this method and shows ample examples to help you group and analyze data better using Java Streams.

Let’s make your code cleaner and more powerful!

Stream API Fundamentals

To understand the benefit of groupingBy() method, you need to have an understanding of the Stream API fundamentals.

A Stream is a sequence of elements supporting parallel and functional-style operations.

You can think of it as a lazy, composable, and iterable data structure.

When you create a Stream, you specify the source of the data, and then you can apply various operations to it, such as filtering, mapping, reducing, and, of course, grouping.

Here’s a simple example to get you started:

List fruits = Arrays.asList("apple", "banana", "orange", "apple", "mango");
List result = fruits.
              stream().
              filter(fruit -> fruit.startsWith("a")).
              collect(Collectors.toList()); 
System.out.println(result); // [apple, apple]

In this example, we create a Stream from a list of fruits, filter out the fruits that start with “a”, and then collect the results into a new list.

Collectors.groupingBy()

The Collectors.groupingBy() method is one of the most powerful and versatile operations in Java’s Stream API, allowing developers to easily group elements based on specific criteria.

Introduced in Java 8, this collector transforms a stream of elements into a Map where elements are organized into groups according to a classification function.

Whether you’re analyzing data, generating reports, or organizing complex objects, groupingBy() provides an elegant and efficient solution for grouping operations.

Basic groupingBy() Usage

Suppose you have a list of Person objects, and you want to group them by their age.

You can use the groupingBy() collector like this:

List people = Arrays.asList(new Person("John", 25), 
                            new Person("Alice", 30), 
                            new Person("Bob", 25), 
                            new Person("Eve", 30)); 
Map<Integer, List> peopleByAge = people.
                                 stream().
                                 collect(
                                    Collectors.groupingBy(Person::getAge)
                                 ); 
System.out.println(peopleByAge);

In this example, we’re using the groupingBy() collector to group the Person objects by their age.

The resulting Map will have the age as the key and a list of Person objects as the value.

When you run this code, you’ll get the following output:

{25=[Person{name='John', age=25}, Person{name='Bob', age=25}], 
30=[Person{name='Alice', age=30}, Person{name='Eve', age=30}]}

As you can see, the groupingBy() collector has grouped the Person objects by their age, and the resulting Map structure is easy to work with.

You can access the list of people for a specific age by using the age as the key.

Customizing Grouping Criteria

For more complex data sets, you may need to group your data based on custom criteria that go beyond simple properties like age or department.

This is where the power of groupingBy() really shines.

In the previous examples, we’ve seen how to group data using simple property accessors like Person::getAge.

Here, double colon(::) is method reference operator used to directly call a method which is implicit according to the context.

However, what if you need to group people based on a more complex condition, like their age range?

You can achieve this by providing a custom function that extracts the grouping key from each element.

Let’s say you want to group people into three categories: “young” (under 25), “adult” (25–50), and “senior” (over 50).

You can create a custom function that calculates the age range and use it as the grouping key:

Map<String, List> peopleByAge = people.
                                stream().
                                collect(
                                Collectors.groupingBy(
                                  person -> { 
                                    if (person.getAge() < 25) { 
                                      return "young"; 
                                    } else if (person.getAge() < 50) { 
                                      return "adult"; 
                                    } else { 
                                      return "senior"; 
                                    } 
                                  }
                                  )
                                );

In this example, the lambda expression inside groupingBy() calculates the age range for each person and returns a string key.

The resulting map will have three entries, one for each age range, with the corresponding list of people.

You can also use method references to make the code more concise and readable.

For instance, you can extract the age range calculation into a separate method:

public static String getAgeRange(Person person) { 
  if (person.getAge() < 25) { 
    return "young"; 
  } else if (person.getAge() < 50) { 
    return "adult"; 
  } else { 
    return "senior"; 
  } 
} 
Map<String, List> peopleByAgeRange = people.
                                     stream().
                                     collect(
                                       Collectors.groupingBy(
                                         this::getAgeRange
                                       )
                                     );

Multi-Level Grouping

Any time you work with hierarchical data, you’ll encounter situations where a single level of grouping just isn’t enough.

That’s where multi-level grouping comes in — and with Java Stream’s groupingBy() collector, you can tackle even the most complex data structures with ease.

Let’s say you have a list of Employee objects, each with a department and a team within that department.

You want to group these employees by department, and then further group them by team within each department.

This is a perfect scenario for multi-level grouping.

Here’s an example to get you started:

List employees = Arrays.asList( 
                         new Employee("John", "Sales", "North"), 
                         new Employee("Alice", "Sales", "South"), 
                         new Employee("Bob", "Marketing", "East"), 
                         new Employee("Charlie", "Marketing", "West") ); 
Map<String, Map<String, List>> result = employees.
                                        stream().
                                        collect(
                                          Collectors.groupingBy(
                                               Employee::getDepartment, 
                                               Collectors.groupingBy(
                                                     Employee::getTeam)
                                           )
                                         );

In this example, we’re using groupingBy() twice — once to group employees by department, and again to group them by team within each department.

The resulting map has a structure like this:

{ "Sales": { "North": [Employee("John", "Sales", "North")], 
             "South": [Employee("Alice", "Sales", "South")] 
           }, 
  "Marketing": { "East": [Employee("Bob", "Marketing", "East")], 
                  "West": [Employee("Charlie", "Marketing", "West")] 
                }
}

As you can see, multi-level grouping allows you to create a hierarchical representation of your data, making it easier to analyze and work with.

When working with multi-level grouping, it’s crucial to maintain code readability.

One tip is to use separate variables for each level of grouping, making it clear what’s happening at each step.

For example:

// collector for team grouping
Collector<Employee,?, Map<String, List>> teamCollector = 
                              Collectors.groupingBy(Employee::getTeam); 

// collector for department and team grouping
Collector<Employee,?, Map<String, Map<String, List>>> departmentCollector =
                    Collectors.
                    groupingBy(Employee::getDepartment, teamCollector); 
Map<String, Map<String, List>> result = employees.
                                        stream().
                                        collect(departmentCollector);

By breaking down the grouping process into smaller, more manageable pieces, you can make your code more readable and easier to understand.

Downstream Collectors

Downstream collectors are a powerful feature of the Stream API that allow you to perform additional operations on the results of your grouping.

Think of them as a way to further process and transform your grouped data.

In this section, we’ll learn about some of the most commonly used downstream collectors and explore how you can use them to transform your code.

Let’s start with one of the simplest yet most useful downstream collectors: counting().

This collector returns the count of elements in each group, which can be incredibly useful for data analysis and aggregation.

Here’s an example:

Map<String, Long> countByDepartment = employees.
                                      stream().
                                      collect(
                                         Collectors.groupingBy(
                                           Employee::getDepartment, 
                                           Collectors.counting()
                                         )
                                      );

In this example, we’re grouping a list of Employee objects by their department and then counting the number of employees in each department.

The resulting map will have the department as the key and the count of employees as the value.

Another powerful downstream collector is mapping().

This collector allows you to transform the values in each group into a new form.

For instance, let’s say you want to calculate the total salary for each department:

Map<String, Double> salaryByDepartment = employees.
                                         stream().
                                         collect(
                                           Collectors.groupingBy(
                                             Employee::getDepartment, 
                                             Collectors.mapping(
                                               Employee::getSalary, 
                                               Collectors.summingDouble(
                                                 Double::doubleValue
                                               )
                                             )
                                           )
                                         );

In this example, we’re grouping the employees by department and then using the mapping() collector to extract the salaries from each employee.

Finally, we’re using the summingDouble() collector to calculate the total salary for each department.

Real-World Applications and Examples

Let’s look into some practical examples that demonstrate the versatility of groupingBy().

Imagine you’re working on an e-commerce platform, and you need to generate a report that shows the total sales amount for each region.

You have a list of Order objects, each containing the region and the sale amount.

Using groupingBy(), you can easily group the orders by region and calculate the total sales amount for each region:

List orders = Arrays.asList( 
                     new Order("North", 100), 
                     new Order("South", 200), 
                     new Order("North", 150), 
                     new Order("East", 50) 
              ); 
Map<String, Double> salesByRegion = orders.
                                    stream().
                                    collect(
                                      Collectors.groupingBy(
                                        Order::getRegion, 
                                        Collectors.summingDouble(
                                          Order::getAmount
                                        )
                                       )
                                     ); 
System.out.println(salesByRegion); // {North=250.0, South=200.0, East=50.0}

In this example, you’re using groupingBy() to group the orders by region, and then applying a downstream collector (summingDouble()) to calculate the total sales amount for each region.

The resulting map shows the total sales amount for each region.

Another common scenario is grouping data by multiple fields.

Suppose you’re working on a student management system, and you need to group students by their department and then by their major.

You can achieve this using a composite key:

List students = Arrays.asList(
                    new Student("John", "CS", "AI"), 
                    new Student("Jane", "CS", "ML"), 
                    new Student("Bob", "EE", "Robotics") 
                ); 
Map<String, Map<String, List>> stByDepAndMajor = students.
                                                 stream().
                                                 collect(
                                                  Collectors.groupingBy(
                                                    Student::getDepartment,
                                                    Collectors.groupingBy(
                                                      Student::getMajor
                                                    )
                                                  )
                                                 ); 
System.out.println(studentsByDepartmentAndMajor); 
// {CS={AI=[John], ML=[Jane]}, EE={Robotics=[Bob]}}

In this example, you’re using groupingBy() to group students by department, and then applying another groupingBy() operation to group students by major within each department.

The resulting map shows the students grouped by department and major.

These examples demonstrate the power and flexibility of groupingBy().

Troubleshooting and Optimization

In this section, we’ll cover some common issues you might encounter when using groupingBy() and how to optimize your code for better performance.

Let’s start with a common error: ConcurrentModificationException.

This occurs when you’re trying to modify a collection while iterating over it.

For example, consider the following code:

List list = Arrays.asList("apple", "banana", "cherry"); 
Map<Character, List> grouped = list.
                               stream().
                               collect(
                                 Collectors.groupingBy(s -> s.charAt(0))); 
                               grouped.forEach(
                                 (k, v) -> v.add("new fruit")
                               ); // ConcurrentModificationException!

In this case, you’re trying to add a new element to the list while iterating over the map.

To fix this, you can create a new list for each group:

Map<Character, List> grouped = list.
                               stream().
                               collect(
                                 Collectors.groupingBy(
                                   s -> s.charAt(0), 
                                   Collectors.mapping(
                                     Function.identity(), 
                                     Collectors.toList()
                                   )
                                 )
                               ); 
grouped.forEach((k, v) -> v.add("new fruit")); // No exception!

Another common issue is performance.

When working with large datasets, using groupingBy() can be slow.

One way to optimize this is to use parallel streams:

// large list
List list = Arrays.asList("apple", "banana", "cherry",…); 
Map<Character, List> grouped = list.
                               parallelStream().
                               collect(
                                 Collectors.groupingBy(s -> s.charAt(0))
                               );

However, be careful when using parallel streams, as they can also introduce additional overhead.

Make sure to test and measure the performance of your code before making any optimizations.

Finally, let’s talk about debugging.

When working with groupingBy(), it can be difficult to understand what’s going on under the hood.

One way to debug your code is to use the peek() method to print out intermediate results:

List list = Arrays.asList("apple", "banana", "cherry"); 
Map<Character, List> grouped = list.
                               stream().
                               peek(System.out::println).
                               collect(
                                 Collectors.groupingBy(s -> s.charAt(0))
                               );

This will print out each element as it’s processed, helping you understand how the grouping is being performed.

To wrap up

In this article, you learned the powerful Java Stream Collectors.groupingBy() method, which can transform your code by simplifying complex data processing tasks.

You’ve learned how to group data by simple and custom keys, perform multiple-level grouping, and apply downstream collectors.

Master Java Stream Collectors.groupingBy(): Powerful Examples That Will Transform Your Code