W3docs

Java Streams Introduction

An introduction to the Java Stream API for processing sequences of elements with functional-style operations.

Java Streams Introduction

A stream is a pipeline that carries the elements of a source through a sequence of operations and produces a result. It's not a data structure — it stores nothing. It's a declarative recipe for processing data, evaluated lazily, executed once. Streams arrived in Java 8 alongside lambdas, and the two were designed to fit together: every stream operation takes a function, and the language gave you a clean way to write one.

The shape you'll write hundreds of times:

double avgAdultAge = people.stream()
    .filter(p -> p.age() >= 18)
    .mapToInt(Person::age)
    .average()
    .orElse(0.0);

Three things to notice. The pipeline reads top-to-bottom as steps describing what you want, not how to iterate. Each step takes a function — a Predicate, a ToIntFunction — exactly the vocabulary the previous chapters set up. And the result drops out of a single terminal operation; there's no loop, no accumulator, no early continue.

The pipeline shape: source → intermediate → terminal

Every stream pipeline has three parts:

  1. A source. Where the elements come from. Usually a collection (coll.stream()), occasionally a literal (Stream.of(\"a\", \"b\")), an array (Arrays.stream(arr)), an IntStream range (IntStream.range(0, 100)), an I/O source (Files.lines(path)), or a generator (Stream.iterate, Stream.generate). The next chapter is dedicated to all of them.
  2. Zero or more intermediate operations. Each returns another stream, so they chain. Common ones: filter, map, flatMap, distinct, sorted, limit, skip, peek. They are lazy — calling filter does not test anything yet; it just records the predicate.
  3. Exactly one terminal operation. Triggers the pipeline. Examples: forEach, collect, toList, count, sum, min, max, reduce, findFirst, anyMatch. The terminal produces a value (or a side effect for forEach) and consumes the stream — you can't reuse it.
list.stream()              // SOURCE
    .filter(...)           // intermediate
    .map(...)              // intermediate
    .sorted()              // intermediate
    .toList();             // TERMINAL — runs the pipeline

Without the terminal, nothing happens. A stream you build and never finish is dead weight — no work is done, no side effects fire, the lambdas don't run.

Lazy by design

Intermediate operations are lazy because the JVM doesn't know which elements you actually need until the terminal asks. That enables two important optimisations:

Fusion. Adjacent intermediates run together in one pass, not one pass per operation. stream.filter(p).map(f) doesn't build an intermediate filtered list and then map it; it tests one element, and if it survives, maps it, all in one step.

Short-circuiting. A terminal like findFirst, anyMatch, or limit(n) stops the pipeline as soon as it has its answer. Combined with laziness, this means you can run a "find the first even-square greater than 100" pipeline over an infinite stream and get an answer in microseconds:

int answer = Stream.iterate(1, n -> n + 1)         // 1, 2, 3, 4, ...
    .map(n -> n * n)                                // 1, 4, 9, 16, ...
    .filter(n -> n % 2 == 0 && n > 100)             // first match wins
    .findFirst()
    .orElseThrow();
// answer = 144

Stream.iterate(1, n -> n + 1) is infinite, but findFirst only requested elements until one matched. The pipeline tested 12 squares (1, 4, 9, ..., 144) and stopped.

Single-use, like an Iterator

A Stream can be traversed once. The terminal consumes it, and after that the stream object is closed; calling another terminal on it throws IllegalStateException:

Stream<String> s = list.stream();
long c1 = s.count();             // ok
long c2 = s.count();             // throws IllegalStateException — stream has already been operated upon

If you need to process the same data twice, build the stream twice:

long c1 = list.stream().count();
long c2 = list.stream().count();

This matches the way Iterator works. The stream object is the moving cursor, not the data. The data is the source — re-streaming it is free.

Streams vs collections — different jobs

AspectCollectionStream
Stores data?YesNo
Reusable?YesNo (one terminal)
Eager or lazy?EagerLazy until terminal
Modifies source?Yes (e.g. list.add)No — pipelines are read-only
Iterates explicitly?Often (for, iterator())No — the pipeline drives iteration
Cost modelBookkeeping per elementOne pass through the source

A collection is a container; a stream is a computation over a container (or another source). They complement each other: you fetch from a collection, run a stream pipeline, and collect back into a (usually different) collection.

Three small examples you'll write all the time

Counting elements that match a predicate:

long adults = people.stream().filter(p -> p.age() >= 18).count();

Building a list of transformed values:

List<String> names = people.stream().map(Person::name).toList();

Reducing to a single value:

int totalAge = people.stream().mapToInt(Person::age).sum();

These three patterns — count, map-to-list, reduce-to-scalar — cover most uses of the API. The rest of the part is a tour of the operations that fill in the how for each.

Three things streams are not

  • Not a replacement for for loops in general. A loop that builds something with non-trivial control flow, that needs break with side effects, or that mutates several variables, is still clearer as a loop. Streams shine when the work is a pipeline of pure operations.
  • Not a performance win on small data. A stream pipeline allocates a few small objects; a 10-element loop will outpace it. The wins come from clarity on any data and from parallelism on big data.
  • Not a substitute for Iterator/Iterable when other code expects them. A stream produces values; if you need to interleave consumption (an enhanced for, a List returned from a method), toList() first.

Sequential by default, parallel on request

Every stream you'll write in this chapter is sequential — elements flow through the pipeline one at a time, in order. There's also coll.parallelStream() (and stream.parallel()) which schedules the pipeline across the common ForkJoinPool for multi-core work. Parallel streams are a later chapter — they make several assumptions about the pipeline (it must be associative, stateless, side-effect-free) that this chapter's "intro" pipelines naturally meet, so the upgrade is usually a one-token change.

A worked example: a full pipeline, laziness, and the single-use rule

The program below builds a small list of Person records, runs the canonical pipeline shape (filter → map → sorted → collect), proves laziness with peek, demonstrates short-circuiting on an infinite Stream.iterate, and shows the IllegalStateException you get from reusing a stream.

java— editable, runs on the server

What to take from the run:

  • The canonical four-step pipeline — streamfiltermaptoList — produced a sorted list of adult names with no explicit loop, no temporary collection, and no nullability bookkeeping.
  • peek printed once per pulled element. findFirst pulled elements until one satisfied n*n > 50 (which happens at n = 8, square 64) and then stopped. That's laziness and short-circuiting working together: the upstream operations did exactly the work that was needed and no more.
  • The "first even square over 100" pipeline ran over an infinite source. Without short-circuiting that would be an infinite loop; with it the pipeline tested 12 values and produced 144.
  • The second s.count() threw IllegalStateException. Streams are single-use; if you need a second pass, build a fresh stream from the source.
  • The "no-terminal" pipeline at the end printed nothing from inside its peek. Without a terminal, intermediates don't run — the stream is just a recipe that nobody asked to execute.

What's next

You know the pipeline shape, the source/intermediate/terminal split, the laziness contract, and the single-use rule. The next chapter, Creating Java Streams, is the catalogue of sourcesCollection.stream(), Stream.of, Arrays.stream, IntStream.range, Stream.iterate, Stream.generate, Files.lines, String.chars(), Stream.empty, and the Stream.Builder API. With the source chapter done you'll have everything you need to start, and the rest of the part will fill in the intermediate and terminal operations.

Practice

Practice

You write `list.stream().filter(p).map(f);` and call no terminal operation. What happens when this line executes?