Reading Files in Java
Read text and binary files in Java using FileReader, BufferedReader, Scanner, Files.readString, and streams.
Reading Files in Java
There are five common ways to read a text file in Java, and the right choice depends almost entirely on the file's size and what you want to do with the contents. This chapter walks the five from simplest to most flexible:
Files.readString(path)— whole file as oneString.Files.readAllLines(path)— whole file as aList<String>.Files.readAllBytes(path)— whole file as abyte[].Files.lines(path)— file as aStream<String>, lazy.BufferedReader/Scanner— classic decorators, full control.
Pick the smallest tool that fits. Reading a 4 GB log with Files.readString is an OutOfMemoryError; reading a 12-line config with BufferedReader and a while loop is six lines of code where one would do.
Files.readString(path) — whole file, one call
String text = Files.readString(Path.of("config.json"), StandardCharsets.UTF_8);Added in Java 11. Returns the full file as a String. Uses UTF-8 by default since Java 18 (Charset is still strongly recommended to pin explicitly, even with the new default). Throws IOException if the file doesn't exist or can't be read; throws OutOfMemoryError if the file is bigger than the heap.
Use when: the file is "small enough" — config files, JSON payloads, MDX chapters, anything you'd be willing to read in a single editor window. The classic informal rule is under a few megabytes.
Files.readAllLines(path) — list of lines
List<String> lines = Files.readAllLines(Path.of("hosts.txt"), StandardCharsets.UTF_8);Returns an immutable List<String> of the file's lines, with line terminators stripped. Same memory profile as readString plus the List overhead — also holds the whole file in memory.
Use when: you want to index by line number, sort the file, or feed lines into a for (String line : lines) loop without setting up streams.
Files.readAllBytes(path) — raw bytes
byte[] raw = Files.readAllBytes(Path.of("photo.png"));The byte equivalent. No Charset because no decoding happens. Use for binary files (images, archives, executables) or when you need to compute a hash or pipe bytes into a ByteArrayInputStream.
Files.lines(path) — lazy stream
try (Stream<String> lines = Files.lines(Path.of("app.log"), StandardCharsets.UTF_8)) {
long errors = lines.filter(l -> l.contains("ERROR")).count();
}This is the only built-in reader that scales to arbitrarily large files. The Stream<String> is lazy — lines are read on demand, not all at once — and connects directly to the Part 12 pipeline vocabulary (filter, map, count, toList).
Two non-negotiables:
try-with-resources is required. The stream owns an open file handle; withouttry-with-resources, the file stays open until GC, and you'll exhaust file descriptors on a busy server.- Don't reuse the stream after a terminal op. Streams are single-use.
Use when: the file is too big for readAllLines, or you want the line-by-line transform to compose with the rest of your stream pipeline.
BufferedReader.readLine() — the classic
BufferedReader is the workhorse the modern helpers wrap. It buffers underlying reads into a fixed-size in-memory chunk so that readLine() doesn't issue one syscall per character.
try (BufferedReader in = Files.newBufferedReader(Path.of("hosts.txt"), StandardCharsets.UTF_8)) {
String line;
while ((line = in.readLine()) != null) {
System.out.println(line);
}
}Files.newBufferedReader(path) is the modern factory; the classic version is new BufferedReader(new FileReader("hosts.txt")) (which uses the platform charset on JDKs older than 18 — pin to UTF-8 with the three-argument overload). The readLine() contract is:
- Returns the next line without its terminator (
\n,\r, or\r\n). - Returns
nullat end of file. The loop condition(line = readLine()) != nullis the established idiom.
BufferedReader is also a Stream<String>-producer: reader.lines() returns a Stream<String> backed by the reader. That's how Files.lines is implemented under the hood.
Scanner — token-by-token parsing
Scanner reads text by tokens — words, integers, doubles, lines, even regex matches — and is the right tool for reading structured input where the units aren't whole lines.
try (Scanner sc = new Scanner(Files.newBufferedReader(Path.of("nums.txt")))) {
while (sc.hasNextInt()) {
int n = sc.nextInt();
System.out.println(n * n);
}
}Scanner is slower than BufferedReader because it parses; it allocates short strings and runs regex. For line-by-line processing, prefer BufferedReader. For typed tokens out of a small file (numbers, words, CSV-ish input), Scanner saves the parsing layer.
There's a full chapter on Scanner later in this part — this is the read-a-file flavour.
FileReader — the raw character reader
try (FileReader in = new FileReader("notes.txt", StandardCharsets.UTF_8)) {
int c;
while ((c = in.read()) != -1) {
System.out.print((char) c);
}
}FileReader reads characters straight from the file — no buffering, no line awareness, no decoding choices made for you (you pass the Charset, or accept the platform default on pre-18 JDKs). It's the layer the others sit on top of. You almost never use it directly in application code; you wrap it in a BufferedReader.
It's still useful when you want to read a few hundred characters and stop — small lookups where the cost of a buffer setup is dwarfed by the call cost.
Which one to use
| Scenario | Pick |
|---|---|
Small file you want as a single String | Files.readString |
Small file you want as a List<String> | Files.readAllLines |
| Binary file (image, archive) | Files.readAllBytes |
| Any file with a stream-style transform | Files.lines (inside try-with-resources) |
| Line-by-line loop, full control | Files.newBufferedReader + readLine |
| Typed tokens (ints, words, regex matches) | Scanner |
| One character at a time, tiny file | FileReader |
The right default for the "I just want to load this small text file" case is Files.readString. The right default for "process this giant log without blowing memory" is Files.lines.
A worked example: same file, five readers
The program below writes a small text file, then reads it five different ways — readString, readAllLines, Files.lines filtered through a Predicate<String> from Part 12's vocabulary, BufferedReader.readLine, and Scanner for tokenised integers. Each block prints what it got so you can see the shapes side by side.
What to take from the run:
Files.readStringreturned the whole file as oneString— easy and exactly what you want for small configs and templates. For a 4 GB log it would have thrownOutOfMemoryError.Files.readAllLinesreturned an indexableList<String>with terminators stripped.lines.get(0)worked because the list is materialised in memory; you couldn't do that with a stream.Files.lines(file)was opened insidetry-with-resources because the stream owns the file handle. The pipeline.filter(isError).count()is the same shape as anything from Part 12 — only the source changed.BufferedReader.readLine()returnednullat end of file. Theforloop here stopped at three on purpose, but the production idiom iswhile ((line = in.readLine()) != null).Scannerskipped lines that didn't start with an integer, then read tokens withnextInt()until it ran out. The sameScannercould have read doubles (nextDouble), regex matches (findInLine), orBigIntegers — that's why it costs more per token thanBufferedReaderdoes per line.
What's next
The next chapter, Writing Files in Java, covers the writing side of the same APIs — Files.writeString, Files.write, BufferedWriter, PrintWriter, and the StandardOpenOption flags (APPEND, CREATE_NEW, TRUNCATE_EXISTING) that decide how an existing file is handled.
Practice
You need to process a 5 GB server log line by line, counting how many lines contain the word `ERROR`. Which reader is the right pick?