Java Regex Flags
Modify Java regex behavior with flags — CASE_INSENSITIVE, MULTILINE, DOTALL, COMMENTS, and inline (?i) syntax.
Java Regex Flags
A flag changes how a regular expression is interpreted without touching the pattern itself. The same expression can match case-sensitively or not, treat a string as one line or many, and let . cross newlines or stop at them — all decided by flags. In Java you set them two ways: as int constants passed to Pattern.compile(pattern, flags), or as inline (?i)-style switches written inside the pattern. This chapter covers the flags you reach for daily and how to combine them.
The two ways to set a flag
Every flag has a constant in the Pattern class. Pass it as the second argument to compile:
Pattern p = Pattern.compile("error", Pattern.CASE_INSENSITIVE);The same behavior is available inside the pattern as an inline modifier, so a bare string regex can carry its own flags with no second argument:
Pattern p = Pattern.compile("(?i)error"); // whole pattern, case-insensitive
Pattern q = Pattern.compile("(?i:error) CODE"); // only the group is case-insensitiveInline flags are handy when the pattern travels as a plain string — a config file, a database column, an annotation — where you cannot also pass an int. The constant form is clearer when the flag is part of your code.
The flags you will actually use
| Constant | Inline | Effect |
|---|---|---|
CASE_INSENSITIVE | (?i) | Match ASCII letters regardless of case |
MULTILINE | (?m) | ^ and $ match at every line boundary, not just string ends |
DOTALL | (?s) | . matches line terminators too (s = "single line") |
COMMENTS | (?x) | Ignore unescaped whitespace and treat # as a comment |
UNICODE_CASE | (?u) | Make CASE_INSENSITIVE fold Unicode letters, not just ASCII |
UNICODE_CHARACTER_CLASS | (?U) | Make \w, \d, \b follow Unicode rules |
LITERAL | — | Treat the whole pattern as plain text, no metacharacters |
A common surprise: CASE_INSENSITIVE alone folds only ASCII. To match accented or non-Latin letters case-insensitively, combine it with UNICODE_CASE.
Case insensitivity
By default a regex is case-sensitive, so error does not match ERROR. Add CASE_INSENSITIVE and both match:
Pattern.compile("error").matcher("ERROR").find(); // false
Pattern.compile("error", Pattern.CASE_INSENSITIVE).matcher("ERROR").find(); // true
// For non-ASCII letters, add UNICODE_CASE:
Pattern.compile("é", Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CASE)
.matcher("É").find(); // trueLine handling: MULTILINE and DOTALL
These two are independent and often confused. MULTILINE changes the anchors ^ and $; DOTALL changes the dot ..
String text = "first line\nsecond line";
// Without MULTILINE, ^ matches only the very start of the input.
Pattern.compile("^second").matcher(text).find(); // false
// With MULTILINE, ^ matches the start of every line.
Pattern.compile("^second", Pattern.MULTILINE).matcher(text).find(); // true
// Without DOTALL, . will not cross the newline.
Pattern.compile("first.*second").matcher(text).find(); // false
// With DOTALL, . matches the newline too.
Pattern.compile("first.*second", Pattern.DOTALL).matcher(text).find(); // trueReach for MULTILINE when scanning multi-line log or document text line by line, and DOTALL when a single match must span several lines (an HTML block, a multi-line record).
Combining flags
Flag constants are bit masks, so you combine them with the bitwise OR operator |:
int flags = Pattern.MULTILINE | Pattern.CASE_INSENSITIVE | Pattern.DOTALL;
Pattern p = Pattern.compile("^error.*done$", flags);The inline equivalent stacks the letters: (?ims) sets all three. You can also turn a flag off inside a group with a minus: (?-i) disables case-insensitivity for the rest of the pattern.
Readable patterns with COMMENTS
The COMMENTS flag (inline (?x)) lets a complex pattern breathe: unescaped whitespace is ignored and # begins a comment to end of line. This turns an unreadable one-liner into something you can maintain:
Pattern phone = Pattern.compile("""
\\d{3} # area code
- # separator
\\d{4} # line number
""", Pattern.COMMENTS);
phone.matcher("555-1234").matches(); // trueBecause real whitespace is ignored, match a literal space with \\s, \\ , or a character class like [ ].
A worked example: one expression, many flags
This program runs the same handful of patterns with and without flags so you can see each flag flip the result. It counts matches case-insensitively, anchors lines with MULTILINE, spans newlines with DOTALL, combines flags with |, and uses both global and scoped inline switches.
What to take from the run:
CASE_INSENSITIVEfound 2 occurrences oferror(the uppercaseERRORand the lowercaseerror) while the default pattern found only 1 — proof that case sensitivity is on unless you ask for the flag.MULTILINEmade^error:.*$match the middle line of the log and printerror: timeout; without the flag,^and$would only anchor to the whole string's ends, so that interior line would never match.DOTALLletwarn.*infojump across the two embedded newlines and match (true), whereas the same pattern without the flag returnedfalsebecause.stops at a line terminator by default.- The combined
MULTILINE | CASE_INSENSITIVEpattern matched^ERRORagainst a line that actually begins with lowercaseerror:—trueconfirms both flags applied at once from a single bitwise-OR mask. - The scoped
(?i:hello) WORLDmatchedHELLO WORLD(true) but notHELLO world(false): the(?i:...)group folded case only forhello, leaving the trailingWORLDstrictly case-sensitive — exactly the precision inline scoping gives you.
Practice
You compile a pattern with Pattern.compile('first.*second', Pattern.MULTILINE) and match it against the text 'first line\nsecond line'. Why does it fail to match, and which flag would fix it?