W3docs

Java Regex Flags

Modify Java regex behavior with flags — CASE_INSENSITIVE, MULTILINE, DOTALL, COMMENTS, and inline (?i) syntax.

Java Regex Flags

A flag changes how a regular expression is interpreted without touching the pattern itself. The same expression can match case-sensitively or not, treat a string as one line or many, and let . cross newlines or stop at them — all decided by flags. In Java you set them two ways: as int constants passed to Pattern.compile(pattern, flags), or as inline (?i)-style switches written inside the pattern. This chapter covers the flags you reach for daily and how to combine them.

The two ways to set a flag

Every flag has a constant in the Pattern class. Pass it as the second argument to compile:

Pattern p = Pattern.compile("error", Pattern.CASE_INSENSITIVE);

The same behavior is available inside the pattern as an inline modifier, so a bare string regex can carry its own flags with no second argument:

Pattern p = Pattern.compile("(?i)error");          // whole pattern, case-insensitive
Pattern q = Pattern.compile("(?i:error) CODE");    // only the group is case-insensitive

Inline flags are handy when the pattern travels as a plain string — a config file, a database column, an annotation — where you cannot also pass an int. The constant form is clearer when the flag is part of your code.

The flags you will actually use

ConstantInlineEffect
CASE_INSENSITIVE(?i)Match ASCII letters regardless of case
MULTILINE(?m)^ and $ match at every line boundary, not just string ends
DOTALL(?s). matches line terminators too (s = "single line")
COMMENTS(?x)Ignore unescaped whitespace and treat # as a comment
UNICODE_CASE(?u)Make CASE_INSENSITIVE fold Unicode letters, not just ASCII
UNICODE_CHARACTER_CLASS(?U)Make \w, \d, \b follow Unicode rules
LITERALTreat the whole pattern as plain text, no metacharacters

A common surprise: CASE_INSENSITIVE alone folds only ASCII. To match accented or non-Latin letters case-insensitively, combine it with UNICODE_CASE.

Case insensitivity

By default a regex is case-sensitive, so error does not match ERROR. Add CASE_INSENSITIVE and both match:

Pattern.compile("error").matcher("ERROR").find();                      // false
Pattern.compile("error", Pattern.CASE_INSENSITIVE).matcher("ERROR").find(); // true

// For non-ASCII letters, add UNICODE_CASE:
Pattern.compile("é", Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CASE)
       .matcher("É").find();                                           // true

Line handling: MULTILINE and DOTALL

These two are independent and often confused. MULTILINE changes the anchors ^ and $; DOTALL changes the dot ..

String text = "first line\nsecond line";

// Without MULTILINE, ^ matches only the very start of the input.
Pattern.compile("^second").matcher(text).find();                  // false
// With MULTILINE, ^ matches the start of every line.
Pattern.compile("^second", Pattern.MULTILINE).matcher(text).find(); // true

// Without DOTALL, . will not cross the newline.
Pattern.compile("first.*second").matcher(text).find();            // false
// With DOTALL, . matches the newline too.
Pattern.compile("first.*second", Pattern.DOTALL).matcher(text).find(); // true

Reach for MULTILINE when scanning multi-line log or document text line by line, and DOTALL when a single match must span several lines (an HTML block, a multi-line record).

Combining flags

Flag constants are bit masks, so you combine them with the bitwise OR operator |:

int flags = Pattern.MULTILINE | Pattern.CASE_INSENSITIVE | Pattern.DOTALL;
Pattern p = Pattern.compile("^error.*done$", flags);

The inline equivalent stacks the letters: (?ims) sets all three. You can also turn a flag off inside a group with a minus: (?-i) disables case-insensitivity for the rest of the pattern.

Readable patterns with COMMENTS

The COMMENTS flag (inline (?x)) lets a complex pattern breathe: unescaped whitespace is ignored and # begins a comment to end of line. This turns an unreadable one-liner into something you can maintain:

Pattern phone = Pattern.compile("""
    \\d{3}   # area code
    -        # separator
    \\d{4}   # line number
    """, Pattern.COMMENTS);
phone.matcher("555-1234").matches();   // true

Because real whitespace is ignored, match a literal space with \\s, \\ , or a character class like [ ].

A worked example: one expression, many flags

This program runs the same handful of patterns with and without flags so you can see each flag flip the result. It counts matches case-insensitively, anchors lines with MULTILINE, spans newlines with DOTALL, combines flags with |, and uses both global and scoped inline switches.

java— editable, runs on the server

What to take from the run:

  • CASE_INSENSITIVE found 2 occurrences of error (the uppercase ERROR and the lowercase error) while the default pattern found only 1 — proof that case sensitivity is on unless you ask for the flag.
  • MULTILINE made ^error:.*$ match the middle line of the log and print error: timeout; without the flag, ^ and $ would only anchor to the whole string's ends, so that interior line would never match.
  • DOTALL let warn.*info jump across the two embedded newlines and match (true), whereas the same pattern without the flag returned false because . stops at a line terminator by default.
  • The combined MULTILINE | CASE_INSENSITIVE pattern matched ^ERROR against a line that actually begins with lowercase error:true confirms both flags applied at once from a single bitwise-OR mask.
  • The scoped (?i:hello) WORLD matched HELLO WORLD (true) but not HELLO world (false): the (?i:...) group folded case only for hello, leaving the trailing WORLD strictly case-sensitive — exactly the precision inline scoping gives you.

Practice

Practice

You compile a pattern with Pattern.compile('first.*second', Pattern.MULTILINE) and match it against the text 'first line\nsecond line'. Why does it fail to match, and which flag would fix it?