Capturing Groups

Now we are going to cover another useful feature of JavaScript regular expressions: capturing groups, allowing to capture parts of a string, putting them into an array.

It has two primary effects:

  1. Allows getting a part of the match as a separate item in the result array.
  2. In case of putting a quantifier after the parentheses, it applies to the latter, as a whole.

Examples of Using Parentheses

Now, let’s see how parentheses operate.

Imagine, you have an example “dododo”.

Without using parentheses, the pattern do+ means d character, followed by o and repeated one or more times. for example doooo or dooooooooo.

With the help of parentheses characters are grouped together, so (do)+ considers do, dodo, dododo, like in the example below:

Javascript regexp parentheses characters
console.log('Dododo'.match(/(do)+/i)); // "Dododo"

Domain

Now, let’s try to look for a website domain using a regular expression.

For instance:

email.com

users.email.com

roberts.users.email.com

So, the domain here consists of repeated words, and with a dot after each of them except the last one.

It is (\w+\.)+\w+ in regular expressions:

Javascript regexp parentheses characters
let regexp = /(\w+\.)+\w+/g; console.log("email.com my.email.com".match(regexp)); // email.com,my.email.com

The search is done, but the pattern is not capable of matching a domain with a hyphen, as it doesn’t belong to the \w class.

It can be fixed by replacing \w with [\w-] in each word except for the last one: ([\w-]+\.)+\w+.

Email

Let’s create a regular expression for emails, based on the previous example. The format of the email is name@domain. A random word can be the name, hyphens and dots are also available. In regexp, it will look like this: [-.\w]+.

The pattern will be as follows:

Javascript regexp parentheses characters
let regexp = /[-.\w]+@([\w-]+\.)+[\w-]+/g; console.log("[email protected] @ [email protected]".match(regexp)); // [email protected], [email protected]

This regexp mostly works, helping to fix accidental mistypes.

Parentheses Contests in the Match

It is necessary to count parentheses from left to right. The engine remembers the content that was matched by each, allowing to get it in the result.

The str.match(regexp) method searches for the first match, returning that as an array (if the regexp doesn’t have a flag g):

  1. At the 0 index: the full match.
  2. At the 1 index: the contents of the initial parentheses.
  3. At the 2 index: the contents of the second parentheses.

Let’s consider finding HTML tags and proceeding them, as an example.

As a first step, you should wrap the content into parentheses, as follows: <(.*?)>.

So, you will get both the tag <p> as a whole and the contents p in the resulting array, like this:

Javascript regexp the content into parentheses
let str = '<p>Welcome to W3Docs</p>'; let tag = str.match(/<(.*?)>/); alert(tag[0]); // <p> alert(tag[1]); // p

Nested Groups

Parentheses might be nested. In that case, the numbering goes from left to right, too.

Once you search for a tag in <p class="myClass">, you should be interested in the whole tag content (p class="myClass"), the tag name (p), and the tag attributes (class="myClass").

Adding parentheses to them will look like this: <(([a-z]+)\s*([^>]*))>

The action will be as follows:

Javascript regexp adding parentheses
let str = '<p class="myClass">'; let regexp = /<(([a-z]+)\s*([^>]*))>/; let res = str.match(regexp); alert(res[0]); // <span class="myClass"> alert(res[1]); // span class="myClass" alert(res[2]); // p alert(res[3]); // class="myClass"

As a rule, the zero index of the result keeps the full match.

The initial group will be returned as res[1]. It encloses the tag content as a whole.

Afterward, in the res[2] group comes the group from the second opening paren ([a-z]+)- the name of the tag, and then the tag in the res[3]:([^>]*).

Optional Groups

Even in case of optional groups that don’t exist in the match, the corresponding result array item is there and equals to undefined.

For example, let’s try to apply the a(z)?(c)? regular expression. In case of running it on the string with one letter a, the result will look like this:

Javascript regexp optional groups
let m = 'a'.match(/a(z)?(c)?/); console.log(m.length); // 3 console.log(m[0]); // a (whole match) console.log(m[1]); // undefined console.log(m[2]); // undefined

The length of the array is 3, but all the groups are empty.

Searching for All Matches:matchAll

First of all let’s note that matchAll is a new method, and is not supported by old browsers. That’s why a polyfill may be required.

While searching for all matches (g flag), the match method can’t return contents for all the groups.

In the example below, you can see an attempt of finding all tags in a string:

Javascript regexp search all matches
let str = '<p> <span>'; let tags = str.match(/<(.*?)>/g); alert(tags); // <p>,<span>

The result is found in the array of matches but without the details about them.

But, usually, contents of the capturing groups in the result.

For getting them, it is necessary to use the str.matchAll(regexp) method, which was added to JavaScript long after the match method. One of the important differences of this method is that it returns an iterable object, rather than an array. Once the g flag is present, it returns each match as an array with groups. In case of finding no matches, it does not return null but an empty iterable object.

Here is an example:

Javascript regexp search all matches
let result = '<p> <span>'.matchAll(/<(.*?)>/gi); // result - is't an array, but an iterable object console.log(result); // [object RegExp String Iterator] console.log(result[0]); // undefined (*) result = Array.from(result); // let's turn it into array alert(result[0]); // <p>,p (1st tag) alert(result[1]); // <span>,span (2nd tag)

It is not easy to remember groups by their names. It is actual for simple patterns but counting parentheses is inconvenient for complex patterns.

You can do it by putting ?<name> right after the opening parent.

Here is an example of searching for a date:

Javascript regexp named groups
let dateRegexp = /(?<year>[0-9]{4})-(?<month>[0-9]{2})-(?<day>[0-9]{2})/; let str = "2020-04-20"; let groups = str.match(dateRegexp).groups; console.log(groups.year); console.log(groups.month); console.log(groups.day);

The groups are residing in the .groups property of the match. To search for the overall dates, the g flag can be added.

The matchAll is also needed for obtaining full matches along with the groups, like this:

Javascript regexp named groups all matches
let dateRegexp = /(?<year>[0-9]{4})-(?<month>[0-9]{2})-(?<day>[0-9]{2})/g; let str = "2020-04-30 2020-10-01"; let results = str.matchAll(dateRegexp); for (let result of results) { let { year, month, day } = result.groups; console.log(`${day}.${month}.${year}`); }

Capturing Groups in the Replacement

The str.replace(regexp, replacement), used for replacing all the matches with regular expressions in str helps to use parentheses contents in the replacement string. It should be done with $n (n is the group number).

For instance:

Javascript regexp capturing groups
let str = "John Smith"; let regexp = /(\w+) (\w+)/; console.log(str.replace(regexp, '$2, $1'));

The reference will be $<name> for the named parentheses.

Here is an example:

Javascript regexp named parentheses
let regexp = /(?<year>[0-9]{4})-(?<month>[0-9]{2})-(?<day>[0-9]{2})/g; let str = "2020-03-30, 2020-10-01"; console.log(str.replace(regexp, '$<day>.$<month>.$<year>'));

Summary

A part of a pattern may be enclosed in parentheses. It is known as a capturing group. Parentheses groups are, generally, numbered from left to right. they can be named with (?<name>...).

The method is used for returning capturing groups without the g flag. The str.matchAll method constantly returns capturing groups.

Also, parentheses contents can be used in the replacement strings in str.replace.

Practice Your Knowledge

What are the key features or uses of Capturing Groups in JavaScript?

Quiz Time: Test Your Skills!

Ready to challenge what you've learned? Dive into our interactive quizzes for a deeper understanding and a fun way to reinforce your knowledge.

Do you find this helpful?