It has two primary effects:
- Allows getting a part of the match as a separate item in the result array.
- In case of putting a quantifier after the parentheses, it applies to the latter, as a whole.
Examples of Using Parentheses
Now, let’s see how parentheses operate.
Imagine, you have an example “dododo”.
Without using parentheses, the pattern do+ means d character, followed by o and repeated one or more times. for example doooo or dooooooooo.
With the help of parentheses characters are grouped together, so (do)+ considers do, dodo, dododo, like in the example below:
Now, let’s try to look for a website domain using a regular expression.
So, the domain here consists of repeated words, and with a dot after each of them except the last one.
It is (\w+\.)+\w+ in regular expressions:
The search is done, but the pattern is not capable of matching a domain with a hyphen, as it doesn’t belong to the \w class.
It can be fixed by replacing \w with [\w-] in each word except for the last one: ([\w-]+\.)+\w+.
Let’s create a regular expression for emails, based on the previous example. The format of the email is [email protected]. A random word can be the name, hyphens and dots are also available. In regexp, it will look like this: [-.\w]+.
The pattern will be as follows:
This regexp mostly works, helping to fix accidental mistypes.
Parentheses Contests in the Match
It is necessary to count parentheses from left to right. The engine remembers the content that was matched by each, allowing to get it in the result.
The str.match(regexp) method searches for the first match, returning that as an array (if the regexp doesn’t have a flag g):
- At the 0 index: the full match.
- At the 1 index: the contents of the initial parentheses.
- At the 2 index: the contents of the second parentheses.
Let’s consider finding HTML tags and proceeding them, as an example.
As a first step, you should wrap the content into parentheses, as follows: <(.*?)>.
So, you will get both the tag <p> as a whole and the contents p in the resulting array, like this:
Parentheses might be nested. In that case, the numbering goes from left to right, too.
Once you search for a tag in <p class="myClass">, you should be interested in the whole tag content (p class="myClass"), the tag name (p), and the tag attributes (class="myClass").
Adding parentheses to them will look like this: <(([a-z]+)\s*([^>]*))>
The action will be as follows:
As a rule, the zero index of the result keeps the full match.
The initial group will be returned as res. It encloses the tag content as a whole.
Afterward, in the res group comes the group from the second opening paren ([a-z]+)- the name of the tag, and then the tag in the res:([^>]*).
Even in case of optional groups that don’t exist in the match, the corresponding result array item is there and equals to undefined.
For example, let’s try to apply the a(z)?(c)? regular expression. In case of running it on the string with one letter a, the result will look like this:
The length of the array is 3, but all the groups are empty.
Searching for All Matches:matchAll
First of all let’s note that matchAll is a new method, and is not supported by old browsers. That’s why a polyfill may be required.
While searching for all matches (g flag), the match method can’t return contents for all the groups.
In the example below, you can see an attempt of finding all tags in a string:
The result is found in the array of matches but without the details about them.
But, usually, contents of the capturing groups in the result.
Here is an example:
It is not easy to remember groups by their names. It is actual for simple patterns but counting parentheses is inconvenient for complex patterns.
You can do it by putting ?<name> right after the opening parent.
Here is an example of searching for a date:
The groups are residing in the .groups property of the match. To search for the overall dates, the g flag can be added.
The matchAll is also needed for obtaining full matches along with the groups, like this:
Capturing Groups in the Replacement
The str.replace(regexp, replacement), used for replacing all the matches with regular expressions in str helps to use parentheses contents in the replacement string. It should be done with $n (n is the group number).
The reference will be $<name> for the named parentheses.
Here is an example:
A part of a pattern may be enclosed in parentheses. It is known as a capturing group. Parentheses groups are, generally, numbered from left to right. they can be named with (?<name>...).
The method is used for returning capturing groups without the g flag. The str.matchAll method constantly returns capturing groups.
Also, parentheses contents can be used in the replacement strings in str.replace.