Backreferences
Reference a previously captured group inside the same pattern to match repeated or mirrored text, and use group references in replacement strings.
- Write a backreference with \1 or \k<name> to match a repeated captured value
- Use backreferences in a replacement string with $1 or \g<name>
- Identify practical use cases such as finding duplicate words
A backreference lets you reuse the actual text matched by a capturing group later in the same pattern. Instead of matching a fixed string, it matches whatever group N happened to capture. This enables patterns that are impossible with character classes and quantifiers alone.
Backreferences in patterns
Inside a regex pattern, \1 refers to the text matched by group 1, \2 to group
2, and so on:
// Find repeated characters: "aa", "bb", "cc"…
/(.)\1/.test("aardvark"); // true — "aa"
/(.)\1/.test("hello"); // false — no adjacent repeated char
/(.)\1/g.exec("bookkeeper")[0]; // "oo"The (.) captures any single character, and \1 requires the same character
to appear again immediately after.
The classic duplicate-word pattern
const dupeWord = /\b(\w+)\s+\1\b/i;
dupeWord.test("the the quick"); // true — "the the"
dupeWord.test("the quick brown"); // false — no duplicate
dupeWord.test("It is is wrong"); // true — "is is"\b(\w+) captures a whole word, \s+ allows one or more spaces between, then
\1 requires the exact same word again. The i flag makes it case-insensitive so
"The the" is also caught.
Backreferences and HTML-like patterns
A classic use case: match a simple opening and closing tag where the tag names must match:
/<(\w+)>.*?<\/\1>/s.test("<p>Hello</p>"); // true
/<(\w+)>.*?<\/\1>/s.test("<p>Hello</div>"); // false — tag names differ
/<(\w+)>.*?<\/\1>/s.test("<h2>Title</h2>"); // trueGroup 1 captures the opening tag name. \1 in the closing tag requires the exact
same name.
This trick works for simple, non-nested tags. As soon as tags nest — <div><p> text</p></div> — the backreference approach breaks down. Real HTML parsing
requires a proper parser.
Named backreferences: \k<name>
When using named groups, reference them with \k<name>:
const dupeNamed = /\b(?<word>\w+)\s+\k<word>\b/i;
dupeNamed.test("the the quick"); // trueNamed backreferences are more readable when the pattern is complex and group numbering is hard to track.
In Python: (?P=name) is the syntax for a named backreference:
import re
re.search(r"\b(?P<word>\w+)\s+(?P=word)\b", "the the quick", re.I)Backreferences in replacement strings
In String.prototype.replace, backreferences in the replacement string let you
reorder or repeat captured content:
// Swap first and last name
"Smith, John".replace(/(\w+), (\w+)/, "$2 $1");
// "John Smith"
// Surround matched words with emphasis
"hello world".replace(/(\w+)/g, "**$1**");
// "**hello** **world**"
// Reformat ISO date to US format
"2024-03-15".replace(/(\d{4})-(\d{2})-(\d{2})/, "$2/$3/$1");
// "03/15/2024"With named groups, use $<name> in the replacement:
"2024-03-15".replace(
/(?<y>\d{4})-(?<m>\d{2})-(?<d>\d{2})/,
"$<m>/$<d>/$<y>"
);
// "03/15/2024"In Python, replacements use \1 or \g<1> (numbered) and \g<name> (named):
import re
re.sub(r"(\w+), (\w+)", r"\2 \1", "Smith, John")
# "John Smith"Backreferences and repeated structure
Backreferences can enforce structural symmetry in patterns:
// Match strings surrounded by the same delimiter (' or ")
const quoted = /^(['"]).*\1$/;
quoted.test('"hello"'); // true — both double quotes
quoted.test("'hello'"); // true — both single quotes
quoted.test('"hello\'"); // false — mismatched quotesWhere to go next
Next: the Groups lab — apply capturing groups, named groups, non-capturing groups, and backreferences to realistic text extraction and transformation tasks.
Non-capturing groups
Group patterns for quantifiers and alternation without creating a capture — keeping group numbering clean and avoiding unnecessary overhead.
Lab: groups and references
Practice capturing groups, named groups, non-capturing groups, and backreferences on realistic text extraction and transformation tasks.