Regex in tooling
Use regex in grep, ripgrep, sed, VS Code, git, and PostgreSQL — concrete one-liners a developer uses every day.
- Use grep and ripgrep with PCRE and extended-regex flags for code and log search
- Write sed substitutions with capture group references
- Apply regex in VS Code find/replace including multi-line replacements
- Query PostgreSQL with the ~ operator and regexp_matches
- Use git log --grep to filter commits by message content
Most developers spend more time in terminals, editors, and databases than writing regex in code. This lesson covers the regex surface area you will encounter in those environments — with concrete one-liners you can adapt immediately.
grep and ripgrep
Basic extended regex (-E)
grep -E enables ERE (Extended Regular Expressions), which gives you +, ?,
|, and () without escaping:
# Find all lines containing an IP-like pattern in a log file
grep -E '\b[0-9]{1,3}(\.[0-9]{1,3}){3}\b' access.log
# Find function definitions in JavaScript files
grep -rE 'function\s+\w+\s*\(' src/PCRE mode (-P)
GNU grep's -P flag enables PCRE, giving you lookaheads, lookbehinds, \d,
\w, and named groups:
# Find lines with a status code of 4xx or 5xx
grep -P '(?<= )[45]\d{2}(?= )' access.log
# Find TODO comments that have a GitHub issue number
grep -rP 'TODO.*#\d+' src/ripgrep (rg)
ripgrep defaults to its own RE2-like engine (no backreferences or lookarounds).
Use --pcre2 to enable full PCRE2:
# Search recursively, default engine (no lookarounds)
rg '\b\d{4}-\d{2}-\d{2}\b' logs/
# PCRE2 mode — lookahead available
rg --pcre2 '\d+(?= USD)' prices.txt
# Show filenames only (-l), case-insensitive (-i)
rg -l -i 'password|secret|token' config/ripgrep respects .gitignore by default, making it faster than plain grep
for code searches. The --type flag limits search to specific file types:
rg --type js 'import.*react' searches only .js and .jsx files.
sed
sed uses POSIX BRE by default. The -E flag (or -r on older GNU sed)
enables ERE so you don't need to escape + and ().
Basic substitution
# Replace first occurrence of "colour" with "color" on each line
sed 's/colour/color/' file.txt
# Replace all occurrences (g flag)
sed 's/colour/color/g' file.txtCapture group references
sed uses \1, \2, etc. to reference captured groups in the replacement:
# Reformat dates from YYYY-MM-DD to DD/MM/YYYY
sed -E 's/([0-9]{4})-([0-9]{2})-([0-9]{2})/\3\/\2\/\1/' dates.txt
# Wrap all function names in a JS file with a log call
sed -E 's/function ([a-zA-Z_][a-zA-Z0-9_]*)\s*\(/function \1(/' src.jsIn-place editing
# Edit file in place (make a backup with -i.bak)
sed -i.bak 's/localhost/db.internal/g' config.env
# Delete lines matching a pattern
sed -i '/^\s*#/d' config.env # remove comment linesVS Code find and replace
VS Code's find panel supports regex when you click the .* icon (or press
Alt+R). Capture groups are referenced as $1, $2 (not \1) in the
replacement field.
Example: rename a function across a codebase
Find:
function getUserData\((\w+)\)Replace with:
function fetchUser($1)Example: convert double-quoted strings to template literals
Find:
"([^"]*)\$\{Replace with:
`$1${Multi-line matching
Enable multi-line mode with the .* + M toggle. The pattern ^(\s+) matches
leading whitespace on each line, useful for bulk indentation changes.
VS Code also supports regex in the Search panel (Ctrl+Shift+F / Cmd+Shift+F).
Use the same syntax for project-wide substitutions. The "Preserve case" option
(accessible via the AB icon) lets replacements follow the case of the original
match — useful when renaming identifiers that appear in both camelCase and
PascalCase contexts.
git log --grep
git log --grep filters commits whose messages match a pattern:
# Find commits mentioning a bug fix for issue 42
git log --grep='#42'
# Find all commits touching authentication (case-insensitive)
git log --grep='auth' -i
# Extended regex — match "fix" or "fixes" or "fixed"
git log --grep='fix\(es\)\?' --extended-regexp
# or with -E (same as --extended-regexp)
git log -E --grep='fix(es)?'
# Combine --grep with --author and a date range
git log --grep='deploy' --author='alice' --since='2024-01-01'Note: git log --grep matches the commit message only, not the code
changes. To search code in commits, use git log -S (pickaxe) or git log -G
(regex on the diff).
PostgreSQL
The ~ operator
PostgreSQL uses ~ for case-sensitive regex match and ~* for case-insensitive.
The negations are !~ and !~*:
-- Find all users with a .edu email address
SELECT email FROM users WHERE email ~* '\\.edu$';
-- Find product codes matching a pattern
SELECT sku FROM products WHERE sku ~ '^[A-Z]{2}-[0-9]{4}-[A-Z]{3}$';
-- Exclude rows where the log message contains an IP address
SELECT * FROM event_log WHERE message !~ '[0-9]{1,3}(\.[0-9]{1,3}){3}';regexp_matches() and regexp_replace()
regexp_matches returns an array of captured groups for each match:
-- Extract host from each URL in a table
SELECT regexp_matches(url, 'https?://([^/]+)', 'g') AS host
FROM page_views;
-- Returns: ARRAY['example.com'], ARRAY['api.example.com'], etc.The third argument 'g' is the flags string — 'g' for global (all matches),
'i' for case-insensitive.
regexp_replace works like a SQL-flavoured sed:
-- Normalise phone numbers to digits only
UPDATE contacts
SET phone = regexp_replace(phone, '[^0-9]', '', 'g');
-- Replace multiple spaces with a single space in descriptions
UPDATE products
SET description = regexp_replace(description, '\s{2,}', ' ', 'g');regexp_split_to_table and regexp_split_to_array
-- Split a comma-or-semicolon-separated tag list into rows
SELECT regexp_split_to_table(tags, '[,;]\s*') AS tag
FROM articles
WHERE id = 42;Quick reference
| Tool | Flag/syntax | Engine |
|---|---|---|
grep | -E | POSIX ERE |
grep | -P | PCRE |
ripgrep | default | RE2-like (Rust regex crate) |
ripgrep | --pcre2 | PCRE2 |
sed | -E / -r | POSIX ERE |
| VS Code | .* toggle | JavaScript (V8) |
git log | -E --grep | POSIX ERE |
| PostgreSQL | ~, regexp_* | POSIX ERE |
Where to go next
The final lesson in this module, When not to use regex, is the honest counterpoint — knowing when to stop and reach for a parser instead is just as important as knowing how to write a good pattern.
Data extraction pipelines
Use regex as part of a larger pipeline — pre-processing text, extracting structured fields, normalising values, and composing sequential patterns.
When not to use regex
Recognise when regex is the wrong tool — HTML parsing, recursive structures, write-only patterns — and learn to use verbose mode for the patterns you do write.