Code of the Day
IntermediateText processing

Grep Patterns

Use grep with BRE and ERE, anchors, character classes, and flags like -i, -v, -l, -c, -r, and --include to search text precisely.

BashIntermediate11 min read
Recommended first
By the end of this lesson you will be able to:
  • Explain the difference between BRE and ERE in grep
  • Use anchors and character classes in patterns
  • Apply -i, -v, -l, -c, -r flags effectively
  • Limit recursive searches with --include and --exclude

grep is the workhorse of shell text processing. You used it in the beginner track to filter output; now you'll use it with precision — extended , anchored matches, and recursive file searches with targeted file filtering. Most of the "find it in the codebase" tasks that agents and developers do every day are variations on a single well-crafted grep invocation.

BRE vs ERE

grep uses Basic Regular Expressions (BRE) by default. The +, ?, and | metacharacters require backslash-escaping in BRE:

grep "colou\?r" file.txt       # BRE: matches "color" or "colour"
grep "https\?" url.txt         # BRE: ? must be escaped

grep -E (or egrep) enables Extended Regular Expressions (ERE), where +, ?, |, (), and {} work without escaping:

grep -E "colou?r" file.txt     # ERE: cleaner, same result
grep -E "error|warn|fatal" app.log
grep -E "[0-9]{4}-[0-9]{2}-[0-9]{2}" dates.txt   # ISO date format

Prefer -E in scripts — the unescaped syntax is less error-prone and easier to read.

Anchors

Anchors match positions, not characters:

grep "^ERROR" app.log          # lines starting with ERROR
grep "DONE$" build.log         # lines ending with DONE
grep "^$" file.txt             # empty lines
grep -E "^[A-Z]" names.txt     # lines starting with a capital letter

\b is a word boundary in ERE:

grep -E "\bcat\b" story.txt    # matches "cat" but not "concatenate"

Character classes

Use bracket expressions for sets:

grep "[aeiou]" words.txt        # lines with any vowel
grep "[^aeiou]" words.txt       # lines with any non-vowel character
grep -E "[0-9]{3}-[0-9]{4}" phones.txt  # simple phone pattern
grep -E "[[:upper:]][[:lower:]]+" names.txt  # POSIX: capital followed by lowercase

POSIX character classes ([[:alpha:]], [[:digit:]], [[:space:]], [[:upper:]], [[:lower:]]) are locale-aware and more portable than [a-z] for non-ASCII text.

Essential flags

grep -i "error" app.log        # case-insensitive
grep -v "DEBUG" app.log        # invert: lines that do NOT match
grep -c "ERROR" app.log        # count matching lines (not the lines themselves)
grep -l "TODO" src/*.py        # list filenames that contain a match
grep -n "TODO" main.py         # show line numbers with matches
grep -w "cat" story.txt        # whole-word match (same as \b...\b)

-l is particularly useful in scripts that need to act on matching files:

for f in $(grep -rl "DEPRECATED" src/); do
  echo "Needs update: $f"
done

Recursive search with -r and --include/--exclude

grep -r "TODO" ./src/                   # all files under src/
grep -r "TODO" ./src/ --include="*.py"  # only Python files
grep -r "TODO" . --exclude="*.min.js"   # skip minified JS
grep -r "TODO" . --exclude-dir=".git"   # skip the .git directory

--include and --exclude accept globs, not regexes. To match multiple extensions, run separate grep calls or use find ... | xargs grep. The --exclude-dir flag is GNU grep only; on macOS, use grep -r --exclude-dir=.git.

-r without --include on large trees can be slow and produce noisy output from binary files. Add --include to scope the search, or pipe through a file list from find. For very large codebases, ripgrep (rg) is significantly faster, but the flags are similar.

Check your understanding

  1. 1.
    Which flag enables Extended Regular Expressions, allowing + and ? without backslash-escaping?
  2. 2.
    You want a list of filenames (not matching lines) that contain the word "password" in a directory tree. Which command is correct?
  3. 3.
    The POSIX character class [[:digit:]] is equivalent to [0-9] in all locales.

Do it yourself

# Find lines in /etc/passwd that start with a lowercase letter
grep "^[a-z]" /etc/passwd | head -5

# Count how many lines in /etc/passwd do NOT start with #
grep -vc "^#" /etc/passwd

# Find all .sh files under /usr/local/bin (if any)
grep -rl "#!/" /usr/local/bin/ 2>/dev/null --include="*.sh" | head -5

# Find TODO comments in any .sh scripts in your home directory
find ~ -name "*.sh" 2>/dev/null | head -3 | xargs grep -l "TODO" 2>/dev/null

Where to go next

You can now search files with surgical precision. Next: sed basics — stream editing to substitute, delete, and transform text in place, building on the same pattern-matching foundation.

Finished reading? Mark it complete to track your progress.

On this page