Code of the Day
BeginnerCore syntax

What is a regular expression?

Understand what regex is, when to reach for it, and the three building blocks every pattern is made of.

Regular ExpressionsBeginner7 min read
By the end of this lesson you will be able to:
  • Explain in plain terms what a regular expression is
  • Decide when regex is the right tool versus plain string methods
  • Name the three building blocks every pattern is composed of

A regular expression (regex for short) is a pattern that describes a set of strings. You write the pattern; the engine tests whether a given piece of text matches it, and optionally hands you back the matched portions.

In JavaScript that looks like this:

const result = "Order #4821 received".match(/\d+/);
// result[0] === "4821"

The pattern /\d+/ means "one or more digit characters." The engine scans the string and returns the first run of digits it finds — "4821". No loops, no index arithmetic, no branching.

When to use regex — and when not to

Regex is the right tool when the structure of the text is regular (meaning: it follows a predictable pattern) and you need to find, validate, or extract pieces of it. Common good fits:

  • Validation: does this string look like a phone number / ZIP code / email?
  • Extraction: pull every date out of a log file.
  • Transformation: replace all occurrences of one pattern with something else.
  • Parsing simple formats: split a CSV line on commas that are not inside quotes.

Regex is the wrong tool when the text has real nesting or context that a finite pattern cannot capture. HTML and JSON are the textbook examples: a tag can contain other tags, and a regex cannot track that depth. For those, use a dedicated parser.

A simple rule: if a plain string method (includes, startsWith, split, indexOf) reads clearly and does the job, use it. Reach for regex when the pattern is variable or structural — when you need "a digit" not "the digit 5".

The three building blocks

Every regex pattern, no matter how long, is assembled from three kinds of things:

1. Literals — characters that match themselves exactly.

/cat/ matches the characters c, a, t in that order. Literal matching is case-sensitive by default, so /cat/ does not match "Cat".

2. Metacharacters — characters with special meaning.

The twelve metacharacters are: . ^ $ * + ? { } [ ] \ | ( )

They don't match themselves — they control the engine. For example, . means "any single character except a newline," and * means "zero or more of the preceding thing." You'll learn each of them in the coming lessons.

3. Quantifiers — expressions that say how many times something should repeat.

\d+ means \d (one digit) repeated one or more times (+). Quantifiers let a single small pattern cover variable-length input.

Regex lives everywhere

The same pattern notation, with tiny dialect variations, works in:

  • JavaScript/pattern/flags literal syntax, or new RegExp("pattern")
  • Pythonre module: re.search(r"pattern", text)
  • Go, Rust, Java, C#, PHP, Ruby — each has a regex library; the core syntax is nearly identical
  • SQL — many engines support REGEXP or SIMILAR TO
  • Command linegrep -E, sed, awk all speak regex
  • Text editors — VS Code, Vim, Emacs all have regex find-and-replace

Learning regex once pays dividends across every environment you work in.

A first live example

Try editing the pattern below. Change /\d+/ to /[a-z]+/ (lowercase letters) and see what changes:

JavaScript — editable, runs in your browser

Notice the g flag on the second pattern. Without it, .match() returns only the first result. With g ("global"), it returns every match. Flags are covered in depth in the Practical matching module.

Where to go next

Next up: literals and metacharacters — the twelve special characters and exactly what each one does.

Finished reading? Mark it complete to track your progress.

On this page