Regular expressions — regex for short — are a language for describing patterns in text. With a single regex you can search, validate, extract, or replace text in ways that would take dozens of lines of code otherwise. They look intimidating at first, but once you know the building blocks they become intuitive.
What does a regex look like?
A regex is written between two forward slashes: /pattern/flags. For example, /\d{4}/g finds all sequences of exactly four digits. The g is a flag meaning "global" — find all matches, not just the first one.
Literal characters
The simplest regex is just the text you want to find. The pattern /cat/ matches the exact string "cat" anywhere in the input. By default this is case-sensitive — /cat/ will not match "Cat" unless you add the i flag.
Character classes
Instead of matching a specific character, you can match any character from a set using square brackets.
| Pattern | Matches |
|---|---|
| [abc] | Any one of: a, b, or c |
| [a-z] | Any lowercase letter |
| [A-Z] | Any uppercase letter |
| [0-9] | Any digit |
| [^abc] | Anything except a, b, or c |
Shorthand character classes
These are built-in shortcuts for common character sets:
| Shorthand | Equivalent | Matches |
|---|---|---|
| \d | [0-9] | Any digit |
| \D | [^0-9] | Any non-digit |
| \w | [a-zA-Z0-9_] | Word character (letter, digit, underscore) |
| \W | [^\w] | Non-word character |
| \s | [ \t\n\r\f] | Whitespace |
| \S | [^\s] | Non-whitespace |
| . | (anything) | Any character except newline |
Quantifiers
Quantifiers control how many times a character or group must appear:
| Quantifier | Meaning |
|---|---|
| * | Zero or more |
| + | One or more |
| ? | Zero or one (optional) |
| {3} | Exactly 3 |
| {2,5} | Between 2 and 5 |
| {3,} | 3 or more |
By default quantifiers are greedy — they match as much as possible. Add ? after a quantifier to make it lazy (match as little as possible): +?, *?.
Anchors
Anchors match a position, not a character:
| Anchor | Matches |
|---|---|
| ^ | Start of the string (or line with m flag) |
| $ | End of the string (or line with m flag) |
| \b | Word boundary (between \w and \W) |
| \B | Non-word boundary |
Groups and capturing
Parentheses create a capturing group. The matched text inside is stored and can be referenced later.
(abc)— capturing group, stored as$1,$2, etc.(?:abc)— non-capturing group (grouped but not stored)(?<name>abc)— named capturing group, referenced as$<name>a|b— alternation: match "a" or "b"
Groups are useful for replacement. The pattern /(\d{4})-(\d{2})-(\d{2})/ with replacement $3/$2/$1 converts 2026-03-22 to 22/03/2026.
Flags
| Flag | Name | Effect |
|---|---|---|
| g | global | Find all matches (not just the first) |
| i | case insensitive | Match regardless of case |
| m | multiline | ^ and $ match start/end of each line |
| s | dotAll | . also matches newline characters |
| u | unicode | Enable full Unicode support |
Real-world patterns
Email address (simple)
/^[\w.+-]+@[\w-]+\.[a-z]{2,}$/i
UK postcode
/^[A-Z]{1,2}\d[A-Z\d]? ?\d[A-Z]{2}$/i
Date (YYYY-MM-DD)
/^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$/
URL (basic)
/https?:\/\/[\w.-]+\.[a-z]{2,}(\/\S*)?/gi
Hex colour code
/#([0-9a-f]{3}|[0-9a-f]{6})\b/gi
Common mistakes
- Forgetting the g flag — without it,
replace()only replaces the first match. - Over-matching with .* — greedy dot-star will match across boundaries you did not expect. Use
.*?or a character class instead. - Escaping special characters — characters like
.*+?()[{\have special meaning. Escape them with\when you want them literally. - Catastrophic backtracking — certain nested quantifiers like
(a+)+can cause exponential slowdowns on certain inputs. Keep patterns simple.