Regex Quick Reference

  • . Any character except newline
  • \d \w \s Digit, word char, whitespace
  • * + ? 0+, 1+, 0 or 1 occurrences
  • [abc] [^abc] Character class / negated class
  • ( ) \1 Capture group / backreference

Copy-Paste Patterns

  • Email: ^[\w.-]+@[\w.-]+\.[a-zA-Z]{2,}$
  • URL: https?://[\w.-]+(?:/[\w./-]*)?
  • Phone (US): \(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}
  • Date (YYYY-MM-DD): \d{4}-\d{2}-\d{2}
  • IP Address: \d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}
  • Hex Color: #[0-9A-Fa-f]{6}\b

Flags Explained

  • g (global): Find ALL matches, not just the first one
  • i (case-insensitive): 'hello' matches 'Hello', 'HELLO', etc.
  • m (multiline): ^ and $ match start/end of each LINE, not just the string

Common Regex Mistakes

  • Forgetting to escape special chars: . * + ? must be \. \* \+ \?
  • Greedy vs lazy: .* matches everything; .*? matches minimum
  • ^ and $ only work per-line with multiline flag enabled
  • Character classes don't need escaping: [.] matches a literal dot
  • Backreferences start at \1, not \0 (\0 is the whole match)

Regex Origin Story

  • Invented by Stephen Kleene in 1951 to describe 'regular languages'
  • First implemented in text editors (ed, sed, grep) in the 1970s
  • Name 'grep' = 'global regular expression print'
  • Now built into every programming language, IDE, and text editor