Regex to Extract Email Addresses From Text
Extract Email Addresses is a Regex pattern that this pattern matches the standard email format: local part (letters, numbers, dots, special chars) + @ symbol + domain (letters, numbers, dots) + tld (2+ letters).. Formula Genius generates and validates this formula automatically from a plain-English prompt.
A battle-tested regex pattern for finding email addresses in unstructured text. Handles common formats and edge cases.
The Formula
"Extract all email addresses from a block of text"
[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}
This pattern matches the standard email format: local part (letters, numbers, dots, special chars) + @ symbol + domain (letters, numbers, dots) + TLD (2+ letters).
Step-by-Step Breakdown
- [a-zA-Z0-9._%+-]+ — local part: letters, numbers, dots, underscores, percent, plus, hyphen
- @ — literal @ symbol
- [a-zA-Z0-9.-]+ — domain: letters, numbers, dots, hyphens
- \\. — literal dot before the TLD
- [a-zA-Z]{2,} — TLD: at least 2 letters (com, org, io, etc.)
Edge Cases & Warnings
- Doesn't match emails with IP address domains (user@[192.168.1.1])
- Matches most common formats but not all RFC 5322 valid addresses
- Quoted local parts ("john doe"@example.com) are not matched
- International domain names (IDN) with non-ASCII characters need a broader pattern
Examples
"Contact us at hello@example.com or support@test.co.uk"
hello@example.com, support@test.co.uk
"Invalid: @nodomain.com or user@ or user@.com"
None matched (all invalid)
Frequently Asked Questions
Is this regex 100% RFC compliant?
No regex perfectly validates all RFC 5322 email addresses. This pattern handles 99%+ of real-world emails. For strict validation, use a dedicated email validation library.
How do I use this in Google Sheets?
Use =REGEXEXTRACT(A1, "[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}") to extract the first email from cell A1.
Can't find what you need?
Describe any formula in plain English and Formula Genius will generate, explain, and validate it — instantly.