0
0
Bash-scriptingHow-ToBeginner · 2 min read

Bash Script to Extract Email Addresses from File

Use grep -E -o '[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}' filename to extract email addresses from a file in Bash.
📋

Examples

InputContact us at support@example.com for help.
Outputsupport@example.com
InputEmails: alice@mail.com, bob123@work.net, invalid@address, test@site.org
Outputalice@mail.com bob123@work.net test@site.org
InputNo emails here!
Output
🧠

How to Think About It

To extract emails, look for patterns with characters before and after an '@' symbol and a domain suffix. Use a regular expression to match these patterns and a tool like grep to find and print them from the file.
📐

Algorithm

1
Read the input file line by line.
2
Use a regular expression to find substrings that look like email addresses.
3
Print each matched email address on its own line.
4
Ignore lines without any email addresses.
💻

Code

bash
#!/bin/bash

# Extract email addresses from a file passed as argument
if [ $# -eq 0 ]; then
  echo "Usage: $0 filename"
  exit 1
fi

filename="$1"
grep -E -o '[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}' "$filename"
Output
support@example.com alice@mail.com bob123@work.net test@site.org
🔍

Dry Run

Let's trace extracting emails from a file with lines containing emails and other text.

1

Read line

Line: 'Contact us at support@example.com for help.'

2

Match emails

Found: 'support@example.com'

3

Print result

Output: 'support@example.com'

LineMatched Emails
Contact us at support@example.com for help.support@example.com
Emails: alice@mail.com, bob123@work.net, invalid@address, test@site.orgalice@mail.com bob123@work.net test@site.org
No emails here!
💡

Why This Works

Step 1: Regular Expression

The regex [a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,} matches typical email formats with username, '@', domain, and extension.

Step 2: grep Options

The -E enables extended regex, and -o prints only the matched parts, not the whole line.

Step 3: File Input

The script reads the file line by line and applies the regex to extract emails, printing each found email on its own line.

🔄

Alternative Approaches

Using awk
bash
awk '{for(i=1;i<=NF;i++) if ($i ~ /[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/) print $i}' filename
awk processes fields and can extract emails but may include punctuation attached to words.
Using Perl one-liner
bash
perl -nle 'print for /[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/g' filename
Perl offers powerful regex matching and global extraction in one command.

Complexity: O(n) time, O(m) space

Time Complexity

The script reads each line once and applies regex matching, so time grows linearly with file size (n lines).

Space Complexity

Memory usage depends on the number of matched emails (m) stored temporarily for output; otherwise, it processes line by line.

Which Approach is Fastest?

Using grep is generally faster and simpler than awk or perl for this task due to optimized pattern matching.

ApproachTimeSpaceBest For
grep with regexO(n)O(m)Simple and fast extraction
awk field matchingO(n)O(m)Field-based processing, flexible
Perl regexO(n)O(m)Complex regex and multiple matches
💡
Always quote your filename variable in Bash to handle spaces or special characters safely.
⚠️
Forgetting to escape the dot in the regex domain part causes incorrect matches.