Bash-scriptingHow-ToBeginner · 2 min read

Bash Script to Extract Email Addresses from File

Use grep -E -o '[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}' filename to extract email addresses from a file in Bash.

📋

Examples

InputContact us at support@example.com for help.

Outputsupport@example.com

InputEmails: alice@mail.com, bob123@work.net, invalid@address, test@site.org

Outputalice@mail.com bob123@work.net test@site.org

InputNo emails here!

Output

🧠

How to Think About It

To extract emails, look for patterns with characters before and after an '@' symbol and a domain suffix. Use a regular expression to match these patterns and a tool like grep to find and print them from the file.

📐

Algorithm

Read the input file line by line.

Use a regular expression to find substrings that look like email addresses.

Print each matched email address on its own line.

Ignore lines without any email addresses.

💻

Code

bash

#!/bin/bash

# Extract email addresses from a file passed as argument
if [ $# -eq 0 ]; then
  echo "Usage: $0 filename"
  exit 1
fi

filename="$1"
grep -E -o '[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}' "$filename"

Output

support@example.com alice@mail.com bob123@work.net test@site.org

🔍

Dry Run

Let's trace extracting emails from a file with lines containing emails and other text.

Read line

Line: 'Contact us at support@example.com for help.'

Match emails

Found: 'support@example.com'

Print result

Output: 'support@example.com'

Line	Matched Emails
Contact us at support@example.com for help.	support@example.com
Emails: alice@mail.com, bob123@work.net, invalid@address, test@site.org	alice@mail.com bob123@work.net test@site.org
No emails here!

💡

Why This Works

Step 1: Regular Expression

The regex [a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,} matches typical email formats with username, '@', domain, and extension.

Step 2: grep Options

The -E enables extended regex, and -o prints only the matched parts, not the whole line.

Step 3: File Input

The script reads the file line by line and applies the regex to extract emails, printing each found email on its own line.

🔄

Alternative Approaches

Using awk

bash

awk '{for(i=1;i<=NF;i++) if ($i ~ /[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/) print $i}' filename

awk processes fields and can extract emails but may include punctuation attached to words.

Using Perl one-liner

bash

perl -nle 'print for /[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/g' filename

Perl offers powerful regex matching and global extraction in one command.

⚡

Complexity: O(n) time, O(m) space

Time Complexity

The script reads each line once and applies regex matching, so time grows linearly with file size (n lines).

Space Complexity

Memory usage depends on the number of matched emails (m) stored temporarily for output; otherwise, it processes line by line.

Which Approach is Fastest?

Using grep is generally faster and simpler than awk or perl for this task due to optimized pattern matching.

Approach	Time	Space	Best For
grep with regex	O(n)	O(m)	Simple and fast extraction
awk field matching	O(n)	O(m)	Field-based processing, flexible
Perl regex	O(n)	O(m)	Complex regex and multiple matches

💡

Always quote your filename variable in Bash to handle spaces or special characters safely.

⚠️

Forgetting to escape the dot in the regex domain part causes incorrect matches.