Bash Script to Extract Specific Column from File
cut -d 'delimiter' -f column_number filename or awk '{print $column_number}' filename to extract a specific column from a file in Bash.Examples
How to Think About It
cut with the delimiter and column number or awk to print the desired column. This approach reads each line and outputs only the selected column.Algorithm
Code
#!/bin/bash # Usage: ./extract_column.sh filename delimiter column_number filename="$1" delimiter="$2" column="$3" cut -d "$delimiter" -f "$column" "$filename"
Dry Run
Let's trace extracting the 2nd column from a comma-separated file with lines 'apple,banana,cherry' and 'cat,dog,elephant'.
Read first line
Line: apple,banana,cherry
Split by comma
Parts: [apple, banana, cherry]
Select 2nd column
Selected: banana
Print selected column
Output: banana
| Line | Split Parts | Selected Column |
|---|---|---|
| apple,banana,cherry | apple, banana, cherry | banana |
| cat,dog,elephant | cat, dog, elephant | dog |
Why This Works
Step 1: Using cut with delimiter
The cut command splits each line by the specified delimiter using -d and selects the field with -f.
Step 2: Using awk for flexibility
awk treats whitespace as default delimiter and prints the desired column with {print $column_number}.
Step 3: Output only the chosen column
Both tools output only the selected column, making it easy to extract data from structured text files.
Alternative Approaches
#!/bin/bash filename="$1" column="$2" awk "{print \$$column}" "$filename"
#!/bin/bash filename="$1" delimiter="$2" column="$3" while IFS="$delimiter" read -ra fields; do echo "${fields[$((column-1))]}" done < "$filename"
Complexity: O(n) time, O(1) space
Time Complexity
The script reads each line once, so time grows linearly with file size (O(n)).
Space Complexity
Uses constant extra space since it processes one line at a time without storing all lines.
Which Approach is Fastest?
cut is generally faster for simple delimiter extraction; awk is more flexible but slightly slower.
| Approach | Time | Space | Best For |
|---|---|---|---|
| cut | O(n) | O(1) | Simple delimiter-separated columns |
| awk | O(n) | O(1) | Flexible extraction with complex patterns |
| while read + split | O(n) | O(1) | Custom processing per line |
cut for simple delimiter-separated columns and awk for more complex extraction.cut -d causes wrong or empty output.