0
0
Bash-scriptingHow-ToBeginner · 2 min read

Bash Script to Extract Specific Column from File

Use cut -d 'delimiter' -f column_number filename or awk '{print $column_number}' filename to extract a specific column from a file in Bash.
📋

Examples

Inputfile.txt content: apple,banana,cherry cat,dog,elephant Command: cut -d ',' -f 2 file.txt
Outputbanana dog
Inputfile.txt content: 1 2 3 4 5 6 Command: awk '{print $3}' file.txt
Output3 6
Inputfile.txt content: name|age|city alice|30|nyc bob|25|la Command: cut -d '|' -f 1 file.txt
Outputname alice bob
🧠

How to Think About It

To extract a specific column from a file, first identify the delimiter separating columns, like a comma or space. Then use a tool like cut with the delimiter and column number or awk to print the desired column. This approach reads each line and outputs only the selected column.
📐

Algorithm

1
Get the filename, delimiter, and column number as input.
2
Read the file line by line.
3
Split each line by the delimiter.
4
Select the specified column from the split parts.
5
Print the selected column.
6
Repeat until all lines are processed.
💻

Code

bash
#!/bin/bash
# Usage: ./extract_column.sh filename delimiter column_number

filename="$1"
delimiter="$2"
column="$3"

cut -d "$delimiter" -f "$column" "$filename"
Output
banana dog
🔍

Dry Run

Let's trace extracting the 2nd column from a comma-separated file with lines 'apple,banana,cherry' and 'cat,dog,elephant'.

1

Read first line

Line: apple,banana,cherry

2

Split by comma

Parts: [apple, banana, cherry]

3

Select 2nd column

Selected: banana

4

Print selected column

Output: banana

LineSplit PartsSelected Column
apple,banana,cherryapple, banana, cherrybanana
cat,dog,elephantcat, dog, elephantdog
💡

Why This Works

Step 1: Using cut with delimiter

The cut command splits each line by the specified delimiter using -d and selects the field with -f.

Step 2: Using awk for flexibility

awk treats whitespace as default delimiter and prints the desired column with {print $column_number}.

Step 3: Output only the chosen column

Both tools output only the selected column, making it easy to extract data from structured text files.

🔄

Alternative Approaches

Using awk
bash
#!/bin/bash
filename="$1"
column="$2"
awk "{print \$$column}" "$filename"
More flexible for space-delimited files and complex patterns but requires quoting carefulness.
Using while read and cut
bash
#!/bin/bash
filename="$1"
delimiter="$2"
column="$3"
while IFS="$delimiter" read -ra fields; do
  echo "${fields[$((column-1))]}"
done < "$filename"
Reads file line by line and splits manually; useful for custom processing but more code.

Complexity: O(n) time, O(1) space

Time Complexity

The script reads each line once, so time grows linearly with file size (O(n)).

Space Complexity

Uses constant extra space since it processes one line at a time without storing all lines.

Which Approach is Fastest?

cut is generally faster for simple delimiter extraction; awk is more flexible but slightly slower.

ApproachTimeSpaceBest For
cutO(n)O(1)Simple delimiter-separated columns
awkO(n)O(1)Flexible extraction with complex patterns
while read + splitO(n)O(1)Custom processing per line
💡
Use cut for simple delimiter-separated columns and awk for more complex extraction.
⚠️
Forgetting to specify the correct delimiter with cut -d causes wrong or empty output.