Bash Script to Split File into Parts Easily
Use the Bash command
split -l 100 filename prefix to split a file into parts each containing 100 lines, where filename is your file and prefix is the output file prefix.Examples
Inputfile.txt with 250 lines, split into parts of 100 lines each
OutputCreates files prefixaa (lines 1-100), prefixab (lines 101-200), prefixac (lines 201-250)
Inputfile.txt with 50 lines, split into parts of 20 lines each
OutputCreates files prefixaa (lines 1-20), prefixab (lines 21-40), prefixac (lines 41-50)
Inputfile.txt with 10 lines, split into parts of 15 lines each
OutputCreates one file prefixaa with all 10 lines
How to Think About It
To split a file into parts, decide how many lines each part should have. Then use the
split command with the -l option to divide the file into smaller files each containing that many lines. The command automatically names the output files with a prefix and suffix.Algorithm
1
Get the input file name and desired number of lines per part.2
Use the split command with the -l option and a prefix for output files.3
The split command creates multiple files each with the specified number of lines.4
Check the output files to confirm the split.Code
bash
#!/bin/bash input_file="file.txt" lines_per_part=100 output_prefix="part_" split -l "$lines_per_part" "$input_file" "$output_prefix" echo "File split into parts with prefix '$output_prefix'"
Output
File split into parts with prefix 'part_'
Dry Run
Let's trace splitting a file with 250 lines into parts of 100 lines each.
1
Set variables
input_file = 'file.txt', lines_per_part = 100, output_prefix = 'part_'
2
Run split command
split -l 100 file.txt part_
3
Resulting files
part_aa (lines 1-100), part_ab (lines 101-200), part_ac (lines 201-250)
| Output File | Lines Included |
|---|---|
| part_aa | 1-100 |
| part_ab | 101-200 |
| part_ac | 201-250 |
Why This Works
Step 1: Using split command
The split command divides a file into smaller files based on line count or size.
Step 2: Option -l for lines
The -l option tells split how many lines each output file should contain.
Step 3: Output file naming
Split names output files by adding suffixes like 'aa', 'ab' to the given prefix.
Alternative Approaches
Split by byte size
bash
split -b 1M file.txt part_
Splits file into parts each 1 megabyte in size instead of line count; useful for binary files.
Using csplit for pattern-based splitting
bash
csplit file.txt '/pattern/' '{*}'
Splits file at lines matching a pattern; more flexible but complex.
Manual split with awk
bash
awk 'NR%100==1 {file=sprintf("part_%03d.txt", ++i)} {print > file}' file.txtSplits file every 100 lines using awk; more control but longer code.
Complexity: O(n) time, O(1) space
Time Complexity
The split command reads the entire file once, so time grows linearly with file size.
Space Complexity
Split writes output files directly without loading the whole file into memory, so space is constant.
Which Approach is Fastest?
Using the built-in split command is fastest and simplest compared to manual scripting.
| Approach | Time | Space | Best For |
|---|---|---|---|
| split -l | O(n) | O(1) | Simple line-based splitting |
| split -b | O(n) | O(1) | Splitting by file size |
| csplit | O(n) | O(1) | Pattern-based splitting |
| awk manual | O(n) | O(1) | Custom splitting logic |
Use a clear prefix to easily identify split files and avoid overwriting.
Forgetting to specify the output prefix causes split to use default 'x' prefix, which can be confusing.