0
0
Bash Scriptingscripting~15 mins

Looping over files and directories in Bash Scripting - Deep Dive

Choose your learning style9 modes available
Overview - Looping over files and directories
What is it?
Looping over files and directories means running a set of commands repeatedly for each file or folder in a location. In bash scripting, this helps automate tasks like checking, moving, or modifying many files without doing each one by hand. It uses simple loops that go through each item one by one. This makes managing files faster and less error-prone.
Why it matters
Without looping over files and directories, you would have to manually handle each file, which is slow and boring. Automating this saves time and reduces mistakes, especially when dealing with many files. It allows scripts to adapt to changing file lists, making your work more flexible and powerful. This is essential for tasks like backups, organizing files, or batch processing.
Where it fits
Before learning this, you should understand basic bash commands and how to write simple scripts. After mastering loops over files, you can learn advanced file handling, conditional processing, and automation workflows. This topic is a foundation for automating system tasks and managing data efficiently.
Mental Model
Core Idea
Looping over files and directories means repeating actions for each item in a folder automatically, like checking every book on a shelf one by one.
Think of it like...
Imagine you have a basket of apples and you want to check each apple for bruises. Instead of looking at all apples at once, you pick one apple, check it, then move to the next until all are checked. Looping over files is the same but with files instead of apples.
Folder
  ├─ file1
  ├─ file2
  ├─ dir1
  │    ├─ file3
  │    └─ file4
  └─ file5

Loop:
  for each item in folder:
    do something with item
    move to next item
  end
Build-Up - 7 Steps
1
FoundationBasic for loop syntax in bash
🤔
Concept: Learn the simplest way to write a loop that repeats commands a fixed number of times.
In bash, a basic for loop looks like this: for i in 1 2 3 do echo "Number $i" done This prints numbers 1, 2, and 3 one by one.
Result
Number 1 Number 2 Number 3
Understanding the basic for loop syntax is the first step to automating repeated tasks in bash.
2
FoundationUsing wildcards to list files
🤔
Concept: Use wildcards like * to select multiple files or directories in a folder.
The * wildcard matches all files and folders in the current directory. Example: ls * lists all items. You can use patterns like *.txt to match only text files.
Result
file1.txt file2.txt image.png docs script.sh
Wildcards let you select groups of files easily, which is essential for looping over them.
3
IntermediateLooping over files with for and wildcards
🤔Before reading on: do you think 'for file in *' loops over files only or files and directories? Commit to your answer.
Concept: Combine for loops with wildcards to process each file or directory in a folder.
Example: for file in * do echo "Found: $file" done This loop prints the name of every file and directory in the current folder.
Result
Found: file1.txt Found: file2.txt Found: image.png Found: docs Found: script.sh
Knowing that * matches both files and directories helps you handle all items or filter them as needed.
4
IntermediateFiltering only files or directories
🤔Before reading on: can you guess how to check if an item is a file or directory inside a loop? Commit to your answer.
Concept: Use test commands inside the loop to act only on files or only on directories.
Inside the loop, use -f to check for files and -d for directories. Example: for item in * do if [ -f "$item" ]; then echo "$item is a file" elif [ -d "$item" ]; then echo "$item is a directory" fi done
Result
file1.txt is a file file2.txt is a file image.png is a file docs is a directory script.sh is a file
Filtering lets you target your commands precisely, avoiding errors or unwanted actions.
5
IntermediateHandling filenames with spaces safely
🤔Before reading on: do you think 'for file in *' handles filenames with spaces correctly? Commit to your answer.
Concept: Learn how to loop safely over files with spaces or special characters in their names.
Using for file in * splits on spaces, breaking filenames with spaces. Better approach: while IFS= read -r -d '' file; do echo "File: $file" done < <(find . -maxdepth 1 -print0) This uses find with -print0 and reads filenames safely.
Result
File: ./file1.txt File: ./my file with spaces.txt File: ./docs File: ./script.sh
Handling spaces correctly prevents bugs and data loss in scripts working with real-world files.
6
AdvancedRecursively looping through directories
🤔Before reading on: do you think a simple for loop with * goes into subdirectories automatically? Commit to your answer.
Concept: Use recursion or find command to process files inside subfolders as well.
Simple * only lists current folder. To loop recursively: find . -type f | while read -r file; do echo "File: $file" done This finds all files in current and subfolders and loops over them.
Result
File: ./file1.txt File: ./docs/readme.md File: ./docs/manual.pdf File: ./script.sh
Recursion lets scripts handle complex folder trees, essential for backups or large data processing.
7
ExpertAvoiding common pitfalls with globbing and quoting
🤔Before reading on: do you think quoting variables inside loops is optional or necessary? Commit to your answer.
Concept: Understand how shell expands wildcards (globbing) and why quoting variables prevents bugs.
Globbing expands * before the loop runs. If you forget quotes: for file in *; do echo $file done Files with spaces split incorrectly. Correct: for file in *; do echo "$file" done Always quote variables to preserve exact filenames.
Result
Correctly prints filenames even with spaces, avoiding broken output.
Quoting is a simple habit that prevents subtle bugs and data corruption in scripts.
Under the Hood
When you write 'for file in *', the shell first expands the * wildcard into a list of matching filenames. Then the loop runs once for each filename. The shell treats filenames as strings, so if they contain spaces and are not quoted, the shell splits them into separate words, causing errors. Commands like 'find' generate lists of files differently, allowing safer handling of complex names. The shell's parsing and expansion rules control how loops receive and process file names.
Why designed this way?
The shell uses globbing (wildcard expansion) to let users easily select files without typing each name. This design is simple and fast but assumes filenames are simple strings. Quoting and special commands like 'find' were added later to handle real-world complexities like spaces and recursion. This balance keeps the shell lightweight but requires careful scripting to avoid pitfalls.
User writes: for file in *
       ↓
Shell expands * → file1.txt file2.txt my file.txt
       ↓
Loop runs:
  Iteration 1: file = file1.txt
  Iteration 2: file = file2.txt
  Iteration 3: file = my
  Iteration 4: file = file.txt  (wrong split)

Correct with quotes:
  Iteration 3: file = "my file.txt" (one item)
Myth Busters - 4 Common Misconceptions
Quick: Does 'for file in *' loop only over files, or also directories? Commit to your answer.
Common Belief:The loop 'for file in *' only processes files, not directories.
Tap to reveal reality
Reality:The * wildcard matches both files and directories, so the loop processes all items in the folder.
Why it matters:Scripts may fail or behave unexpectedly if they assume only files are processed, causing errors when directories appear.
Quick: Can you safely loop over filenames with spaces using 'for file in *'? Commit to your answer.
Common Belief:The for loop with * handles filenames with spaces correctly without extra care.
Tap to reveal reality
Reality:Without proper quoting or special handling, filenames with spaces break the loop, splitting one filename into multiple parts.
Why it matters:This causes bugs, data loss, or incorrect processing in scripts working with real files.
Quick: Does a simple for loop with * go into subdirectories automatically? Commit to your answer.
Common Belief:Using 'for file in *' loops through all files in current and all subdirectories automatically.
Tap to reveal reality
Reality:The * wildcard only matches items in the current directory; it does not recurse into subfolders.
Why it matters:Scripts that expect recursive processing will miss files deeper in folders, causing incomplete results.
Quick: Is quoting variables inside loops optional? Commit to your answer.
Common Belief:Quoting variables like "$file" inside loops is optional and mostly stylistic.
Tap to reveal reality
Reality:Quoting variables is necessary to preserve filenames exactly, especially those with spaces or special characters.
Why it matters:Not quoting leads to broken scripts and hard-to-find bugs when filenames contain spaces or unusual characters.
Expert Zone
1
Globbing happens before the loop runs, so the list of files is fixed at loop start; changes during the loop won't affect iteration.
2
Using 'find' with -print0 and 'read -d '' ' handles all filenames safely, including those with newlines, which normal loops can't handle.
3
Looping over files with spaces requires careful use of IFS (Internal Field Separator) and quoting to avoid splitting filenames incorrectly.
When NOT to use
Simple for loops with * are not suitable for filenames with newlines or very complex names; use 'find' with null-separated output instead. Also, for very large directories, 'find' is more efficient and safer. Avoid loops when a single command can process multiple files at once (like 'xargs' or built-in commands).
Production Patterns
In real systems, scripts often combine 'find' with while-read loops for safe recursive processing. They also use functions to encapsulate file handling and trap errors for robustness. Batch renaming, backups, and log processing commonly use these patterns with careful quoting and error checking.
Connections
Recursion in Computer Science
Looping recursively through directories builds on the idea of recursion, where a function calls itself to handle nested structures.
Understanding recursion helps grasp how scripts explore folder trees deeply and systematically.
Regular Expressions
Wildcards in shell globbing are simpler cousins of regular expressions used for pattern matching.
Knowing regex helps create more precise file selection patterns beyond basic wildcards.
Assembly Line in Manufacturing
Looping over files is like an assembly line where each item is processed step-by-step automatically.
Seeing loops as automation pipelines clarifies how scripts save time and reduce human error.
Common Pitfalls
#1Loop breaks on filenames with spaces
Wrong approach:for file in * do echo $file done
Correct approach:for file in * do echo "$file" done
Root cause:Not quoting $file causes the shell to split filenames on spaces, breaking the loop.
#2Assuming loop processes only files, not directories
Wrong approach:for file in * do # process as file without checking cat "$file" done
Correct approach:for item in * do if [ -f "$item" ]; then cat "$item" fi done
Root cause:Wildcard * matches both files and directories; ignoring this causes errors when directories are processed as files.
#3Expecting recursive processing with simple for loop
Wrong approach:for file in * do echo "$file" done
Correct approach:find . -type f | while read -r file; do echo "$file" done
Root cause:The * wildcard only matches current directory; recursion requires commands like find.
Key Takeaways
Looping over files and directories automates repetitive tasks, saving time and reducing errors.
The shell expands wildcards before loops run, so understanding globbing is key to correct file selection.
Always quote variables holding filenames to handle spaces and special characters safely.
Simple loops do not recurse into subdirectories; use 'find' or recursion for deep folder processing.
Mastering these concepts builds a strong foundation for powerful and reliable bash scripting.