0
0
Gitdevops~15 mins

Adding a submodule in Git - Deep Dive

Choose your learning style9 modes available
Overview - Adding a submodule
What is it?
Adding a submodule in git means including one git repository inside another as a folder. This lets you keep a separate project inside your main project, but still track it with git. The submodule points to a specific commit of the other project, so you can control which version you use. It helps manage dependencies or shared code cleanly.
Why it matters
Without submodules, you would have to copy code from other projects manually or mix unrelated histories, making updates and collaboration messy. Submodules solve this by linking projects while keeping them separate, so updates and version control stay clear and organized. This saves time and reduces errors when working with multiple codebases.
Where it fits
Before learning submodules, you should understand basic git commands like clone, commit, and push. After mastering submodules, you can explore advanced git topics like submodule updates, nested submodules, and git workflows involving multiple repositories.
Mental Model
Core Idea
A git submodule is a pointer inside your project to a specific commit of another separate git repository.
Think of it like...
It's like having a book with a chapter that is actually a separate booklet inserted inside. The main book references the booklet, but the booklet can be updated independently.
Main Project Repo
└── Submodule Folder (points to another repo commit)

[Main Repo] ── contains ──> [Submodule Folder]
[Submodule Folder] ── tracks ──> [External Repo at specific commit]
Build-Up - 6 Steps
1
FoundationWhat is a git submodule
🤔
Concept: Introduce the idea of a submodule as a separate git repository inside another.
A git submodule is a way to include one git repository inside another as a folder. This folder is linked to a specific commit of the external repository. It allows you to keep projects separate but connected.
Result
You understand that a submodule is not just a folder, but a link to another git repo at a fixed version.
Understanding that submodules keep projects separate but linked helps avoid confusion about mixing codebases.
2
FoundationHow to add a submodule
🤔
Concept: Learn the basic command to add a submodule to your git project.
Use the command: git submodule add This clones the external repo into the specified folder and sets it as a submodule. Example: git submodule add https://github.com/example/libfoo.git libs/libfoo
Result
The submodule folder appears in your project, and git tracks it as a submodule.
Knowing the exact command to add a submodule is the first step to managing external code cleanly.
3
IntermediateSubmodule commit tracking
🤔Before reading on: do you think the submodule automatically updates to the latest commit of the external repo, or stays fixed at a commit? Commit to your answer.
Concept: Understand that submodules track a specific commit, not the latest changes automatically.
When you add a submodule, git records the exact commit of the external repo. If the external repo changes, your submodule stays at the old commit until you update it manually with git commands.
Result
Your main project uses a fixed version of the submodule, ensuring stability.
Knowing that submodules do not auto-update prevents surprises when your code suddenly changes due to external repo updates.
4
IntermediateCloning projects with submodules
🤔Before reading on: when cloning a repo with submodules, do you think the submodule content is cloned automatically or requires extra steps? Commit to your answer.
Concept: Learn how to clone a project that contains submodules properly.
When you clone a repo with submodules, the submodule folders are empty by default. You must run: git submodule update --init --recursive to fetch and checkout the submodules at the correct commits.
Result
Your local copy has the main project and all submodules correctly checked out.
Understanding this prevents confusion when submodule folders appear empty after cloning.
5
AdvancedUpdating submodules to new commits
🤔Before reading on: do you think updating a submodule requires committing in the main repo or just inside the submodule? Commit to your answer.
Concept: Learn how to update a submodule to a newer commit and record that change in the main repo.
To update a submodule: 1. Enter the submodule folder. 2. Pull or checkout the desired commit. 3. Go back to the main repo and commit the changed submodule pointer. Example: cd libs/libfoo git checkout main git pull cd ../.. git add libs/libfoo git commit -m "Update libfoo submodule"
Result
The main repo now points to the updated submodule commit, keeping history consistent.
Knowing that submodule updates require commits in the main repo avoids lost changes and confusion.
6
ExpertHandling submodule pitfalls and workflows
🤔Before reading on: do you think submodules simplify or complicate collaboration? Commit to your answer.
Concept: Explore common challenges with submodules and best practices to manage them in teams.
Submodules can cause confusion if team members forget to update them or commit submodule changes. Best practices include: - Always run 'git submodule update --init --recursive' after cloning. - Commit submodule pointer changes in the main repo. - Use scripts or CI checks to ensure submodules are correct. - Consider alternatives like git subtree if submodules are too complex.
Result
Teams avoid common submodule errors and keep dependencies consistent.
Understanding submodule challenges helps you decide when to use them and how to avoid collaboration issues.
Under the Hood
Git stores submodules as special entries in the main repo's .gitmodules file and as a gitlink entry in the index. This gitlink records the exact commit hash of the submodule repo. When you commit in the main repo, git saves this commit hash, not the submodule files themselves. The submodule folder is a separate git repository inside the main repo folder, with its own .git directory or git metadata.
Why designed this way?
Submodules were designed to keep projects separate but linked, avoiding code duplication and mixing histories. This design allows independent versioning and updates of subprojects while maintaining a clear reference in the main project. Alternatives like copying code or merging repos lose this separation and cause maintenance headaches.
Main Repo
├─ .gitmodules (tracks submodule URLs and paths)
├─ Submodule Folder (contains separate git repo)
│  └─ .git (or git metadata)
└─ Gitlink entry (in main repo index) points to submodule commit

Workflow:
[Main Repo commit] ──> records submodule commit hash
[Submodule Folder] ──> independent repo with own commits
Myth Busters - 4 Common Misconceptions
Quick: do you think updating a submodule automatically updates the main repo's pointer? Commit yes or no.
Common Belief:Updating the submodule repo automatically updates the main repo to use the new submodule version.
Tap to reveal reality
Reality:The main repo only updates its pointer to the submodule commit when you explicitly commit that change in the main repo.
Why it matters:Without committing the pointer update, collaborators will still use the old submodule version, causing inconsistencies.
Quick: do you think cloning a repo with submodules fetches all submodule content automatically? Commit yes or no.
Common Belief:Cloning a repo with submodules downloads all submodule files automatically.
Tap to reveal reality
Reality:Submodule folders are empty after cloning until you run 'git submodule update --init'.
Why it matters:New users may think submodules are missing or broken, leading to confusion and wasted time.
Quick: do you think submodules merge their history into the main repo? Commit yes or no.
Common Belief:Submodules merge their commit history into the main repo's history.
Tap to reveal reality
Reality:Submodules keep their history separate; the main repo only tracks a commit pointer, not the full history.
Why it matters:Misunderstanding this can cause incorrect assumptions about project size, history, and blame tracking.
Quick: do you think submodules are always the best way to include external code? Commit yes or no.
Common Belief:Submodules are always the best method to include external projects in git.
Tap to reveal reality
Reality:Submodules add complexity and are not always the best choice; alternatives like git subtree or package managers may be better.
Why it matters:Choosing submodules blindly can cause workflow problems and slow down development.
Expert Zone
1
Submodules require explicit commits in the main repo to update their tracked commit, which can cause subtle bugs if forgotten.
2
Nested submodules (submodules inside submodules) add complexity and require recursive commands to manage properly.
3
The .gitmodules file is versioned in the main repo and controls submodule URLs and paths, so changing it affects all collaborators.
When NOT to use
Avoid submodules when your team prefers simpler workflows or when you want to merge external code history directly. Alternatives include git subtree for embedding repos without separate pointers, or using package managers to handle dependencies outside git.
Production Patterns
In production, submodules are often used for shared libraries or tools that evolve independently. Teams automate submodule updates in CI pipelines and enforce policies to commit submodule pointer changes. Some projects use scripts to simplify submodule initialization for new developers.
Connections
Package Management
Alternative approach
Understanding submodules helps compare them with package managers that handle external code dependencies outside git, highlighting tradeoffs in version control and workflow.
Modular Programming
Builds-on
Submodules support modular programming by allowing separate code modules to be developed and versioned independently but used together.
Containerization (Docker)
Complementary technology
Knowing submodules helps understand how code dependencies can be managed at the source level, complementing containerization which manages dependencies at runtime.
Common Pitfalls
#1Forgetting to initialize submodules after cloning.
Wrong approach:git clone https://github.com/example/project.git # Then start working without submodule commands
Correct approach:git clone https://github.com/example/project.git git submodule update --init --recursive
Root cause:Assuming cloning fetches all submodule content automatically.
#2Updating submodule commit inside submodule but not committing pointer in main repo.
Wrong approach:cd submodule_folder git pull # Then continue working in main repo without git add/commit
Correct approach:cd submodule_folder git pull cd .. git add submodule_folder git commit -m "Update submodule pointer"
Root cause:Not realizing main repo tracks submodule commit via pointer that must be committed.
#3Changing submodule URL in .gitmodules but not updating config.
Wrong approach:Edit .gitmodules manually but do not run git submodule sync
Correct approach:Edit .gitmodules Run git submodule sync to update config
Root cause:Misunderstanding that .gitmodules and git config must be consistent.
Key Takeaways
Git submodules let you include one git repository inside another as a separate folder linked to a specific commit.
Submodules track a fixed commit and do not update automatically; you must update and commit the pointer in the main repo.
After cloning a repo with submodules, you must run 'git submodule update --init --recursive' to fetch submodule content.
Submodules add complexity and require careful management, so understand their workflow and consider alternatives when appropriate.
Mastering submodules improves your ability to manage multi-repository projects and shared code dependencies cleanly.