Git Submodule
What is Submodule
Very often, a code repository depends on external code from other repositories. You can directly copy and paste the external code into the main repository, or use a language's package management system. However, both methods have the downside of not tracking changes to the external repository. Git allows including other Git Repositories called submodules into a single repository. Submodules allow tracking changes across multiple repositories from a single parent repository. Submodules are repositories included in the parent repository at a specific path in the working directory. They can be located anywhere in the working directory and are configured via the .gitmodules file, which is located at the root of the parent repository. The .gitmodules file contains metadata about the mapping between the submodule project's URL and local directory. Submodules support adding, synchronizing, updating, and cloning. Submodules track only specific commits, not Git references or branches.
When to Use Submodules
Working with submodules is tricky, so we suggest some best use cases for them.
- If the subproject is changing too fast or upcoming changes will break the API, lock the code to a specific commit for safety.
- If a component isn’t updated very often, and you want to track it as a vendor dependency.
- If you represent a part of the project to a third party, and you want to integrate their work at a particular time (works only when updates are not too frequent).
- If the technological context allows packaging and formal dependency management, you should use package managers instead of submodules.
- If your codebase is massive and you don’t want to fetch it every time, use submodules to prevent collaborators from downloading the entire codebase.
Commands for Git Submodules
For creating a new submodule to the existing repository, use git submodule add.
create a new submodule to the existing repository git
mkdir git-submodule-demo
cd git-submodule-demo/
git initInitialized empty Git repository in /Users/example/git-submodule-demo/.git/To add a submodule to the new repository, run the following:
add submodule to the new repository git
git submodule add https://somehost/example/textexampleCloning into '/Users/example/git-submodule-demo/textexample'...
remote: Counting objects: 8, done.
remote: Compressing objects: 100% (6/6), done.
remote: Total 8 (delta 1), reused 0 (delta 0)
Unpacking objects: 100% (8/8), done.The git submodule add command takes a URL pointing to a Git repository. Git immediately clones the textexample submodule. Check the state of the repository by running the git status command:
check the state of the repository git status
git statusOn branch master
No commits yet
Changes to be committed:
(use "git rm --cached <file>..." to unstage)
new file: .gitmodules
new file: textexampleYou can commit the files to the repository using the git add and git commit commands:
commit the files git
git add .gitmodules textexample/
git commit -m "added submodule"[master (root-commit) d5002d0] added submodule
2 files changed, 4 insertions(+)
create mode 100644 .gitmodules
create mode 160000 textexampleUpdating Submodules
Team members should update submodule code if it has been modified elsewhere. You cannot use git pull because it only updates the parent repository's reference to the submodule commit, not the actual submodule code. To update the submodule to the recorded commit, run:
updating the code of the submodule git
git submodule updateNote: Without the --remote flag, this command only checks out the commit recorded in the parent repository and does not fetch new upstream changes.
If the .gitmodules file is updated (for example, if the submodule URL changes), run git submodule sync to update the local .git/config with the new URLs before running git submodule update.
Cloning Git Submodules
To clone a project with submodules, use the git clone command. By default, it clones the parent repository but leaves submodule directories empty. You must then run git submodule init and git submodule update. The former updates the local .git/config with the mappings from .gitmodules, while the latter fetches the submodule data and checks out the recorded commit.
Alternatively, use the --recursive flag with git clone to automatically initialize and update submodules. If you cloned without --recursive, run the following:
git clone command initializes and updates submodules
git clone /url/to/repo/with/submodules
git submodule init
git submodule updatePulling the Submodule's Code
When you pull a repository containing new submodules, other team members need to initialize them. First, run git pull to fetch the latest parent repository state. If new submodules are listed, initialize them with:
initiate new submodules git
git submodule initNote that init only updates the local .git/config file. To actually fetch the code and check out the recorded commit, you must run git submodule update.
Pushing Updates in Submodule
Since a submodule is a separate repository, you can push changes to it like any regular Git repository by running commands inside the submodule’s directory. If you make new commits inside the submodule, the parent repository will still point to the old commit. To update the parent repository to the new commit, run git add on the submodule directory, commit, and push. When you run git status in the main repository, the submodule will show as “Changes not staged for commit” with the message “modified content (new commits)”. This indicates the submodule code is checked out at a different commit than the parent repository expects.
Submodules are a good way to keep projects in separate repositories while still referencing them as folders in another repository’s working directory. However, keep in mind that for many projects, submodules are not the best practice, and working with them can be tricky.
Practice
What are the key aspects of using Git submodules?