Git Submodule

What is Submodule

Very often, the code repository depends on the external code from other repositories. You can directly copy and paste the external code into the main repository, or you can use the method of language's package management system. But these two methods have the downside of not tracking changes to the external repository. Git allows including other Git Repositories called submodules into a single repository. Submodules allow tracking changes in several repositories via one repository. Submodules are repositories included in the parent repository at a specific path in the working directory of the parent repository. They can be located anywhere in the working directory and are configured via the .gitmodules file, which is located at the root of the parent repository. The .gitmodules file contains metadata about the mapping between the submodule project's URL and local directory. Submodule supports adding, synchronizing, updating, and cloning submodules. Submodules track only specific commits, not the git references and branches.

When to Use Submodules

Working with submodules is tricky, so we suggest some best use cases for them.

  • If the subproject is changing too fast or upcoming changes will break the API, lock the code to a specific commit for safety.
  • If a component isn’t updated very often, and you want to track it as a vendor dependency.
  • If you represent a part of the project to a third party, and you want to integrate their work at a particular time (works only when updates are not too frequent).
  • If the technological context allows packaging and formal dependency management, you should use submodules.
  • If your codebase is massive and you don’t want to fetch it every time, use submodules so as not to make the collaborators fetch the entire blocks of the codebase.

Commands for Git Submodules

For creating a new submodule to the existing repository, use git submodule add. This sets of command create a new directory, enter it, and initialize it as a new repository:

mkdir git-submodule-demo
cd git-submodule-demo/
git init
Initialized empty Git repository in /Users/example/git-submodule-demo/.git/

To add a submodule to the new repository run the following:

git submodule add https://somehost/example/textexample
Cloning into '/Users/example/git-submodule-demo/textexample'...
remote: Counting objects: 8, done.
remote: Compressing objects: 100% (6/6), done.
remote: Total 8 (delta 1), reused 0 (delta 0)
Unpacking objects: 100% (8/8), done.

The git submodule add command uses a URL parameter to point to a git repository. Git immediately clones the textexample submodule. Check the state of the repository by running the git status command:

git status
On branch master

No commits yet

Changes to be committed:
  (use "git rm --cached <file>..." to unstage)

 new file: .gitmodules
 new file: textexample

Two new files are created in the repository: .gitmodules and the textexample directory. You can commit the files to the original repository with the git commit, and git add commands:

git add .gitmodules textexample/
git commit -m "added submodule"
[master (root-commit) d5002d0] added submodule
 2 files changed, 4 insertions(+)
 create mode 100644 .gitmodules
 create mode 160000 textexample

Updating Submodules

The members of the team should update the submodule code if someone has updated it. We cannot use git pull because it just retrieves the information that the submodule pointing to another commit, not updates the code of it. For updating the code of the submodule run the following:

git submodule update

Cloning Git Submodules

For cloning a project with submodules, you should use the git clone command. It will clone the directories with submodules but not the files within them. You should run git submodule init and git submodule update. The first will update the local .git/config with the mapping from the .gitmodules file, and the latter will fetch the entire data from the submodule project and check out the mapped commit in the parent project.

The -recursive option of the git clone command initializes and updates submodules. Or just run the following:

git clone /url/to/repo/with/submodules
git submodule init

Pulling the Submodule's Code

When you create a new submodule, the other members of the team should initiate it. To get the information about the submodule, first, you have to get the information about the submodule by executing git pull. If there are new submodules, you'll see it in the output of git pull. Then you'll have to initiate them with:

git submodule init

This will pull the code from the submodule and locate it in the directory that it is configured to.

Pushing Updates in Submodule

As the submodule is a separate repository, you can push it like a regular Git repository by executing commands in the submodule’s directory. To make new commits inside the submodule will still point to the old commit. If you want to have the changes in the main repository, too, you should instruct the main repository to use the recent commit of the submodule. When you run git status in the main repository, the submodule will be in the “Changes not staged for commit” with the text “modified content”. This will check out the submodule code on a different commit than the main repository is pointing at. For making the main repository point to the new commit, run the git add command, commit and push it.

Submodules are a good way of keeping the projects in separate repositories, but still, be able to reference them as folders in the working directory of other repositories. However, take into account that for a lot of projects, submodules are not the best practice, and working with them is tricky.




Do you find this helpful?

Related articles