Lately I faced a case for the first time in my career to merge 2 working repos with large amount of logs into one repo and the challenge here was to keep the history for both repos after merging.
source unsplash 🔗
Doing that without caring that much about the history is super easy, simply by adding one of them copy/paste into the other one, in this case you will keep only the history of the target repo but what if you would like to create even a brand new repo that will hold both of them with a new setup (which was my case)? now the problem shines more as using the copy/paste technique will make you lose both histories.
To explain the issue better let’s assume that we have a repo Original_A
that has a history of 4 years of changes and repo Original_B
that also has about 3 years of different changes, and we would like to have both of them in a new monorepo (as we decided in the end) because we found quite some features in monorepos that fits us best, talking about that maybe in a different post.
source unsplash 🔗
The target look in the end is to have both apps inside a new repo in different folders, to be honest the mentioned article (in resources) was quite helpful in giving me a starting point but was really misleading and not precise about the changes where and in which path, that’s why I decided to write this post in a clearer way.
Let’s assume that the final result should be:
NewRepo
|_ ProjectA // a folder that holds the content of Original_A repo
|_ ProjectB // a folder that holds the content of Original_B repo
Implementation
Step One: Cloning original repos
You should have the 2 original repos that you want to merge in your projects folder first, I assume you should have cloned them already so you can skip this step.
Hint: You should do that outside the new repo folder, maybe in the same level of the new repo folder.
Step Two: Rewriting the git history of both repos
⚠️ UppubDate:after some trials/reads you could use git subtree
, Also sometimes you can even skip this step in case you would like to keep the commits ids as it was for tracking or anything else
Since both repos content will be inside different folders inside the new repo, so it is important to re-write all the paths of all files into the right folder in the new repo before moving them, so we are going to use the command git filter-branch
which has quite a lot of options but very dangerous to use, try to avoid it if you can.
Go to the path of the first repo Original_A
and run the following command:
Don’t worry I’ll explain line by line.
- line
1
it is the main command as I mentioned earlier and it takes one of the options which is--index-filter
you can read more about it from here 🔗 but concisely it’s used usually to rewrite the index of the files based on the commits. - line
2
we are listing all the files usinggit ls-files -s
and then re-name them to start withProjectA/
as the root folder instead of the original repo(that only for the final setup folder name), read more about the sed command from here 🔗. - line
3
we set the current index file based on the list we got before into a variable called$GIT_INDEX_FILE.new
. - line
4
is for updating the indexes with the new ones read more 🔗. - line
5
finally moving the index to the index.new that we created.
Note: if you are using macOS replace \t
in line 2
with control+v, TAB
so it should become as follow:
Also, It was mentioned in one of the articles in resources that if the repos that you are merging has a single root folder which is rarely happening IMO, but anyway you can use this command instead:
And of course in both cases you need to repeat that for each repo.
Step Three: Create a new Repo
All the previous steps were with the original repos but now we are going to create a brand new directory in the same path with the other repos that should be eventually a monorepo, first create a new directory (folder) and open the terminal in its path, which we can consider NewRepo
.
Inside NewRepo
folder we should have both repos merged in 2 folder ProjectA
and ProjectB
, now in the terminal initiate NewRepo
folder as a git repo simply by running this command in the folder path:
Step Four: Add the 2 repos as remote repos for the “NewRepo” and fetch them
To set these two repos as remote repos to fetch their content into the new repo as follow:
Now nothing happened, you will not find any change in your folder, will remain empty.
Step Five: Merge the 2 remote repos
Now we are in the final step to merge the history of both repos, the good thing you should do here is to merge them one by one to make sure that the root files would not conflict in both of them.
Good to mention is that to merge the disparate branches (repos) which is now disabled by default in git but can be enabled with the --allow-unrelated-histories
flag.
First run the following command:
You will find all content of the Original_A
repo in your root of NewRepo
, including the history, so if you run the following command you should find the whole log of the original repo:
Now you can create a folder ProjectA
in the root and move all the files merged into it, again to prevent any conflicts after merging the other repo, then you run the some command for the other repo:
Then you do the same thing again, create a folder ProjectB
and move all the files merged from Original_B
, also now by running git log
you will find also the log history of the Original_B
along side with the Original_A
history, all together in the same place and the good thing is that there will not be any conflicts since they are originally 2 different codebase.
Hint: you can replace the branch master in the command with whatever branch you would like to merge from.
In the end I hope that this article was helpful for you, and if you have any comment or need some help with a similar case you can easily reach out to me on twitter @med7atdawoud 🔗
Resources
- git filter-branch 🔗
- Merge git repositories into a new repository 🔗
- Git ‘—allow-unrelated-histories’ 🔗
Tot ziens 👋