Migrating a repo from SVN to Git


Git
For more information on Git, visit git-scm.com

Description:


SVN is one of the major Source Code Management tools that is available and still widely used. It was derived from CVS, and information about it can be found here. In this tutorial we will be covering the process of converting an SVN repository over to git, a more modern, more powerful SCM.


   COMPATABILITY NOTICE:
These instructions are only compatible with CentOS 7     and RHEL 7  


Pre-Requisites:


The following list of things need to be installed on the server/container that will be used to perform the migration. This migration example will be done using CentOS 6 as the base OS.

1.    Directory Structure:
Create a directory to be used for the migration. In this example, a directory named svn2git will be used.

mkdir /svn2git; cd /svn2git


2.    Java:
Install Java 1.7 from the yum repository. This will be required to take advantage of using Atlassian BitBucket's SVN Migration tool.

yum install java-1.7.0-openjdk.x86_64 java-1.7.0-openjdk-devel.x86_64


3.    Atlassian Migration Utility:
Download the Atlassian Bitbucket migration tool: SVN Migration Tool Download

cd /svn2git
wget https://bitbucket.org/atlassian/svn-migration-scripts/downloads/svn-migration-scripts.jar


4.    Git:
The default version of git included in the CentOS 6 repository (version 1.7) does not meet the minimum required version for the Atlassian git migration utility. So in order to have a newer version of git, we will install the PUIAS 6 repository, and install git from that repo. We will also install the git-svn tool.

wget -O /etc/yum.repos.d/PUIAS_6_computational.repo https://gitlab.com/gitlab-org/gitlab-recipes/raw/master/install/centos/PUIAS_6_computational.repo
wget -O /etc/pki/rpm-gpg/RPM-GPG-KEY-puias http://springdale.math.ias.edu/data/puias/6/x86_64/os/RPM-GPG-KEY-puias
rpm --import /etc/pki/rpm-gpg/RPM-GPG-KEY-puias
yum clean all; yum install git git-svn


5.    SVN:
As with Git, the default version found in the CentOS 6 repository, doesn't make the cut. So again, in order to have an version of SVN that will meet the minimum requirements, we will install the Wandisco repository to get the latest version of SVN.


Copy the following code block and paste it into the terminal in order to install the repository.

cat >> /etc/yum.repos.d/wandiscosvn.repo << "EOF"
[WandiscoSVN]
name=Wandisco SVN Repo
baseurl=http://opensource.wandisco.com/centos/$releasever/svn-1.8/RPMS/$basearch/
enabled=1
gpgcheck=0
EOF
yum clean all; yum install svn


Migration tool verification:


To ensure that all of the versions of the required utilities are met, run the Atlassian migration tool with the verify option.

cd /svn2git
java -jar svn-migration-scripts.jar verify


Output should look similar to the following:

[root@svn2git GitMigration]# java -jar svn-migration-scripts.jar verify
svn-migration-scripts: using version 0.1.56bbc7f
Git: using version 1.8.3.1
Subversion: using version 1.8.16
git-svn: using version 1.8.3.1


Create authors.txt file:


The most tedious part or the migration is compiling a list of every user that has ever checked in an SVN commit. In the /svn2git directory create a file named authors.txt and compile your list of commiters as follows:

jsnow = John Snow <jsnow@yourcompany.com>
dtarg = Daenerys Targaryen <dtarg@yourcompany.com>
kdrogo = Khal Drogo <kdrogo@yourcompany.com>
astark = Arya Stark <astark@yourcompany.com>
omartell = Oberyn Martell <omartell@yourcompany.com>


   NOTICE:
If you have forgotten to include someone that has checked in a commit for the project, the migration will stop when it hits the commit by the forgotten user. If you add the user to the list and restart the migration, it will pick back up and continue churning.


Run the SVN migration:


Once the a authors.txt file is complete, we are ready to start the migraton. This step will now start to churn through the SVN repository, migrating it to git format.

cd /svn2git
git svn clone --stdlayout --authors-file=authors.txt http://svn.yourcompany.com/your_repo Git_Repo_Name --username svnuser


   NOTICE:
This step could take several hours depending on both the size of the repository and the number of commits. A repository of 150MB with roughly 700 commits for example took around an hour to complete, while a repository of 400MB with 30000 commits could take more than 12 hours.


Clean the new Git repository:


Once the migration has completed, use the Atlassian migration tool to clean up any anomalies in the new git repository.


   NOTICE:
The following will only make a dry run to show you what it would change.


cd /svn2git/Git_Repo_Name
java -Dfile.encoding=utf-8 -jar ../svn-migration-scripts.jar clean-git


Once the dry run has completed and you are satisfied with the results, then it's time to run the cleanup for real.

cd /svn2git/Git_Repo_Name
java -Dfile.encoding=utf-8 -jar ../svn-migration-scripts.jar clean-git --force


Push the repo to Git:


The last and final step of the migration is to push the new repository to Git. Go to the git server and create a new repository. Once created, grab the git url, and back on the migration server, initialize a new git repo, add the remote repo, and push the repo to the remote.

cd /svn2git/Git_Repo_Name
git init
git remote add origin http://gitlab.yourcompany.com/namespace/git_repo_name.git
git config --global user.email "kdrogo@yourcompany.com"
git config --global user.name "Khal Drogo"
git add --all
git commit -m "Conversion from SVN to Git"
git push origin master
git push --all
git push --tags


Post Requisites:


Go and crack yourself a beer.. you deserve one!



References:


Atlassian SVN Migration Tutorial