The other day, I read the following article. If you don’t want to read it, you really should. It’s awesome. Essentially, a student managed to get access to all of India’s standardized test scores and who they belonged to. The author posted his code and results on Github (thanks for sharing!). However, the data itself was not mirrored and is no longer available. Grumpy. I wanted to pull the data off Github.
This got me thinking about Git, how to use Github with it, and my own personal Git habits. Git is designed as a distributed version control system. This means that there should be no single ‘master’ copy of your source code anywhere; multiple copies should be available from different nodes across the network, be that the local network or the Internet.
You are working on a code project at home and dutifully using Git for version control (as you should!). You commit your changes locally and push them to Github. You get off the computer, drink beer, socialize, work on your plan for world domination, sleep, and head to work the next day. Your co-worker asks to see what you were working on (of course you told her about how awesome your side project was). You go to Github to pull the code and *gasp!* Github is down! Not only that, but your cat ate the laptop at home with all the code! Suddenly, all your code is gone forever. (Maybe this example is a bit hyperbolic, but stay with me.)
When most people use Git, the only remote they use is Github. In the previous scenario, Github was considered to be the 100% safe backup solution. By pushing to only Github, what you’ve done case is create a single point of failure.
When you only use Github as the remote you push to, you are negating a lot of benefits of distributed version control. Your code should be spread out among multiple nodes across the network. When you only use one remote, you are using two nodes: your local machine and the remote. This may work fine most of the time, but what if you forget your laptop and the remote is down? You should have multiple remotes to pull from in cases like this.
Note that when I am saying ‘remotes’, I essentially just mean nodes on the network. I do a lot of development in my free time and use Git to keep track of my personal projects. In a company environment though, you could also push code to your co-workers, team leads, build servers, and a backup. In case of a build server failure, you can then pull changes from any of those machines.
The solution is to simply have more than one remote! This is not hard to set up or configure. When you have multiple remotes, you are never at the mercy of a single remote server. Not only could a single remote go down, the admins could decide to be super evil and just delete your code. Bad news bears.
With multiple remotes, if one remote goes down, no worries, just pull and push to the others instead.
Call to Action
There are plenty of free options for Github hosting. Since Git is distributed, just because you are using Github, doesn’t mean you can’t also use BitBucket! I myself use both. If you have a server at home or pay for a VPS, add that as a remote too! Below I have provided an example of adding 2 new remotes (one for BitBucket and Github) for a project I’m working on in my free time.
Add a new remote
$ # git remote add alt alt-machine:/path/to/repo $ git remote add github
firstname.lastname@example.org:stkerr/MemoryPIN.git $ git remote add bitbucket email@example.com:samoz/memorypin.git
Push to the new remote with
$ # git push <remotename> <branch name> $ git push github master $ git push bitbucket master
Push to multiple remotes at once
This part is somewhat confusing, so maybe check the StackOverflow link for more info. Essentially, you are editing your .git/config file to create a new remote that has multiple URLs (that you added previously) in it. Edit your .git/config file and add:
[remote "AllTheThings"] url = firstname.lastname@example.org:stkerr/MemoryPIN.git url = email@example.com:samoz/memorypin.git
Then to push to all of those different remotes, do
$ git push AllTheThings master
If you’re only using Github or one remote server to push your Git code to, you are doing it wrong. Git is designed to have multiple remote servers, so in case of a failure, you simply use a different remote. I have pointed out some options you can use for new Git remotes, such as Github, Bitbucket, or using a personal server. I also showed you how to add these new remotes to your existing repo and to push to them.
Leave a comment if you have any suggestions! I’m not a Git expert by any means, so I’m sure there are other ways of doing this!