Backing up your github content.

Written by  on December 16, 2013

I never like trusting others to care for my things as well as I do. Github, bitbucket, and googledocs are big piles of trust. So here is a quicky to copy and pull from all of your users repositories.


#!/usr/bin/python
# Quick and dirty backup of your git files.
from pygithub3 import Github
import git
import os
import subprocess
os.chdir('/home/git/repositories')
gh = Github(login='git')
for user in ['JoeDumoulin','soycamo','suspect-devices','feurig'] :
    for repo in gh.repos.list(user).all():
        try:
            os.stat(repo.full_name.split('/')[0])
        except:
            os.mkdir(repo.full_name.split('/')[0])       

        print repo.git_url,'->', repo.full_name.split('/')[0]
        try:
            git.Repo.clone_from(repo.git_url, repo.full_name)
            description_file=open(repo.full_name+"/.git/description","r+")
            if("Unnamed repository" in description_file.read()):
               description_file.seek(0)
               print repo.description
               description_file.write(repo.description)
               description_file.truncate()
               description_file.close()
        except Exception as e:
            if str(e).find('already exists and is not an empty'):
               try:
                  git.cmd.Git(repo.full_name).pull()
                  print repo.full_name, "should be up to date"
               except Exception as e2:
                  print str(e2)
                  pass
            else: 
               print str(e)
            pass

subprocess.call(["chown", "-R","git:git","."]) 

Notes:

The two python tools we use are pygithub3 http://pygithub3.readthedocs.org/en/latest/index.html which gives us access to githubs api. To work with the git repositories we use GitPython http://pythonhosted.org/GitPython/0.3.2/index.html.

We have been looking at gitweb which uses the description file in the .git repository. Since many of these repos are created on github the description file is empty though github provides the description in its json, so we populate it.