How to Use Github API in Python

Using Github Application Programming Interface v3 to search for repositories, users, making a commit, deleting a file, and more in Python using requests and PyGithub libraries.
Abdou Rockikz · 6 min read · Updated may 2020 · Using APIs


Github is a Git repository hosting service, in which it adds many of its own features such as web-based graphical interface to manage repositories, access control and several other features, such as wikis, organizations, gists and more.

As you may already know, there is a ton of data to be grabbed. In this tutorial, you will learn how you can use Github API v3 in Python using both requests or PyGithub libraries.

To get started, let's install the dependencies:

pip3 install PyGithub requests

Since it's pretty straightforward to use Github API v3, you can make a simple GET request to a specific URL and retrieve the results:

import requests
from pprint import pprint

# github username
username = "x4nth055"
# url to request
url = f"https://api.github.com/users/{username}"
# make the request and return the json
user_data = requests.get(url).json()
# pretty print JSON data
pprint(user_data)

Here I used my account, here is a part of the returned JSON (you can see it in the browser as well):

{'avatar_url': 'https://avatars3.githubusercontent.com/u/37851086?v=4',
 'bio': None,
 'blog': 'https://www.thepythoncode.com',
 'company': None,
 'created_at': '2018-03-27T21:49:04Z',
 'email': None,
 'events_url': 'https://api.github.com/users/x4nth055/events{/privacy}',
 'followers': 93,
 'followers_url': 'https://api.github.com/users/x4nth055/followers',
 'following': 41,
 'following_url': 'https://api.github.com/users/x4nth055/following{/other_user}',
 'gists_url': 'https://api.github.com/users/x4nth055/gists{/gist_id}',
 'gravatar_id': '',
 'hireable': True,
 'html_url': 'https://github.com/x4nth055',
 'id': 37851086,
 'login': 'x4nth055',
 'name': 'Rockikz',
<..SNIPPED..>

A lot of data, that's why using requests library alone won't be handy to extract this ton of data manually, as a result, PyGithub comes into the rescue. Let's get all the public repositories of that user using PyGithub:

import base64
from github import Github
from pprint import pprint

# Github username
username = "x4nth055"
# pygithub object
g = Github()
# get that user by username
user = g.get_user(username)

for repo in user.get_repos():
    print(repo)

Here is my output:

Repository(full_name="x4nth055/aind2-rnn")
Repository(full_name="x4nth055/awesome-algeria")
Repository(full_name="x4nth055/emotion-recognition-using-speech")
Repository(full_name="x4nth055/emotion-recognition-using-text")
Repository(full_name="x4nth055/food-reviews-sentiment-analysis")
Repository(full_name="x4nth055/hrk")
Repository(full_name="x4nth055/lp_simplex")
Repository(full_name="x4nth055/price-prediction")
Repository(full_name="x4nth055/product_recommendation")
Repository(full_name="x4nth055/pythoncode-tutorials")
Repository(full_name="x4nth055/sentiment_analysis_naive_bayes")

Alright, so I made a simple function to extract some useful information from this Repository object:

def print_repo(repo):
    # repository full name
    print("Full name:", repo.full_name)
    # repository description
    print("Description:", repo.description)
    # the date of when the repo was created
    print("Date created:", repo.created_at)
    # the date of the last git push
    print("Date of last push:", repo.pushed_at)
    # home website (if available)
    print("Home Page:", repo.homepage)
    # programming language
    print("Language:", repo.language)
    # number of forks
    print("Number of forks:", repo.forks)
    # number of stars
    print("Number of stars:", repo.stargazers_count)
    print("-"*50)
    # repository content (files & directories)
    print("Contents:")
    for content in repo.get_contents(""):
        print(content)
    try:
        # repo license
        print("License:", base64.b64decode(repo.get_license().content.encode()).decode())
    except:
        pass

Let's iterate over repositories again:

# iterate over all public repositories
for repo in user.get_repos():
    print_repo(repo)
    print("="*100)

This will print some information about each public repository of this user:

====================================================================================================
Full name: x4nth055/pythoncode-tutorials
Description: The Python Code Tutorials
Date created: 2019-07-29 12:35:40
Date of last push: 2020-04-02 15:12:38
Home Page: https://www.thepythoncode.com
Language: Python
Number of forks: 154
Number of stars: 150
--------------------------------------------------
Contents:
ContentFile(path="LICENSE")
ContentFile(path="README.md")
ContentFile(path="ethical-hacking")
ContentFile(path="general")
ContentFile(path="images")
ContentFile(path="machine-learning")
ContentFile(path="python-standard-library")
ContentFile(path="scapy")
ContentFile(path="web-scraping")
License: MIT License
<..SNIPPED..>

I've truncated the whole output, as it will return all repositories and their information, you can see we used repo.get_contents("") method to retrieve all the files and folders of that repository, PyGithub parses it into a ContentFile object, use dir(content) to see other useful fields. There are a lot of other data to extract in the repository object as well, use dir(repo) to see various other fields.

Also, if you have private repositories, you can access them by authenticating your account (using the correct credentials) using PyGithub as follows:

username = "username"
password = "password"

# authenticate to github
g = Github(username, password)
# get the authenticated user
user = g.get_user()
for repo in user.get_repos():
    print_repo(repo)

It is also suggested by Github to use the authenticated requests, as it will raise a RateLimitExceededException if you use the public one (without authentication) and exceed a small number of requests.

The Github API is quite rich, you can search for repositories by a specific query just like you do in the website:

# search repositories by name
for repo in g.search_repositories("pythoncode tutorials"):
    # print repository details
    print_repo(repo)

This will return 9 repositories and their information.

You can also search by programming language or topic:

# search by programming language
for i, repo in enumerate(g.search_repositories("language:python")):
    print_repo(repo)
    print("="*100)
    if i == 9:
        break

If you're using the authenticated version, you can also create, update and delete files very easily using the API:

# searching for my repository
repo = g.search_repositories("pythoncode tutorials")[0]

# create a file and commit n push
repo.create_file("test.txt", "commit message", "content of the file")

# delete that created file
contents = repo.get_contents("test.txt")
repo.delete_file(contents.path, "remove test.txt", contents.sha)

And sure enough, after the execution of the above lines of code, the commits were created and pushed:

Github CommitsSince there is a lot of other functions and methods you can use and we can't cover all of them, here are some useful ones (not all of them) you can test them on your own:

  • g.get_organization(login): Returns an Organization object that represent a Github organization
  • g.get_gist(id): Returns a Gist object which it represents a gist in Github
  • g.search_code(query): Returns a paginated list of ContentFile objects in which it represent matched files on several repositories
  • g.search_topics(query): Returns a paginated list of Topic objects in which it represent a Github topic
  • g.search_commits(query): Returns a paginated list of Commit objects in which it represents a commit in Github

There are a lot more, please use dir(g) to get other methods, check PyGithub documentation, or the Github API for detailed information.

Learn also: How to Use Google Custom Search Engine API in Python.

Happy Coding ♥

View Full Code
Sharing is caring!



Read Also





Comment panel

   
Comment system is still in Beta, if you find any bug, please consider contacting us here.