How to Use Google Drive API in Python

Learn how you can use Google Drive API to list files, search for specific files or file types, download and upload files from/to Google Drive in Python.
Abdou Rockikz · 13 min read · Updated may 2020 · Using APIs


Google Drive enables you to store your files to the cloud in which you can access them anytime and everywhere in the world. In this tutorial, you will learn how you can list your Google drive files, search over them, download stored files and even upload local files into your drive programmatically using Python.

Here is the table of contents:

To get started, let's install required libraries for this tutorial:

pip3 install google-api-python-client google-auth-httplib2 google-auth-oauthlib tabulate requests tqdm

Enable the Drive API

First, you need to have a Google account with Google Drive enabled, head to this page and click "Enable the Drive API" button as shown below:

Enable the Drive API

A new window will pop up, choose your type of application, I'm gonna stick with "Desktop app", then hit "Create" button. After that, you'll see another window appear saying you're all set:

Drive API is enabled

Download your credentials by clicking the "Download Client Configuration" button and then hit "Done".

Finally, you need to put credentials.json that is just downloaded into your working directories (i.e where you execute the upcoming Python scripts).

List Files and Directories

Before we do anything, we need to authenticate our code to our Google account, the below function does that:

import pickle
import os
from googleapiclient.discovery import build
from google_auth_oauthlib.flow import InstalledAppFlow
from google.auth.transport.requests import Request
from tabulate import tabulate

# If modifying these scopes, delete the file token.pickle.
SCOPES = ['https://www.googleapis.com/auth/drive.metadata.readonly']

def get_gdrive_service():
    creds = None
    # The file token.pickle stores the user's access and refresh tokens, and is
    # created automatically when the authorization flow completes for the first
    # time.
    if os.path.exists('token.pickle'):
        with open('token.pickle', 'rb') as token:
            creds = pickle.load(token)
    # If there are no (valid) credentials available, let the user log in.
    if not creds or not creds.valid:
        if creds and creds.expired and creds.refresh_token:
            creds.refresh(Request())
        else:
            flow = InstalledAppFlow.from_client_secrets_file(
                'credentials.json', SCOPES)
            creds = flow.run_local_server(port=0)
        # Save the credentials for the next run
        with open('token.pickle', 'wb') as token:
            pickle.dump(creds, token)
    # return Google Drive API service
    return build('drive', 'v3', credentials=creds)

We've imported necessary modules, the above function was grabbed from Google Drive quickstart page. It basically looks for token.pickle file to authenticate with your Google account, if it didn't find it, it'll use credentials.json to prompt you for authentication in your browser. After that, it'll initiate the Google Drive API service and return it.

Going to the main function, let's define a function that lists files in our drive:

def main():
    """Shows basic usage of the Drive v3 API.
    Prints the names and ids of the first 5 files the user has access to.
    """
    service = get_gdrive_service()
    # Call the Drive v3 API
    results = service.files().list(
        pageSize=5, fields="nextPageToken, files(id, name, mimeType, size, parents, modifiedTime)").execute()
    # get the results
    items = results.get('files', [])
    # list all 20 files & folders
    list_files(items)

So we used service.files().list() function to return the first 5 files/folders the user has access to by specifying pageSize=5, we passed some useful fields to fields parameter in order to get details about the listed files, such as mimeType (type of file), size in bytes, parent directory IDs, and the last modified date time, check this page to see all other fields.

Notice we used list_files(items) function, we didn't define this function yet. Since results is now a list of dictionaries, it isn't that readable, we pass items to this function in order to print them in human readable format:

def list_files(items):
    """given items returned by Google Drive API, prints them in a tabular way"""
    if not items:
        # empty drive
        print('No files found.')
    else:
        rows = []
        for item in items:
            # get the File ID
            id = item["id"]
            # get the name of file
            name = item["name"]
            try:
                # parent directory ID
                parents = item["parents"]
            except:
                # has no parrents
                parents = "N/A"
            try:
                # get the size in nice bytes format (KB, MB, etc.)
                size = get_size_format(int(item["size"]))
            except:
                # not a file, may be a folder
                size = "N/A"
            # get the Google Drive type of file
            mime_type = item["mimeType"]
            # get last modified date time
            modified_time = item["modifiedTime"]
            # append everything to the list
            rows.append((id, name, parents, size, mime_type, modified_time))
        print("Files:")
        # convert to a human readable table
        table = tabulate(rows, headers=["ID", "Name", "Parents", "Size", "Type", "Modified Time"])
        # print the table
        print(table)

We converted that list of dictionaries items variable into a list of tuples rows variable, and then pass them to tabulate module we installed earlier to print them in a nice format, let's call main() function:

if __name__ == '__main__':
    main()

See my output:

Files:
ID                                 Name                            Parents                  Size      Type                          Modified Time
---------------------------------  ------------------------------  -----------------------  --------  ----------------------------  ------------------------
1FaD2BVO_ppps2BFm463JzKM-gGcEdWVT  some_text.txt                   ['0AOEK-gp9UUuOUk9RVA']  31.00B    text/plain                    2020-05-15T13:22:20.000Z
1vRRRh5OlXpb-vJtphPweCvoh7qYILJYi  google-drive-512.png            ['0AOEK-gp9UUuOUk9RVA']  15.62KB   image/png                     2020-05-14T23:57:18.000Z
1wYY_5Fic8yt8KSy8nnQfjah9EfVRDoIE  bbc.zip                         ['0AOEK-gp9UUuOUk9RVA']  863.61KB  application/x-zip-compressed  2019-08-19T09:52:22.000Z
1FX-KwO6EpCMQg9wtsitQ-JUqYduTWZub  Nasdaq 100 Historical Data.csv  ['0AOEK-gp9UUuOUk9RVA']  363.10KB  text/csv                      2019-05-17T16:00:44.000Z
1shTHGozbqzzy9Rww9IAV5_CCzgPrO30R  my_python_code.py               ['0AOEK-gp9UUuOUk9RVA']  1.92MB    text/x-python                 2019-05-13T14:21:10.000Z

These are the files in my Google Drive, notice the Size column are scaled in bytes, that's because we used get_size_format() function in list_files() function, here is the code for it:

def get_size_format(b, factor=1024, suffix="B"):
    """
    Scale bytes to its proper byte format
    e.g:
        1253656 => '1.20MB'
        1253656678 => '1.17GB'
    """
    for unit in ["", "K", "M", "G", "T", "P", "E", "Z"]:
        if b < factor:
            return f"{b:.2f}{unit}{suffix}"
        b /= factor
    return f"{b:.2f}Y{suffix}"

The above function should be defined before running main() method, otherwise it'll raise an error. For convenience, check the full code.

Remember after you run the script, you'll be prompted in your default browser to select your Google account and permit your application for the scopes you specified earlier, don't worry, this will only happen the first time you run it, and then token.pickle will be saved and will load authentication details from there instead.

Note: Sometimes you'll encouter with "This application is not validated" warning (since Google didn't verify your app) after you choose your Google account, It's okey to go "Advanced" section and permit the application to your account.

Upload Files

In order to upload files to our Google Drive, we need to change SCOPES list we specified earlier, we need to add the permission to add files/folders:

from __future__ import print_function
import pickle
import os.path
from googleapiclient.discovery import build
from google_auth_oauthlib.flow import InstalledAppFlow
from google.auth.transport.requests import Request
from googleapiclient.http import MediaFileUpload

# If modifying these scopes, delete the file token.pickle.
SCOPES = ['https://www.googleapis.com/auth/drive.metadata.readonly',
          'https://www.googleapis.com/auth/drive.file']

Different scope means different privileges, you need to delete token.pickle file in your working directory and run again the code to authenticate with the new scope.

We will use the same get_gdrive_service() function to authenticate our account, let's make a function to create a folder and upload a sample file to it:

def upload_files():
    """
    Creates a folder and upload a file to it
    """
    # authenticate account
    service = get_gdrive_service()
    # folder details we want to make
    folder_metadata = {
        "name": "TestFolder",
        "mimeType": "application/vnd.google-apps.folder"
    }
    # create the folder
    file = service.files().create(body=folder_metadata, fields="id").execute()
    # get the folder id
    folder_id = file.get("id")
    print("Folder ID:", folder_id)
    # upload a file text file
    # first, define file metadata, such as the name and the parent folder ID
    file_metadata = {
        "name": "test.txt",
        "parents": [folder_id]
    }
    # upload
    media = MediaFileUpload("test.txt", resumable=True)
    file = service.files().create(body=file_metadata, media_body=media, fields='id').execute()
    print("File created, id:", file.get("id"))

We used service.files().create() method to create a new folder, we passed folder_metadata dictionary that has the type and the name of the folder we want to create, we passed fields="id" to retrieve folder id so we can upload a file into that folder.

Next, we used MediaFileUpload class to upload the sample file and pass it to the same service.files().create() method, make sure you have a test file of your choice called test.txt, this time we specified the "parents" attribute in the metadata dictionary, we simply put the folder we just created. Let's run it:

if __name__ == '__main__':
    upload_files()

After I ran the code, a new folder was created in my Google Drive:

A folder created using Google Drive API in PythonAnd indeed, after I enter that folder, I see the file we justed uploaded:

File Uploaded using Google Drive API in PythonWe used a text file for demonstration, but you can upload any type of file you want. Check the full code of uploading files to Google Drive.

Search for Files and Directories

Google Drive enables us to search for files and directories using the previously used list() method, just by passing 'q' parameter, the below function takes the Drive API service and query, and returns filtered items:

def search(service, query):
    # search for the file
    result = []
    page_token = None
    while True:
        response = service.files().list(q=query,
                                        spaces="drive",
                                        fields="nextPageToken, files(id, name, mimeType)",
                                        pageToken=page_token).execute()
        # iterate over filtered files
        for file in response.get("files", []):
            result.append((file["id"], file["name"], file["mimeType"]))
        page_token = response.get('nextPageToken', None)
        if not page_token:
            # no more files
            break
    return result

Let's see how to use this function:

def main():
    # filter to text files
    filetype = "text/plain"
    # authenticate Google Drive API
    service = get_gdrive_service()
    # search for files that has type of text/plain
    search_result = search(service, query=f"mimeType='{filetype}'")
    # convert to table to print well
    table = tabulate(search_result, headers=["ID", "Name", "Type"])
    print(table)

So we're filtering text/plain files here, by using "mimeType='text/plain'" as query parameter, if you want to filter by name instead, you can simply use "name='filename.ext'" as query parameter. See Google Drive API documentation for more detailed information.

Let's execute this:

if __name__ == '__main__':
    main()

Output:

ID                                 Name           Type
---------------------------------  -------------  ----------
15gdpNEYnZ8cvi3PhRjNTvW8mdfix9ojV  test.txt       text/plain
1FaE2BVO_rnps2BFm463JwPN-gGcDdWVT  some_text.txt  text/plain

Check the full code here.

Download Files

Now to download files, we need to first get the file we want to download, we can either search for it using the previous code or manually get its drive ID. In this section, we gonna search for the file by name and download it to our local disk:

import pickle
import os
import re
import io
from googleapiclient.discovery import build
from google_auth_oauthlib.flow import InstalledAppFlow
from google.auth.transport.requests import Request
from googleapiclient.http import MediaIoBaseDownload
import requests
from tqdm import tqdm

# If modifying these scopes, delete the file token.pickle.
SCOPES = ['https://www.googleapis.com/auth/drive.metadata',
          'https://www.googleapis.com/auth/drive',
          'https://www.googleapis.com/auth/drive.file'
          ]

I've added 2 scopes here, that's because we gonna need to create a permission that is making files shareable and downloadable, here is the main function:

def download():
    service = get_gdrive_service()
    # the name of the file you want to download from Google Drive 
    filename = "bbc.zip"
    # search for the file by name
    search_result = search(service, query=f"name='{filename}'")
    # get the GDrive ID of the file
    file_id = search_result[0][0]
    # make it shareable
    service.permissions().create(body={"role": "reader", "type": "anyone"}, fileId=file_id).execute()
    # download file
    download_file_from_google_drive(file_id, filename)

You saw the first three lines in previous recipes, we simply authenticate with our Google account and search for the desired file that we want to download.

After that, we extract the file ID and create a new permission that will allow us to download the file, this is the same as creating a shareable link button in Google Drive web interface.

Finally, we use our defined download_file_from_google_drive() function to download the file, there you have it:

def download_file_from_google_drive(id, destination):
    def get_confirm_token(response):
        for key, value in response.cookies.items():
            if key.startswith('download_warning'):
                return value
        return None

    def save_response_content(response, destination):
        CHUNK_SIZE = 32768
        # get the file size from Content-length response header
        file_size = int(response.headers.get("Content-Length", 0))
        # extract Content disposition from response headers
        content_disposition = response.headers.get("content-disposition")
        # parse filename
        filename = re.findall("filename=\"(.+)\"", content_disposition)[0]
        print("[+] File size:", file_size)
        print("[+] File name:", filename)
        progress = tqdm(response.iter_content(CHUNK_SIZE), f"Downloading {filename}", total=file_size, unit="Byte", unit_scale=True, unit_divisor=1024)
        with open(destination, "wb") as f:
            for chunk in progress:
                if chunk: # filter out keep-alive new chunks
                    f.write(chunk)
                    # update the progress bar
                    progress.update(len(chunk))
        progress.close()

    # base URL for download
    URL = "https://docs.google.com/uc?export=download"
    # init a HTTP session
    session = requests.Session()
    # make a request
    response = session.get(URL, params = {'id': id}, stream=True)
    print("[+] Downloading", response.url)
    # get confirmation token
    token = get_confirm_token(response)
    if token:
        params = {'id': id, 'confirm':token}
        response = session.get(URL, params=params, stream=True)
    # download to disk
    save_response_content(response, destination)  

I've grabbed a part of the above code from downloading files tutorial, it is simply making a GET request to the target URL we constructed by passing the file ID as params in session.get() method.

I've used tqdm to print a progress bar to see when it'll finish, this will become handy for large files. Let's execute it:

if __name__ == '__main__':
    download()

This will search for bbc.zip file, download it and save it in your working directory. Check the full code.

Conclusion

Alright, there you have it, these are basically the core functionalities of Google Drive, now you know how to do them in Python without manual mouse clicks!

Remember, whenever you change SCOPES list, you need to delete token.pickle file to authenticate to your account again with the new scopes. See this page for further information along with list of scopes and their explanations.

Feel free to edit the code to accept file names as parameters to download or upload them, go and try to make the script as dynamic as possible by introducing argparse module to make some useful scripts, let's see what you build !

Learn also: How to Use Google Custom Search Engine API in Python.

Happy Coding ♥

View Full Code
Sharing is caring!



Read Also





Comment panel

   
Comment system is still in Beta, if you find any bug, please consider contacting us here.