OAuth2

This note accompanies the Bail Fund Case

Introduction

Most users are accustomed to accessing online services using password authentication – logging in to a website using a username and password. In this case, we will require a Python script to access an email inbox. In principle, we could simply provide our script with the Gmail account’s username and password, and thus give it all the rights it needs.

There are a number of important problems with this approach. First and foremost, passwords are an "all or nothing" authentication technique. Anyone with a Gmail account’s credential can do anything with that account – read email, send emails, change the account’s password, make purchases, etc...

Second, passwords don’t allow granular access controls. In particular, suppose you give access to two different apps, but then want to revoke one app’s access. If both apps have your password, there’s no way to do this – you would have to change the password, and then give the new password to all the apps that should still have access.

Third, passwords don’t allow you to know for sure what app did what on your account. Suppose you grant access to a number of apps by giving them your password, and you later discover a large purchase was made on your account. Each of the apps deny having made the purchase. There’s no way for you to know who’s to blame, because each app would have access your account using the same technique.

In fact, for this reason, Google has not allowed this authentication technique since 2015 – even if you were to give your username and password to an app, they wouldn’t be able to access your google account programmatically using these details. (As of the time this case was written, there are ways to get around that, but we won’t go into them here).

Enter OAuth2

Instead, Google uses a far more robust authentication technique called OAuth2 (this stands for Open Authorization, version 2). It is an open standard for access delegation, commonly used for users to grant limited access to their information on other websites.

The steps to getting an app authorization using OAuth2 are as follows (see the official documentation):

  1. Request OAuth2 client credentials from Google while signed in to your account – these credentials are linked to a specific project; in some sense, they are your app’s "calling card" to Google.
  2. When your app runs, it presents its "calling card" to Google, and it requests very specific permissions (for example, permission to read emails, or to send them, or to create calendar events).
  3. Google will pop up a browser window that asks the user to grant your app the specific permissions it has requested.
  4. Assuming the user grants permissions, Google sends a token back to your app.
  5. Your app can then use this token as a password to do the specific things you gave it permission for.

Note that OAuth2 is ubiquitous, and also used by Facebook, Amazon, Microsoft, and Twitter. In fact, you’ve almost certainly granted permission to your Facebook profile using an OAuth workflow before. When logging into Tinder using Facebook, for example, the app redirects you to Facebook and asks you to grant permission to your profile. When it does this, Tinder is requesting a token from Facebook, which it then uses to pull pictures and information from Facebook for your profile.

This protocol solves all the problems we mentioned above. Each token only grants access to specific resources, and tokens can be revoked for one app only, with all other tokens remaining valid. In fact, you can see all the tokens you have ever issued on your Google and Facebook accounts and revoke them.

In Google, go to google.com, click on your picture on the top right hand corner, click on "Manage your Google Account", choose the "Security" option on the menu on the left, and scroll down to the section entitled "Third-party apps with account access". If you click on "Manage third party access", you’ll see every token you ever issued – click on any item to revoke access.

In Facebook, click on the down arrow at the top right corner of the Facebook page, click on "Settings & Privacy", then "Settings", and then select "Apps and Websites" in the menu on the left.

Accessing the Google API Using Python

In this section, we will teach you how to register a project with Google, download client credentials, and then use Python to obtain a token using those credentials.

Note that if you are covering the Bail Fund Case in class, you should have been given a file called client_credentials.json. You can skip the instructions to obtian this credential, and jump directly to the last part of this section

Creating a project

Before we begin, we have to create a Google Project. This project is simply a way to register your app on the Google Cloud Platform - it is this project that will be requesting a token, and access to the Google API.

Open the Google Cloud Platform Console. If this is your first time using the console, you will likely be asked to select your country, and accept the terms and conditions:

You will then be asked how you intend to use the project - feel free to answer the question or skip it.

Next, create a project by clicking "Select a Project" near the top left-hand-corner of the page. In the window that pops up, select "New Project" on the top right. In the resulting dialogue, choose a project name. Google will automatically generate a project name for you.

Finally click on "Create". A few seconds later, your project will be ready.

Requesting client credentials

The next step is to download client credentials - your app's "calling card". Your Python script will send these credentials to Google to obtain a token.

When you return to the Google Cloud Platform Console, the new project you created should be selected by default.

Before you download credentials, you will need to configure the OAuth consent screen. This is the screen that will be presented to your user in step 3 above. Click on the three lines at the top left of the window to open the navigation menu, hover over "APIs & Services", and select "OAuth consent screen":

You will then be asked to choose whether your app is an "Internal" or "External" app. "Internal" apps are meant for Google Workspace users who want to design apps that will only be used by users at their company in the same Google Workspace. You will likely be using a personal account to carry out these steps, and so you should select "External":

Click create. You will then be asked to select a name for your app, and an email address users can contact if they have issues with the app. If you look on the right of this window, you will see a preview of what the consent screen will look like. It should look familiar from times you've granted apps a Google token before:

Once you've entered these details, you can leave the rest blank, scroll to the bottom of the page, enter your email address, and click "Save and Continue".

The next page will ask you to select the scopes you would like your app to have access to. "Scopes" is Google's name for the specific permissions your app might request. For example, the https://www.googleapis.com/auth/gmail.readonly scope will allow your API to read emails, but not send them. You can scroll to the bottom of this page and click "Save and Continue" without selecting any scopes - we will be specifying the correct scopes in our code.

The next screen will allow you to specify the users allowed to use the app. In a real app, any user would be able to use the app and request a token from their account. To prevent people from creating malicious apps, Google will limit the number of users who can use the app to 100 until it is verified by Google. In this use case, it shouldn't be a problem, since you will be using this app to get programmatic access to your own inbox. Simply click on "Add Users", enter your email address, and click "Add". Then click "Save and Continue".

Finally, you'll see a summary screen. Scroll to the bottom and click on "Back to Dashboard".

Allright! We're now finally ready to download our credentials. Once again, click on the three lines at the top left of the window to open the navigation menu, hover over "APIs & Services", and select "Credentials".

At the top of the page, select "Create Credentials", and "OAuth client ID":

In the "Application Type" dropdown, select "Desktop app" (since you will simply be using this app in a Python script on your desktop). Pick a name, and click "Create".

You will then see a screen that lists two long pieces of text - these are your client credentials! You can use these credentials to request a token from Google. Click on "Download JSon", and save the file as client_credentials.json somewhere safe - the file will contain everything you need to make that request.

Getting a Token in Python

Important note: the steps in this section will generate a token with access to a real Google account. In class, we will only generate tokens for the "toy" Google accounts associated with the bail fund case. If you generate a token for your own personal account, you do so at your risk and peril - anyone with that token will be able to access it.

You should now have a file called client_credentials.json. We will now go through the process of requesting a token from Google using these credentials.

First, you will need to install a number of packages by running the following command wherever you usually run pip:

pip install --upgrade google-api-python-client google-auth-httplib2 google-auth-oauthlib

These packages will provide functions that will allow us to easily interact with the Google API. Import them in your code:

from googleapiclient.discovery import build
from google_auth_oauthlib.flow import InstalledAppFlow
from google.auth.transport.requests import Request
from google.oauth2.credentials import Credentials

As we briefly mentioned above, "Scopes" is the term Google uses for the specific permissions you want to grant using this token. These scopes need to be provided to Google in the form of a list of string, where each string contains one scope. Here, we will only want to read emails, and we will define a list with one item only:

SCOPES = ['https://www.googleapis.com/auth/gmail.readonly']

You can see what other scopes are available for access to gmail here.

Next, we need to create an authentication flow object - this is the process by which the app will present it's client credentials to Google and get a token in response. Luckily, the packages you've just imported made this easy:

flow = InstalledAppFlow.from_client_secrets_file('client_credentials.json', SCOPES)

This instructs Python to create an authentication flow object using the client credentials you downloaded into your JSon file.

We're now ready to get the token! Run the following line:

creds = flow.run_local_server()

This should open up a new window that will allow you to carry out the authorization flow. First, select the account you would like your app to have authorization to. As you might remember from when we set up our client credentials, our app is still in testing mode, and can therefore only get access to the specific user accounts you listed when you created the client credentials. This is fine for our purposes, since we'll only be accessing the bail fund email address.

Select that account (or log in to it using the details provided by your instructor). You will then see a blood-curdling warning informing you that Google has not verified this app yet. Click "Continue". Finally, you will see a window listing all the scopes your script has requested, with a checkbox next to each one. Put a check next to every box, and click "Continue"

You will finally be told the authentication flow has completed - you can then close the window.

Returning to Python, you will find your code has run. The creds variable will now contain the token your script will need to access the Google API with the scopes requested. You can then connect to GMail using this token as follows:

service = build('gmail', 'v1', credentials=creds)

You can now use service to interact with Google - we will discuss how in a later appendix.

One last note - it would be a little inconvenient to have to go through this process every time you restart your Python kernel. Luckily, there is a way to save the token in a file. First, try and run the following command:

creds.to_json()

The result of this command contains the token you just created. Be incredibly careful with it - treat this token like a password, because it is. Note that it will include an expiration date

We could copy and paste this token directly into our code, and then use it to directly authenticate with Google rather than go through the whole flow procedure:

# Note: DO NOT DO THIS!
token = {"token": "ya29.a0ARrdaM",
         "refresh_token": "1//0d6hgIQqptO92CgYIARAAGA0",
         "token_uri": "https://oauth2.googleapis.com/token",
         "client_id": "172124637047-apa2crsnseeugeinle6rp92bml7c77ro.apps.googleusercontent.com",
         "client_secret": "2JSYRxAyrcer8zHPj-2pAead",
         "scopes": ["https://www.googleapis.com/auth/gmail.readonly"],
         "expiry": "2021-09-20T18:58:27.496896Z"}
         
SCOPES = ['https://www.googleapis.com/auth/gmail.readonly']
creds = Credentials.from_authorized_user_info(token, SCOPES)
service = build('gmail', 'v1', credentials=creds)

This is, of course, a terrible idea. Putting a token in your code directly means you might inadvertently reveal it to someone else when you send them you code. A much better thing to do is to save the token in a file as follows

with open('token.json', 'w') as token_file:
    token_file.write(creds.to_json())

A file called token.json will be created containing the token. You can then authenticate with Google using the file directly, as follows

SCOPES = ['https://www.googleapis.com/auth/gmail.readonly']
creds = Credentials.from_authorized_user_file('token.json', SCOPES)
service = build('gmail', 'v1', credentials=creds)

Getting a Token - A More Streamlined Approach

We can combine the steps described in the previous function into a more streamlined function gmail_connect which checks whether a token file currently exists, loads it if it does, and obtains a new token if not. Note that this function also requires the os.path package to work, to check whether the file exists.

import os.path
from googleapiclient.discovery import build
from google_auth_oauthlib.flow import InstalledAppFlow
from google.auth.transport.requests import Request
from google.oauth2.credentials import Credentials

def gmail_connect(client_credentials_file='client_credentials.json', token_file='token.json'):
    '''
    This function establishes a connection to the Gmail API for reading
    only. It takes two arguments
      - client_credentials_file: a json file containing client credentials
        from Google
      - token_file: a json file containing a previously obtained tokens. If
        this file exists and is valid, it will be used to establish a
        connection. If not, a new connection will be established
    
    The function returns Google service object that can be used to access
    the Gmail API.    
    '''
    
    # Specify the read-only scope we'll want
    SCOPES = ['https://www.googleapis.com/auth/gmail.readonly']
    
    # Begin with empty credentials
    creds = None
    
    # Check whether the token file exists
    if os.path.exists(token_file):
        # Attempt to connect using this token file
        creds = Credentials.from_authorized_user_file(token_file, SCOPES)
        
    # If valid credentials were not obtained, get new ones
    if not creds or not creds.valid:
        flow = InstalledAppFlow.from_client_secrets_file(client_credentials_file, SCOPES)
        creds = flow.run_local_server(port=0)
        
        # Save the credentials for future use
        with open(token_file, 'w'):
            token.write(creds.to_json())
    
    # Authenticate to GMail with this token, and return the Google service
    service = build('gmail', 'v1', credentials=creds)
    return service