Import GA Information to Salesforce – Part 6, Breaking it Down

As a final part to the series I will go through the script from the last part to give an idea what it does (which isn’t terribly much, since most of the work is done by the respective packages).

First we have to Importing the packages:


from apiclient.discovery import build
from oauth2client.service_account import ServiceAccountCredentials

import httplib2
from oauth2client import client
from oauth2client import file
from oauth2client import tools

import requests
from simple_salesforce import Salesforce

“import”, as the name says, imports a library/packages, meaning it’s made available within the program. Packages are organized in modules – if you feel you do not need everything in the package use the “from” statement (i.e. from <library> import <module>). So we are importing the API client library for Google, and the OAuth stuff that is needed to authenticate against your Google Account. simple_salesforce is of course the Salesforce API client and the requests lib is a dependency thereof (meaning salesforce access will not work without it as it’s calling functions from the requests library).

We use the Google library in the next bit:


def get_service(api_name, api_version, scope, key_file_location,
                service_account_email):

  credentials = ServiceAccountCredentials.from_p12_keyfile(
    service_account_email,
    key_file_location,
    scopes = scope
  )

  http = credentials.authorize(httplib2.Http())

  # Build the service object.
  service = build(api_name, api_version, http=http)

  return service

def is how Python defines a function. Like in other languages a function is a container for a reusable piece of code (there are more complex containers like objects, but we do not create our own objects in that script). After the def keyword follow the function name.

A function does usually (but not necessarily) take some input and usually (but not necessarily) emits a value (which might be computed from the input values). A function may call other functions in its function body.

The input values are called arguments ( or sometimes parameters). The names in the brackets after the function name are the arguments, and one can use them as variables within the function.

The output value is referred to as the return value, and fittingly it is the variable that appears after the “return” statement (which is a keyword that tells the function to end and emit a value). The return is necessary because the function has a “scope”, which basically mean that things that happen in the function stay in the function. If you want to access a value that is computed in the function you have to return it.

The function looks for the file with the credentials and opens it. The first argument to the open function is the file name as given in the function arguments. The second argument is called a flag, meaning that it’s one of a set of predefined argument values that tell the open function how to behave. “rb” means the file is opened for read (no writing allowed) and that it is not necessarily a simple text file (“b” is for binary). The service account email and the key from the file are used to create a valid set of credentials, which in turn are used to authorize access.

The penultimate line finally creates a service, i.e. an access to a specific Google API (again, as specified in the function arguments). The service exists only in the function that created it, so to use in some other place in the program you have to “return” it to the place where the function has been called.

For Salesforce access we have something similar


def get_sf_service(sfuser,sfpass,sftoken):
  sf = Salesforce(username=sfuser, password=sfpass, security_token=sftoken)
  return sf

The reason this is shorter is that the authorization is somewhat less roundabout with SF. You pass your username , password and you security token to a function of the Salesforce API client and it returns the thing (an object instance of the SF API client).

To import data to Salesforce we first need to extract it from Google Analytics. This bit does the very thing:


def get_results(service, profile_id):
  return service.data().ga().get(
ids='ga:' + profile_id,
start_date='30daysAgo',
end_date='today',
dimensions='ga:dimension1,ga:source,ga:medium,ga:campaign',
metrics='ga:sessions').execute()

We pass in a service and return the value of an (somewhat convoluted) function call. As you can see we return it directly, we do not assign the value to a variable first (not necessarily good practice, but possible). There are a few arguments that are used to make the Google Analytics query.

There is the profile id (i.e. the id of the data view we extract the values from), a timeframe (determined by start_date and end_date respectively), dimensions (i.e. the categorical data by which we want to break down our metrics) and the metrics (the numbers we want to retrieve). We do not really need the metrics, but it is a technical requirement – one cannot make a query to GA without at least one metric.

If you find you find you want to import other data than campaign data to SF the line with the dimensions is what you need to change.

It is not enough to read the values from GA, you also need to write them to Salesforce.


def update_sf_lead(sf_service, row):
  try:
    sf_row = sf_service.Lead.get_by_custom_id('gaid__c', row[0])
    if sf_row:
      id = sf_row['Id']
      sf_service.lead.update(id,{'source__c': row[1] , 'medium__c': row[2] , 'campaign__c': row[3]})
      print "Lead updated from row with Id " + row[0]
  except Exception:
    print "Row with Id " + row[0] + " did not update anything"

First is the function definition, and we pass in the Salesforce API object and a single row of the results we got from Google.

This is the only function in this program that implements some error handling. “Try” does exactly what the names says, it tries if the following code can be executed without errors. If that is not possible it executes the code that follows after the line “except Exception:”. Usually a program stops after an error; exception handling allows to continue in some controlled fashion (usually to close all open resource before the program shuts down, although here we simply print out an error message and continue with the program).

Before we write we have to read; the update function requires that we have an unique identifier for the row we want to write to, so we look if Salesforce has a lead where the UAID field is the same as the custom dimension value in Google Analytics. If there is an id we can update the lead by passing in the campaign values.

Btw. the sf_service variable is an object, a complex variable that contains both values and “methods” (functions that are part of the object). Object oriented programming has a number of advantages that I won’t go into, but among other things it is a better way to organize program code. One can access “properties” (both values and methods) by starting with the name of the object, then a dot “.” and then writing the name of the variable or function one wants to access.

We could possibly define a return value that calls the calling function if the update has been properly executed, but instead we print a status message to stdout and do not return anything.

The last function is the main function. In this script it gets automatically called if you run the script through the Python interpreter.


def main():

scope = ['https://www.googleapis.com/auth/analytics.readonly']

sfuser = "XXXXXXXXXXXXX.XXXXXXXXXXXXXXX.com"
sfpass = "XXXXXXXXXXXXXX"
sftoken = "XXXXXXXXXXXXXX"

service_account_email = "my-project@boreal-furnace-118319.iam.gserviceaccount.com"
key_file_location = "client_secrets.p12"
profile = "XXXXXXXXXXXXXX"

sf_service = get_sf_service(sfuser,sfpass,sftoken)
service = get_service('analytics', 'v3', scope, key_file_location,service_account_email)

results = get_results(service, profile)
if results.get('rows', []):
for row in results.get('rows'):
update_sf_lead(sf_service, row)

if __name__ == '__main__': main()

This defines a few variables (i.e. assigns values to placeholders). There is the scope for the Google APIs (i.e. what the program will be allowed to do – in this case only reading Google Analytics data). Then there are credentials for Salesforce and the Google Service Account, and the id of the data view we want to use.

The rest is simply calling the functions we defined above: Create an authorized Salesforce object and get a Google Analytics service. Fetch the results from Google Analytics, which comes as a list of many rows, each of which is divided in data fields. One by one we pass each row to the function that updates the Salesforce lead (if possible).

The last line simply runs the main function when the script is called, and that is all.

As I’ve said before, this is not exactly production code. But frankly, a lot of people doodle around in this fashion. As long as you keep your credentials safe and run this on your own computer this is a workable, if not particular elegant way to enrich your SF data with values from Google Analytics.

Tutorial Table of Contents

Leave a Reply

Your email address will not be published. Required fields are marked *