October 29, 2007 by Justin Cutroni
A lot of people ask about integrating Google Analytics data with other types of data. Most people are interested in some type of CRM integration so that’s what I’m going to write about.
Let me start by saying that there is no API to Google Analytics. There is no formal way to pull data out and integrate it into some other application. Sure, there are some hacks out there to manipulate the report URLs but, in my opinion, this is a pretty big hack and I don’t recommend it to our clients.
When we talk about integrating Google Analytics with a CRM we’re talking about pulling information about the visitor’s originating source and sending it to a CRM system. We’re not going to pull the visitor’s entire history, just where they came from and attach it to other information that they enter into a form.
Why do this?
Pulling a visitor’s source information is very helpful to a sales team. Image if a sales rep could identify how a sales lead found the website before picking up the phone and contacting them? Remember, a visitor’s source info describes how the visitor found the site. Did they respond to a specific email with a certain offer? Or did they come from search and, if so, what was the keyword they used? This type of info can help a sales team understand the intent of the prospect but where in the buying process they are. That makes life for the team a little easier.
So how do we connect a visitor’s source to a visitor?
Conceptual Overview
Google Analytics stores a visitor’s source data in a cookie. That cookie is named __utmz. The data can be extracted from the cookie and added to a lead generation form. When the visitor submits the form the source information is connected to the other information that the visitor entered into the form (usually her name and other contact information).
If the contact form is integrated with a CRM application, like SalesForce.com or NetSuite, it may be possible to store the marketing information with the individual’s contact information. Direct CRM integration depends on your CRM system. Some systems allow form fields to be pulled directly into the application. Check with your CRM provider for information about your specific system.
I would like to note that you don’t need to use some fancy CRM to take advantage of this technique. You can use the technique below even if your lead form generates an email or dumps the data into a database. The key is that we’re leveraging the data in the Google Analytics cookies and connecting it with information that the visitor sends you.
Detailed Instructions
Ok, so how do you actually do this? It takes some coding, either client side coding (JavaScript) or server side code (PHP, ColdFusion, .NET, Java, etc.). My example uses JavaScript. Here’s a basic explanation of what the code needs to do:
1. Extract the visitor’s source data from the __utmz cookie
2. Manipulate data as needed
3. Place data in hidden form fields
When the visitor submits the form the source data in the hidden form fields will be sent back to the server where it can be used.
The sample code below extracts the source information from the cookie and places it in some hidden form fields. Then, when the form is submitted the information is passed to the server.
<html>
<head>
<script src="http://www.google-analytics.com/urchin.js"></script>
<script>
_uact="XXXXXX-X";
urchinTracker();
</script>
<script>
//
// Get the __utmz cookie value. This is the cookies that
// stores all campaign information.
//
var z = _uGC(document.cookie, '__utmz=', ';');
//
// The cookie has a number of name-value pairs.
// Each identifies an aspect of the campaign.
//
// utmcsr = campaign source
// utmcmd = campaign medium
// utmctr = campaign term (keyword)
// utmcct = campaign content (used for A/B testing)
// utmccn = campaign name
// utmgclid = unique identifier used when AdWords auto tagging is enabled
//
// This is very basic code. It separates the campaign-tracking cookie
// and populates a variable with each piece of campaign info.
//
var source = _uGC(z, 'utmcsr=', '|');
var medium = _uGC(z, 'utmcmd=', '|');
var term = _uGC(z, 'utmctr=', '|');
var content = _uGC(z, 'utmcct=', '|');
var campaign = _uGC(z, 'utmccn=', '|');
var gclid = _uGC(z, 'utmgclid=', '|');
//
// The gclid is ONLY present when auto tagging has been enabled.
// All other variables, except the term variable, will be '(not set)'.
// Because the gclid is only present for Google AdWords we can
// populate some other variables that would normally
// be left blank.
//
if (gclid !="-") {
source = 'google';
medium = 'cpc';
}
// Data from the custom segmentation cookie can also be passed
// back to your server via a hidden form field
var csegment = _uGC(document.cookie, '__utmv=', ';');
if (csegment != '-') {
var csegmentex = /[1-9]*?\.(.*)/;
csegment = csegment.match(csegmentex);
csegment = csegment[1];
} else {
csegment = '';
}
function populateHiddenFields(f) {
f.source.value = source;
f.medium.value = medium;
f.term.value = term;
f.content.value = content;
f.campaign.value = campaign;
f.segment.value = csegment;
return true;
}
</script>
</head>
<body>
<form name='contactform'
onSubmit="javascript:populateHiddenFields(this);">
<input type='hidden' name='source' />
<input type='hidden' name='medium' />
<input type='hidden' name='term' />
<input type='hidden' name='content' />
<input type='hidden' name='campaign' />
<input type='hidden' name='segment' />
</form>
</body>
</html>
Now, if this form is directly connected to a CRM then the data in the hidden form fields will go directly into the CRM along with other form info. That’s the magic. Again, you can use this method to collect source information even if the form is not connected to a CRM. Any type of lead generation form will be more valuable if you use this technique.
How The Code Works
When the above page loads in the browser the JavaScript starts to execute and extracts data from the cookies. First, it extracts the value of the __utmz cookie and stores it in a variable named z. Then it parses the z variable and looks for information about the visitor’s source. The __utmz cookie has a number of name-value pairs separated by a pipe (’|') character. Each name=value pair holds a different attribute of the visitor’s source. Here’s an example of the __utmz cookie.
12454562.1193706926.14.5.utmcsr=google|utmccn=(organic)|
utmcmd=organic|utmctr=google%2Banalytics%2Bshortcut
You can see that the name value pairs look very similar to the parameters we use for link tagging. Just by looking at the above cookie you can figure out that the visitor performed an organic search on Google for the term ‘google%2Banalytics%2Bshortcut’ or ‘google analytics shortcut’. That’s the type of information that we want to put in hidden form fields and send back to the server. [You can learn more about the __utmz cookie in the reference section below.]
Getting back to the code, we were talking about how the code extracts the information from the z variable. It uses a function named _uGC(), which is found in the urchin.js JavaScript, to do all of the work. __uGC() extracts the value part of all the name-value pairs in the cookie. We call _uGC() for each name-value pair that exists in the cookie. It parses the cookie and pulls out the information that we want. [If you want to know more about _uGC() please see the reference section at the end of this post.]
Once the information is out of the z variable the populateHiddenFileds() functions puts the data in a series of hidden form fields. Then, when the form is submitted, the data is sent to the server.
You’ll notice a few things about the above code. I’ve added some logic to deal with AdWords auto-tagging. Auto tagging populates part of the cookie with a value named gclid. This variable hides some of the info that we need, like source and medium. The logic in the above code populates data that would otherwise be missing. I’ve also added some code that extracts the custom segment value which is stored in the __utmv cookie. I thought it would be useful to send this info back to the server as well.
Conclusion
Pulling visitor source data and connecting it with a visitor is very valuable. While your implementation will almost certainly be different, the concept illustrated above is the foundation for all implementations. Regardless of your implementation the business use for this data is fantastic.
Good luck!

Reference
About the _uGC() Function
_uGC() takes three arguments:
_uGC(string, start-string, end-string)
• A string to search (target string)
• A start string
• An end string
The function will return the string between start string and end string. If the start string is not found then the function will return a dash (-).
About The _utmz Cookie
The __utmz cookie is the referral-tracking cookie. It tracks all referral information regardless of the referral medium or source. This means that all organic, CPC, campaign, or plain referral information is stored in the __utmz cookie. By default the cookie expires in six months, but that can be customized by changing the tracking code.
Cookie Format:
domain-hash.ctime.nsessions.nresponses.utmcsr= X(|utmccn=X|utmctr=X|utmcmd=X|utmci
Data about the referrer is stored in a number of name-value pairs, one for each attribute of the referral:
utmcsr
Identifies a search engine, newsletter name, or other source specified in the
utm_source query parameter See the “Marketing Campaign Tracking”
section for more information about query parameters.
utmccn
Stores the campaign name or value in the utm_campaign query parameter.
utmctr
Identifies the keywords used in an organic search or the value in the utm_term query parameter.
utmcmd
A campaign medium or value of utm_medium query parameter.
utmcct
Campaign content or the content of a particular ad (used for A/B testing)
The value from utm_content query parameter.
utmgclid
A unique identifier used when AdWords auto tagging is enabled This value
is reconciled during data processing with information from AdWords.
Share:
These icons link to social bookmarking sites where readers can share and discover new web pages.
Subscribe:
19 Comments »