Regular Expressions: A Powerful Tool for the Analyst’s Tool Kit

By | October 10, 2015

Regular expressions can seem pretty intimidating at first, especially if you have little to no software development experience like myself. Your first glance at a regular expression string might even make you want to curl up into the fetal position and cry for your mama. After facing your fears, you’ll soon realize that regular expressions aren’t as bad as they seem. In fact, using regular expressions in your analysis can be an extremely powerful tool to add to your analysis skill set.

What Are Regular Expressions?

Before I dive into all the fabulous benefits regular expressions have to offer, let’s all get a refresher on what regular expressions are. The following is a brief introduction for the vaguely familiar and for those who are just venturing into the world of regular expressions (don’t worry, you’re not alone).

A regular expression is a programing language that provides a concise and flexible means for matching strings of text, such as particular characters, words, or patterns of characters. A popular use for regular expressions in our world of website optimization is creating filters for reports and goals in Google Analytics. Each character you use in your regular expression is associated with a particular rule. You can combine these special characters and rules to demand specific data sets in your GA reports. You can also use regular expressions as a match type in your goals and in your funnel step URLS to indicate to Google Analytics which pages you would like to be captured and reported within your funnels.

For example, imagine you have multiple confirmation pages for a purchase on your site, one for a logged-in member customer and the other for a guest customer. Regular expressions can help you create a goal funnel that captures the purchase confirmation for both your new and returning customer.

How To Create Regular Expressions

There are multiple characters that can be used to create a regular expression, each with it’s own meaning. Rather than get into the nitty gritty of that here, I’ll reference the Regular Expression experts over at Lunametrics.

There is an excellent series of blog posts written about regular expressions on the Lunametrics blog. You can check out their 14 post series focused solely on regular expressions; the first article in the series outlines escaping with the use of the backslash “/.” The entire series covers various regular expression characters and how to use them when writing you own regular expressions. It even comes complete with a regular expression self mastery quiz!

Using Regular Expressions in Analysis

Now I’ll get into an example of a handy use of regular expressions for some data analysis in Google Analytics. This explanation assumes that you have read the 14 post series and understand a little about how to create regular expressions of your own.

Let’s pretend you have a large number of pages on your site (as many eCommerce websites out there do). You want to shine a spotlight on just your product pages to understand how they are performing. If you pull up a top content report, you find that you are looking at a data set with too much information. Your product pages are lost mixed in with your homepage, various category pages, and your checkout process. What’s an analyst to do?

Regular Expressions to the rescue!

First, take a look at the html structure of the site before even beginning to tackle writing a regular expression. View multiple product pages and note their similarities and differences in html structure. Try to identify a unique characteristic of the product page html structure that sets it apart from any of the other pages on the site. Once you have done this, you can create a regular expression that will only pull up the pages that have the unique identifier in the html structure of your products pages and viola! You now have a regular expression that you can use to filter reports or you can create a custom segment.

The Takeaway

How can this be useful to your analysis? In this particular scenario, you now have the ability to isolate only pages that are of interest to you. It will now be much easier for you to shed light on this data set by sorting your reports to uncover trends and potential challenges. For example, you can see how well highly trafficked product pages are converting in relationship with other product pages on the site.

This is just the beginning of your analysis journey with regular expression in your toolkit. Use your imagination and your knowledge of regular expressions to get a deeper understanding of your site’s performance.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.