For our retention/attrition analysis, it would be meaningful to restrict counting visitors – whether retained or lost – with respect to acquired visitors over a certain period of time.
As the diagram shows, we will focus on visitors who were acquired in a certain month, e.g. January. Only this segment of visitors -acquired in January- will be investigated if they were retained or lost during the following 2 months, i.e. February and March. That is, green is marking the time window of acquisition and yellow is marking the time window for counting retention or attrition.
Consequently, this leaves some visitors out of our data model. For example:
- Visitors who were acquired before January won’t be included.
- Visitors who were acquired in January and came for the second time after March will be considered lost.
- Visitors who were acquired in January and came for the second time in Feb/Mar but never came back later will be considered retained.
This should be totally acceptable, since we are only trending and not measuring actual long-term retention and attrition rates. Also, that’s why defining the length of both time windows – acquisition and retention – can vary greatly from one site to another and can change the results significantly. You can decide the acquisition/retention window that make the most sense for your business and adjust the queries accordingly.
That is, remember that you’re not bound by these numbers. As you run the queries on your own data, you can certainly decrease or increase to align with your business model and typical/desired user behavior.
Let’s now begin building our queries.
Step 1: Extracting Relevant Fields To An Intermediate Table
The following query will generate an intermediate table, extracting only a few relevant fields and simplifying following queries. Specifically, the query will extract the visitor id, visit id, visit number, and visit time in different formats. The calcMonth is the number of months starting with January 2015, the start of our report.
Before you run this query, replace XXX_ViewID_XXX with the view id of your Google Analytics view that is exported to BigQuery.
Listing 1: Intermediate table to extract visitor id, visit id, visit number, and visit time and calculates calcMonth, all to be used in queries below.