Closing the Loop on Data Validation in Google Analytics
Using Realtime reports and Segmentation Techniques to Verify Google Analytics Tags and Reports
by Nikolay Gradinarov
Manual data validation in Google Analytics often starts with checking the dataLayer variables and then inspecting the Google Analytics network requests.
My favorite tools for this task (apart from the built-in browser Dev Tools) are Philip Lawrence's Omnibug and David Vallejo's GTM/GA Debug.
In addition to inspecting browser states and requests, it often makes sense to confirm that data dispatched by the browser is also being properly consumed by the target Google Analytics reporting property and view. Some examples where this is a good idea include:
- Changing view filter settings
- Confirming marketing channel configurations
- Validating activity from native apps (where you may not be able to inspect the details of the network requests)
- Testing cross-domain tracking and eliminating issues with over-counting of visits/visitors
- Ruling out unexpected issues with data collection
There are two Google Analytics features that I use for such cases—Realtime reports and Segmentation.
The idea behind both of these techniques is that we are able to key off of a unique visitor attribute that will uniquely identify our testing session. Armed with this attribute, we are in turn able to validate that the activity we register in the browser during the testing session matches the metrics and the different dimension values reported by Google Analytics. This blog post will review both of these methods and a few example scenarios.
The biggest advantage of using this method, as the name suggests, is that you can validate things in more or less real-time. The simplicity and immediacy of this approach also make it easy to explain to team members that may be otherwise intimidated by the prospect of checking tags. With this method your colleagues will be able validate various Google Analytics data on their own and improve their understanding and confidence in the data. (This technique is especially useful for marketing campaign validation where a marketer may want to make sure that there are no omissions in the tracking (UTM) codes or other tracking inconsistencies.)
To use this approach:
1. Add a query parameter to the end of your URL. I usually add the "test=" parameter name and assign it a unique value that is specific to the use case I am trying to test.
For example, if I wanted to check that the "Learn More" CTA on QA2L's home page is correctly recording when the CTA is clicked, I might use the following parameter name/value pair "test=learnmore" and append it to the end of the URL "www.qa2l.com/", so it becomes "www.qa2l.com/?test=learnmore". The parameter name and value can be customized as you prefer, the only requirement is that they are unique enough to identify your testing activity.
2. Paste the URL in the address bar of your browser. Click on the "Learn More" CTA to generate the click event.
3. In GA's Realtime feature, I can navigate to the "Content" tab and find the unique value "/?test=learnmore".
4. Clicking on the value would allow me to select the HTTP requests associated with this session.
5. Navigating to the "Events" tab would reveal all events fired during the session, including the event tracking for the "Learn More" CTA that I clicked on and that I originally like to validate.
This same approach can be adapted for use with UTM codes: you might build a URL such as "www.qa2l.com/?utm_source=learnore&utm_medium=test&utm_campaign=test. When the landing page loads, you would check the "Traffic Sources" tab and look for the unique Medium / Source values you have specified.
This type of data validation can be performed by anyone with access to the digital property and Google Analytics—and the real-time component makes it quick and easy.
This same approach would also allow you to verify various GA Goals that may have been set up in the target view.
There are, however, quite a few elements you won't be able to validate in this way—custom metrics, custom dimensions, e-commerce related events/parameters, attribution-related counts, etc. For such advanced use cases, you would want to use segmentation.
This method starts out the same way:
1. You specify a unique URL by passing a unique query parameter name/value pair. We'll use the same testing URL we provided earlier to illustrate this example: www.qa2l.com/?test=learnmore.
2. When using Segments, you would need to wait for the data to become fully processed. The data is fully processed by your Google Analytics property when you see the unique query parameter name/value pair in the "All Pages" report under Behavior > Site Content.
The easiest thing to do is to directly search for your unique value in this report. If the value shows up, you are cleared to move on to the next step. (Data becomes fully processed usually within a few minutes of registering the activity, but depending on the volume of hits processed by your Google Analytics property / view it may take longer to see the unique values in the "All Pages" report).
3. Once you have confirmed you can see the value, build a User-scoped segment. The User segment will only allow the hits from the visitor that generated the unique parameter value to be displayed once the segment has been enabled.
4. With this segment enabled, you can examine any Google Analytics built-in or custom metric or dimension. You can then make sure that the list of actions you have executed as part of your test activity are correctly represented by your various Google Analytics reports.
This method is great for verifying all sorts of advanced reporting capabilities / configuration (flows, marketing channel/campaign data, visitor counts) and also gaining a deeper understanding of how various Google Analytics reports and settings work. You will notice that since we are using a User-scoped segment, the data will show activity for past, current, and future sessions associated with the tracking cookie that Google Analytics uses to identify the testing session. Depending on your needs, you may need to change the scope of the segment to Session or clear cookies periodically.
A Note on Native App Validation
A version of the segmentation method can also be used effectively to validate data coming from Native Apps. If your app has a freeform form field (any "search"-like functionality or a user login field that is being tracked by a Google Analytics dimension e. g.), you can build a segment off of a unique value typed in that field in order to uniquely identify a visitor/session.
QA2L is a data governance platform specializing in the automated validation of tracking tags/pixels. We focus on making it easy to automate even the most complicated user journeys / flows and to QA all your KPIs in a robust set of tests that is a breeze to maintain.
Tags: Data Quality Google