While much of the discussion around non-human or spam data in Google Analytics focuses on (malicious) bots and referral spam, there is another increasingly prevalent form of fictitious Web analytics data: hits generated through direct HTTP requests.
In Google Analytics terminology, the mechanism for recording data through HTTP requests is referred to as the Measurement Protocol. The Measurement Protocol is certainly not a bad thing; in fact, it's the most "universal" aspect of Google Universal Analytics, since it allows us to send hits to the Google Analytics servers from any programmable, networked environment in the form of name/value pairs, where the name corresponds with Google Analytics dimensions. (Classic use case: kiosk in a shopping mall.)
Interestingly, hits generated from analytics.js in the browser and the mobile SDKs are ultimately also communicated through Measurement Protocol, so it in some ways is the backbone of Google Analytics data collection. As with most other technology, Measurement Protocol is bad only when used for nefarious purposes.
One of the dimensions that is likely to be unpopulated in a Measurement Protocol spam hit is hostname. In order to block hits where the hostname is (not set) – whether from bad Measurement Protocol requests and potentially other types of GA spam – from entering your Google Analytics view, you can use the following set of filters.
Analysis Benefits: (What are the top benefits that this analysis will provide?)
DAA members, go here to view full recipe.
- Reduced spam data in your Google Analytics reporting
Analytics Team Lead - Implementation
E-Nor (Corporate Account)