Filtering out Google Analytics Referral Spam
Posted on May 11, 2015
Knowing your website stats is vital if you want to properly monitor the progress of your marketing efforts, and make informed data-influenced decisions. At Optimising, we are looking at analytics all the time for our clients, so we need to ensure we are getting the most accurate information possible.
You may have noticed a lot of 'referral spam' popping up in your Google Analytics data over the past few months. These are some of the worst offenders:
This has become a growing problem and Google is yet to issue a solution themselves (even after being made aware of this problem in at least 2013).
Here's an example from one of our clients using a brand new domain! The referral spam is making up over 70% of their traffic!
These spammers get your attention but posting their URLs in your Analytics data. You then visit their website to see what the hell it is!
What not to do
A few of these spam services offer an 'opt out' link on their websites. I can confirm that this does not work! It's probably best not to even bring your website to their attention by submitting your site in this fashion.
So now what?
Your best method is to simply filter these out of your Google Analytics. I know some people would prefer to fix the problem at the source (myself included) and not even allow them to post to your Analytics at all, but you have no other option here.
Here is our simple method which works for majority of the spam, or at least enough not to skew your real data too much.
Log in to your Google Analytics.
Set the Date Range to the past 12 months (or so) to get a good amount of sample data.
Go to Acquisition > All Traffic > Referrals
Add a 'Secondary Dimension' for 'Hostname'
Sort the table by 'Hostname'.
In my example the only real one was:
But read through your list carefully as to not miss CDN hostnames or similar.
It is pretty safe to add these ones whilst you're at it:
Go to Admin > View > Create new view (always do this so you don't ruin your master data source).
Call it 'Excluding Spam' or similar.
Set timezone accordingly.
Go to Admin > Filters > New Filter (it should be applied to the 'Excluding Spam' View you just created).
Call it 'Exclude Spam' with type 'Include' and 'Hostname'.
Using your real hostnames from step #6, format them like so (with pipes between the hostnames, and escape the fullstops with backslashes):
Click Verify this Filter and you should see the spam traffic that is going to be excluded.
If there are any wrong ones in there go back to step #5.
Now when you view the Reporting use your newly created View.
Unfortunately this method will only work for data captured going forwards and your historic data is still going to be skewed. So the earlier you implement this, the better!
The only flaw with this method is if the (smarter) spammers use an actual hostname of yours then they will not be excluded. In which case you can set up a second filter to specifically exclude these referrals. But so far, we have found this to work ~90% of the time.
If you are serious about your website data, and you should be, start with the above so that you can start to trust your own numbers!