Google Analytics is a great source of data for businesses and agencies alike, offering a wealth of knowledge that can be used to help improve the performance of a site.
Unfortunately, there are those who abuse elements of Google Analytics to send fake, useless traffic to your site. This is known as referral spam, and it can be a real challenge for those looking to get the most out of Analytics.
What is it?
A referral occurs when Analytics tracks a visit to your site that originates from somewhere other than the search engine results, most commonly from a user clicking a link from one site to another. This is then shown as a ‘referral’ from the site where they clicked the link.
So where does referral spam come in? Referral spam creates fake visits to your site from other sites that want to promote themselves by exploiting weaknesses in the way that Google Analytics works. They leave their website’s address in the referrer’s list, hoping you will get curious and search for it, which will either lead back to the spammer’s site to get them visits, or redirect users to an affiliate site which is paying the spammer for traffic.
One commonly used method is to to inject data into the analytics tracking ID for your site, sending ‘ghost’ sessions that never actually happened. These sessions have a bounce rate of 100% and visit only one page, which can ruin some site metrics, including average session length and bounce rate.
Another commonly used method is for referral spam to be delivered by crawlers which ignore robots.txt and hunt for information that the spammers can use. This type of referral spam can be harder to filter out as the crawlers actually visit the site and crawl through it.
What does it look like?
In the picture above, you can see some of the referral spam that can effect a website. This report can be found in Acquisition > All Traffic > Channels > Referrals. As you can see, the bounce rate for each of these referral spam sources is 100% and they have spent a total of 00:00 seconds on the site, suggesting they never visited at all.
How can you deal with this?
There are several methods that you can use to reduce referral spam, but before you engage in any of them, it’s important that you set up your analytics account so that you can test any changes before they go live. You can use different ‘Views’ to do this. At Klood, we recommend using an unfiltered view (with the raw data), a test view (for testing new filters), and a filtered view (with the spam filtered out).
To do this, head to the Admin section of your Google Analytics account and click on the name of the view. Then, click in the highlighted area below.
You can then set up an unfiltered, test and filtered view, ready for use.
Implementing a valid hostname filter
The first method is to implement a valid hostname filter. Real sessions onyour site will have both a source and your hostname (the server that the landing page is pointing to). Ghost referrals do not visit your pages, and will use an invalid hostname such as (“Not Set”), meaning that they can be filtered out using a simple regular expression.
Firstly though, you need to identify your valid hostnames. To do this, go to Behavior > All Pages, and then filter by Hostname.
The highlighted sites above are legitimate hostnames for our site, which we can tell from the face that they have a number of sessions, and a non 100% bounce rate.
This regular expression will filter out the bulk of referral spam visits by making sure that all visits have to have a valid hostname. You can then create the filter using the process below.
Implement spam crawler filters
The next stage is to implement filters that will remove spam crawlers from your site. You can find an excellent guide on doing this, with some regularly updated expressions, in Ben Travis’s guide to applying this type of filter.
Below you can find a visual guide to applying this type of filter, if you need something to refer back to.
These filters will need to be kept up-to-date, as new referral spam sources are always appearing, and you will need to add these to the filters to ensure they are kept out of your data.
Remove all bots & spiders
Google Analytics also has an inbuilt option to filter away hits from known bots and spiders. Simply head to ‘Admin’, then to each view you want filtered. Select view settings, and tick the box ‘Exclude all hits from known bots and spiders’.
Implementing these filters will allow your data to be more accurate, enabling you to make more accurate decisions about your site.
While new referral spam is always evolving, while new methods of getting around Google’s spam detection are always being created, these filters can provide a strong first line of defence against the bulk of referral spammers.