I was looking at the traffic for some of my sites the other day and spotted a lot of bandwidth (relatively) being used by one particular site. Now that site only really serves up a script to convert RSS feeds to JavaScript files so that I can them embed news items from selected sources into some client sites.
So I looked at the web stats. for that site and discovered that nearly all the traffic was coming from a site in China. I followed a couple of the referring links and found that the pages were basically just generating page after page of potential search terms with embedded news feeds presumably to serve ads. on those pages.
That does beg the question that if they are intelligent to code those pages or that system, why aren’t they intelligent enough to simply add the scripts to their own site and serve them from there?
Now my site is hosted on a regular Linux box running Apache Web Server, so it was a fairly straightforward task to simply block all traffic from that domain name using an .htaccess file with this code:
RewriteEngine on
RewriteCond %{HTTP_REFERER} ^http://(www\.)?baddomain\.com [NC]
RewriteRule .* - [F]
So the next day when I checked the stats., there were many thousands of Failed Referrer entries where the code was no longer being leeched by them. Job done!
But it did then appear that my site had some particular attraction to them because they then started running the scripts on a different domain! Now, my first thought was to simply amend the .htaccess file to read as follows:
RewriteEngine on
RewriteCond %{HTTP_REFERER} ^http://(www\.)?baddomain\.com [NC,OR]
RewriteCond %{HTTP_REFERER} ^http://(www\.)?anotherbadone\.com [NC]
RewriteRule .* - [F]
But I realised I could end up playing cat and mouse with them for life, so instead I have now set the .htaccess file to only allow specific referring domains access to the scripts by using this code:
RewriteEngine on
RewriteCond %{HTTP_REFERER} !^http://(www\.)?gooddomain\.co\.uk [NC]
RewriteCond %{HTTP_REFERER} !^http://(www\.)?anothergoodone\.com [NC]
RewriteRule .* - [F]
By adding the “!”, the expression now says “if the referrer is not gooddomain, then…”. The only difficulty for me then is making sure there are matching entries for all the legitimate referrers (trickier as one of the sites has multiple domain names).
We’ll see how we get on with this.
[edited to add]
And lo and behold! The blocking is working well, especially as the Leecher in question, hosted by NetEase.com, Inc., has now started doing it with a third domain name.
You must be logged in to post a comment.