Thursday, June 20, 2013

How to Handle Spammers or Spambots (on Contact Forms)

The Form

If you have a contact form on your website for the purpose of allowing users to request additional information or forward contact information, then you've probably come across spambots.

Here is the simple form we use on our website to notify us by email of simple requests.


Contact Us

Your Name:  
*
Phone:  

Email:  
*
Purpose:  
 
Comments:  



The Spammer

In addition to legitimate content, we've also been getting these comments on filled out forms that are just spam.

Actual received form data: 
Name: Arianna 
Email: pitfighter@hotmail.com 
Phone: 41684792535 
Purpose: Proposal Request 
Comment: Could I have , please? <a href=" http://www.digitrak.com ">accutane prescription requirements</a>  Hepatic Issues, Atrial Fibrillation

I'm not sure who makes money off sending this type of spam.  I'm even less sure of someone who would be foolish enough to click on the link except by curiosity or accident.

Exposing the Spammer

It would be interesting to know where this spam is coming from, luckily finding out who is sending it is easy, being sure is not (because the IP address may be spoofed).  I choose a nice free geolocation plugin for PHP  which can be found at http://www.geoplugin.com

Capture the sender's IP thanks to a StackOverflow snippet:

function getClientIP() {
    if (!empty($_SERVER['HTTP_CLIENT_IP'])) {   //check ip
        $ip = $_SERVER['HTTP_CLIENT_IP'];
    } elseif (!empty($_SERVER['HTTP_X_FORWARDED_FOR'])) {   //to check if ip is passed from a proxy
        $ip = $_SERVER['HTTP_X_FORWARDED_FOR'];
    } elseif (!empty($_SERVER['REMOTE_ADDR'])) {
        $ip = $_SERVER['REMOTE_ADDR'];
    } else {
        $ip = "localhost";
    }   
    return $ip;
}


If you're willing to send me spam I am perfectly willing to share your IP address and the spammer in this case sent from: 188.143.232.31.  The PHP function returns the IP address, which you can then pass to the geoplugin PHP API site.

/** returns array of geolocation data
**/
function getGeolocation($ip) {
    $geo_str = "http://www.geoplugin.net/php.gp?ip=$ip";
    $arr = unserialize(file_get_contents($geo_str));
    //$country = $arr["geoplugin_countryName"];
    //$code = $arr["geoplugin_countryCode"];
    //$region = $arr['geoplugin_regionName'];
    //$regionCode = $arr['geoplugin_regionCode'];
    //$city = $arr['geoplugin_city'];

    return $arr;
}


Running the IP address through the geolocation service returns the following data:

Array
(
    [geoplugin_request] => 188.143.232.31
    [geoplugin_status] => 200
    [geoplugin_credit] => Some of the returned data includes GeoLite data created by MaxMind, available from http://www.maxmind.com.
    [geoplugin_city] => Saint Petersburg
    [geoplugin_region] => Sankt-Peterburg
    [geoplugin_areaCode] => 0
    [geoplugin_dmaCode] => 0
    [geoplugin_countryCode] => RU
    [geoplugin_countryName] => Russian Federation
    [geoplugin_continentCode] => EU
    [geoplugin_latitude] => 59.894402
    [geoplugin_longitude] => 30.2642
    [geoplugin_regionCode] => 66
    [geoplugin_regionName] => Sankt-Peterburg
    [geoplugin_currencyCode] => RUB
    [geoplugin_currencySymbol] => руб
    [geoplugin_currencySymbol_UTF8] => руб
    [geoplugin_currencyConverter] => 32.4624
)

So now we have information that points to a spammer in Saint Petersburg, Russia.  This makes sense, a lot of the spam in the world comes from Russia (of course many sources indicate most comes out of the good-ole USA).

Blocking Spam - Preventing spam from getting successfully sent

  1. Block by country code or region.  Once you know where people are sending data from, your website might only make sense to traffic from certain regions.  You can easily check the returned values from the geoplugin function to see if the sender is on your blocked list.  For us, customers outside the United States don't apply to our business so we block them
    if ($arr['geoplugin_countryCode'] != 'US') {
    //silently block the request
    }
  2. The comments typically have links in them when sent by spammers.  You can submit this text to another plugin called Akismet (found here) which will give you an idea if the content is detected as spam.  This plugin is originally intended for WordPress, but don't worry - you don't need to be running WordPress to use it, but you do need to signup for a free API key (here). 
    require_once("Akismet.class.php");
    public static function is_spam($name, $email, $comment) {     $WordPressAPIKey = 'xxxxxx';
    $MyBlogURL = "http://mysite.com";
    $akismet = new Akismet($MyBlogURL, $WordPressAPIKey); $akismet->setCommentAuthor($name); $akismet->setCommentAuthorEmail($email); $akismet->setCommentContent($comment); $akismet->setPermalink("http://lutz-engr.com/index.php");    return $akismet->isCommentSpam();
    }
  3. The above steps have been enough to block the vast majority of the spam for our sites.  If you are really serious about blocking more, use a CAPTCHA.  Personally, I hate these and don't recommend their use unless absolutely necessary, but they are definitely needed in certain applications.
Even when spam is detected using these methods, I still send it along, but mark it as spam.  It's fun to peek evey once in a while to see what silly things these spam bots are trying to do to promote a certain product or web link that I'll never buy.

Do spammers/spam bots run client side javascript?  

UPDATE: After running the code below, I have found that javascript is NOT being run on the forms.  This is based on the very same IP address identified earlier.  Since our website requires javascript anyway, it might make sense to block submission of the form unless javascript is run on the client machine.  The offending machine appears to be running an Apache 2.2.15 server on CentOS and it may be worth exploring more to see what else I can find. 


I have been wondered this and thought it would be easier and safer for a spambot to not run any client side scripting.  After all, doesn't it just crawl a page and fill out forms?  Maybe I could require users to be running javascript to submit the form.  To test for this, I've added a simple line that just changes the value if javascript is running:

<head>
...
$(document).ready (function() { $('#timedayjs').val('set by js'); }); 
...
 </head>

...
<input type="hidden" id="timedayjs" name="timedayjs" value="nojs (unused)"></input>

I called it 'timedayjs' just to be inconspicuous, but really this value just notifies me if the client has run the onready javascript function on the page.  I'll either get the value 'nojs (unsed)' or "set by js" passed into my contact form. 

I'm awaiting results from this experiment now.


UPDATE: Now that the results are in, see the next post for how you can fight contact form spam by requiring the use of javascript