Sunday, January 4, 2015

Don't Look Bad - Test Your Website

Don't be caught looking like a fool when your website fails unexpectedly.  Find out errors ahead of time using automatic testing techniques.  In this article we discuss automatic regression testing.  Use PHP to setup your own continuous integration server.

When you are responsible for a updating a website that is in active use,  you want to feel confident that the changes you make to the code do not break any currently working pages.  Nothing is worse that making a small change to fix one problem and find out later through a customer that the unintended consequence causes another page to stop working.  Prevent yourself from experiencing that embarrassment and setup automated webpage checking.  It's not too difficult and you will thank yourself later when you identify errors before pushing changes made on your development server to the production server.

Types of Testing

  • Unit Testing - Using a tool like PHPUnit, tests the individual components of your programs and be run each time code changes are made.  
  • Regression Testing - The subject of this article are tests help verify that the webpage works as a whole.  It is designed to supplement unit testing and provide early notification when any page or web service breaks.  Scheduling this test to run repeatedly is the next best thing to having your own continuous integration service, but is much easier to setup.
  • Integration/UI Testing - Using a UI simulator like Selenium tests the end-user's experience on the website.  While useful, this is time consuming to setup and is easily defeated when site layout is changed.  This testing is beyond the scope of this article. 

 

First Things First, the Overall Process


  1. Separate development servers from production.  It is recommended to have a separate development server and production server.  Development servers can be low power and low memory machines which are not only inexpensive to maintain, but have the added benefit of exposing any performance issues early.  
  2. Perform unit testing on the development server each time you make code changes.
  3. Perform automatic regression tests from a different computer to the one you are checking.  If the computer you are testing goes down, you want to be notified.  If your testing is done on the same computer, then if that computer fails, you run the risk of not being notified when there is a serious problem.  Run an automated task on the production server that checks the development server.  
  4. Test in both directions.  Run an automated task on the development server that checks the production server.  
  5. Confidently push code changes to the production server after testing successfully passes.

 

Writing a Regression Test - Get what you expect, not what you don't 


Get the first page you want to check and identify three things, the URL of the page, the content that is expected to be on the page (ie. page title), and content that should never be on the page (error messages, etc).  The test will verify that content you expect to be there is present and that content you don't want is not present.  This could be any page you want and I am using checkliststogo.com's homepage (a site I develop and maintain).  The error messages I use are geared towards a Linux, PHP, and Apache server, but the same basics apply to other systems.  Make sure error reporting is on in your programming language (for PHP this can be done by setting error_reporting = E_ALL in /etc/php.ini).
URL - "www.checkliststogo.com"
Expected content - "Checklists ToGo", "Popular Checklists", " WorxForUs ©"
Error content -"Parse error: syntax error", "{local filesystem root path to your site}", in my case: "/home/ec2-user/www/htdocs/"

Using the local filesystem page path in the check for error content combined with the web server language displaying errors is an extremely powerful and easy way to check for site errors.  Syntax errors, database errors, run-time errors, and all kinds of problems are all easily detected since errors report a filesystem trace including the root path when errors occur.  NOTE: The relative path below the filesystem root should not be used in the error detection since those strings will be found in page links.

Running the Regression Test - Main Code


Now that we know the site and what strings to check for we can build the program to run the actual test in a file called sample_index_test.php.

<?php
    include_once("validate_site_helper.php");

    $url = "http://www.checkliststogo.com";
    //These strings must not be in the page content to pass testing
    $err_arr = array();
    $err_arr[] = "Parse error: syntax error";
    $err_arr[] = "/home/ec2-user/www/htdocs/"; //This is probably the best detector in this group

    //These strings are required to be in the page content to pass testing
    $pass_arr = array();
    $pass_arr[] = "Checklists ToGo"; //check page title is on page
    $pass_arr[] = "Popular Checklists"; //check sample header
    $pass_arr[] = "WorxForUs &copy;"; //check copyright

    //Check the site
    $result = validate_site_helper::check_site($url, $err_arr, $pass_arr, basename(__FILE__));

    //Report the test result
    if (!$result->success) {
        $message = "Page {$url} testing failed - {$result->error}";
        handle_error_notification($message);
    } else {
        //(optional) let developer know the site was ok
        echo ("Site {$url} is ok");
    }

    //This is your custom module to send the display to the administrator or developer
    function handle_error_notification($message) {
        //Notify admin of failure - email, print to screen, etc.
        //Please see other blog entries on sending emails which are a great notifier
        echo ("ERROR: {$message}");
    }
?>


The validate site helper encapsulates all this checking and returns a result object that lets you know how the testing went.

The error notification is going to be different for each system and is beyond the scope of this article.  In my case, I like to use Amazon Simple Email Service (tutorial here) to send emails to myself when errors are detected and find that works very well.  


Validate Site Helper - Code


The validate_site_helper does all the hard work of getting the URL page content, parsing the text for the expected and error strings and then returns the result.

<?php

class validation_result {
        public $success = true;
        public $error = ""; //for passing errors
        public $subject = ""; //for providing a quick summary to email
}

/**
 *  validate_site_helper - this is a tool to capture and parse a specific web site page
 * @author sbossen
 */
class validate_site_helper {

        protected static function check_site_helper($site_content, $host_url, $err_indicators_arr, $pass_indicators_arr, $calling_file) {
                $result = new validation_result();
                //using try here so any parse errors will be caught by this script
                try {
                        $ctg_content = $site_content;
                        //check for the errors
                        foreach ($err_indicators_arr as $err_str) {
                                if (stristr($ctg_content, $err_str)) {
                                        $result->success = false;
                                        $result->error .= "Suspected error indication: '$err_str' found in generated page content.\r\n";
                                }
                        }
                        //check for the required items
                        foreach ($pass_indicators_arr as $pass_str) {
                                if (!stristr($ctg_content, $pass_str)) {
                                        $result->success = false;
                                        $result->error .= "Validation indication: '$pass_str' was not found in generated page content.\r\n";
                                }
                        }

                        if (!$result->success) {
                                $result->subject = "$host_url - Warning - $calling_file";
                        }
                } catch (Exception $e) {
                        $result->success = false;
                        //email user
                        $body = $e->getMessage()."\r\n".$e->getTraceAsString();
                        $result->error .= $body;
                        $result->subject = "$host_url - Execution Error - $calling_file";
                }
                return $result;
        }

        public static function check_site($host_url, $err_indicators_arr, $pass_indicators_arr, $calling_file) {
                $result = new validation_result();
                //using try here so any network errors will be caught by this script
                try {
                        $ctg_content = file_get_contents($host_url);
                        $result = validate_site_helper::check_site_helper($ctg_content, $host_url, $err_indicators_arr, $pass_indicators_arr, $calling_file);
                } catch (Exception $e) {
                        $result->success = false;
                        //email user
                        $body = $e->getMessage()."\r\n".$e->getTraceAsString();
                        $result->error .= $body;
                        $result->subject = "$host_url - Execution Error - $calling_file";
                }
                return $result;
        }

        public static function check_site_with_post($host_url, $post_params_array, $err_indicators_arr, $pass_indicators_arr, $calling_file) {
                $result = new validation_result();
                //using try here so any errors will be caught by this script and emailed
                try {
                        // use key 'http' even if you send the request to https://...
                        $options = array(
                                'http' => array(
                                        'header'  => "Content-type: application/x-www-form-urlencoded\r\n",
                                        'method'  => 'POST',
                                        'content' => http_build_query($post_params_array),
                                ),
                        );
                        $context  = stream_context_create($options);
                        $ctg_content = file_get_contents($host_url, false, $context);

                        $result = validate_site_helper::check_site_helper($ctg_content, $host_url, $err_indicators_arr, $pass_indicators_arr, $calling_file);
                } catch (Exception $e) {
                        $result->success = false;
                        //email user
                        $body = $e->getMessage()."\r\n".$e->getTraceAsString();
                        $result->error .= $body;
                        $result->subject = "$host_url - Execution Error - $calling_file";
                }
                return $result;
        }

}

?>

Validation Results

 

This works by using PHP's built in file_get_contents function which grabs the contents of the URL from the server.  That is handled beneath a try function that captures errors such as the page not being found and allows the script to continue and report the error back to the user.  Otherwise, if the page could not be retrieved the notification code would not execute which would be a big problem.


The returned validation_result object is just a holder to pass along the results of the validation.  When you get the results, you'll want to pass them on somewhere to let the developer know that an error has occurred.  In the sample code here we are just outputting to the screen.  

This code was tested under multiple failure scenarios, including:
  • Server is offline (IP address could not be resolved)
  • Page is not authorized
  • Page does not exist
  • Page is blank

    ERROR: Page http://
    www.checkliststogo.com/ctg/app testing failed - Validation indication: 'Checklists ToGo' was not found in generated page content. Validation indication: 'Popular Checklists' was not found in generated page content. Validation indication: 'WorxForUs ©' was not found in generated page content.
  • Page is OK

    Site http://www.checkliststogo.com/ctg/app is ok

Automating the Testing


When the validation code is ready, you'll want to continually run it.  An easy way to do this in Linux is using the cron tool or Task Scheduler for Windows.

For me, I just run this test every hour on the 14th minute:
    sudo crontab -e
    14 * * * * php {path to file}/sample_index_test.php
To finalize the change and write the updated task to the system
    :w

Of course, you will need to have added the email notification (or other system) since cron will only output to the console and you will not see it directly.

If you find this code useful, please let me know in the comments, give a +1, or send a smiley cat picture.