anti-hammer - how to exclude my IP?

  • cerio
  • Proficient
  • Proficient
  • User avatar
  • Posts: 263
  • Loc: UK

Post 3+ Months Ago

Hi.
Does anyone use anti-hammer http://corz.org/serv/tools/anti-hammer/? I added it to my site to try to stop so much bandwidth being wasted and it works fine except that it includes me and halts me when in my site admin. Does anyone know how I can exempt my own IP in anti-hammer?

It does suggest a sort of way round admin being caught up in it; it seems to say to exempt a specific browser, such as Firefox, which is what I mainly use, but that isn't ideal and I don't undertand how one does that anyway. I want to exempt my IP instead.

Their anti hammer bypass solution is...
Quote:
There's also the facility to enable one correctly configured browser to bypass Anti-Hammer at all times. This is designed for busy webmasters who sometimes, in the course of their daily activities, will need to hammer their own site. I know I do!

This, setting ("admin_agent_string"), along with many other settings, can be found in the preferences section inside anti-hammer.php. Essentially, you tag a unique string onto the end of your browser's User Agent string, so that Anti-Hammer can recognize you as you. It's not high-security, but it is handy. I've used a similar approach to avoid loggin my own site hits for years.


...though I don't actually know how to 'tag a unique string onto the end of your browser's User Agent string'.

The anti-hammer php file contains this...

Code: [ Select ]
<?php    // Ûž text{ encoding:utf-8; bom:no; linebreaks:unix; tabs:4; } Ûž//
/* direct access -> \/ */                         $anti_hammer_version = '0.9.3';
if (realpath($_SERVER['SCRIPT_FILENAME']) == realpath(__FILE__)) { die(
'This script is designed to run as a php auto-prepend, like so (in .htaccess)..<br /><br />
<tt>php_value auto_prepend_file "/var/www/vhosts/wafuku.co.uk/httpdocs/anti-hammer.php"</tt>'); }


/*
    Anti-Hammer

    Automatically set temporary bans for web site hammering.
    Protect your valuable server resources for genuine clients.

    Full details here..

        http://corz.org/serv/tools/anti-hammer/

    Have fun!

    ;o) Cor

    Â© 2007-> corz.org

*/




/*
prefs.. */



/*
    Anti-Hammer data directory

    [default: $_SERVER['DOCUMENT_ROOT'].'/Anti-Hammer/anti-hammer';]

    When using Anti-Hammer's built-in client-tracking (the default), files will
    be stored, in this directory..
                                                    */
$anti_hammer['info_path'] = $_SERVER['DOCUMENT_ROOT'].'/anti-hammer/anti-hammer';




/*
    Client ID File Prefix            
    
    [default: $anti_hammer['ID_prefix'] = 'HammerID_';]

    This text is placed before the client ID in the ID filename. e.g..

        "HammerID_06fa71c938a108f4a2b1f1ef091653ef"

    You may wish to use a different name..
                                                        */
$anti_hammer['ID_prefix'] = 'HammerID_';




/*
    File Types                
    
    [default: $anti_hammer['types'] = 'php,html';]

    Which file types (extensions) to protect with the anti-hammer?

    We only want to count hits on the main pages, not associated files, css,
    javascript includes, and such (if they are generated by php), as many of
    these will normally be requested within miliseconds of the initial page hit.

    If we run the anti-hammer indiscriminately, such files, would automatically
    count towards hammering, and folk would probably be penalized on their first
    visit. If you don't use php to generate other (non - .php) files, the
    anti-hammer won't be running anyway - it only runs before php scripts, as
    it's designed to protect server resources, not bandwidth; basic requests get
    spat out without any real processing power or memory usage.

    These list items matches the extension of the *actual* physical script file,
    regardless of the requested URI, so for example, these..

        http://mysite.com/index.php
        http://mysite.com/foo.php?page=bar.htm
        http://mysite.com/genny.php?image=img1.jpg
        
    .. would *all* match 'php'. Other extensions are fine, so long as they are
    parsed by php on your setup. Separate entries with commas, and put the whole
    thing in quotes..

    Extensionless files are not supported.
                                                                        */
$anti_hammer['types'] = 'php,html';



/*
    Generated Extensions     
    
    [default: $anti_hammer['gen_types'] = 'jpg,png';]

    This is an list of (usually) image extensions which you serve via php. As
    there may be many of these on a single page, we want to skip these, too.

    These list items match the extension of the *request*, regardless of the
    physical script file generating the output. Links such as..

        http://mysite.com/gen.php?image=foo.jpg
        http://mysite.com/png-pusher.html/foo.jpg

    .. would match "jpg".
    
    Separate entries with commas and put the whole thing in quotes..
                                                                            */
$anti_hammer['gen_types'] = 'jpg,png';
/*
    NOTES:

        The file type *generating* these url's MUST be included in your
        $anti_hammer['types'] array (above), presumably. 'php'.

        You could also use the above preference array to skip other non-
        image generated types, if you have such things onsite.            
*/



/*
    Skip certain files and folders..    
    
        aka, basic "Ignore"..

        [default: $anti_hammer['skip'] = '/chat/,/foobar/members,rdf.php,/blog/rss.php';]

        A list of areas/folders and specific files you DON'T want the
        anti-hammer to cover. Enter the full path (from site root) to each
        file/folder.

        You can also skip ALL the instances of "rss.php", etc. on your entire
        site by using only the file name, e.g..

        $anti_hammer['skip'] = 'rdf.php,rss.php';

        This also works for folder. Using the full path enables you to target
        specific files and folders, using only the name gives you blanket
        coverage. Your call.

        Basically, if your string is contained anywhere within the requested
        URI, the script returns control to your page immediately, bypassing
        Anti-Hammer.

        Do put comments *between* entries.

*/
$anti_hammer['skip'] = '/chat/,/foobar/members,rdf.php,/blog/rss.php';

/*
    RSS feeds are a good example of a file to skip (assuming they are
    php-generated). Firefox, for example, will often grab all the feeds on a
    page at-once, quickly notching up a user's hammer count.

*/



/*
    Hammer Time!

    [default: $anti_hammer['hammer_time'] = 100;]    (One Second)

    If they make two requests within this time, the counter increases by one.

    The faster and more capable your server, the lower this setting can be.
    The higher you set this, the more likely they are to get a warning.

    100 is a reasonable setting for a fast server, enabling one-hit-per-second
    spidering, but penalizing anything faster.
    
    Enter an integer, representing 100th/s..
                                            */
$anti_hammer['hammer_time'] = 90;        



/*
    Trigger levels.        
    
    [default: $anti_hammer['trigger_levels'] = '5,10,20,30';]

    Enter the number of violations that will trigger each of the four levels..

    i.e. At the default settings, they get their first warning after five
    violations (with a ban time of three seconds, set below). The time penalty
    increases after ten and twenty violations, up to the maximum level of 30
    violations (which imposes the maximum ban time of 20 seconds). You can set
    the actual times in the next preference.

    Specify four integer values, separated by commas, whole thing in quotes.
                                                                */
$anti_hammer['trigger_levels'] = '5,10,20,30';



/*
    Ban Times.        
    
    [default: $anti_hammer['waiting_times'] = '3,5,10,20';]

    This list sets the individual times that offenders will be 'banned' for.
    They will have to wait *this* long before they can try again.

    Each of the four setting corresponds to one of the above trigger_levels.

    Specify four integer values, separated by commas, whole thing in quotes.
*/
$anti_hammer['waiting_times'] = '3,5,10,20';



/*
    Rolling Trigger Times
    
    [default: $anti_hammer['rolling_trigger'] = false;]

    This increases the ban time automatically with EACH hammer.

    <hit>
        You must wait three seconds..
    <hit>
        You must wait four seconds..
    <hit>
        You must wait five seconds..

    And so on.

*/
$anti_hammer['rolling_trigger'] = false;



/*
    Cut-Off

    [default: $anti_hammer['cut_off'] = '']

    You can also set an absolute cut-off point.

    Anyone receiving this many hammer violations is simply dropped, and from
    that point onward, their pages die before they even begin - blank.

    This works with both preset and rolling triggers.

    Leave blank to disable the cut-off.
                            */
$anti_hammer['cut_off'] = '75';



/*
    Bye Bye! Message.

    [default: $anti_hammer['cut_off_msg'] = '<h1>Bye Now!</h1>';]

    A final word from our sponsor?

    This is the final message they see before it all goes blank.
    No other text is presented.
                                                */
$anti_hammer['cut_off_msg'] = '<h1>Bye Now!</h1>';



/*
    Ban Time

    [default: $anti_hammer['ban_time'] = '12';]

    And for how many hours will the above cut-off (ban) last?
                                                            */
$anti_hammer['ban_time'] = '12';

//    NOTE:    If you set your Garbage Collection age to any less than this, you
//            effectively reset all bans older than THAT figure.
//
//            In other words, ensure your garbage collection age ('GC_age', below)
//            is larger than your 'ban_time' setting here, probably x2.
//            Think: if GC happened one minute after someone was banned, and their
//            session ID file was >= GC_age, it would be cleaned up! Then no ban!
//
// Also Note: Humans are daily creatures, for them a 12h ban, is effectively 24!



/*
    Log File

    [default: $anti_hammer['log'] = $_SERVER['DOCUMENT_ROOT'].'/log/.ht_hammers';]

    We will log each banned hit, for reference.
    Enter full path to log location..        

    NOTE: If the parent directory does not exist, Anti-Hammer will not attempt
    to create it, and you will get no logging.
                                                                            */
$anti_hammer['log'] = $_SERVER['DOCUMENT_ROOT'].'/log/.ht_hammers';

//             It is recommend you watch this log very carefully for the first
//    NOTE:     few minutes/ days after installation, in case of unexpected side-
//             effects. And in that case, please do mail me about it!



/*

    Kill Message.

    [default: $anti_hammer['kill_msg'] = 'Please do not hammer this site.<br />';]

    When a request is killed - send this message (before the other text).
    You can use any calid HTML in here, header tags, or whatever you like..
                                                                        */
$anti_hammer['kill_msg'] = 'Please do not hammer this site!<br />';

/* NOTE: No <br /> is placed after this text.
         If you aren't using <h> tags, and want a break, add it yourself. */

/*

    Page Title.

    [default: $anti_hammer['page_title'] = 'Please do not hammer this site!';]

    This is what is displayed in the title bar of their browser.
    Keep this one plain text.
                                                            */
$anti_hammer['page_title'] = 'Please do not hammer this site!';


/*
    WebMaster's Name

    [default: $anti_hammer['webmaster'] = 'the webmaster';]

    Name of the webmaster, will be included in the kill page.
    e.g. "If you believe this is in error, please mail <Insert Name> about it!"
                                                            */
$anti_hammer['webmaster'] = 'the webmaster';



/*
    Admin Bypass

    [default: $anti_hammer['admin_agent_string'] = 'MyCrazyUserAgentString';]

    If you insert this exact string into your web browser's user-agent string
    (just tag it onto the end), you can bypass the hammer altogether.
    Very handy for busy webmasters.
                                                                    */
$anti_hammer['admin_agent_string'] = 'MyCrazyUniqueUserAgentString';    


//             It's not advisable to go messing with the main body of your
//    NOTE:    browser's user agent string. Lots of web designers rely on this
//            information to serve you beautiful, functional web pages.



/*
    WebMaster email address (string).

    [default: $anti_hammer['error_mail'] = 'bugs at mydomain dot com';]

    The usual text format of so-and-so at such-and-such dot com works well.
    This is tagged on to the end of the massage inside <> angle brackets,
    to look like an address.
                                                                        */
$anti_hammer['error_mail'] = 'admin@wafuku.co.uk';



/*
    Lookup Failures.

    When an event worth logging occurs, we can lookup the host name of the
    client to add to our logs. This takes a moment, but only occurs while
    logging bad clients, and can be useful in quickly identifying abusers
    (or good bots using bad user agent string - to come)
                                                    */
$anti_hammer['lookup_failures'] = true;


/*

    Allow known bots?

    [default: $anti_hammer['allow_bots'] = false;]

    We can allow certain bots to bypass the Anti-Hammer.

    Do do this, specify the expected user agent strings in..

        path-to/anti-hammer/exemptions/exemptions.ini
        
    and then supply an IP-mask file where said user agent is expected to be
    making requests FROM, one ip per line, in the standard Spider IP list format
    as found here..

        http://www.iplists.com/
        http://www.iplists.com/nw/    <- updated, reorganised, with msnbot+more

        A blog URI is listed there, where list updates are posted.
        (this doesn't happen a lot, maybe 2-3 times a year)

    NOTE:    User agent string matches are CaSe SenSiTivE! If you want to match
            "msnbot" and "MSNBOT", you need two entries. (a case-insensitive
            test is roughly five times slower than case-sensitive; so testing
            two separate entries is much faster)

    NOTE:    If cooking up your own anti-hammer.ini, you probably do not want to
            include the generic user agent strings (e.g. Yahoo's "Mozilla/4.0"),
            which would create a lot of processing overhead, as ALL browsers
            send that. Doh! (More notes within that file.)


    You can set this to "true" (no quotes), in which case, all specified bots
    are simply allowd to bypass the hammer. You can also set it to an integer,
    e.g..

        $anti_hammer['allow_bots'] = 50;
    
    ..that integer representing the hammer_time that will apply to the specified
    clients. "50" would enable 2 hits-per-second spidering, but nothing faster,
    which is half the normal hammer_time of One Second (hammer_time=100).


                                    */
$anti_hammer['allow_bots'] = true;


/*
    The following two preferences control Anti-Hammer's built-in Client session
    Garbage collection routines..
*/


/*
    Garbage Collection Limit

    [default: $anti_hammer['GC_limit'] = 10000;]

    To prevent your server's hard drive filling up with stale client sessions,
    we run a periodic garbage collection routine to sweep up the old files.

    How periodically, is up to you. By default, Anti-Hammer will check for
    garbage every 10,000 hits. I'm thinking this would be around a 2-daily hit
    rate for a small site (@ 5000 hits per day).

    Obviously, you can chage this number to anything you like, depending on how
    busy your site is, and how much space you have on the disks.
    
    If you don't want Anti-Hammer to clean up its garbage, set this to 0.

    Remember to ensure that this limit falls well outside your longest ban time,
    probably at least 2x that.
                                */
$anti_hammer['GC_limit'] = 10000;


/*

    GarbAge!

    [default: $anti_hammer['GC_age'] = 24;]

    How old, in hours, is considered "stale"?
    Any ID files older than this will be swept away (deleted).
                                                                */
$anti_hammer['GC_age'] = 24;


/*
    NOTE: The previous two preferences have no effect if you set the following
    preference ('use_php_sessions') to true. They are only for AntiHammer's
    built-in client session files.

*/


/*
    Use php sessions..

    [default: $anti_hammer['use_php_sessions'] = false;]

    You would think it might be a nice idea to detect if the client has cookies
    enabled, and if so, use php sessions, only falling-back to some other method
    when they have not. However, it is not possible to detect whether or not a
    client has cookies enabled, with a single request. You need Two. Clearly,
    that isn't a lot of use for a protection mechanism designed to operate
    before they have even had one. So you gotta choose, now..

    By default, anti-hammer will use its own session mechanism, writing client-
    unique data to files in a directory of your choosing, irrespective of their
    ability to accept cookies. As it is an independant system, it in no way
    interferes with any session magic you may have running on your site, and in
    most scenarios is just as fast as php's own session handling.

    However, you may wish to use that, instead; particularly if you have
    millions of hits a day, and your web server stores the php sessions in a
    some uberfast /tmp space you can't otherwise get to, where the difference
    might be worth it. Or if in-website space is extremely limited. At any rate,
    you have a choice.

    NOTE: if you enable this, you will ALWAYS start a php session with each
    request. This usually presents no problems, but you and your server may know
    better. Testing is always advised! I ran it this way for may months on
    corz.org, with no issues whatsoever, and I use php sessions all over the
    site. If you use proper names in your session, everything should work fine.

    Also NOTE: With this enabled, if the client/spider/script kiddie/etc. has
    cookies disabled in their web browser, they bypass anti-hammer protection!
    This is why, by default, Anti-Hammer uses its own session mechanism.

    There should be no performance concerns; Anti-Hammer writes the data in the
    same way as a php session - it's a simple serialized array in a flat file.

*/
$anti_hammer['use_php_sessions'] = false;



/*
:end prefs: */




// let's go..
//


$killpage = false;
$gentime = explode(' ', microtime());
$anti_hammer['now_time'] = $gentime[1].substr($gentime[0], 6, -2);                //     1/100th of a second accuracy!
settype($anti_hammer['now_time'], "double");                                    // scientifically tested!
$anti_hammer['final_time'] = 0;    // will be used to set the retry header on killed page (503)


// Collect all usable client data..
$anti_hammer['remote_ip']        = $_SERVER['REMOTE_ADDR'];
$anti_hammer['user_agent']        = @$_SERVER['HTTP_USER_AGENT'];
$anti_hammer['referrer']        = @$_SERVER['HTTP_REFERER'];
$anti_hammer['request']            = $_SERVER['REQUEST_URI'];
$anti_hammer['user_accept']        = @$_SERVER['HTTP_ACCEPT'];
$anti_hammer['user_charset']    = @$_SERVER['HTTP_ACCEPT_CHARSET'];
$anti_hammer['user_encoding']    = @$_SERVER['HTTP_ACCEPT_ENCODING'];
$anti_hammer['user_language']    = @$_SERVER['HTTP_ACCEPT_LANGUAGE'];


// Admin Bypass..

// Is this the admin user? let's see..
if (stristr($anti_hammer['user_agent'], $anti_hammer['admin_agent_string'])) {
    return;
}


// local server access (for readfile() requests..
// (and as a potential catch-all for user pref errors!))
if ($anti_hammer['remote_ip'] == $_SERVER['SERVER_ADDR']) {
    return;
}/*
     A note about readfile()..

        If you use readfile() to include resources on your pages, remember,
        those requests will come in right after the first, and as they are
        technically brand new hits, they count towards the hammer.

        Use of include() is preferred.

        However, the code right above this notice should prevent any issues. If it does /not/, and include() isn't working, you might want to hack in     the actual IP Address of the local server. See my debug-report.zip for a way to easily get this sort of information in your browser.
        
        NOTE: If you are having difficulty include()ing URI resource in your
        pages, remember you need to enable BOTH php allow_url_* flags (this is the .htaccess version of those two switches..)

        php_flag allow_url_fopen on
        php_flag allow_url_include on
*/




// skip protection for known bots and spiders..
//
// okay, this is some cute code! simple, but effective.
// we load an ini file of user-agent=ip-mask-file pairs, and check our client's
// user agent string for a match (at must match the beginning of the string
// exactly). If there is a match, we load the associated IP Mask file, and
// run through the IP/masks, again looking for a perfect match at the start of
// the two strings. Commented lines are no problem. We use strpos() for both
// tests, so it's nice and fast, and the IP test covers our comments, too!
//
// having said (coded) all this, you gotta ask yourself, why are they hammering?
// Surely it would be better get them to slow down, instead!


$IP_file = '';
$anti_hammer['ini_file'] = $anti_hammer['info_path'].'/exemptions/exemptions.ini';

if ($anti_hammer['allow_bots']) {
    $bot_agent_array = read_bots_ini($anti_hammer['ini_file']);
    if (is_array($bot_agent_array)) {
        foreach ($bot_agent_array as $bot_agent_string => $IP_file) {
            if ($bot_agent_string and strpos($anti_hammer['user_agent'], $bot_agent_string) === 0) {
                break;
            }
        }
        if ($IP_file) {
            $ip_array = file($anti_hammer['info_path'].'/exemptions/'.$IP_file);
        }
        if (is_array($ip_array)) {
            foreach($ip_array as $bot_ip) {
                if (@strpos($anti_hammer['remote_ip'], trim($bot_ip)) === 0) {
                    if ($anti_hammer['allow_bots'] > 1) {
                        $anti_hammer['hammer_time'] = $anti_hammer['allow_bots'];
                    } else {
                        return;
                    }
                }
            }
        }
    }
}



// User prefs..
// Get user values into usable arrays, do some error-checking.


// trigger thresholds..
if (!stristr($anti_hammer['trigger_levels'], ',') or (str_word_count($anti_hammer['trigger_levels'], 0, "0123456789") != 4)) {
    $anti_hammer['trigger_levels'] = '5,10,20,30';
}
// A neat way to create a array from numeric prefs..
$anti_hammer['trigger_levels'] = str_word_count($anti_hammer['trigger_levels'], 1, "0123456789");

// Get user penalty times into correct values..
if (!stristr($anti_hammer['waiting_times'], ',') or (str_word_count($anti_hammer['waiting_times'], 0, "0123456789") != 4)) {
    $anti_hammer['waiting_times'] = '3,5,10,20';
}
$anti_hammer['waiting_times'] = str_word_count($anti_hammer['waiting_times'], 1, "0123456789");

// file types to protect..
if (!$anti_hammer['types']) { return; } // no types specified, forget it!
$anti_hammer['types'] = explode(',', $anti_hammer['types']);

// generated types to skip..
$anti_hammer['gen_types'] = explode(',', $anti_hammer['gen_types']);

// ignored locations..
$anti_hammer['skip'] = explode(',', $anti_hammer['skip']);




// run through ignored locations and if matched, return immediately..
//
foreach($anti_hammer['skip'] as $nogo) {
    if (stristr($anti_hammer['request'], trim($nogo))) { return; }
}

// Anti-Hammer only for php files, not generated css, etc..
//
$ah_type_ok = false;
foreach($anti_hammer['types'] as $ah_type) {
    if (!$ah_type) { continue; } // @ to avoid strict php5 errors
    if (@end(explode('.', $_SERVER['SCRIPT_FILENAME'])) == trim($ah_type)) {
        $ah_type_ok = true;
    }
    //2do.. could make this code more efficient, for those using MANY types.
}
if ($ah_type_ok /* still! */ == false) { return; }

// skip protection for selected generated types..
//
if (in_array(@end(explode('.', $_SERVER['REQUEST_URI'])), $anti_hammer['gen_types'])) { return; }



/*

    okay, let's do it..

*/



// read session data..
$session = array();
if ($anti_hammer['use_php_sessions']) {

    // Regular php session..
    session_start();
    $session = $_SESSION['anti_hammer'];

} else {

    // Anti-Hammer's built-in session mechanism..

    // Create a unique Client ID for this client..
    // we simply MD5 all the browser data concatenated together (and blanks are not a problem)..
    $anti_hammer['client_id'] = md5($anti_hammer['user_agent'].
                                    $anti_hammer['user_accept'].
                                    $anti_hammer['user_language'].
                                    $anti_hammer['user_encoding'].
                                    $anti_hammer['user_charset'].
                                    $anti_hammer['remote_ip']);
    $fake_sess_file = $anti_hammer['info_path'].'/'.$anti_hammer['ID_prefix'].$anti_hammer['client_id'];
    if (file_exists($fake_sess_file)) {
        $session = read_fake_session($fake_sess_file);
    }
}
/*
    Useful use of a "cat"..

    It seems to me that I unwittingly created a system whereby the less
    information a client is wiling to give, the more likely they are to be
    banned. I say "seems", because we create an md5 of this information, so the
    actual likelyhood of colliding session ID's is astronomically low. However,
    I like the *principle* of the thing.
*/



// Calculate the Hammer Rate..
//

// How much time since their last request (in 100/th Second)
$hammer_rate = $anti_hammer['now_time'] - @$session['start_time'] + 1;

// Their ban has elapsed (but GC has not swept up their session)..
if ($hammer_rate > ($anti_hammer['ban_time']*60*60*100)) {    // 8640000 = 24 hours (in 100th/second)
    $session['start_time'] = $anti_hammer['now_time'] - 1;
    $hammer_rate = $anti_hammer['hammer_time'];
    $session['hammer'] = $anti_hammer['trigger_levels'][0]-1; // repeat-offenders do not get to start from 0!
    unset($session['cut_off']);
    // do not return here - we still need to write the updated session data.
}


// CUT_OFF has already been set -- BYE NOW!
if ($anti_hammer['cut_off'] and isset($session['cut_off'])) { die(); }




// okay, still here..




// Start with Garbage Collection..
if (!$anti_hammer['use_php_sessions']) {
    CollectGarbage($anti_hammer['info_path'].'/Counter', $anti_hammer['GC_limit']);
}


// Anti-Hammer Protection has been activated!
if ($hammer_rate < $anti_hammer['hammer_time']) {

    $retry_str = 'a few ';
    @$session['hammer'] += 1;

    if ($session['hammer'] > ($anti_hammer['trigger_levels'][0]-1)) {

        // cut-off..
        if ($anti_hammer['cut_off'] and $session['hammer'] > $anti_hammer['cut_off']) {
            $anti_hammer['kill_msg'] = $anti_hammer['cut_off_msg'];
            $session['cut_off'] = true;
        }
        if ($anti_hammer['cut_off'] and $session['hammer'] == $anti_hammer['cut_off']) {
            $anti_hammer['kill_msg'] = '<h1>THIS IS YOUR LAST WARNING!</h1>'.$anti_hammer['kill_msg'];
        }

        // rolling ban time, increments with each hammer..
        if ($anti_hammer['rolling_trigger']) {
            $session['start_time'] = $anti_hammer['now_time'] + (($session['hammer']*100)-1);
            $retry_str = ah_int2eng($session['hammer']);
        } else {
            // predefined ban levels.. these are more effective, as they shock the user with increasing jumps!
            if (($session['hammer'] > $anti_hammer['trigger_levels'][0]) and ($session['hammer'] <= $anti_hammer['trigger_levels'][1])) {
                // we simply nudge their start time forward by *this* many seconds (into the future!)..
                $session['start_time'] = $anti_hammer['now_time'] + (($anti_hammer['waiting_times'][0]*100)-1); // 299 = Three second penalty.
                $retry_str = ah_int2eng($anti_hammer['waiting_times'][0]);

            } elseif (($session['hammer'] > $anti_hammer['trigger_levels'][1]) and ($session['hammer'] <= $anti_hammer['trigger_levels'][2])) {
                $session['start_time'] = $anti_hammer['now_time'] + (($anti_hammer['waiting_times'][1]*100)-1); // Five second penalty! (by default)
                $retry_str = ah_int2eng($anti_hammer['waiting_times'][1]);

            } elseif (($session['hammer'] >= $anti_hammer['trigger_levels'][2]) and ($session['hammer'] <= $anti_hammer['trigger_levels'][3])) {
                $session['start_time'] = $anti_hammer['now_time'] + (($anti_hammer['waiting_times'][2]*100)-1); // Ten second penalty! (etc.)
                $retry_str = ah_int2eng($anti_hammer['waiting_times'][2]);

            } elseif ($session['hammer'] >= $anti_hammer['trigger_levels'][3]) {
                $session['start_time'] = $anti_hammer['now_time'] + (($anti_hammer['waiting_times'][3]*100)-1); // Twenty second penalty!
                $retry_str = ah_int2eng($anti_hammer['waiting_times'][3]);
            }
        }
        $killpage = true;
    }

} else {
    $session['start_time'] = $anti_hammer['now_time'];
}


// write client session data..
SetHammer();


if ($killpage) {
    $km = '<!DOCTYPE HTML SYSTEM><html><head><title>'.$anti_hammer['page_title'].'</title></head><body>'.$anti_hammer['kill_msg'];
    if (!isset($session['cut_off'])) {
        $km .= '
        You must wait '.$retry_str.'seconds before trying again.<br />
        <br />
        If you believe this is in error, please mail '.$anti_hammer['webmaster'].' about it!<br />
        &lt;'.$anti_hammer['error_mail'].'&gt;<br />
        <span style="font-size:x-small;position:fixed;bottom:1em;right:1em;"><a title="Automatically ban web site hammers! Protect your valuable server resources for *genuine* clients"
        id="link-Get-Anti-Hammer" href="http://corz.org/serv/tools/anti-hammer/">Get Anti-Hammer protection for your own site!</a></span></body></html>';
    }
    kill_page($km);
}


if (function_exists('debug')) { debug('out'); } //:debug:


//2do..
// include auto-ban.php ? hmm.



/*

    fin

            */



// You're outta here!
function kill_page($msg) {
global $anti_hammer;

    $r_host = '';
    if ($anti_hammer['lookup_failures']) {
        $r_host = gethostbyaddr($anti_hammer['remote_ip']).' ';
    }
    if (file_exists(dirname($anti_hammer['log']))) {
        $this_hit = ''
        ."page:  "."\t".$anti_hammer['request']."\n"
        ."time:  "."\t".date('Y.m.d h:i:s A')."\t".'ID: '.$anti_hammer['client_id']."\t"."x ".$GLOBALS['session']['hammer']."\n"
        ."visitor:"."\t".$r_host.'['.$anti_hammer['remote_ip'].']'."\t"."(".$anti_hammer['user_agent'].")"."\n"
        ."accepts:"."\t".$anti_hammer['user_accept']."\n"
        ."referer:"."\t".$anti_hammer['referrer']."\n"
        ;
        add_data($anti_hammer['log'], $this_hit."\n");
    }
    header('Content-Type: text/html; charset=utf-8');    // Old IE probably still won't play ball, though.
    header('HTTP/1.1 503 Service Temporarily Unavailable');
    // For CGI/*suexec use..
    if (substr(php_sapi_name(), 0, 3) == 'cgi') { header('Status: 503 Service Temporarily Unavailable'); }
    header('Retry-After: '.($anti_hammer['final_time']+1)); // the calculation needs to be enclosed in braces to work.
    die($msg);
}


// write the updated hammer info to the fake/session file..
function SetHammer() {
    if ($GLOBALS['anti_hammer']['use_php_sessions']) {
        $_SESSION['anti_hammer']['start_time'] = $GLOBALS['session']['start_time'];
        $_SESSION['anti_hammer']['hammer'] = $GLOBALS['session']['hammer'];
        $_SESSION['anti_hammer']['cut_off'] = $GLOBALS['session']['cut_off'];
    } else {
        write_fake_session($GLOBALS['fake_sess_file'], $GLOBALS['session']);
    }
}


/*
    Append data to a file.
    Pass true as the 3rd paramater to wipe the file.
                                            */
function add_data($file, $data, $wipe=false) {

    // if it's not there, try to create it..
    if (!file_exists($file)) $fp = fopen($file, 'wb');
    
    $flag = 'ab';
    if ($wipe) { $flag = 'wb'; }

    if (is_writable($file)) {
        $fp = fopen($file, $flag);
        $lock = flock($fp, LOCK_EX);
        if ($lock) {
            fwrite($fp, $data);
            flock ($fp, LOCK_UN);
        } else {
            $GLOBALS['errors']['add_data'] = "couldn't lock $file";
        }
        fclose($fp);
    } else {
        $GLOBALS['errors']['add_data'] = "can't write to $file";
    }
}


// read serialized array data from a file, and return as an array..
function read_fake_session($no_cookie_file) {
    if (file_exists($no_cookie_file)) {
        $file_handle = fopen($no_cookie_file, 'rb');
        $file_contents = @fread($file_handle, filesize($no_cookie_file));
        fclose($file_handle);
    } else { return false; }
    $file_contents = unserialize($file_contents);
    if (is_array($file_contents)) {
        return $file_contents;
    }
}

// serialize an array and write the string data to a file..
function write_fake_session($no_cookie_file, $array) {
    $data = serialize($array);
    if (empty($data)) { return; }
    $fp = @fopen($no_cookie_file, 'wb');
    if ($fp) {
        $lock = flock($fp, LOCK_EX);
        if ($lock) {
            fwrite($fp, $data);
            flock ($fp, LOCK_UN);
        }
        fclose($fp);
        clearstatcache();
        return (1);
    }
}



/*

    CollectGarbage

    You couldtransplant this into another web app fairly easily.
    Useful.

*/
function CollectGarbage($count_file, $limit) {
    if ($limit === 0) { return; }
    if (increment_hit_counter($count_file) >= $limit) {
        $file_list = array();
        if ($the_dir = @opendir(dirname($count_file))) {
            while (false != ($file = readdir($the_dir))) {
                if ((ord($file) != 46) and strpos($file, $GLOBALS['anti_hammer']['ID_prefix']) === 0) {
                    $file_path = dirname($count_file).'/'.$file;
                    if (filemtime($file_path) < (time() - $GLOBALS['anti_hammer']['GC_age']*60*60)) {
                        unlink($file_path);
                    }
                }
            }
        }
        increment_hit_counter($count_file, 0, 1); // reset the counter
    }
}//2do..
//        Run this in another thread? Or maybe a simple http request, perhaps
//        with $_GET, to flip Ant-Hammer to GC mode in the Background - this task
//        could be done after the request is already sent, even simultaneously;
//        there may be a *lot* of files in this directory.
//
//        Having said that, it's *very* fast, and only runs once per 10,000 or so
//        ($limit) hits.
//


/*
increment a counter()    
from my "file-tools.php", available elsewhere.
                                                                            */
function increment_hit_counter($count_file, $report_only=false, $reset=false) {

    $count = false;

    if (!file_exists($count_file) or $reset) {
        $file_pointer = fopen($count_file, 'wb');
        fwrite ($file_pointer, '0');
        fclose ($file_pointer);
    }

    // now the counter..
    if (file_exists($count_file)) {

        // read in the old score..
        $count = trim(file_get_contents($count_file));
        if ($report_only) { return $count; }
        if (!$count) { $count = 0; }
        $count++;
        
        // write out new score..
        if (is_writable($count_file)) {
            $file_pointer = fopen($count_file, 'wb+');
            $lock = flock($file_pointer, LOCK_EX);
                if ($lock) {
                    fwrite($file_pointer, $count);
                    flock ($file_pointer, LOCK_UN);
                }
                fclose($file_pointer);
                clearstatcache();
        }
    }
    return $count;
}



/*
    Integers To English Words.

    Converts 1145432 into..

    "one million, one hundred and forty five thousand, four hundred and thirty two"

    Fairly groovy. ;o)

    The regular version is in my "text-func.php", with some other stuff.

                            */
function ah_int2eng($number) {

    $output = '';
    if ($number < 1) $number = 1;

    $GLOBALS['anti_hammer']['final_time'] = $number;

    $units = array(' ', 'one ', 'two ', 'three ', 'four ', 'five ', 'six ', 'seven ', 'eight ', 'nine ');
    $teens = array('ten ', 'eleven ', 'twelve ', 'thirteen ', 'fourteen ', 'fifteen ', 'sixteen ', 'seventeen ', 'eighteen ', 'nineteen ');
    $tenners = array('', '', 'twenty ', 'thirty ', 'fourty ', 'fifty ', 'sixty ', 'seventy ', 'eighty ', 'ninety ');

    $lint = strlen($number);
    if ($lint > 2) $bigger = true;

    for ($x = $lint ; $x >= 1 ; $x--) {    
    
        $last = substr($output, -5, 4);
        $digit = substr($number, 0, 1);
        $number = substr($number, 1);
    
        if ($x % 3 == 2) {
        
            if ($digit == 1) { // 10-19..
                $digit = substr($number, 0, 1);
                $number = substr($number, 1);
                $x--;
                if ($last == 'sand') { $output .= 'and '; }
                $output .= $teens[$digit];
                
            } else { // 20-99..
            
                if (($last == 'sand') ) { $output .= 'and '; }
                $output .= $tenners[$digit];
            }
        } else {
            if (($x % 3 != 1) and ($digit > 0) and (!empty($output))) { $output .= ', '; }
            $output .= $units[$digit];
        }
        if ((strlen($number) % 3) == 0) {
            $bignum = ah_bignumbers(strlen($number) / 3);
            if (($last == 'dred') and ($bignum != 'thousand')) { $output .= 'and ';}
            $output .= $bignum;
        }
        if ((strlen($number) % 3) == 2 and $digit > 0) {
            $output .= 'hundred and ';
        }
    }
    
    // clean up the output..
    $output = str_replace(' ', ' ', $output);
    $output = str_replace('red and thou', 'red thou', $output);
    $output = str_replace('red and mill', 'red mill', $output);
    $output = str_replace('lion thousand', 'lion ', $output);
    if (substr($output, -5) == ' and ') { $output = substr($output, 0, -5).' '; }
    
return $output;
}


/*
it just looks better, okay!    */

    function ah_bignumbers($test) {
        switch ($test) {
            case 0:
            $test = "";
            break;
            case 1:
            $test = "thousand";
            break;
            case 2:
            $test = "million";
            break;
            case 3:
            $test = "trillion"; // <- that's a lot of comments!
            break;
        }
    return $test;
}


/*
    function read_ini()        [from my 'ini-tools.php']

    pull the data from the ini file and return as an array

    Usage: array (string {path to file})

    returns false on failure.
                                */
function read_bots_ini($data_file) {
    $ini_array = array();
    if (is_readable($data_file)) {
        $file = file($data_file);
        foreach($file as $conf) {
            // if first real character isn't '#' or ';' and there is a '=' in the line..
            if ( (substr(trim($conf),0,1) != '#')
                and (substr(trim($conf),0,1) != ';')
                and (substr_count($conf,'=') >= 1) ) {
                $eq = strpos($conf, '=');
                $ini_array[trim(substr($conf,0,$eq))] = trim(substr($conf, $eq + 1));
            }
        }
        unset($file);
        return $ini_array;
    } else {
        $GLOBALS['errors']['read_bots_ini'] = "ini file: $file does not exist.";
        return false;
    }
}


/*

    changes:

    0.9.3

        +    You now have the option to perfomr a quick DNS lookup of the
            IP Address of bad clients, and have this added to the logging.

            This was already enabled, you now have the option to *disable* it, if required.

        +    Anti-Hammer now send a valid "Retry-After" header, which is set to the client's current hammer delay + 1 second.

        +    Added a link to the Anti-Hammer page, should lessen the wtf-factor.



    0.9.2

        +    You can now choose whether to allow your specified clients (aka "exemptions") to either completely bypass anti-hammer (current exemption method)..

                $anti_hammer['allow_bots'] = true;

            Or else specify an integer, representing a hammer_time, in
            1/100th Second, which will apply to *only* these clients..

                $anti_hammer['allow_bots'] = 50;

            This setting would enable your specified clients to hammer the site at a rate of two hits-per-second, but no faster.

            Effectively, we now have two hammer rates, one for known good
            clients, and one for everyone else.

    0.9

        +    Good bots & spiders can now be allowed to bypass the hammer. This is achieved through the use of standard spider IP lists, as published here..

                http://www.iplists.com/

            along with a simple ini file, detailing which user-agent links to which IP list. A working ini, and more details, will be included in the preference section (above), as well as the release.


    0.8.*

        +    Anti-Hammer now sends a proper 503 (service temporarily unavailable) message, rather than a 200 OK message. This will be useful in     situations where valid bots are temporarily hammering, and is more correct in this scenario. The reource *will* be back, if they cut out the crazy hammering!

            If you are running under cgi/*suexec (non-module), the extra
            required header is automatically sent.

            In use, this causes many bots to back-off immediately. Excellent!


        ~    Improved the ban resetting (which needs to work independantly of the Garbage collection mechanism). After the ban time, the client's cut-off is wiped, and their start time set to *now*, just like a new client, however, their hammer_count is set to one hammer below the first trigger level. In other words, a single hammer gets them the NO Hammer! page; and to the final page quicker than new clients. Even if you use rolling triggers, Anti_hammer will still use the first ban level to calculate this number, so set that to whatever you want.

~    ban_times and ban_levels have been renamed to waiting_times and trigger_levels, to avoid confusion with the ban_time (for the new total cut-off). These also make more sense, as they are not bans, simply delays.


    0.7.*

        +    Added rolling ban times. Rather than have set limits which the
            client can cross, this simply increments the ban time with each and every hammer attempt. 1-2-3-4-5-6-7.. etc. cut-off still functions as before for each system (rolling or preset levels).

            This was, in fact, the original system, which I replaced with the level presets early on, but it's kinda fun, and the code is simple.

        ~    Removed the file-tools.php include statement, and put the functions directly into here (slightly renamed). I figure anyone smart enough
            to be including my file-tools, will be smart enough to figure out how to put that back, if required.

        +    More things are configurable, like the page title. Why not!


    0.6.*

        +    Added capability to work with clients who do not accept, or have chosen not to accept (read: disabled) cookies.

            Basically, we write a "fake" session. The fake session uses a
            serialized array in a flat file, just like regular php sessions, and is created before they even get receive their first page. From that
            point on, they are known (by Anti-Hammer) by this ID.

            The name of the client's session ID file is the session ID itself; an MD5 of all the known usable client data concatenated together.

        ~    php session usage is still available as an option, if required.

        +    Added Garbage Collection for the fake session files. Both how often this happens (every 'so-many' requests), and how old is considered
            "stale", are configurable.

        +    Ban time is now configurable (in hours). Remember to ensure that Garbage Collection isn't happening before this time.

        +    Added penultimate message for cut-off. You get one *final* warning!

        ~    Cleaned-up the code regarding sessions. We now make a clean break, converting whichever type of session data into a local array, and
            then work with that. At the end, we write the pertinent data back to whichever type of session is being utilized (built-in or php).


    0.5.*

        +    You can now configure a cut-off point. When the number of violations reaches this number, their pages simply die. This is disabled by default. This point is, of course, configurable. (actually, I got called away in the middle of this, so I'll need to check how far I got!)

    0.4.*

        +    Added user preferences for lots of the settings, voilation levels, times, etc. Added error checking for these, so they should be fairly foolproof (good movie, by the way, "Foolproof", 2003).

    0.3.*

        +    Added configurable protection skipping for certain file types
            (usually associated files and such). This replaces a nasty hack that lived at the top of the script.

        +    Added skipping for generated images, too (GD images, etc.). This can also be used to skip other tpyes. See the preferences for more
            details.

        +    Added configurable messages. I'll likely put this out eventually, it's kinda useful.

    0.2.*

        +    Added ignored areas, for chat scripts and such. places where either hammering is allowed, or is dealt with by the local script.

*/

?>
  1. <?php    // Ûž text{ encoding:utf-8; bom:no; linebreaks:unix; tabs:4; } Ûž//
  2. /* direct access -> \/ */                         $anti_hammer_version = '0.9.3';
  3. if (realpath($_SERVER['SCRIPT_FILENAME']) == realpath(__FILE__)) { die(
  4. 'This script is designed to run as a php auto-prepend, like so (in .htaccess)..<br /><br />
  5. <tt>php_value auto_prepend_file "/var/www/vhosts/wafuku.co.uk/httpdocs/anti-hammer.php"</tt>'); }
  6. /*
  7.     Anti-Hammer
  8.     Automatically set temporary bans for web site hammering.
  9.     Protect your valuable server resources for genuine clients.
  10.     Full details here..
  11.         http://corz.org/serv/tools/anti-hammer/
  12.     Have fun!
  13.     ;o) Cor
  14.     Â© 2007-> corz.org
  15. */
  16. /*
  17. prefs.. */
  18. /*
  19.     Anti-Hammer data directory
  20.     [default: $_SERVER['DOCUMENT_ROOT'].'/Anti-Hammer/anti-hammer';]
  21.     When using Anti-Hammer's built-in client-tracking (the default), files will
  22.     be stored, in this directory..
  23.                                                     */
  24. $anti_hammer['info_path'] = $_SERVER['DOCUMENT_ROOT'].'/anti-hammer/anti-hammer';
  25. /*
  26.     Client ID File Prefix            
  27.     
  28.     [default: $anti_hammer['ID_prefix'] = 'HammerID_';]
  29.     This text is placed before the client ID in the ID filename. e.g..
  30.         "HammerID_06fa71c938a108f4a2b1f1ef091653ef"
  31.     You may wish to use a different name..
  32.                                                         */
  33. $anti_hammer['ID_prefix'] = 'HammerID_';
  34. /*
  35.     File Types                
  36.     
  37.     [default: $anti_hammer['types'] = 'php,html';]
  38.     Which file types (extensions) to protect with the anti-hammer?
  39.     We only want to count hits on the main pages, not associated files, css,
  40.     javascript includes, and such (if they are generated by php), as many of
  41.     these will normally be requested within miliseconds of the initial page hit.
  42.     If we run the anti-hammer indiscriminately, such files, would automatically
  43.     count towards hammering, and folk would probably be penalized on their first
  44.     visit. If you don't use php to generate other (non - .php) files, the
  45.     anti-hammer won't be running anyway - it only runs before php scripts, as
  46.     it's designed to protect server resources, not bandwidth; basic requests get
  47.     spat out without any real processing power or memory usage.
  48.     These list items matches the extension of the *actual* physical script file,
  49.     regardless of the requested URI, so for example, these..
  50.         http://mysite.com/index.php
  51.         http://mysite.com/foo.php?page=bar.htm
  52.         http://mysite.com/genny.php?image=img1.jpg
  53.         
  54.     .. would *all* match 'php'. Other extensions are fine, so long as they are
  55.     parsed by php on your setup. Separate entries with commas, and put the whole
  56.     thing in quotes..
  57.     Extensionless files are not supported.
  58.                                                                         */
  59. $anti_hammer['types'] = 'php,html';
  60. /*
  61.     Generated Extensions     
  62.     
  63.     [default: $anti_hammer['gen_types'] = 'jpg,png';]
  64.     This is an list of (usually) image extensions which you serve via php. As
  65.     there may be many of these on a single page, we want to skip these, too.
  66.     These list items match the extension of the *request*, regardless of the
  67.     physical script file generating the output. Links such as..
  68.         http://mysite.com/gen.php?image=foo.jpg
  69.         http://mysite.com/png-pusher.html/foo.jpg
  70.     .. would match "jpg".
  71.     
  72.     Separate entries with commas and put the whole thing in quotes..
  73.                                                                             */
  74. $anti_hammer['gen_types'] = 'jpg,png';
  75. /*
  76.     NOTES:
  77.         The file type *generating* these url's MUST be included in your
  78.         $anti_hammer['types'] array (above), presumably. 'php'.
  79.         You could also use the above preference array to skip other non-
  80.         image generated types, if you have such things onsite.            
  81. */
  82. /*
  83.     Skip certain files and folders..    
  84.     
  85.         aka, basic "Ignore"..
  86.         [default: $anti_hammer['skip'] = '/chat/,/foobar/members,rdf.php,/blog/rss.php';]
  87.         A list of areas/folders and specific files you DON'T want the
  88.         anti-hammer to cover. Enter the full path (from site root) to each
  89.         file/folder.
  90.         You can also skip ALL the instances of "rss.php", etc. on your entire
  91.         site by using only the file name, e.g..
  92.         $anti_hammer['skip'] = 'rdf.php,rss.php';
  93.         This also works for folder. Using the full path enables you to target
  94.         specific files and folders, using only the name gives you blanket
  95.         coverage. Your call.
  96.         Basically, if your string is contained anywhere within the requested
  97.         URI, the script returns control to your page immediately, bypassing
  98.         Anti-Hammer.
  99.         Do put comments *between* entries.
  100. */
  101. $anti_hammer['skip'] = '/chat/,/foobar/members,rdf.php,/blog/rss.php';
  102. /*
  103.     RSS feeds are a good example of a file to skip (assuming they are
  104.     php-generated). Firefox, for example, will often grab all the feeds on a
  105.     page at-once, quickly notching up a user's hammer count.
  106. */
  107. /*
  108.     Hammer Time!
  109.     [default: $anti_hammer['hammer_time'] = 100;]    (One Second)
  110.     If they make two requests within this time, the counter increases by one.
  111.     The faster and more capable your server, the lower this setting can be.
  112.     The higher you set this, the more likely they are to get a warning.
  113.     100 is a reasonable setting for a fast server, enabling one-hit-per-second
  114.     spidering, but penalizing anything faster.
  115.     
  116.     Enter an integer, representing 100th/s..
  117.                                             */
  118. $anti_hammer['hammer_time'] = 90;        
  119. /*
  120.     Trigger levels.        
  121.     
  122.     [default: $anti_hammer['trigger_levels'] = '5,10,20,30';]
  123.     Enter the number of violations that will trigger each of the four levels..
  124.     i.e. At the default settings, they get their first warning after five
  125.     violations (with a ban time of three seconds, set below). The time penalty
  126.     increases after ten and twenty violations, up to the maximum level of 30
  127.     violations (which imposes the maximum ban time of 20 seconds). You can set
  128.     the actual times in the next preference.
  129.     Specify four integer values, separated by commas, whole thing in quotes.
  130.                                                                 */
  131. $anti_hammer['trigger_levels'] = '5,10,20,30';
  132. /*
  133.     Ban Times.        
  134.     
  135.     [default: $anti_hammer['waiting_times'] = '3,5,10,20';]
  136.     This list sets the individual times that offenders will be 'banned' for.
  137.     They will have to wait *this* long before they can try again.
  138.     Each of the four setting corresponds to one of the above trigger_levels.
  139.     Specify four integer values, separated by commas, whole thing in quotes.
  140. */
  141. $anti_hammer['waiting_times'] = '3,5,10,20';
  142. /*
  143.     Rolling Trigger Times
  144.     
  145.     [default: $anti_hammer['rolling_trigger'] = false;]
  146.     This increases the ban time automatically with EACH hammer.
  147.     <hit>
  148.         You must wait three seconds..
  149.     <hit>
  150.         You must wait four seconds..
  151.     <hit>
  152.         You must wait five seconds..
  153.     And so on.
  154. */
  155. $anti_hammer['rolling_trigger'] = false;
  156. /*
  157.     Cut-Off
  158.     [default: $anti_hammer['cut_off'] = '']
  159.     You can also set an absolute cut-off point.
  160.     Anyone receiving this many hammer violations is simply dropped, and from
  161.     that point onward, their pages die before they even begin - blank.
  162.     This works with both preset and rolling triggers.
  163.     Leave blank to disable the cut-off.
  164.                             */
  165. $anti_hammer['cut_off'] = '75';
  166. /*
  167.     Bye Bye! Message.
  168.     [default: $anti_hammer['cut_off_msg'] = '<h1>Bye Now!</h1>';]
  169.     A final word from our sponsor?
  170.     This is the final message they see before it all goes blank.
  171.     No other text is presented.
  172.                                                 */
  173. $anti_hammer['cut_off_msg'] = '<h1>Bye Now!</h1>';
  174. /*
  175.     Ban Time
  176.     [default: $anti_hammer['ban_time'] = '12';]
  177.     And for how many hours will the above cut-off (ban) last?
  178.                                                             */
  179. $anti_hammer['ban_time'] = '12';
  180. //    NOTE:    If you set your Garbage Collection age to any less than this, you
  181. //            effectively reset all bans older than THAT figure.
  182. //
  183. //            In other words, ensure your garbage collection age ('GC_age', below)
  184. //            is larger than your 'ban_time' setting here, probably x2.
  185. //            Think: if GC happened one minute after someone was banned, and their
  186. //            session ID file was >= GC_age, it would be cleaned up! Then no ban!
  187. //
  188. // Also Note: Humans are daily creatures, for them a 12h ban, is effectively 24!
  189. /*
  190.     Log File
  191.     [default: $anti_hammer['log'] = $_SERVER['DOCUMENT_ROOT'].'/log/.ht_hammers';]
  192.     We will log each banned hit, for reference.
  193.     Enter full path to log location..        
  194.     NOTE: If the parent directory does not exist, Anti-Hammer will not attempt
  195.     to create it, and you will get no logging.
  196.                                                                             */
  197. $anti_hammer['log'] = $_SERVER['DOCUMENT_ROOT'].'/log/.ht_hammers';
  198. //             It is recommend you watch this log very carefully for the first
  199. //    NOTE:     few minutes/ days after installation, in case of unexpected side-
  200. //             effects. And in that case, please do mail me about it!
  201. /*
  202.     Kill Message.
  203.     [default: $anti_hammer['kill_msg'] = 'Please do not hammer this site.<br />';]
  204.     When a request is killed - send this message (before the other text).
  205.     You can use any calid HTML in here, header tags, or whatever you like..
  206.                                                                         */
  207. $anti_hammer['kill_msg'] = 'Please do not hammer this site!<br />';
  208. /* NOTE: No <br /> is placed after this text.
  209.          If you aren't using <h> tags, and want a break, add it yourself. */
  210. /*
  211.     Page Title.
  212.     [default: $anti_hammer['page_title'] = 'Please do not hammer this site!';]
  213.     This is what is displayed in the title bar of their browser.
  214.     Keep this one plain text.
  215.                                                             */
  216. $anti_hammer['page_title'] = 'Please do not hammer this site!';
  217. /*
  218.     WebMaster's Name
  219.     [default: $anti_hammer['webmaster'] = 'the webmaster';]
  220.     Name of the webmaster, will be included in the kill page.
  221.     e.g. "If you believe this is in error, please mail <Insert Name> about it!"
  222.                                                             */
  223. $anti_hammer['webmaster'] = 'the webmaster';
  224. /*
  225.     Admin Bypass
  226.     [default: $anti_hammer['admin_agent_string'] = 'MyCrazyUserAgentString';]
  227.     If you insert this exact string into your web browser's user-agent string
  228.     (just tag it onto the end), you can bypass the hammer altogether.
  229.     Very handy for busy webmasters.
  230.                                                                     */
  231. $anti_hammer['admin_agent_string'] = 'MyCrazyUniqueUserAgentString';    
  232. //             It's not advisable to go messing with the main body of your
  233. //    NOTE:    browser's user agent string. Lots of web designers rely on this
  234. //            information to serve you beautiful, functional web pages.
  235. /*
  236.     WebMaster email address (string).
  237.     [default: $anti_hammer['error_mail'] = 'bugs at mydomain dot com';]
  238.     The usual text format of so-and-so at such-and-such dot com works well.
  239.     This is tagged on to the end of the massage inside <> angle brackets,
  240.     to look like an address.
  241.                                                                         */
  242. $anti_hammer['error_mail'] = 'admin@wafuku.co.uk';
  243. /*
  244.     Lookup Failures.
  245.     When an event worth logging occurs, we can lookup the host name of the
  246.     client to add to our logs. This takes a moment, but only occurs while
  247.     logging bad clients, and can be useful in quickly identifying abusers
  248.     (or good bots using bad user agent string - to come)
  249.                                                     */
  250. $anti_hammer['lookup_failures'] = true;
  251. /*
  252.     Allow known bots?
  253.     [default: $anti_hammer['allow_bots'] = false;]
  254.     We can allow certain bots to bypass the Anti-Hammer.
  255.     Do do this, specify the expected user agent strings in..
  256.         path-to/anti-hammer/exemptions/exemptions.ini
  257.         
  258.     and then supply an IP-mask file where said user agent is expected to be
  259.     making requests FROM, one ip per line, in the standard Spider IP list format
  260.     as found here..
  261.         http://www.iplists.com/
  262.         http://www.iplists.com/nw/    <- updated, reorganised, with msnbot+more
  263.         A blog URI is listed there, where list updates are posted.
  264.         (this doesn't happen a lot, maybe 2-3 times a year)
  265.     NOTE:    User agent string matches are CaSe SenSiTivE! If you want to match
  266.             "msnbot" and "MSNBOT", you need two entries. (a case-insensitive
  267.             test is roughly five times slower than case-sensitive; so testing
  268.             two separate entries is much faster)
  269.     NOTE:    If cooking up your own anti-hammer.ini, you probably do not want to
  270.             include the generic user agent strings (e.g. Yahoo's "Mozilla/4.0"),
  271.             which would create a lot of processing overhead, as ALL browsers
  272.             send that. Doh! (More notes within that file.)
  273.     You can set this to "true" (no quotes), in which case, all specified bots
  274.     are simply allowd to bypass the hammer. You can also set it to an integer,
  275.     e.g..
  276.         $anti_hammer['allow_bots'] = 50;
  277.     
  278.     ..that integer representing the hammer_time that will apply to the specified
  279.     clients. "50" would enable 2 hits-per-second spidering, but nothing faster,
  280.     which is half the normal hammer_time of One Second (hammer_time=100).
  281.                                     */
  282. $anti_hammer['allow_bots'] = true;
  283. /*
  284.     The following two preferences control Anti-Hammer's built-in Client session
  285.     Garbage collection routines..
  286. */
  287. /*
  288.     Garbage Collection Limit
  289.     [default: $anti_hammer['GC_limit'] = 10000;]
  290.     To prevent your server's hard drive filling up with stale client sessions,
  291.     we run a periodic garbage collection routine to sweep up the old files.
  292.     How periodically, is up to you. By default, Anti-Hammer will check for
  293.     garbage every 10,000 hits. I'm thinking this would be around a 2-daily hit
  294.     rate for a small site (@ 5000 hits per day).
  295.     Obviously, you can chage this number to anything you like, depending on how
  296.     busy your site is, and how much space you have on the disks.
  297.     
  298.     If you don't want Anti-Hammer to clean up its garbage, set this to 0.
  299.     Remember to ensure that this limit falls well outside your longest ban time,
  300.     probably at least 2x that.
  301.                                 */
  302. $anti_hammer['GC_limit'] = 10000;
  303. /*
  304.     GarbAge!
  305.     [default: $anti_hammer['GC_age'] = 24;]
  306.     How old, in hours, is considered "stale"?
  307.     Any ID files older than this will be swept away (deleted).
  308.                                                                 */
  309. $anti_hammer['GC_age'] = 24;
  310. /*
  311.     NOTE: The previous two preferences have no effect if you set the following
  312.     preference ('use_php_sessions') to true. They are only for AntiHammer's
  313.     built-in client session files.
  314. */
  315. /*
  316.     Use php sessions..
  317.     [default: $anti_hammer['use_php_sessions'] = false;]
  318.     You would think it might be a nice idea to detect if the client has cookies
  319.     enabled, and if so, use php sessions, only falling-back to some other method
  320.     when they have not. However, it is not possible to detect whether or not a
  321.     client has cookies enabled, with a single request. You need Two. Clearly,
  322.     that isn't a lot of use for a protection mechanism designed to operate
  323.     before they have even had one. So you gotta choose, now..
  324.     By default, anti-hammer will use its own session mechanism, writing client-
  325.     unique data to files in a directory of your choosing, irrespective of their
  326.     ability to accept cookies. As it is an independant system, it in no way
  327.     interferes with any session magic you may have running on your site, and in
  328.     most scenarios is just as fast as php's own session handling.
  329.     However, you may wish to use that, instead; particularly if you have
  330.     millions of hits a day, and your web server stores the php sessions in a
  331.     some uberfast /tmp space you can't otherwise get to, where the difference
  332.     might be worth it. Or if in-website space is extremely limited. At any rate,
  333.     you have a choice.
  334.     NOTE: if you enable this, you will ALWAYS start a php session with each
  335.     request. This usually presents no problems, but you and your server may know
  336.     better. Testing is always advised! I ran it this way for may months on
  337.     corz.org, with no issues whatsoever, and I use php sessions all over the
  338.     site. If you use proper names in your session, everything should work fine.
  339.     Also NOTE: With this enabled, if the client/spider/script kiddie/etc. has
  340.     cookies disabled in their web browser, they bypass anti-hammer protection!
  341.     This is why, by default, Anti-Hammer uses its own session mechanism.
  342.     There should be no performance concerns; Anti-Hammer writes the data in the
  343.     same way as a php session - it's a simple serialized array in a flat file.
  344. */
  345. $anti_hammer['use_php_sessions'] = false;
  346. /*
  347. :end prefs: */
  348. // let's go..
  349. //
  350. $killpage = false;
  351. $gentime = explode(' ', microtime());
  352. $anti_hammer['now_time'] = $gentime[1].substr($gentime[0], 6, -2);                //     1/100th of a second accuracy!
  353. settype($anti_hammer['now_time'], "double");                                    // scientifically tested!
  354. $anti_hammer['final_time'] = 0;    // will be used to set the retry header on killed page (503)
  355. // Collect all usable client data..
  356. $anti_hammer['remote_ip']        = $_SERVER['REMOTE_ADDR'];
  357. $anti_hammer['user_agent']        = @$_SERVER['HTTP_USER_AGENT'];
  358. $anti_hammer['referrer']        = @$_SERVER['HTTP_REFERER'];
  359. $anti_hammer['request']            = $_SERVER['REQUEST_URI'];
  360. $anti_hammer['user_accept']        = @$_SERVER['HTTP_ACCEPT'];
  361. $anti_hammer['user_charset']    = @$_SERVER['HTTP_ACCEPT_CHARSET'];
  362. $anti_hammer['user_encoding']    = @$_SERVER['HTTP_ACCEPT_ENCODING'];
  363. $anti_hammer['user_language']    = @$_SERVER['HTTP_ACCEPT_LANGUAGE'];
  364. // Admin Bypass..
  365. // Is this the admin user? let's see..
  366. if (stristr($anti_hammer['user_agent'], $anti_hammer['admin_agent_string'])) {
  367.     return;
  368. }
  369. // local server access (for readfile() requests..
  370. // (and as a potential catch-all for user pref errors!))
  371. if ($anti_hammer['remote_ip'] == $_SERVER['SERVER_ADDR']) {
  372.     return;
  373. }/*
  374.      A note about readfile()..
  375.         If you use readfile() to include resources on your pages, remember,
  376.         those requests will come in right after the first, and as they are
  377.         technically brand new hits, they count towards the hammer.
  378.         Use of include() is preferred.
  379.         However, the code right above this notice should prevent any issues. If it does /not/, and include() isn't working, you might want to hack in     the actual IP Address of the local server. See my debug-report.zip for a way to easily get this sort of information in your browser.
  380.         
  381.         NOTE: If you are having difficulty include()ing URI resource in your
  382.         pages, remember you need to enable BOTH php allow_url_* flags (this is the .htaccess version of those two switches..)
  383.         php_flag allow_url_fopen on
  384.         php_flag allow_url_include on
  385. */
  386. // skip protection for known bots and spiders..
  387. //
  388. // okay, this is some cute code! simple, but effective.
  389. // we load an ini file of user-agent=ip-mask-file pairs, and check our client's
  390. // user agent string for a match (at must match the beginning of the string
  391. // exactly). If there is a match, we load the associated IP Mask file, and
  392. // run through the IP/masks, again looking for a perfect match at the start of
  393. // the two strings. Commented lines are no problem. We use strpos() for both
  394. // tests, so it's nice and fast, and the IP test covers our comments, too!
  395. //
  396. // having said (coded) all this, you gotta ask yourself, why are they hammering?
  397. // Surely it would be better get them to slow down, instead!
  398. $IP_file = '';
  399. $anti_hammer['ini_file'] = $anti_hammer['info_path'].'/exemptions/exemptions.ini';
  400. if ($anti_hammer['allow_bots']) {
  401.     $bot_agent_array = read_bots_ini($anti_hammer['ini_file']);
  402.     if (is_array($bot_agent_array)) {
  403.         foreach ($bot_agent_array as $bot_agent_string => $IP_file) {
  404.             if ($bot_agent_string and strpos($anti_hammer['user_agent'], $bot_agent_string) === 0) {
  405.                 break;
  406.             }
  407.         }
  408.         if ($IP_file) {
  409.             $ip_array = file($anti_hammer['info_path'].'/exemptions/'.$IP_file);
  410.         }
  411.         if (is_array($ip_array)) {
  412.             foreach($ip_array as $bot_ip) {
  413.                 if (@strpos($anti_hammer['remote_ip'], trim($bot_ip)) === 0) {
  414.                     if ($anti_hammer['allow_bots'] > 1) {
  415.                         $anti_hammer['hammer_time'] = $anti_hammer['allow_bots'];
  416.                     } else {
  417.                         return;
  418.                     }
  419.                 }
  420.             }
  421.         }
  422.     }
  423. }
  424. // User prefs..
  425. // Get user values into usable arrays, do some error-checking.
  426. // trigger thresholds..
  427. if (!stristr($anti_hammer['trigger_levels'], ',') or (str_word_count($anti_hammer['trigger_levels'], 0, "0123456789") != 4)) {
  428.     $anti_hammer['trigger_levels'] = '5,10,20,30';
  429. }
  430. // A neat way to create a array from numeric prefs..
  431. $anti_hammer['trigger_levels'] = str_word_count($anti_hammer['trigger_levels'], 1, "0123456789");
  432. // Get user penalty times into correct values..
  433. if (!stristr($anti_hammer['waiting_times'], ',') or (str_word_count($anti_hammer['waiting_times'], 0, "0123456789") != 4)) {
  434.     $anti_hammer['waiting_times'] = '3,5,10,20';
  435. }
  436. $anti_hammer['waiting_times'] = str_word_count($anti_hammer['waiting_times'], 1, "0123456789");
  437. // file types to protect..
  438. if (!$anti_hammer['types']) { return; } // no types specified, forget it!
  439. $anti_hammer['types'] = explode(',', $anti_hammer['types']);
  440. // generated types to skip..
  441. $anti_hammer['gen_types'] = explode(',', $anti_hammer['gen_types']);
  442. // ignored locations..
  443. $anti_hammer['skip'] = explode(',', $anti_hammer['skip']);
  444. // run through ignored locations and if matched, return immediately..
  445. //
  446. foreach($anti_hammer['skip'] as $nogo) {
  447.     if (stristr($anti_hammer['request'], trim($nogo))) { return; }
  448. }
  449. // Anti-Hammer only for php files, not generated css, etc..
  450. //
  451. $ah_type_ok = false;
  452. foreach($anti_hammer['types'] as $ah_type) {
  453.     if (!$ah_type) { continue; } // @ to avoid strict php5 errors
  454.     if (@end(explode('.', $_SERVER['SCRIPT_FILENAME'])) == trim($ah_type)) {
  455.         $ah_type_ok = true;
  456.     }
  457.     //2do.. could make this code more efficient, for those using MANY types.
  458. }
  459. if ($ah_type_ok /* still! */ == false) { return; }
  460. // skip protection for selected generated types..
  461. //
  462. if (in_array(@end(explode('.', $_SERVER['REQUEST_URI'])), $anti_hammer['gen_types'])) { return; }
  463. /*
  464.     okay, let's do it..
  465. */
  466. // read session data..
  467. $session = array();
  468. if ($anti_hammer['use_php_sessions']) {
  469.     // Regular php session..
  470.     session_start();
  471.     $session = $_SESSION['anti_hammer'];
  472. } else {
  473.     // Anti-Hammer's built-in session mechanism..
  474.     // Create a unique Client ID for this client..
  475.     // we simply MD5 all the browser data concatenated together (and blanks are not a problem)..
  476.     $anti_hammer['client_id'] = md5($anti_hammer['user_agent'].
  477.                                     $anti_hammer['user_accept'].
  478.                                     $anti_hammer['user_language'].
  479.                                     $anti_hammer['user_encoding'].
  480.                                     $anti_hammer['user_charset'].
  481.                                     $anti_hammer['remote_ip']);
  482.     $fake_sess_file = $anti_hammer['info_path'].'/'.$anti_hammer['ID_prefix'].$anti_hammer['client_id'];
  483.     if (file_exists($fake_sess_file)) {
  484.         $session = read_fake_session($fake_sess_file);
  485.     }
  486. }
  487. /*
  488.     Useful use of a "cat"..
  489.     It seems to me that I unwittingly created a system whereby the less
  490.     information a client is wiling to give, the more likely they are to be
  491.     banned. I say "seems", because we create an md5 of this information, so the
  492.     actual likelyhood of colliding session ID's is astronomically low. However,
  493.     I like the *principle* of the thing.
  494. */
  495. // Calculate the Hammer Rate..
  496. //
  497. // How much time since their last request (in 100/th Second)
  498. $hammer_rate = $anti_hammer['now_time'] - @$session['start_time'] + 1;
  499. // Their ban has elapsed (but GC has not swept up their session)..
  500. if ($hammer_rate > ($anti_hammer['ban_time']*60*60*100)) {    // 8640000 = 24 hours (in 100th/second)
  501.     $session['start_time'] = $anti_hammer['now_time'] - 1;
  502.     $hammer_rate = $anti_hammer['hammer_time'];
  503.     $session['hammer'] = $anti_hammer['trigger_levels'][0]-1; // repeat-offenders do not get to start from 0!
  504.     unset($session['cut_off']);
  505.     // do not return here - we still need to write the updated session data.
  506. }
  507. // CUT_OFF has already been set -- BYE NOW!
  508. if ($anti_hammer['cut_off'] and isset($session['cut_off'])) { die(); }
  509. // okay, still here..
  510. // Start with Garbage Collection..
  511. if (!$anti_hammer['use_php_sessions']) {
  512.     CollectGarbage($anti_hammer['info_path'].'/Counter', $anti_hammer['GC_limit']);
  513. }
  514. // Anti-Hammer Protection has been activated!
  515. if ($hammer_rate < $anti_hammer['hammer_time']) {
  516.     $retry_str = 'a few ';
  517.     @$session['hammer'] += 1;
  518.     if ($session['hammer'] > ($anti_hammer['trigger_levels'][0]-1)) {
  519.         // cut-off..
  520.         if ($anti_hammer['cut_off'] and $session['hammer'] > $anti_hammer['cut_off']) {
  521.             $anti_hammer['kill_msg'] = $anti_hammer['cut_off_msg'];
  522.             $session['cut_off'] = true;
  523.         }
  524.         if ($anti_hammer['cut_off'] and $session['hammer'] == $anti_hammer['cut_off']) {
  525.             $anti_hammer['kill_msg'] = '<h1>THIS IS YOUR LAST WARNING!</h1>'.$anti_hammer['kill_msg'];
  526.         }
  527.         // rolling ban time, increments with each hammer..
  528.         if ($anti_hammer['rolling_trigger']) {
  529.             $session['start_time'] = $anti_hammer['now_time'] + (($session['hammer']*100)-1);
  530.             $retry_str = ah_int2eng($session['hammer']);
  531.         } else {
  532.             // predefined ban levels.. these are more effective, as they shock the user with increasing jumps!
  533.             if (($session['hammer'] > $anti_hammer['trigger_levels'][0]) and ($session['hammer'] <= $anti_hammer['trigger_levels'][1])) {
  534.                 // we simply nudge their start time forward by *this* many seconds (into the future!)..
  535.                 $session['start_time'] = $anti_hammer['now_time'] + (($anti_hammer['waiting_times'][0]*100)-1); // 299 = Three second penalty.
  536.                 $retry_str = ah_int2eng($anti_hammer['waiting_times'][0]);
  537.             } elseif (($session['hammer'] > $anti_hammer['trigger_levels'][1]) and ($session['hammer'] <= $anti_hammer['trigger_levels'][2])) {
  538.                 $session['start_time'] = $anti_hammer['now_time'] + (($anti_hammer['waiting_times'][1]*100)-1); // Five second penalty! (by default)
  539.                 $retry_str = ah_int2eng($anti_hammer['waiting_times'][1]);
  540.             } elseif (($session['hammer'] >= $anti_hammer['trigger_levels'][2]) and ($session['hammer'] <= $anti_hammer['trigger_levels'][3])) {
  541.                 $session['start_time'] = $anti_hammer['now_time'] + (($anti_hammer['waiting_times'][2]*100)-1); // Ten second penalty! (etc.)
  542.                 $retry_str = ah_int2eng($anti_hammer['waiting_times'][2]);
  543.             } elseif ($session['hammer'] >= $anti_hammer['trigger_levels'][3]) {
  544.                 $session['start_time'] = $anti_hammer['now_time'] + (($anti_hammer['waiting_times'][3]*100)-1); // Twenty second penalty!
  545.                 $retry_str = ah_int2eng($anti_hammer['waiting_times'][3]);
  546.             }
  547.         }
  548.         $killpage = true;
  549.     }
  550. } else {
  551.     $session['start_time'] = $anti_hammer['now_time'];
  552. }
  553. // write client session data..
  554. SetHammer();
  555. if ($killpage) {
  556.     $km = '<!DOCTYPE HTML SYSTEM><html><head><title>'.$anti_hammer['page_title'].'</title></head><body>'.$anti_hammer['kill_msg'];
  557.     if (!isset($session['cut_off'])) {
  558.         $km .= '
  559.         You must wait '.$retry_str.'seconds before trying again.<br />
  560.         <br />
  561.         If you believe this is in error, please mail '.$anti_hammer['webmaster'].' about it!<br />
  562.         &lt;'.$anti_hammer['error_mail'].'&gt;<br />
  563.         <span style="font-size:x-small;position:fixed;bottom:1em;right:1em;"><a title="Automatically ban web site hammers! Protect your valuable server resources for *genuine* clients"
  564.         id="link-Get-Anti-Hammer" href="http://corz.org/serv/tools/anti-hammer/">Get Anti-Hammer protection for your own site!</a></span></body></html>';
  565.     }
  566.     kill_page($km);
  567. }
  568. if (function_exists('debug')) { debug('out'); } //:debug:
  569. //2do..
  570. // include auto-ban.php ? hmm.
  571. /*
  572.     fin
  573.             */
  574. // You're outta here!
  575. function kill_page($msg) {
  576. global $anti_hammer;
  577.     $r_host = '';
  578.     if ($anti_hammer['lookup_failures']) {
  579.         $r_host = gethostbyaddr($anti_hammer['remote_ip']).' ';
  580.     }
  581.     if (file_exists(dirname($anti_hammer['log']))) {
  582.         $this_hit = ''
  583.         ."page:  "."\t".$anti_hammer['request']."\n"
  584.         ."time:  "."\t".date('Y.m.d h:i:s A')."\t".'ID: '.$anti_hammer['client_id']."\t"."x ".$GLOBALS['session']['hammer']."\n"
  585.         ."visitor:"."\t".$r_host.'['.$anti_hammer['remote_ip'].']'."\t"."(".$anti_hammer['user_agent'].")"."\n"
  586.         ."accepts:"."\t".$anti_hammer['user_accept']."\n"
  587.         ."referer:"."\t".$anti_hammer['referrer']."\n"
  588.         ;
  589.         add_data($anti_hammer['log'], $this_hit."\n");
  590.     }
  591.     header('Content-Type: text/html; charset=utf-8');    // Old IE probably still won't play ball, though.
  592.     header('HTTP/1.1 503 Service Temporarily Unavailable');
  593.     // For CGI/*suexec use..
  594.     if (substr(php_sapi_name(), 0, 3) == 'cgi') { header('Status: 503 Service Temporarily Unavailable'); }
  595.     header('Retry-After: '.($anti_hammer['final_time']+1)); // the calculation needs to be enclosed in braces to work.
  596.     die($msg);
  597. }
  598. // write the updated hammer info to the fake/session file..
  599. function SetHammer() {
  600.     if ($GLOBALS['anti_hammer']['use_php_sessions']) {
  601.         $_SESSION['anti_hammer']['start_time'] = $GLOBALS['session']['start_time'];
  602.         $_SESSION['anti_hammer']['hammer'] = $GLOBALS['session']['hammer'];
  603.         $_SESSION['anti_hammer']['cut_off'] = $GLOBALS['session']['cut_off'];
  604.     } else {
  605.         write_fake_session($GLOBALS['fake_sess_file'], $GLOBALS['session']);
  606.     }
  607. }
  608. /*
  609.     Append data to a file.
  610.     Pass true as the 3rd paramater to wipe the file.
  611.                                             */
  612. function add_data($file, $data, $wipe=false) {
  613.     // if it's not there, try to create it..
  614.     if (!file_exists($file)) $fp = fopen($file, 'wb');
  615.     
  616.     $flag = 'ab';
  617.     if ($wipe) { $flag = 'wb'; }
  618.     if (is_writable($file)) {
  619.         $fp = fopen($file, $flag);
  620.         $lock = flock($fp, LOCK_EX);
  621.         if ($lock) {
  622.             fwrite($fp, $data);
  623.             flock ($fp, LOCK_UN);
  624.         } else {
  625.             $GLOBALS['errors']['add_data'] = "couldn't lock $file";
  626.         }
  627.         fclose($fp);
  628.     } else {
  629.         $GLOBALS['errors']['add_data'] = "can't write to $file";
  630.     }
  631. }
  632. // read serialized array data from a file, and return as an array..
  633. function read_fake_session($no_cookie_file) {
  634.     if (file_exists($no_cookie_file)) {
  635.         $file_handle = fopen($no_cookie_file, 'rb');
  636.         $file_contents = @fread($file_handle, filesize($no_cookie_file));
  637.         fclose($file_handle);
  638.     } else { return false; }
  639.     $file_contents = unserialize($file_contents);
  640.     if (is_array($file_contents)) {
  641.         return $file_contents;
  642.     }
  643. }
  644. // serialize an array and write the string data to a file..
  645. function write_fake_session($no_cookie_file, $array) {
  646.     $data = serialize($array);
  647.     if (empty($data)) { return; }
  648.     $fp = @fopen($no_cookie_file, 'wb');
  649.     if ($fp) {
  650.         $lock = flock($fp, LOCK_EX);
  651.         if ($lock) {
  652.             fwrite($fp, $data);
  653.             flock ($fp, LOCK_UN);
  654.         }
  655.         fclose($fp);
  656.         clearstatcache();
  657.         return (1);
  658.     }
  659. }
  660. /*
  661.     CollectGarbage
  662.     You couldtransplant this into another web app fairly easily.
  663.     Useful.
  664. */
  665. function CollectGarbage($count_file, $limit) {
  666.     if ($limit === 0) { return; }
  667.     if (increment_hit_counter($count_file) >= $limit) {
  668.         $file_list = array();
  669.         if ($the_dir = @opendir(dirname($count_file))) {
  670.             while (false != ($file = readdir($the_dir))) {
  671.                 if ((ord($file) != 46) and strpos($file, $GLOBALS['anti_hammer']['ID_prefix']) === 0) {
  672.                     $file_path = dirname($count_file).'/'.$file;
  673.                     if (filemtime($file_path) < (time() - $GLOBALS['anti_hammer']['GC_age']*60*60)) {
  674.                         unlink($file_path);
  675.                     }
  676.                 }
  677.             }
  678.         }
  679.         increment_hit_counter($count_file, 0, 1); // reset the counter
  680.     }
  681. }//2do..
  682. //        Run this in another thread? Or maybe a simple http request, perhaps
  683. //        with $_GET, to flip Ant-Hammer to GC mode in the Background - this task
  684. //        could be done after the request is already sent, even simultaneously;
  685. //        there may be a *lot* of files in this directory.
  686. //
  687. //        Having said that, it's *very* fast, and only runs once per 10,000 or so
  688. //        ($limit) hits.
  689. //
  690. /*
  691. increment a counter()    
  692. from my "file-tools.php", available elsewhere.
  693.                                                                             */
  694. function increment_hit_counter($count_file, $report_only=false, $reset=false) {
  695.     $count = false;
  696.     if (!file_exists($count_file) or $reset) {
  697.         $file_pointer = fopen($count_file, 'wb');
  698.         fwrite ($file_pointer, '0');
  699.         fclose ($file_pointer);
  700.     }
  701.     // now the counter..
  702.     if (file_exists($count_file)) {
  703.         // read in the old score..
  704.         $count = trim(file_get_contents($count_file));
  705.         if ($report_only) { return $count; }
  706.         if (!$count) { $count = 0; }
  707.         $count++;
  708.         
  709.         // write out new score..
  710.         if (is_writable($count_file)) {
  711.             $file_pointer = fopen($count_file, 'wb+');
  712.             $lock = flock($file_pointer, LOCK_EX);
  713.                 if ($lock) {
  714.                     fwrite($file_pointer, $count);
  715.                     flock ($file_pointer, LOCK_UN);
  716.                 }
  717.                 fclose($file_pointer);
  718.                 clearstatcache();
  719.         }
  720.     }
  721.     return $count;
  722. }
  723. /*
  724.     Integers To English Words.
  725.     Converts 1145432 into..
  726.     "one million, one hundred and forty five thousand, four hundred and thirty two"
  727.     Fairly groovy. ;o)
  728.     The regular version is in my "text-func.php", with some other stuff.
  729.                             */
  730. function ah_int2eng($number) {
  731.     $output = '';
  732.     if ($number < 1) $number = 1;
  733.     $GLOBALS['anti_hammer']['final_time'] = $number;
  734.     $units = array(' ', 'one ', 'two ', 'three ', 'four ', 'five ', 'six ', 'seven ', 'eight ', 'nine ');
  735.     $teens = array('ten ', 'eleven ', 'twelve ', 'thirteen ', 'fourteen ', 'fifteen ', 'sixteen ', 'seventeen ', 'eighteen ', 'nineteen ');
  736.     $tenners = array('', '', 'twenty ', 'thirty ', 'fourty ', 'fifty ', 'sixty ', 'seventy ', 'eighty ', 'ninety ');
  737.     $lint = strlen($number);
  738.     if ($lint > 2) $bigger = true;
  739.     for ($x = $lint ; $x >= 1 ; $x--) {    
  740.     
  741.         $last = substr($output, -5, 4);
  742.         $digit = substr($number, 0, 1);
  743.         $number = substr($number, 1);
  744.     
  745.         if ($x % 3 == 2) {
  746.         
  747.             if ($digit == 1) { // 10-19..
  748.                 $digit = substr($number, 0, 1);
  749.                 $number = substr($number, 1);
  750.                 $x--;
  751.                 if ($last == 'sand') { $output .= 'and '; }
  752.                 $output .= $teens[$digit];
  753.                 
  754.             } else { // 20-99..
  755.             
  756.                 if (($last == 'sand') ) { $output .= 'and '; }
  757.                 $output .= $tenners[$digit];
  758.             }
  759.         } else {
  760.             if (($x % 3 != 1) and ($digit > 0) and (!empty($output))) { $output .= ', '; }
  761.             $output .= $units[$digit];
  762.         }
  763.         if ((strlen($number) % 3) == 0) {
  764.             $bignum = ah_bignumbers(strlen($number) / 3);
  765.             if (($last == 'dred') and ($bignum != 'thousand')) { $output .= 'and ';}
  766.             $output .= $bignum;
  767.         }
  768.         if ((strlen($number) % 3) == 2 and $digit > 0) {
  769.             $output .= 'hundred and ';
  770.         }
  771.     }
  772.     
  773.     // clean up the output..
  774.     $output = str_replace(' ', ' ', $output);
  775.     $output = str_replace('red and thou', 'red thou', $output);
  776.     $output = str_replace('red and mill', 'red mill', $output);
  777.     $output = str_replace('lion thousand', 'lion ', $output);
  778.     if (substr($output, -5) == ' and ') { $output = substr($output, 0, -5).' '; }
  779.     
  780. return $output;
  781. }
  782. /*
  783. it just looks better, okay!    */
  784.     function ah_bignumbers($test) {
  785.         switch ($test) {
  786.             case 0:
  787.             $test = "";
  788.             break;
  789.             case 1:
  790.             $test = "thousand";
  791.             break;
  792.             case 2:
  793.             $test = "million";
  794.             break;
  795.             case 3:
  796.             $test = "trillion"; // <- that's a lot of comments!
  797.             break;
  798.         }
  799.     return $test;
  800. }
  801. /*
  802.     function read_ini()        [from my 'ini-tools.php']
  803.     pull the data from the ini file and return as an array
  804.     Usage: array (string {path to file})
  805.     returns false on failure.
  806.                                 */
  807. function read_bots_ini($data_file) {
  808.     $ini_array = array();
  809.     if (is_readable($data_file)) {
  810.         $file = file($data_file);
  811.         foreach($file as $conf) {
  812.             // if first real character isn't '#' or ';' and there is a '=' in the line..
  813.             if ( (substr(trim($conf),0,1) != '#')
  814.                 and (substr(trim($conf),0,1) != ';')
  815.                 and (substr_count($conf,'=') >= 1) ) {
  816.                 $eq = strpos($conf, '=');
  817.                 $ini_array[trim(substr($conf,0,$eq))] = trim(substr($conf, $eq + 1));
  818.             }
  819.         }
  820.         unset($file);
  821.         return $ini_array;
  822.     } else {
  823.         $GLOBALS['errors']['read_bots_ini'] = "ini file: $file does not exist.";
  824.         return false;
  825.     }
  826. }
  827. /*
  828.     changes:
  829.     0.9.3
  830.         +    You now have the option to perfomr a quick DNS lookup of the
  831.             IP Address of bad clients, and have this added to the logging.
  832.             This was already enabled, you now have the option to *disable* it, if required.
  833.         +    Anti-Hammer now send a valid "Retry-After" header, which is set to the client's current hammer delay + 1 second.
  834.         +    Added a link to the Anti-Hammer page, should lessen the wtf-factor.
  835.     0.9.2
  836.         +    You can now choose whether to allow your specified clients (aka "exemptions") to either completely bypass anti-hammer (current exemption method)..
  837.                 $anti_hammer['allow_bots'] = true;
  838.             Or else specify an integer, representing a hammer_time, in
  839.             1/100th Second, which will apply to *only* these clients..
  840.                 $anti_hammer['allow_bots'] = 50;
  841.             This setting would enable your specified clients to hammer the site at a rate of two hits-per-second, but no faster.
  842.             Effectively, we now have two hammer rates, one for known good
  843.             clients, and one for everyone else.
  844.     0.9
  845.         +    Good bots & spiders can now be allowed to bypass the hammer. This is achieved through the use of standard spider IP lists, as published here..
  846.                 http://www.iplists.com/
  847.             along with a simple ini file, detailing which user-agent links to which IP list. A working ini, and more details, will be included in the preference section (above), as well as the release.
  848.     0.8.*
  849.         +    Anti-Hammer now sends a proper 503 (service temporarily unavailable) message, rather than a 200 OK message. This will be useful in     situations where valid bots are temporarily hammering, and is more correct in this scenario. The reource *will* be back, if they cut out the crazy hammering!
  850.             If you are running under cgi/*suexec (non-module), the extra
  851.             required header is automatically sent.
  852.             In use, this causes many bots to back-off immediately. Excellent!
  853.         ~    Improved the ban resetting (which needs to work independantly of the Garbage collection mechanism). After the ban time, the client's cut-off is wiped, and their start time set to *now*, just like a new client, however, their hammer_count is set to one hammer below the first trigger level. In other words, a single hammer gets them the NO Hammer! page; and to the final page quicker than new clients. Even if you use rolling triggers, Anti_hammer will still use the first ban level to calculate this number, so set that to whatever you want.
  854. ~    ban_times and ban_levels have been renamed to waiting_times and trigger_levels, to avoid confusion with the ban_time (for the new total cut-off). These also make more sense, as they are not bans, simply delays.
  855.     0.7.*
  856.         +    Added rolling ban times. Rather than have set limits which the
  857.             client can cross, this simply increments the ban time with each and every hammer attempt. 1-2-3-4-5-6-7.. etc. cut-off still functions as before for each system (rolling or preset levels).
  858.             This was, in fact, the original system, which I replaced with the level presets early on, but it's kinda fun, and the code is simple.
  859.         ~    Removed the file-tools.php include statement, and put the functions directly into here (slightly renamed). I figure anyone smart enough
  860.             to be including my file-tools, will be smart enough to figure out how to put that back, if required.
  861.         +    More things are configurable, like the page title. Why not!
  862.     0.6.*
  863.         +    Added capability to work with clients who do not accept, or have chosen not to accept (read: disabled) cookies.
  864.             Basically, we write a "fake" session. The fake session uses a
  865.             serialized array in a flat file, just like regular php sessions, and is created before they even get receive their first page. From that
  866.             point on, they are known (by Anti-Hammer) by this ID.
  867.             The name of the client's session ID file is the session ID itself; an MD5 of all the known usable client data concatenated together.
  868.         ~    php session usage is still available as an option, if required.
  869.         +    Added Garbage Collection for the fake session files. Both how often this happens (every 'so-many' requests), and how old is considered
  870.             "stale", are configurable.
  871.         +    Ban time is now configurable (in hours). Remember to ensure that Garbage Collection isn't happening before this time.
  872.         +    Added penultimate message for cut-off. You get one *final* warning!
  873.         ~    Cleaned-up the code regarding sessions. We now make a clean break, converting whichever type of session data into a local array, and
  874.             then work with that. At the end, we write the pertinent data back to whichever type of session is being utilized (built-in or php).
  875.     0.5.*
  876.         +    You can now configure a cut-off point. When the number of violations reaches this number, their pages simply die. This is disabled by default. This point is, of course, configurable. (actually, I got called away in the middle of this, so I'll need to check how far I got!)
  877.     0.4.*
  878.         +    Added user preferences for lots of the settings, voilation levels, times, etc. Added error checking for these, so they should be fairly foolproof (good movie, by the way, "Foolproof", 2003).
  879.     0.3.*
  880.         +    Added configurable protection skipping for certain file types
  881.             (usually associated files and such). This replaces a nasty hack that lived at the top of the script.
  882.         +    Added skipping for generated images, too (GD images, etc.). This can also be used to skip other tpyes. See the preferences for more
  883.             details.
  884.         +    Added configurable messages. I'll likely put this out eventually, it's kinda useful.
  885.     0.2.*
  886.         +    Added ignored areas, for chat scripts and such. places where either hammering is allowed, or is dealt with by the local script.
  887. */
  888. ?>


and the exemptions.ini file has this...
Code: [ Select ]
# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
# Anti-Hammer Exemptions..
# These spiders, at their correct IP Addresses, are exempt from Anti-Hammer.
#
# How-To..
#
# On the left, is the expected User Agent string (can be a partial match).
# On the right, is the file to look at for IP Mask information, one IP per line.
#
# Basically, you enter only as much of the string as is required to make a match.
# Entries are Case Sensitive. If you want to match "msnbot" and "MSNBOT", you
# need two entries. Entries are read in order, and processing halts as soon as a
# match is found. See "Scooter" entries, below, for an example.
#

[exemptions]

# The one and only Google..
Mozilla/5.0 (compatible; Googlebot=google.txt
Googlebot/Test=google.txt
Googlebot=google.txt
Mediapartners-Google=google.txt
AdsBot-Google=google.txt
gsa-crawler (Enterprise; S4-E9LJ2B82FJJAA=google.txt
SAMSUNG-SGH-E250/1.0 Profile/MIDP-2.0 Configuration/CLDC-1.1 UP.Browser/6.2.3.3.c.1.101 (GUI) MMP/2.0 (compatible; Googlebot-Mobile=google.txt

# Microsoft..
msnbot=msn.txt
# the test is Case Sensitive, so we need two entries for this bot..
MSNBOT=msn.txt
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT; MS Search=msn.txt
lanshanbot/1.0 ( http://search.msn=msn.txt

# Alta Vista .. put this before the Yahoo! version, so it's picked up first..
Scooter/3.3Y!CrawlX=altavista.txt

# Yahoo!/Inktomi..
Fast Crawler v X=inktomi.txt
Scooter=inktomi.txt
Y!J-BSC=inktomi.txt
Yahoo=inktomi.txt
slurp=inktomi.txt
Mozilla/5.0 (compatible; Yahoo! Slurp=inktomi.txt
Mozilla/5.0 (Slurp=inktomi.txt

# Excite..
Excite=excite.txt
Infoseek=infoseek.txt

# Lycos..
Lycos_Spider=lycos.txt

# NorthernLight..
NorthernLight=northernlight.txt

# Ask/Jeeves
Mozilla/2.0 (compatible; Ask=askjeeves.txt
Mozilla/5.0 (compatible; Ask=askjeeves.txt
teoma_agent1=askjeeves.txt

#Miscelleneous non-bots..
Jigsaw/2.2.5 W3C=custom.txt
  1. # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
  2. # Anti-Hammer Exemptions..
  3. # These spiders, at their correct IP Addresses, are exempt from Anti-Hammer.
  4. #
  5. # How-To..
  6. #
  7. # On the left, is the expected User Agent string (can be a partial match).
  8. # On the right, is the file to look at for IP Mask information, one IP per line.
  9. #
  10. # Basically, you enter only as much of the string as is required to make a match.
  11. # Entries are Case Sensitive. If you want to match "msnbot" and "MSNBOT", you
  12. # need two entries. Entries are read in order, and processing halts as soon as a
  13. # match is found. See "Scooter" entries, below, for an example.
  14. #
  15. [exemptions]
  16. # The one and only Google..
  17. Mozilla/5.0 (compatible; Googlebot=google.txt
  18. Googlebot/Test=google.txt
  19. Googlebot=google.txt
  20. Mediapartners-Google=google.txt
  21. AdsBot-Google=google.txt
  22. gsa-crawler (Enterprise; S4-E9LJ2B82FJJAA=google.txt
  23. SAMSUNG-SGH-E250/1.0 Profile/MIDP-2.0 Configuration/CLDC-1.1 UP.Browser/6.2.3.3.c.1.101 (GUI) MMP/2.0 (compatible; Googlebot-Mobile=google.txt
  24. # Microsoft..
  25. msnbot=msn.txt
  26. # the test is Case Sensitive, so we need two entries for this bot..
  27. MSNBOT=msn.txt
  28. Mozilla/4.0 (compatible; MSIE 6.0; Windows NT; MS Search=msn.txt
  29. lanshanbot/1.0 ( http://search.msn=msn.txt
  30. # Alta Vista .. put this before the Yahoo! version, so it's picked up first..
  31. Scooter/3.3Y!CrawlX=altavista.txt
  32. # Yahoo!/Inktomi..
  33. Fast Crawler v X=inktomi.txt
  34. Scooter=inktomi.txt
  35. Y!J-BSC=inktomi.txt
  36. Yahoo=inktomi.txt
  37. slurp=inktomi.txt
  38. Mozilla/5.0 (compatible; Yahoo! Slurp=inktomi.txt
  39. Mozilla/5.0 (Slurp=inktomi.txt
  40. # Excite..
  41. Excite=excite.txt
  42. Infoseek=infoseek.txt
  43. # Lycos..
  44. Lycos_Spider=lycos.txt
  45. # NorthernLight..
  46. NorthernLight=northernlight.txt
  47. # Ask/Jeeves
  48. Mozilla/2.0 (compatible; Ask=askjeeves.txt
  49. Mozilla/5.0 (compatible; Ask=askjeeves.txt
  50. teoma_agent1=askjeeves.txt
  51. #Miscelleneous non-bots..
  52. Jigsaw/2.2.5 W3C=custom.txt


and the exemptions folder contains several text files, one for google, one for excite, one for lycos etc.

Ideally, I want to just exclude my own IP but, if no one can help with that, please tell me how to do the admin_agent_string thing in the anti-hammer.php file to exempt Firefox 5.0.0.4183


Many thanks,
Cerio
  • cerio
  • Proficient
  • Proficient
  • User avatar
  • Posts: 263
  • Loc: UK

Post 3+ Months Ago

I found this section that apparently is how to make it skip a folder but I either did it incorrectly, as it didn't work, or it isn't the way to do it. Any ideas if it should have worked and, if it should, what I did wrong?

Code: [ Select ]
/*
    Skip certain files and folders..    
    
        aka, basic "Ignore"..

        [default: $anti_hammer['skip'] = '/chat/,/foobar/members,rdf.php,/blog/rss.php';]

        A list of areas/folders and specific files you DON'T want the
        anti-hammer to cover. Enter the full path (from site root) to each
        file/folder.

        You can also skip ALL the instances of "rss.php", etc. on your entire
        site by using only the file name, e.g..

        $anti_hammer['skip'] = 'rdf.php,rss.php';

        This also works for folder. Using the full path enables you to target
        specific files and folders, using only the name gives you blanket
        coverage. Your call.

        Basically, if your string is contained anywhere within the requested
        URI, the script returns control to your page immediately, bypassing
        Anti-Hammer.

        Do put comments *between* entries.

*/
$anti_hammer['skip'] = '/chat/,/foobar/members,rdf.php,/blog/rss.php';

/*
    RSS feeds are a good example of a file to skip (assuming they are
    php-generated). Firefox, for example, will often grab all the feeds on a
    page at-once, quickly notching up a user's hammer count.

*/
  1. /*
  2.     Skip certain files and folders..    
  3.     
  4.         aka, basic "Ignore"..
  5.         [default: $anti_hammer['skip'] = '/chat/,/foobar/members,rdf.php,/blog/rss.php';]
  6.         A list of areas/folders and specific files you DON'T want the
  7.         anti-hammer to cover. Enter the full path (from site root) to each
  8.         file/folder.
  9.         You can also skip ALL the instances of "rss.php", etc. on your entire
  10.         site by using only the file name, e.g..
  11.         $anti_hammer['skip'] = 'rdf.php,rss.php';
  12.         This also works for folder. Using the full path enables you to target
  13.         specific files and folders, using only the name gives you blanket
  14.         coverage. Your call.
  15.         Basically, if your string is contained anywhere within the requested
  16.         URI, the script returns control to your page immediately, bypassing
  17.         Anti-Hammer.
  18.         Do put comments *between* entries.
  19. */
  20. $anti_hammer['skip'] = '/chat/,/foobar/members,rdf.php,/blog/rss.php';
  21. /*
  22.     RSS feeds are a good example of a file to skip (assuming they are
  23.     php-generated). Firefox, for example, will often grab all the feeds on a
  24.     page at-once, quickly notching up a user's hammer count.
  25. */


I added
Code: [ Select ]
$anti_hammer['skip'] = '/var/www/vhosts/wafuku.co.uk/httpdocs/admin';
(/var/www/vhosts/wafuku.co.uk/httpdocs/admin is the full path to the site's admin files) after this line
Code: [ Select ]
$anti_hammer['skip'] = '/chat/,/foobar/members,rdf.php,/blog/rss.php';

but, as I said, it didn't work.

Cerio
  • WritingBadCode
  • Graduate
  • Graduate
  • User avatar
  • Posts: 214
  • Loc: Sweden

Post 3+ Months Ago

I have not tested this script and don't intend to but is it possible prehaps to just wrap the whole thing in a simple if statment?


eg: if((NOT)(your_ip/admin_session/admin_cookie/whatever)) { do anti hammer code }

cerio wrote:
...though I don't actually know how to 'tag a unique string onto the end of your browser's User Agent string'.



As for browser agent string its easily done in FireFox if you type about:config in the adress field and add a new string and call it: general.useragent.override

Then simply add your wished user agent string (note that some pages uses the user agent string to present diffrent layouts and you may end up with some sites looking wierd). This is a good thing to do for numerous reasons tho (one being security - your information gets more hidden, usually a browser do tell quite a lot about you...).

Post Information

  • Total Posts in this topic: 3 posts
  • Users browsing this forum: No registered users and 142 guests
  • You cannot post new topics in this forum
  • You cannot reply to topics in this forum
  • You cannot edit your posts in this forum
  • You cannot delete your posts in this forum
  • You cannot post attachments in this forum
 
 

© 1998-2014. Ozzu® is a registered trademark of Unmelted, LLC.