Cacheing with PHP and mod_rewrite

  • joebert
  • Fart Bubbles
  • Genius
  • User avatar
  • Posts: 13503
  • Loc: Florida

Post 3+ Months Ago

Someone posted a thread months ago that turned into a cache type thread and I'd replied with what was apparently a confusing diagram that nobody understood.

Well, I can't find that thread right now because I really have no idea what I'm searching for, but I'm back with more confusing stuff that nobody is likely to understand, just the way I like it ! :D

Anywho, this involves cacheing dynamicly generated HTML pages created by PHP and having Apache skip PHP all together when possible.

This basicly takes advantage of mod_rewrites RewriteCond directive and the "-f" flag available to it for checking whether or not the given path is an existing file on the filesystem.

My DocumentRoot looks like this

Code: [ Select ]
.
..
cache/
index.php
  1. .
  2. ..
  3. cache/
  4. index.php


and my URIs look like this

Code: [ Select ]
/word-rewritemarker.html
/another-word-rewritemarker.html
  1. /word-rewritemarker.html
  2. /another-word-rewritemarker.html


My rewrites look like this

Code: [ Select ]
## cache
RewriteCond /full_path_to/public_html/cache%{REQUEST_URI} -f
RewriteRule \.html$ /cache%{REQUEST_URI} [L]
 
## category and page links
RewriteRule ^([a-z0-9\-]*-)-rewritemarker\.html$ index.php?words=$1 [L]
  1. ## cache
  2. RewriteCond /full_path_to/public_html/cache%{REQUEST_URI} -f
  3. RewriteRule \.html$ /cache%{REQUEST_URI} [L]
  4.  
  5. ## category and page links
  6. RewriteRule ^([a-z0-9\-]*-)-rewritemarker\.html$ index.php?words=$1 [L]


And my PHP looks like this

Code: [ Select ]
// $page is generated using a template class which includes a __toString method
$page = '<html> ... </html>';
 
if(isset($cache_filename))
{
    // $cache_filename is generally generated from keywords and a $cache_path
    // I save it to disk, then read the file contents back out to the visitor.
    // this could probably be more efficient, but it works for now
    file_put_contents($cache_filename, $page);
    unset($page);
    echo file_get_contents($cache_filename);
}
else
{
    echo $page;
}
  1. // $page is generated using a template class which includes a __toString method
  2. $page = '<html> ... </html>';
  3.  
  4. if(isset($cache_filename))
  5. {
  6.     // $cache_filename is generally generated from keywords and a $cache_path
  7.     // I save it to disk, then read the file contents back out to the visitor.
  8.     // this could probably be more efficient, but it works for now
  9.     file_put_contents($cache_filename, $page);
  10.     unset($page);
  11.     echo file_get_contents($cache_filename);
  12. }
  13. else
  14. {
  15.     echo $page;
  16. }


Basically how this all works is when a request gets to the server for say "/words-rewritemarker.html", Apache checks the cache folder for a file with that name, if such a file exists it returns that files contents under the current URL instead of redirecting to the cache folder.
If such a file doesn't exist, mod_rewrite moves on to the usual SEO friendly RewriteRule that sends the request to "index.php" where a page is dynamicly generated like any normal PHP page.

The application is smart enough to determine whether or not it should cache the request, or just display the generated page.
One of the reasons this works soo well is because of the template class I use, there is no HTML in my index.php, I load HTML files with my template class and the class remains an object that retains references to arrays and other objects untill the template object is converted to a string by echoing $page or anything along those lines. The template class basically just remembers where everything it's going to need is located untill its time to display output.

Caching pages like this provides a HUGE boost in speed in situations where pages are likely to be viewed more often than they're changed.

I've got a cron job that looks through my templates folder once a day and compares filesystem last-modified timestamps to those of the files in the cache, if there's any changed template files it trashes the cache files made before that changed file. It also purges stale files from the cache which are more than a week old.

This would probably be a little tougher to pull off if I was using a database. Everything on the site comes from the filesystem where I can check last-modified timestamps, something that's hard to do efficiently with a database.

Ok I'm done, if somehow you understand a word of what I said, awesome, if not, oh well it wouldn't be the first time. :D
  • Anonymous
  • Bot
  • No Avatar
  • Posts: ?
  • Loc: Ozzuland
  • Status: Online

Post 3+ Months Ago

  • spork
  • Brewmaster
  • Silver Member
  • User avatar
  • Posts: 6252
  • Loc: Seattle, WA

Post 3+ Months Ago

Why not turn this into a tutorial Joe?
  • joebert
  • Fart Bubbles
  • Genius
  • User avatar
  • Posts: 13503
  • Loc: Florida

Post 3+ Months Ago

Because it meanders with no real structure about it and it's something I'm still working on. :D

Post Information

  • Total Posts in this topic: 3 posts
  • Users browsing this forum: No registered users and 87 guests
  • You cannot post new topics in this forum
  • You cannot reply to topics in this forum
  • You cannot edit your posts in this forum
  • You cannot delete your posts in this forum
  • You cannot post attachments in this forum
 
cron
 

© 1998-2014. Ozzu® is a registered trademark of Unmelted, LLC.