Introduction

I wrote this tutorial because of this thread and other related threads. I was beginning to get frustrated with trying to explain myself without going off in a bunch of directions so I figured it would be easier to just start from the beginning and walk through the code I use when doing this sort of thing.

This is not a novice tutorial, it assumes you've got a handle on PHP and MySQL for the most part and are perhaps exploring new ways to do things.

Categories Database Table

First I need a database table to store my categories. For the purpose of example, I only need three columns to maintain a hierarchy.

CREATE TABLE `categories` (
  `category_id` smallint(5) unsigned NOT NULL auto_increment,
  `parent_id` smallint(6) unsigned default NULL,
  `label` varchar(64) NOT NULL,
  PRIMARY KEY  (`category_id`),
)

category_id as you can see is an auto_increment and my primary key. I'll use this column to keep track of each category individually and this column with be referenced by parent_id to create a hierarchy. label is exactly what it sounds like, the human readable name of the category.

If a categories parent_id is ever NULL, I know it's a root category and not a sub-category.

Querying The Database

My first instinct would be to ask the database for all of the root categories, or categories with a NULL parent_id. This is a bad idea though, it means I need to make another query to retrieve the sub-categories, and then I need to make more queries to retrieve the sub-categories of the sub-categories, and so on. If I have 4 root categories with 4 sub-categories which each have 2 sub-categories, I need a bunch of queries. That's like following a recipe in a cook book and going to the store and buying each individual ingredient as I get to it in the recipes ingredient list.

Instead, I'm going to use 1 query. A simple query that just pulls every row in the table at once without sorting or anything.

SELECT category_id, parent_id, label
FROM categories;

Now for a threaded discussion or something likely to return hundreds of thousands of rows, this would probably be a horrible idea because of how much memory would be wasted holding the entire table in most requests. (a much more likely scenario would be running out of memory) But for the purpose of categories where it's likely the majority of rows can be utilized in each request and where, assuming I have the 65535 categories my SMALLINT accounts for, will only use 16M.

Why on earth would I ever have 65535 categories? The only reason I used a SMALLINT in the first place is to provide room for deleted categories.
I can't recall seeing a website, which is actually usable, using even 1000 categories. Heck, I think the highest category id for any of Ozzus forums is only in the low 100s at most.

Using that most basic of database table I have up there basically gives me 4 categories for every 1 KB.

Hierarchy

First I want a variable to build my hierarchy in. I want to setup this hierarchy in a way that can be utilized by numerous routines for building navigation elements. The idea here is build it once and reuse it multiple times.

$category_lineage = new array();

I'm going to use the MySQLi PHP extension and assume you're familiar with it, meaning I'm going to skip all of the connection parts and tell you that from here on $db is the database object returned from "$db = new MySQLi(...);"

Using the MySQLi objects query method, I pull all of the rows using the query before. I want to reiterate this query so I can point out that I'm using the MYSQLI_USE_RESULT flag when actually executing the query. I use this flag so the step of retrieving the rows before I use them is skipped. Without using this flag I would basically be retrieving the rows twice since I don't want to execute any other queries before I've iterated through the return rows and therefore have no worries about out of sync query errors.

$result = $db->query('SELECT category_id, parent_id, label FROM categories', MYSQLI_USE_RESULT);

With $result holding a reference to my category rows, I can begin iterating through the rows and filling my $category_lineage using the category_id as the array index instead of just letting PHP assign indexes.

while($row = $result->fetch_object())
{
	$row->sub_categories = array();
	$category_lineage[$row->category_id] = $row;
}
$result->close();

The reason I use $category_lineage[$row->category_id] = $row; instead of using for instance $category_lineage[] = $row; is that I want a quick answer to whether a category_id exists when I need to answer that question, whether I'm asking it for a security check, or when I'm looking for the parent of a sub-category.

For instance, if a user wants to view the category with category_id "12" I can quickly go to $category_lineage and see if that index exists.

if( empty($category_lineage[(int)$_GET['category_id']]))
{
    // bail on account of non-existant category
}

Without the $category_lineage I probably would have queried the database to see if the category exists. Since I'm more than likely going to need the $category_lineage for menu items and other things anyways, I might as well check against it instead of asking the database if a category exists.

Another thing you'll notice is that I'm adding an array for sub-categories to the object row returned. I could pass the classname argument to fetch_object to have this array added automatically via a class, but for the sake of this example I'm just going to do it like I have below.

$row->sub_categories = array();

At first glance it might seem like I'm going to fill that array with other arrays to create a multi-dimensional array for the hierarchy, but I'm not. That sub-category array will actually only contain references to other items in the $category_lineage.

Here's the array I'll have after I've retrieved all of the rows from the database and setup an array to hold the sub-category references.
Note how the category_id property of each corresponds to the items array key/index.

Array
(
	[1] => stdClass Object
		(
			[category_id] => 1
			[parent_id] => 
			[label] => Sports
			[sub_categories] => Array()
		)
	[2] => stdClass Object
		(
			[category_id] => 2
			[parent_id] => 1
			[label] => Baseball
			[sub_categories] => Array()
		)
	[5] => stdClass Object
		(
			[category_id] => 5
			[parent_id] => 1
			[label] => Basketball
			[sub_categories] => Array()
		)
	[8] => stdClass Object
		(
			[category_id] => 8
			[parent_id] => 1
			[label] => Hockey
			[sub_categories] => Array()
		)
	[11] => stdClass Object
		(
			[category_id] => 11
			[parent_id] => 1
			[label] => Football
			[sub_categories] => Array()
		)
	[14] => stdClass Object
		(
			[category_id] => 14
			[parent_id] => 
			[label] => Babes
			[sub_categories] => Array()
		)
	[15] => stdClass Object
		(
			[category_id] => 15
			[parent_id] => 14
			[label] => Eva Mendes
			[sub_categories] => Array()
		)
	[16] => stdClass Object
		(
			[category_id] => 16
			[parent_id] => 14
			[label] => Elisha Cuthbert
			[sub_categories] => Array()
		)
	[21] => stdClass Object
		(
			[category_id] => 21
			[parent_id] => 
			[label] => test
			[sub_categories] => Array()
		)
)

Here's where using those references combined with those meaningful item keys really comes in handy. Instead of iterating through the array and for each item iterating through the array again looking for items with the current item as the parent id, I only iterate through the $category_lineage array one time to build the entire hierarchy.

foreach($category_lineage as $key => $val)
{
	if($val->parent_id)
	{
		$category_lineage[$val->parent_id]->sub_categories[$key] =& $category_lineage[$key];
	}
}

Keep in mind that $category_lineage is still a two dimensional array, but when I use print_r on it this time I get the following result.

Array
(
    [1] => stdClass Object
        (
            [category_id] => 1
            [parent_id] => 
            [label] => Sports
            [sub_categories] => Array
                (
                    [2] => stdClass Object
                        (
                            [category_id] => 2
                            [parent_id] => 1
                            [label] => Baseball
                            [sub_categories] => Array()
                        )
                    [5] => stdClass Object
                        (
                            [category_id] => 5
                            [parent_id] => 1
                            [label] => Basketball
                            [sub_categories] => Array()
                        )
                    [8] => stdClass Object
                        (
                            [category_id] => 8
                            [parent_id] => 1
                            [label] => Hockey
                            [sub_categories] => Array()
                        )
                    [11] => stdClass Object
                        (
                            [category_id] => 11
                            [parent_id] => 1
                            [label] => Football
                            [sub_categories] => Array()
                        )
                )
        )
    [2] => stdClass Object
        (
            [category_id] => 2
            [parent_id] => 1
            [label] => Baseball
            [sub_categories] => Array()
        )
    [5] => stdClass Object
        (
            [category_id] => 5
            [parent_id] => 1
            [label] => Basketball
            [sub_categories] => Array()
        )
    [8] => stdClass Object
        (
            [category_id] => 8
            [parent_id] => 1
            [label] => Hockey
            [sub_categories] => Array()
        )
    [11] => stdClass Object
        (
            [category_id] => 11
            [parent_id] => 1
            [label] => Football
            [sub_categories] => Array()
        )
    [14] => stdClass Object
        (
            [category_id] => 14
            [parent_id] => 
            [label] => Babes
            [sub_categories] => Array
                (
                    [15] => stdClass Object
                        (
                            [category_id] => 15
                            [parent_id] => 14
                            [label] => Eva Mendes
                            [sub_categories] => Array()
                        )
                    [16] => stdClass Object
                        (
                            [category_id] => 16
                            [parent_id] => 14
                            [label] => Elisha Cuthbert
                            [sub_categories] => Array()
                        )
                )
        )
    [15] => stdClass Object
        (
            [category_id] => 15
            [parent_id] => 14
            [label] => Eva Mendes
            [sub_categories] => Array()
        )
    [16] => stdClass Object
        (
            [category_id] => 16
            [parent_id] => 14
            [label] => Elisha Cuthbert
            [sub_categories] => Array()
        )
    [21] => stdClass Object
        (
            [category_id] => 21
            [parent_id] => 
            [label] => test
            [sub_categories] => Array()
        )
)

Remember, $category_lineage is still a two dimensional array. It can be iterated like an unlimited depth hierarchy because of references instead of copied values though.

Display

Looking back to the example from before, where empty() is used to see if the category requested by the user even exists, we're going to assume that the category does exist for the purpose of this tutorial and I'm going to refer to the screened $_GET['category_id'] as $current_category from here on.

$current_category = (int)$_GET['category_id'];

How could I generate a list of links for breadcrumbs to the current category ?

	$t->breadcrumbs				= new stdClass;
	$t->breadcrumbs->current	=& $category_lineage[$current_category];
	$t->breadcrumbs->url_mask	= 'index.php?category_id=%1$s&category_label=%2$s&page=%3$s';

	$t->breadcrumbs->links = array(
		sprintf('<a href="%1$s">%2$s</a>',
			sprintf($t->breadcrumbs->url_mask, $t->breadcrumbs->current->category_id, $t->breadcrumbs->current->label, 1),
			$t->breadcrumbs->current->label
		)
	);

	while($t->breadcrumbs->current->parent_id)
	{
		$t->breadcrumbs->current =& $category_lineage[$t->breadcrumbs->current->parent_id];
		array_unshift(
			$t->breadcrumbs->links,
			sprintf('<a href="%1$s">%2$s</a>',
				sprintf($t->breadcrumbs->url_mask, $t->breadcrumbs->current->category_id, $t->breadcrumbs->current->label, 1),
				$t->breadcrumbs->current->label
			)
		);
	}

	$t->breadcrumbs = implode(' • ', $t->breadcrumbs->links);

How about a category hierarchy using nested <ul> elements, something you'd see a lot in menus and sitemaps ?

$c = new stdClass;
$c->listing = '';

function recursive_ul(&$cat)
{
	$str = sprintf('<div><a href="%1$s">%2$s</a></div><ul>', $cat->list_link, $cat->label);
	foreach($cat->sub_categories as &$sub_cat)
	{
		$str .= '<li>' . recursive_ul($sub_cat) . '</li>';
	}
	return "$str</ul>";
}

foreach($category_lineage as &$c->category)
{
	$c->url = '?category_id=%1$s&title=%2$s&page=%3$s';
	$c->category->list_link = sprintf($c->url, $c->category->category_id, $c->category->label, 1);
}

foreach($category_lineage as &$c->category)
{
	if(empty($c->category->parent_id))
	{
		$c->listing .= '<li>' . recursive_ul($c->category) . '</li>';
	}
}

Additions

What if I wanted to retrieve a list of items from a category, and every level of sub-category under that category ?
I could hit the database a few times looking for sub-categories, or I could reuse my $category_lineage somehow.

I can add a descendants array along side the sub_categories array. I could use references for the items in the descendants array like I do with the sub_categories array, but I think more times than not I'm just going to want a list of category ids rather than all of the information about each category when it comes to this.

function category_lineage($branch)
{
	$descendants = array();
	foreach($branch->sub_categories as $key => $val)
	{
		$descendants[]	= $key;
		$descendants	= array_merge($descendants, category_lineage($val));
	}
	return $descendants;
}

foreach($category_lineage as $key => $val)
{
	$category_lineage[$key]->descendants = category_lineage($val);
}

Now let's say my $current_category is 14, and when I view category 14 I want to include the items from categories 15 and 16 since they're sub-categories of category 14. I can do something like this.

	if(isset($category_lineage[$current_category]->descendants[0]))
	{
		$t->category_sql = "SELECT item_id FROM category_relations WHERE category_id IN($current_category," . implode(',', $category_lineage[$current_category]->descendants) . ')';
	}
	else
	{
		$t->category_sql = "SELECT item_id FROM category_relations WHERE category_id = $current_category";
	}

Conclusion

Remember, you're not a computer, meaning you should ask as many questions as you need to without worrying about efficiency when asking questions. I could sit here for days and anticipate questions everyone could ask, but then I'd never get to go fishing. 😉

This page was published on It was last revised on

Contributing Authors

0

0 Comments

  • Votes
  • Oldest
  • Latest