Are these MySQL indexes redundant ?

  • joebert
  • Fart Bubbles
  • Genius
  • User avatar
  • Posts: 13502
  • Loc: Florida

Post 3+ Months Ago

Consider the following table.

SQL Code: [ Select ]
CREATE TABLE `relations` (
   category_id SMALLINT(6) UNSIGNED NOT NULL,
   item_id MEDIUMINT(8) UNSIGNED NOT NULL,
   insert_order INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
   PRIMARY KEY (insert_order),
   UNIQUE KEY cat_id_2 (category_id, item_id),
   KEY cat_id (category_id),
   KEY item_id_2 (item_id)
) ENGINE=MyISAM;
  1. CREATE TABLE `relations` (
  2.    category_id SMALLINT(6) UNSIGNED NOT NULL,
  3.    item_id MEDIUMINT(8) UNSIGNED NOT NULL,
  4.    insert_order INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
  5.    PRIMARY KEY (insert_order),
  6.    UNIQUE KEY cat_id_2 (category_id, item_id),
  7.    KEY cat_id (category_id),
  8.    KEY item_id_2 (item_id)
  9. ) ENGINE=MyISAM;


Is there any benefit to having the last two KEY indexes there ?

Would queries that have WHERE clauses dependent on those single columns use the single column INDEX, would they utilize the multi-column INDEX, would they not utilize anything because of the wildcard (*) column selection ?

SQL Code: [ Select ]
SELECT * FROM relations WHERE category_id = 5 LIMIT 0,9
  • Anonymous
  • Bot
  • No Avatar
  • Posts: ?
  • Loc: Ozzuland
  • Status: Online

Post 3+ Months Ago

  • joebert
  • Fart Bubbles
  • Genius
  • User avatar
  • Posts: 13502
  • Loc: Florida

Post 3+ Months Ago

A few thoughts.

A uniqueness between category_id and item_id pairs is required so that there are no redundant rows. Therefore I need either a PRIMARY or UNIQUE key that uses both of those columns.

I could drop insert_order and use the UNIQUE KEY as the PRIMARY KEY. If I do that though, the ability to have any sort of chronology applied to rows becomes more complicated than I care to think about.

According to the manual, my UNIQUE key inherently provides an index on the category_id column alone, as well as on both columns together because of the "left-most rule".

I could drop the single column INDEX on category_id and rely on the multi-column KEY for the category_id column, but the cardinality of the multi-column key is likely to be close to if not the same as the cardinality of the PRIMARY key, or, the size of the entire table, whereas the cardinality of the single column category_id INDEX should always be smaller since it is limited in size to the number of categories.

Smaller keys mean faster lookups. MySQL appears to favor single column keys when both are present if I understand that manual page right.

Because it is possible to associate an item_id with multiple category_ids, the INDEX on item_id alone makes lookups faster when managing the associations between an item and a category. Because of the "left-most rule", the UNIQUE index in the table can not be used for item_id in situations where I'm looking up the item_id and want to know all categories associated with it.

Ok there's my reasoning.
Am I forgetting anything ?
  • joebert
  • Fart Bubbles
  • Genius
  • User avatar
  • Posts: 13502
  • Loc: Florida

Post 3+ Months Ago

Well this is certainly interesting.

I've got the following query that uses this relations table when displaying category pages.

PHP Code: [ Select ]
   $sql = '
      SELECT wp.item_id, wp.label, wp.path
      FROM ' . RELATIONS_TABLE . ' rel
         LEFT JOIN ' . ITEMS_TABLE . " wp
            ON rel.item_id = wp.item_id
      WHERE rel.{$t->category_sql}
      ORDER BY rel.insert_order ASC
      LIMIT {$t->start_id}, {$config->options->display->category->per_page}";
  1.    $sql = '
  2.       SELECT wp.item_id, wp.label, wp.path
  3.       FROM ' . RELATIONS_TABLE . ' rel
  4.          LEFT JOIN ' . ITEMS_TABLE . " wp
  5.             ON rel.item_id = wp.item_id
  6.       WHERE rel.{$t->category_sql}
  7.       ORDER BY rel.insert_order ASC
  8.       LIMIT {$t->start_id}, {$config->options->display->category->per_page}";


Because a category can be assigned as a parent category, I'm fetching the category_id of any category with the current category set as its' parent before this query and that gives me either a "category_id = N" or "category_id IN(N,N)" for "$t->category_sql" depending on how many categories there are. This allows me to have parent categories that both have their own items, and will include the items of their immediate children in their category pages.

The intersting part is that when using EXPLAIN with this query, I see the extra values "Using WHERE; Using Filesort". From what I understand "using filesort" is bad news. There used to be a DISTINCT clause on that query, but that was causing "using temporary" which is bad news as well.

However, when I drop that ORDER BY clause from the query, my extra turns into "Using Index", which means it doesn't even have to look at the data rows, and it still returns the same order. I'm guessing because of that insert_id PRIMARY KEY.

The query also notes both the UNIQUE KEY and the single category_id INDEX as possible keys, it's deciding to use the single column key by the looks of it, though the key_len being 2 in the output for that table has me confused. :scratchhead:
  • PolishHurricane
  • Mastermind
  • Mastermind
  • User avatar
  • Posts: 1585

Post 3+ Months Ago

You have me confused too! :scratchhead:

Nice blog post "Is Our Understanding Backwards", that's pretty trippy, you're probably correct too! haha.
  • joebert
  • Fart Bubbles
  • Genius
  • User avatar
  • Posts: 13502
  • Loc: Florida

Post 3+ Months Ago

key_len is the length of the index row in bytes, not the number of columns in the index. That's what had me confused. :D

Post Information

  • Total Posts in this topic: 5 posts
  • Users browsing this forum: No registered users and 124 guests
  • You cannot post new topics in this forum
  • You cannot reply to topics in this forum
  • You cannot edit your posts in this forum
  • You cannot delete your posts in this forum
  • You cannot post attachments in this forum
 
 

© 1998-2014. Ozzu® is a registered trademark of Unmelted, LLC.