PHP Search Engine: Effiecient coding?

  • genxservers
  • Novice
  • Novice
  • genxservers
  • Posts: 30

Post 3+ Months Ago

Ok, this is my first attempt at writing a "search engine" so I am pretty sure there are a lot of points that I overlooked or what not, but here's my algorithm.

The program is supposed to search for keywords from a MySQL DB with many fields. All fields are basic int, varchar fields that can be searched easily. One field though, is type tinytext. That field holds the description of an object and can go up 255 chars.


I have 2 tables that will deal with the search words.

One is the master list of all unique word that have been passed to this function, WordsDB (this db will have 2 fields: word_id int(11) auto_increment, and word varchar(20) .. both field will be indexed.)

The second table logs how many times a certain unique word have occurred in each unique product, OccurenceDB (this db will have 5 fields, occurence_id int(11) auto increment, brand_id int(11), product_id int(11), word_id int(11), and occurences int(11))



Person A (admin) will be adding products to the db, while processing the product add form, the code will:

1. Break $_POST[description] into associative array
2. Associative array will contain key = unique word, value = number occurence.
3. Traverse this associative array using foreach while
- checking each unique keyword against WordsDB
- if unique word is found in WordsDB, meaning word have been encountered before, get corresponding word_id value from WordsDB
- if not found, then add new entry to WordsDB and get word_id for newly added word.
- finally add new entry into OccurencesDB with word_id;


Now we have a list off where each word occurred and how many times that word occurred there. Right? I hope so.


Now Person B (site visitor) might want to search my db for product with Brand X with keyword 1, keyword 2 and keyword 3.
Algorithm for the search goes like this:

1. First do the easy search, search for all items of Brand X
2. Create SQL string with search words:
- select * from products where brand_id = 'Brand X'
3. Store MySQL link to $result_brandSearch .. dont get values yet
4. Break $_POST[searchwords] into an array
5. Traverse keyword array using foreach while:
- matching each keyword to a word_id in WordsDB.
- if no match found, that means never encountered this word before, meaning no resulting product will be found, so discard.
- store all word_id in $searchword_id array
6. Create query string with these word_id.
- select product_id from OccurencesDB where keyword = $searchword_id[0] ... and brand_id = Brand X sort by occurence
7. Store all product_id into $keywordResult array
8. Fetch result from previously saved MySQL link, $result_brandSearch, while filtering out results that are not in $keywordResult array
- while($current = mysql_fetch_assoc($result_brandSearch)){
- check $current[product_id] is in $keywordResult array
- if is in $keywordResult, then add all values of $current to $finalResult array
- if not, then move on to next

What will ultimately be left is $finalResult which contain all values for all product of brand X and have keyword 1, keyword 2, keyword 3.

These values can now be styled and display.

What do you guys think of it? Any advice? Usable? Too long? Inefficient?

Do you guys think I should just scrap the WordsDB all together and just store the words themself in OccurenceDB instead of the word_id?

And finally, if you guys got to this part, I really appreciate the 10 minutes that it probly took for you to read all this. =D
  • Anonymous
  • Bot
  • No Avatar
  • Posts: ?
  • Loc: Ozzuland
  • Status: Online

Post 3+ Months Ago

Post Information

  • Total Posts in this topic: 1 post
  • Users browsing this forum: No registered users and 111 guests
  • You cannot post new topics in this forum
  • You cannot reply to topics in this forum
  • You cannot edit your posts in this forum
  • You cannot delete your posts in this forum
  • You cannot post attachments in this forum
 
cron
 

© 1998-2014. Ozzu® is a registered trademark of Unmelted, LLC.