PHP - Search engine accents issue

  • argrafic
  • Beginner
  • Beginner
  • No Avatar
  • Joined: 14 Dec 2007
  • Posts: 53
  • Status: Offline

Post January 25th, 2008, 10:29 am

Hello all.

In a site i'm developing in spanish I have a simple search engine, but I have a little problem.

I want the search engine to find words even if you write them with or without accent.

For example atún (tuna) has an accent, and right now it will only show results if i write atún, if I write atun it tells me there are no results.

My SQL statement is the following:
SELECT * FROM r_glosario WHERE titulo LIKE '%variable%%' ORDER BY titulo ASC

How can I make it that no matter if I write atún or atun I get the same results?

Thanks!
  • Anonymous
  • Bot
  • No Avatar
  • Joined: 25 Feb 2008
  • Posts: ?
  • Loc: Ozzuland
  • Status: Online

Post January 25th, 2008, 10:29 am

  • TsX
  • Graduate
  • Graduate
  • User avatar
  • Joined: 09 Oct 2004
  • Posts: 162
  • Status: Offline

Post January 31st, 2008, 12:21 pm

The only way I see this working is having a list of accents that are searched along with the original character. This could end up being a pain but it will work. The list is easily done in ascii, you just need to find all accents you want to do.

I don't speak spanish, but I am guessing accents are usually in vowels only? This is just an example for the letter 'u' (I tried to generalize as much as possible) note: ú = ascii(163)

Writing this I see a problem when having more than one accent in a word. Hopefully this will work for that because I took every search word and repeated them with the multiple accents. I could imagine this is brutal with some words. But if you limit your list of accents it might be ok. What am I saying, this is the computer age, let the code do the work.

[php]
$string = "atun";
$searchWords = array($string);//of course, the initial search word

for( $i = 0; $i <= sizeof($string); i++ )
{
$x = $string[$i];
if( $x == 'u' || $x == 'i' || $x == 'a' || $x == 'e' || $x == 'o' )
{
$accents = allAccents($x);
for( $k = 0; $k <= sizeof($searchWords); $k++ )
{
for( $j = 0; $j <= sizeof($accents); $j++ )
{
$string = $searchWords[$k];
$string[$i] = $accents[$j];
array_push($searchWords, $string );
}
}
}

//and now you have a list of search words
//I'm thinking a lot of accents won't return and results should be decent
for( $i = 0; $i <= sizeof($searchWords); $i++ )
{
$query = mysql_query("SELECT * FROM `yourTable` WHERE `title` = '$searchWords[$i]' ");//order however you want.
while( $row = mysql_fetch_array($query) )
print_r($row); //really just for testing, do whatever you need to do here
}

//this function just returns a array of all possible variants of a letter given.
//problem is, you need to define all of them yourself.
function allAccents( $letter )
{
if( $letter == 'u' )
{
$array = array( chr(163), chr(150), chr(151) );//added some another accented u's
}
else if( $letter == 'i' )
{
$array = array( chr(139) , chr(140) );// list of other accented i's here, and so on.
}
return $array;
}
[/php]

Looking back through the code it looks like hell to me, and probably has some mistakes. But for 10 minutes I think it has potential and hopefully helps you with your problem. If you have any questions about the code, feel free. I am still looking it over myself.
  • TsX
  • Graduate
  • Graduate
  • User avatar
  • Joined: 09 Oct 2004
  • Posts: 162
  • Status: Offline

Post January 31st, 2008, 2:51 pm

This has been bugging me, but just for a wild idea, why not just replace each vowel (or character you can accent) with a wildcard. Return the results without the wildcard and then all results with the replaced wildcards after.

Post Information

  • Total Posts in this topic: 3 posts
  • Moderator: Moderator Team
  • Users browsing this forum: No registered users and 235 guests
  • You cannot post new topics in this forum
  • You cannot reply to topics in this forum
  • You cannot edit your posts in this forum
  • You cannot delete your posts in this forum
  • You cannot post attachments in this forum
 
 

© Unmelted Enterprises 1998-2008. Driven by phpBB © 2001-2008 phpBB Group.

 
 
 

Need a pre-made web design for your website?

Check out our templates here: Ozzu Templates