Replacing Words With Numbers

  • joebert
  • Fart Bubbles
  • Genius
  • User avatar
  • Posts: 13502
  • Loc: Florida

Post 3+ Months Ago

Basically it should take "tom has twenty-three thousand, one hundred and twelve cats" and give back "tom has 23,112 cats"

It appears to work with what I've used to test it, can anyone break it ? :D

PHP Code: [ Select ]
<?php
 
$str = 'twelve billion people know iPhone has two hundred and thirty thousand, seven hundred and eighty-three apps as well as over one million units sold';
 
function strlen_sort($a, $b)
{
   if(strlen($a) > strlen($b))
   {
      return -1;
   }
   else if(strlen($a) < strlen($b))
   {
      return 1;
   }
   return 0;
}
 
$keys = array(
   'one' => '1', 'two' => '2', 'three' => '3', 'four' => '4', 'five' => '5', 'six' => '6', 'seven' => '7', 'eight' => '8', 'nine' => '9',
   'ten' => '10', 'eleven' => '11', 'twelve' => '12', 'thirteen' => '13', 'fourteen' => '14', 'fifteen' => '15', 'sixteen' => '16', 'seventeen' => '17', 'eighteen' => '18', 'nineteen' => '19',
   'twenty' => '20', 'thirty' => '30', 'forty' => '40', 'fifty' => '50', 'sixty' => '60', 'seventy' => '70', 'eighty' => '80', 'ninety' => '90',
   'hundred' => '100', 'thousand' => '1000', 'million' => '1000000', 'billion' => '1000000000'
);
 
 
preg_match_all('#((?:^|and|,| |-)*(\b' . implode('\b|\b', array_keys($keys)) . '\b))+#i', $str, $tokens);
//print_r($tokens); exit;
$tokens = $tokens[0];
usort($tokens, 'strlen_sort');
 
foreach($tokens as $token)
{
   $token = trim(strtolower($token));
   preg_match_all('#(?:(?:and|,| |-)*\b' . implode('\b|\b', array_keys($keys)) . '\b)+#', $token, $words);
   $words = $words[0];
   //print_r($words);
   $num = '0'; $total = 0;
   foreach($words as $word)
   {
      $word = trim($word);
      $val = $keys[$word];
      //echo "$val\n";
      if(bccomp($val, 100) == -1)
      {
         $num = bcadd($num, $val);
         continue;
      }
      else if(bccomp($val, 100) == 0)
      {
         $num = bcmul($num, $val);
         continue;
      }
      $num = bcmul($num, $val);
      $total = bcadd($total, $num);
      $num = '0';
   }
   $total = bcadd($total, $num);
   echo "$total:$token\n";
   $str = preg_replace("#\b$token\b#i", number_format($total), $str);
}
echo "\n$str\n";
 
?>
  1. <?php
  2.  
  3. $str = 'twelve billion people know iPhone has two hundred and thirty thousand, seven hundred and eighty-three apps as well as over one million units sold';
  4.  
  5. function strlen_sort($a, $b)
  6. {
  7.    if(strlen($a) > strlen($b))
  8.    {
  9.       return -1;
  10.    }
  11.    else if(strlen($a) < strlen($b))
  12.    {
  13.       return 1;
  14.    }
  15.    return 0;
  16. }
  17.  
  18. $keys = array(
  19.    'one' => '1', 'two' => '2', 'three' => '3', 'four' => '4', 'five' => '5', 'six' => '6', 'seven' => '7', 'eight' => '8', 'nine' => '9',
  20.    'ten' => '10', 'eleven' => '11', 'twelve' => '12', 'thirteen' => '13', 'fourteen' => '14', 'fifteen' => '15', 'sixteen' => '16', 'seventeen' => '17', 'eighteen' => '18', 'nineteen' => '19',
  21.    'twenty' => '20', 'thirty' => '30', 'forty' => '40', 'fifty' => '50', 'sixty' => '60', 'seventy' => '70', 'eighty' => '80', 'ninety' => '90',
  22.    'hundred' => '100', 'thousand' => '1000', 'million' => '1000000', 'billion' => '1000000000'
  23. );
  24.  
  25.  
  26. preg_match_all('#((?:^|and|,| |-)*(\b' . implode('\b|\b', array_keys($keys)) . '\b))+#i', $str, $tokens);
  27. //print_r($tokens); exit;
  28. $tokens = $tokens[0];
  29. usort($tokens, 'strlen_sort');
  30.  
  31. foreach($tokens as $token)
  32. {
  33.    $token = trim(strtolower($token));
  34.    preg_match_all('#(?:(?:and|,| |-)*\b' . implode('\b|\b', array_keys($keys)) . '\b)+#', $token, $words);
  35.    $words = $words[0];
  36.    //print_r($words);
  37.    $num = '0'; $total = 0;
  38.    foreach($words as $word)
  39.    {
  40.       $word = trim($word);
  41.       $val = $keys[$word];
  42.       //echo "$val\n";
  43.       if(bccomp($val, 100) == -1)
  44.       {
  45.          $num = bcadd($num, $val);
  46.          continue;
  47.       }
  48.       else if(bccomp($val, 100) == 0)
  49.       {
  50.          $num = bcmul($num, $val);
  51.          continue;
  52.       }
  53.       $num = bcmul($num, $val);
  54.       $total = bcadd($total, $num);
  55.       $num = '0';
  56.    }
  57.    $total = bcadd($total, $num);
  58.    echo "$total:$token\n";
  59.    $str = preg_replace("#\b$token\b#i", number_format($total), $str);
  60. }
  61. echo "\n$str\n";
  62.  
  63. ?>
  • Anonymous
  • Bot
  • No Avatar
  • Posts: ?
  • Loc: Ozzuland
  • Status: Online

Post 3+ Months Ago

  • Bogey
  • Genius
  • Genius
  • Bogey
  • Posts: 8388
  • Loc: USA

Post 3+ Months Ago

It works perfectly fine if you use only words... If you add numbers to the deal... okay okay, let me show an example.

I gave it the following (Does not make sense at all... I was testing the number thing, not how well it makes sense :lol: )
Quote:
<p>I know twelve people who don\'t know fifty thousand and thirty six chickens. Only 4 million six hundred twenty four thousand and two hundred sixty seven people who know six thousand and twenty five alligators.</p>

And it gave me the following:
Code: [ Select ]
624267:million six hundred twenty four thousand and two hundred sixty seven 50036:fifty thousand and thirty six 6025:six thousand and twenty five 12:twelve

I know 12 people who don't know 50,036 chickens. Only 4 624,267 people who know 6,025 alligators.
  1. 624267:million six hundred twenty four thousand and two hundred sixty seven 50036:fifty thousand and thirty six 6025:six thousand and twenty five 12:twelve
  2. I know 12 people who don't know 50,036 chickens. Only 4 624,267 people who know 6,025 alligators.

Pretty good joebert, but that is not really breaking the system, it's merely doing what it is intended to do... I don't think I can break that system :lol:
  • joebert
  • Fart Bubbles
  • Genius
  • User avatar
  • Posts: 13502
  • Loc: Florida

Post 3+ Months Ago

You have a good point with things like "4 million", I see that used a lot now that you mention it. Though I don't see much if any mixing of larger numbers like you have there.

In any event, fixing it for simple common occurrences will probably fix it for uncommon occurrences too.

I'll just edit my original post and reply after I've fixed and edited it.

Good catch Bogey. :D
  • Bogey
  • Genius
  • Genius
  • Bogey
  • Posts: 8388
  • Loc: USA

Post 3+ Months Ago

I get lazy sometimes and put things like '4 million', so I had to try if you accommodated for people like me :D
  • joebert
  • Fart Bubbles
  • Genius
  • User avatar
  • Posts: 13502
  • Loc: Florida

Post 3+ Months Ago

I just realized it will mess with shorthand dates too.

"The best Camaro was built in nineteen sixty seven".
  • Bogey
  • Genius
  • Genius
  • Bogey
  • Posts: 8388
  • Loc: USA

Post 3+ Months Ago

The only way you could fix the script to work for dates is if the script knows it is a date... unless I'm misunderestimating your power with PHP.

Did you fix the '4 million' thing? Because I did and I'm waiting to see what you would do (I'm not sure I did that the best way). :lol:
  • Bogey
  • Genius
  • Genius
  • Bogey
  • Posts: 8388
  • Loc: USA

Post 3+ Months Ago

Anyway, my fix was adding the following array (Before $keys array).
Code: [ Select ]
$nums = array('1' => 'one', '2' => 'two', '3' => 'three', '4' => 'four', '5' => 'five', '6' => 'six', '7' => 'seven', '8' => 'eight', '9' => 'nine');

And adding the following after $keys (No reason for why it's there and not before... just needs to be before the first preg_match_all();.
Code: [ Select ]
$str = strtr($str, $nums);

Making the whole script look like...
Code: [ Select ]
<?php

$str = '<p>1 million units sold</p>';

function strlen_sort($a, $b)
{
    if(strlen($a) > strlen($b))
    {
        return -1;
    }
    else if(strlen($a) < strlen($b))
    {
        return 1;
    }
    return 0;
}

$nums = array('1' => 'one', '2' => 'two', '3' => 'three', '4' => 'four', '5' => 'five', '6' => 'six', '7' => 'seven', '8' => 'eight', '9' => 'nine');

$keys = array(
    'one' => '1', 'two' => '2', 'three' => '3', 'four' => '4', 'five' => '5', 'six' => '6', 'seven' => '7', 'eight' => '8', 'nine' => '9',
    'ten' => '10', 'eleven' => '11', 'twelve' => '12', 'thirteen' => '13', 'fourteen' => '14', 'fifteen' => '15', 'sixteen' => '16', 'seventeen' => '17', 'eighteen' => '18', 'nineteen' => '19',
    'twenty' => '20', 'thirty' => '30', 'forty' => '40', 'fifty' => '50', 'sixty' => '60', 'seventy' => '70', 'eighty' => '80', 'ninety' => '90',
    'hundred' => '100', 'thousand' => '1000', 'million' => '1000000', 'billion' => '1000000000'
);

$str = strtr($str, $nums);

preg_match_all('#((?:^|and|,| |-)*(\b' . implode('\b|\b', array_keys($keys)) . '\b))+#i', $str, $tokens);
//print_r($tokens); exit;
$tokens = $tokens[0];
usort($tokens, 'strlen_sort');

foreach($tokens as $token)
{
    $token = trim(strtolower($token));
    preg_match_all('#(?:(?:and|,| |-)*\b' . implode('\b|\b', array_keys($keys)) . '\b)+#', $token, $words);
    $words = $words[0];
    //print_r($words);
    $num = '0'; $total = 0;
    foreach($words as $word)
    {
        $word = trim($word);
        $val = $keys[$word];
        //echo "$val\n";
        if(bccomp($val, 100) == -1)
        {
            $num = bcadd($num, $val);
            continue;
        }
        else if(bccomp($val, 100) == 0)
        {
            $num = bcmul($num, $val);
            continue;
        }
        
        $num = bcmul($num, $val);
        $total = bcadd($total, $num);
        $num = '0';
    }
    $total = bcadd($total, $num);
    echo "$total:$token\n";
    $str = preg_replace("#\b$token\b#i", number_format($total), $str);
}
echo "\n$str\n";

?>
  1. <?php
  2. $str = '<p>1 million units sold</p>';
  3. function strlen_sort($a, $b)
  4. {
  5.     if(strlen($a) > strlen($b))
  6.     {
  7.         return -1;
  8.     }
  9.     else if(strlen($a) < strlen($b))
  10.     {
  11.         return 1;
  12.     }
  13.     return 0;
  14. }
  15. $nums = array('1' => 'one', '2' => 'two', '3' => 'three', '4' => 'four', '5' => 'five', '6' => 'six', '7' => 'seven', '8' => 'eight', '9' => 'nine');
  16. $keys = array(
  17.     'one' => '1', 'two' => '2', 'three' => '3', 'four' => '4', 'five' => '5', 'six' => '6', 'seven' => '7', 'eight' => '8', 'nine' => '9',
  18.     'ten' => '10', 'eleven' => '11', 'twelve' => '12', 'thirteen' => '13', 'fourteen' => '14', 'fifteen' => '15', 'sixteen' => '16', 'seventeen' => '17', 'eighteen' => '18', 'nineteen' => '19',
  19.     'twenty' => '20', 'thirty' => '30', 'forty' => '40', 'fifty' => '50', 'sixty' => '60', 'seventy' => '70', 'eighty' => '80', 'ninety' => '90',
  20.     'hundred' => '100', 'thousand' => '1000', 'million' => '1000000', 'billion' => '1000000000'
  21. );
  22. $str = strtr($str, $nums);
  23. preg_match_all('#((?:^|and|,| |-)*(\b' . implode('\b|\b', array_keys($keys)) . '\b))+#i', $str, $tokens);
  24. //print_r($tokens); exit;
  25. $tokens = $tokens[0];
  26. usort($tokens, 'strlen_sort');
  27. foreach($tokens as $token)
  28. {
  29.     $token = trim(strtolower($token));
  30.     preg_match_all('#(?:(?:and|,| |-)*\b' . implode('\b|\b', array_keys($keys)) . '\b)+#', $token, $words);
  31.     $words = $words[0];
  32.     //print_r($words);
  33.     $num = '0'; $total = 0;
  34.     foreach($words as $word)
  35.     {
  36.         $word = trim($word);
  37.         $val = $keys[$word];
  38.         //echo "$val\n";
  39.         if(bccomp($val, 100) == -1)
  40.         {
  41.             $num = bcadd($num, $val);
  42.             continue;
  43.         }
  44.         else if(bccomp($val, 100) == 0)
  45.         {
  46.             $num = bcmul($num, $val);
  47.             continue;
  48.         }
  49.         
  50.         $num = bcmul($num, $val);
  51.         $total = bcadd($total, $num);
  52.         $num = '0';
  53.     }
  54.     $total = bcadd($total, $num);
  55.     echo "$total:$token\n";
  56.     $str = preg_replace("#\b$token\b#i", number_format($total), $str);
  57. }
  58. echo "\n$str\n";
  59. ?>

You decide if that's efficient or not :lol: (And that doesn't fix the dates thing :( ... :lol: )
  • Rabid Dog
  • Web Master
  • Web Master
  • User avatar
  • Posts: 3245
  • Loc: South Africa

Post 3+ Months Ago

Didn't Sam Hughes do something like this way back in the day?
  • joebert
  • Fart Bubbles
  • Genius
  • User avatar
  • Posts: 13502
  • Loc: Florida

Post 3+ Months Ago

If he did, I can't seem to find it.

Post Information

  • Total Posts in this topic: 9 posts
  • Users browsing this forum: No registered users and 105 guests
  • You cannot post new topics in this forum
  • You cannot reply to topics in this forum
  • You cannot edit your posts in this forum
  • You cannot delete your posts in this forum
  • You cannot post attachments in this forum
 
 

© 1998-2014. Ozzu® is a registered trademark of Unmelted, LLC.