PHP clean up a string

  • devilwood
  • Silver Member
  • Silver Member
  • User avatar
  • Posts: 447

Post 3+ Months Ago

I have some data I'm dumping into mysql from excel which is basically just an item name and cost.

The problem is with the cost.

Sometimes the cost is for example:

125.88 (which is perfect) ... but sometimes the employees have put the cost to reflect the unit of measure....

and even once used a slash such as

I done tons of string manipulations but this one is bugging me. Most of the time I've regex or str_split or explode or substr which I can write to GET RID of the M,C,or /M but I kinda wanted to keep those values and use them. Str_split seemed to be good but I had no delimiter?? Plus the /M one is going to be tough. Any suggestions or help on how to edit a string but keep the data being removed. thnx.
  • joebert
  • Genius
  • Genius
  • User avatar
  • Posts: 13511
  • Loc: Florida

Post 3+ Months Ago

I would use preg_match.

This is the pattern I would start with based on what I'm reading. $matches will contain the full string, the number, and the suffix. In that order.
Code: [ Select ]
$matches = array();
preg_match('#^([\d.]+)(/?[MC])?$#', $str, $matches);
  1. $matches = array();
  2. preg_match('#^([\d.]+)(/?[MC])?$#', $str, $matches);
  • devilwood
  • Silver Member
  • Silver Member
  • User avatar
  • Posts: 447

Post 3+ Months Ago

Thnx joebert, I'm going to work with that pregex you gave me to simply the following that I cooked up.

First get just the decimal string
Code: [ Select ]
$cost = "111.11/M";
$findme = ".";
$pos1 = strpos($cost, $findme);
$totpos = strlen($cost);
$strcheck = $totpos - $pos1;
if ($strcheck != 3) {
    $countbackpos = 3 - $strcheck;
$fixedcost = substr($cost, 0, $countbackpos);
  1. $cost = "111.11/M";
  2. $findme = ".";
  3. $pos1 = strpos($cost, $findme);
  4. $totpos = strlen($cost);
  5. $strcheck = $totpos - $pos1;
  6. if ($strcheck != 3) {
  7.     $countbackpos = 3 - $strcheck;
  8. }
  9. $fixedcost = substr($cost, 0, $countbackpos);

Next I get the U/M that some idiot attached to the end.

Code: [ Select ]
function getuom($hay) {
  $k = array('M','/M','m','C','c');
    foreach ($k as $value) {
    $check_uom = strpbrk($hay, $value);
    if ($check_uom != false) {
        return $check_uom;

$uom = getuom($cost);
  1. function getuom($hay) {
  2.   $k = array('M','/M','m','C','c');
  3.     foreach ($k as $value) {
  4.     $check_uom = strpbrk($hay, $value);
  5.     if ($check_uom != false) {
  6.         return $check_uom;
  7.         }
  8.     }
  9. }
  10. $uom = getuom($cost);

I implemented this in my main script but it's giving errors in my sql but it's working on a test file I made so I think I'm on the right track. However, I want to mess with that pregex to see if it won't make things easier and the edits more exact for the volume of inserts I'm reading through. This should get me rolling more but any other suggestions would be nice. More I work on this I realize I really haven't ever separated a string and actually kept individual characters. I was always just removing/deleting unwanted/unneeded characters. I guess I could always use a string array cause each character in the variable $cost should be referenced by $cost[0], $cost[1], etc. I've also checked the data and the most I should be dealing with is either 1 or 2 additional characters after the cents which should always be M,m,/M,C,/C,c and really the /C and lowercase c were never used but I threw in to maybe to catch future errors.
  • joebert
  • Genius
  • Genius
  • User avatar
  • Posts: 13511
  • Loc: Florida

Post 3+ Months Ago

I like to use preg_* functions because it's easy to work in different formats down the road. Just about every time I get started with something, I end up encountering freak instances down the road and when I just have to alter a regex pattern those are easy to deal with.

If you're sure there's no spaces at the beginning of the strings, and that the first N characters will be ones you want, there's the strspn function.

In the conditions I described, that function will end up returning the position of the first bunk character. Technically, it's returning the length of the part of the string you want.

For instance, if you have the string "12.34/m" or "12.34m", in both cases strspn restricted to numbers and a dot would return 5.

PHP Code: [ Select ]
$str = array(
foreach($str as &$s)
   echo "$s: " . substr($s, 0, strspn($s, '1234567890.')) . "\n";
  1. <?php
  3. $str = array(
  4.    '12.34m',
  5.    '12.34/m',
  6.    '12.34M'
  7. );
  9. foreach($str as &$s)
  10. {
  11.    echo "$s: " . substr($s, 0, strspn($s, '1234567890.')) . "\n";
  12. }
  14. ?>
  • devilwood
  • Silver Member
  • Silver Member
  • User avatar
  • Posts: 447

Post 3+ Months Ago

Ahh, yes that is what I'm looking for. That's got it. strspn is exactly what I needed. thnx.

Post Information

  • Total Posts in this topic: 5 posts
  • Users browsing this forum: No registered users and 32 guests
  • You cannot post new topics in this forum
  • You cannot reply to topics in this forum
  • You cannot edit your posts in this forum
  • You cannot delete your posts in this forum
  • You cannot post attachments in this forum

© 1998-2017. Ozzu® is a registered trademark of Unmelted, LLC.