explode() question. results not accurate.

  • CStrauss
  • Student
  • Student
  • User avatar
  • Joined: 23 Mar 2006
  • Posts: 75
  • Loc: St. Louis MO. USA
  • Status: Offline

Post May 5th, 2008, 10:27 am

Here is the code in question.

  1.  
  2. <?php
  3.     // This Represents how data is stored in the database with <p> tags.
  4. $article_content = "<p>This is paragraph 1.</p><p>This is paragraph 2</p><p>This is paragraph 3</p><p>This is paragraph4</p>";
  5.  
  6.     // Number of Paragraphs per Page
  7. $para = 1; 
  8.  
  9.     // Split up article by paragraphs
  10. $paragraph = explode("<p>",$article_content);
  11.  
  12. $total_paragraph = count($paragraph);
  13. echo $total_paragraph; 
  14.  
  15. ?>
  16.  


When I run the code and print $total_paragraph it result is 5 but there is only 4 paragraphs. Why is is showing 5. could it be counting the </p> as well. is there a better way or another function other then explode to parse out each paragraph seperately?
Rules and Bones Where Made To Be Broken. Now Which Do You Want? Broken Rules Or Your Bones Broken.
  • Anonymous
  • Bot
  • No Avatar
  • Joined: 25 Feb 2008
  • Posts: ?
  • Loc: Ozzuland
  • Status: Online

Post May 5th, 2008, 10:27 am

  • Bigwebmaster
  • Site Admin
  • Site Admin
  • User avatar
  • Joined: 20 Dec 2002
  • Posts: 6225
  • Loc: Seattle, WA
  • Status: Offline

Post May 5th, 2008, 10:43 am

It might be doing that because it uses the <p> as the divider or boundary, and since you are starting with a <p> it might be counting the empty side to the left of it.

If I were to do this I would actually use the preg_match function as in something like this:

  1. preg_match("/<p>.*?<\/p>/i",$article_content,$matches);
  2. $total_paragraph = count($matches);
  3. echo $total_paragraph;


The PHP preg_match function basically uses regular expressions for matching and stores each match into the $matches array. At that point I just counted how many elements were put into the array to figure out how many paragraphs it found. I believe this should work correctly.
Webmaster Resources
UNFLUX.net - Quality Web Hosting
  • CStrauss
  • Student
  • Student
  • User avatar
  • Joined: 23 Mar 2006
  • Posts: 75
  • Loc: St. Louis MO. USA
  • Status: Offline

Post May 5th, 2008, 11:00 am

Thanks big, I will play with your code i new regex would be better but im still really trying to learn them I at the point I understand what preg_match or any regex does its i get thrown off by what symbols to use in the parameters, the *,$,? ect., all those little sysmbols tend to throw me off and make my eyes crossed eye.

Anyways thats for the tip I will play with it see how it works in my little script I'm writing.
Rules and Bones Where Made To Be Broken. Now Which Do You Want? Broken Rules Or Your Bones Broken.
  • Bigwebmaster
  • Site Admin
  • Site Admin
  • User avatar
  • Joined: 20 Dec 2002
  • Posts: 6225
  • Loc: Seattle, WA
  • Status: Offline

Post May 5th, 2008, 11:05 am

Using regex can be confusing, especially if you have not had much practice with it. I would highly recommend you spend time learning and becoming comfortable with it though as it is very powerful for many types of things you will need to do. I use pattern matching in almost every program I write at some point.
Webmaster Resources
UNFLUX.net - Quality Web Hosting
  • CStrauss
  • Student
  • Student
  • User avatar
  • Joined: 23 Mar 2006
  • Posts: 75
  • Loc: St. Louis MO. USA
  • Status: Offline

Post May 5th, 2008, 11:27 am

Yeah, I been looking for a good book that covers regex and php, I haven't found one yet I have found some articles on the web but haven't found one yet that puts it in words I can understand to well.

As far as your code I tried it return the result of 1 paragraph instead of 3 since there are 4 but stored in array first element is 0. but I can still access all elements of the array by using echo $paragraph[2], and so forth.

So does that mean its working or should the result be 3 instead of 1?
Rules and Bones Where Made To Be Broken. Now Which Do You Want? Broken Rules Or Your Bones Broken.
  • CStrauss
  • Student
  • Student
  • User avatar
  • Joined: 23 Mar 2006
  • Posts: 75
  • Loc: St. Louis MO. USA
  • Status: Offline

Post May 5th, 2008, 11:39 am

corretion i forgot to take out the explode() in my code thats why i was able to access paragraph array, but still figuring out why it displays 1 instead of 4 paragraphs with your preg_match
Rules and Bones Where Made To Be Broken. Now Which Do You Want? Broken Rules Or Your Bones Broken.
  • spork
  • /dev/null
  • Silver Member
  • User avatar
  • Joined: 22 Sep 2003
  • Posts: 3987
  • Loc: Rochester, NY
  • Status: Offline

Post May 5th, 2008, 12:05 pm

  1. preg_match("/<p>.*<\/p>/Ui", $article_content, $matches);
  2. $total_paragraph = count($matches);
  3. echo $total_paragraph;

Try adding the U option to the end of the regex to make it non-greedy.
That's great, but 3 weeks is so far away...
I'm going TOMORROW.
  • Bigwebmaster
  • Site Admin
  • Site Admin
  • User avatar
  • Joined: 20 Dec 2002
  • Posts: 6225
  • Loc: Seattle, WA
  • Status: Offline

Post May 5th, 2008, 12:40 pm

The purpose of the ? in the .*? is to make it non-greedy as well. So either of those ways should have the same effect. I did make a typo earlier and had this line:

  1. $total_paragraph = count($matches);


as:

  1. $total_paragraph = $count($matches);


So hopefully you had taken the corrected version without the $.
Webmaster Resources
UNFLUX.net - Quality Web Hosting
  • joebert
  • Turdburglus III
  • Genius
  • User avatar
  • Joined: 10 Feb 2004
  • Posts: 8802
  • Loc: Clearwater, FL
  • Status: Offline

Post May 5th, 2008, 12:49 pm

count($matches) is good with preg_match_all, but realisticly you're only ever going to have 0 or 1 returned from preg_match.

Using preg_match_all will likely clear this issue up. :)

Quote:
but still figuring out why it displays 1 instead of 4 paragraphs with your preg_match
Error: 0xC0FFEE is empty
ooms - bb3-mods
  • Bigwebmaster
  • Site Admin
  • Site Admin
  • User avatar
  • Joined: 20 Dec 2002
  • Posts: 6225
  • Loc: Seattle, WA
  • Status: Offline

Post May 5th, 2008, 12:55 pm

I just did a quick test and joebert is correct. This code works:

  1. preg_match_all("/<p>.*?<\/p>/",$article_content,$matches);
  2. $total_paragraph = count($matches[0]);
  3. echo $total_paragraph;


I figured it out after seeing the results returned from using the print_r($matches) function.
Webmaster Resources
UNFLUX.net - Quality Web Hosting
  • CStrauss
  • Student
  • Student
  • User avatar
  • Joined: 23 Mar 2006
  • Posts: 75
  • Loc: St. Louis MO. USA
  • Status: Offline

Post May 5th, 2008, 1:09 pm

I think i figured it out duh, i found a little tutorial on regex and just reading it as I try to learn. So let me know if this sounds correct.

  1.  
  2. preg_match("/<p>.*?<\/p>/",$article_content,$matches);
  3.  


if I'm understanding this tutorial I'm reading preg_match will return a boolan value of 0 for false no matches found or 1 for true if found a match? If that is correct then that could be why when I echo $total_paragraph it has a value of 1 for the matches found and not actually the number of paragraph found?
Rules and Bones Where Made To Be Broken. Now Which Do You Want? Broken Rules Or Your Bones Broken.
  • CStrauss
  • Student
  • Student
  • User avatar
  • Joined: 23 Mar 2006
  • Posts: 75
  • Loc: St. Louis MO. USA
  • Status: Offline

Post May 5th, 2008, 1:48 pm

okay I was wrong about my last post cause if I type:

echo $matches[0];

it displays the content between the first <p> tags. if I replace [0] with another element number prints noting so all its doing is finding the first paragraph where I want to store all the paragraphs in array is what my over all goal in this project. but using all your examples its just showing its storing only 1 of the 4 paragraphs I want stored.
Rules and Bones Where Made To Be Broken. Now Which Do You Want? Broken Rules Or Your Bones Broken.
  • Bigwebmaster
  • Site Admin
  • Site Admin
  • User avatar
  • Joined: 20 Dec 2002
  • Posts: 6225
  • Loc: Seattle, WA
  • Status: Offline

Post May 5th, 2008, 2:10 pm

Did you use the example I last posted above with the preg_match_all function? The problem with the preg_match function is that it will only match once, while preg_match_all is like using the 'g' option with perl where it will match globally until everything is used up. I probably should have tested my code the first time before posting here, but I did test the code above and it returns 4 correctly for me now.
Webmaster Resources
UNFLUX.net - Quality Web Hosting
  • CStrauss
  • Student
  • Student
  • User avatar
  • Joined: 23 Mar 2006
  • Posts: 75
  • Loc: St. Louis MO. USA
  • Status: Offline

Post May 5th, 2008, 2:12 pm

okay lol finally got it to work, i missed bigwebmasters last post but one thing kinda of throws me off if i can get an explination for. this statement what it means

  1.  
  2. $total_paragraph = count($matches[0]);
  3.  


whats the purpose of count($matches[0]); and not just using $matches?

because matches is an array and [0] is the first element in that array it looks like to me count the first element in the array only?

Anyways if anyone could explain that so i can understand it better be helpful. over all thanks for the replies and the help to you all
Rules and Bones Where Made To Be Broken. Now Which Do You Want? Broken Rules Or Your Bones Broken.
  • Bigwebmaster
  • Site Admin
  • Site Admin
  • User avatar
  • Joined: 20 Dec 2002
  • Posts: 6225
  • Loc: Seattle, WA
  • Status: Offline

Post May 5th, 2008, 2:51 pm

The reason why is because it is a multidimensional array. If you look at the definition for preg_match_all here:

http://www.php.net/preg_match_all

It says:

Quote:
matches:
Array of all matches in multi-dimensional array ordered according to flags.


So for example if you set the flag: PREG_PATTERN_ORDER then it would order results so that $matches[0] is an array of full pattern matches, $matches[1] is an array of strings matched by the first parenthesized subpattern, and so on.

Since you used no flags it actually sets the default flag to PREG_PATTERN_ORDER and simply stored all of the matches in $matches[0]. If you used parenthesis for additional pattern matching then it would have stored another array in $matches[1].
Webmaster Resources
UNFLUX.net - Quality Web Hosting
  • Anonymous
  • Bot
  • No Avatar
  • Joined: 25 Feb 2008
  • Posts: ?
  • Loc: Ozzuland
  • Status: Online

Post May 5th, 2008, 2:51 pm

Post Information

  • Total Posts in this topic: 19 posts
  • Moderators: joebert, katana
  • Users browsing this forum: Bogey and 63 guests
  • You cannot post new topics in this forum
  • You cannot reply to topics in this forum
  • You cannot edit your posts in this forum
  • You cannot delete your posts in this forum
  • You cannot post attachments in this forum
 
 

© Unmelted Enterprises 1998-2008. Driven by phpBB © 2001-2008 phpBB Group.

 
 
 
 

Need a pre-made web design for your website?

Check out our templates here: Ozzu Templates


400+ FREE Website Templates. Download Now!