regex help

Discuss Programming
Post Reply
byrdman
administrator
administrator
Posts: 225
Joined: Thu May 08, 2003 1:59 pm
Location: In the cloud

regex help

Post by byrdman » Mon Dec 07, 2009 5:11 pm

I was wondering if anyone would have a solution to my problem:

I am trying to get stock information from a site and use it in a way that is better for me.

the site I am using lets me pull the company name and symbol but it returns it as one line

so when I pull google's stock the first line I get is :

Google Inc. (Nasdaq:GOOG)

I would like to have php take everything before the first paren "(" and assign it to $name
and everything between the ":" and the last paren ")" assign it to $symbol

it seems like it would be easy but I can't figure it out as the regex rules REALLY confuse me. Or is regex the right way to go?[/quote]

User avatar
Void Main
Site Admin
Site Admin
Posts: 5715
Joined: Wed Jan 08, 2003 5:24 am
Location: Tuxville, USA
Contact:

Post by Void Main » Wed Dec 09, 2009 6:33 am

This is easy to do using Perl regular expressions:

stock.pl:

Code: Select all

#!/usr/bin/perl -w

use strict;

my $html = "Google Inc. (Nasdaq:GOOG)";

if ($html =~ /(.*)\((.*)\)/) {
   my $name = $1;
   my $symbol = $2;
   print "name: $name\n";
   print "symbol: $symbol\n";
}
PHP also has similar regular expression capability (including Perl regular expressions):

stock.php:

Code: Select all

#!/usr/bin/php
<?
$html = "Google Inc. (Nasdaq:GOOG)";

if (preg_match('/(.*)\((.*)\)/',$html,$matches)) {
   $name = $matches[1];
   $symbol = $matches[2];
   print "name: $name\n";
   print "symbol: $symbol\n";
}
?>

byrdman
administrator
administrator
Posts: 225
Joined: Thu May 08, 2003 1:59 pm
Location: In the cloud

Post by byrdman » Fri Dec 11, 2009 8:32 am

perfect! thank you!!

now for a more difficult one. I found a cheatsheet on regex and was doing some practicing and ran into a bump.

my php reads from a xml file. it loops through each of the nodes and pulls the data correctly. the second varible it grabs from the xml is in this format:

last: 1,345.34, change: +4.34, percent: +1.45%

so, statically I can do a regex to grab

Code: Select all

$description = ‘last: 1,345.34, change: +4.34, percent: +1.45%’;
preg_match(“/^last:\s?([\d,\-\+\.]*),\schange:\s?([\d,\-\+\.]*),\spercent:\s?([\d,\-\+\.%]*)$/â€

User avatar
Void Main
Site Admin
Site Admin
Posts: 5715
Joined: Wed Jan 08, 2003 5:24 am
Location: Tuxville, USA
Contact:

Post by Void Main » Fri Dec 11, 2009 10:42 am

Is that "x$" a typo?

byrdman
administrator
administrator
Posts: 225
Joined: Thu May 08, 2003 1:59 pm
Location: In the cloud

Post by byrdman » Fri Dec 11, 2009 11:02 am

yes, my apologies. the original has [$x]

User avatar
Void Main
Site Admin
Site Admin
Posts: 5715
Joined: Wed Jan 08, 2003 5:24 am
Location: Tuxville, USA
Contact:

Post by Void Main » Fri Dec 11, 2009 11:19 am

I almost need to see a little more code with the loop.

byrdman
administrator
administrator
Posts: 225
Joined: Thu May 08, 2003 1:59 pm
Location: In the cloud

Post by byrdman » Fri Dec 11, 2009 11:34 am

Code: Select all

for ($x=0;$x<count($title_array);$x++) {
//insert rss item into database store
        $title = $title_array[$x]->title;
        $newdesc = $title_array[$x]->description;
                preg_match_all("/^\slast:\s?([\d,\-\+\.]*),change:\s?([\d,\-\+\.}*),percent:\s?([\d,\-\+\.%]*)$/", $newdesc, $matches);
                $price = $matches[1];
                $change = $matches[2];
                $pchange = $matches[3];

        //$insert_query = "INSERT INTO $tablename (`name`,`price`,`change`, `pchange`) VALUES ('$title','$price','$change','$pchange')";
        //$insert_result = mysql_query($insert_query);  
        //if (!$insert_result){echo $insert_query; exit;}                                 

//echo "<br>";
echo "Title: $title";
//echo "<br>";
//      echo "Price: $newdesc";
echo "Price: $matches[1]";
echo "Change: $matches[2]";
echo "PChange: $matches[3]";
        }

}
Above is the loop in the code. If I uncomment the 'echo "Price: $newdesc";
it will display the whole line. If I uncomment "echo "Price: $matches[1]"; it returns "Array" and not the actuall price. Same with "echo "Change: $matches[2]"; echo "PChange: $matches[3] will not display anything...

I put my regex into the Regex Tester
http://www.myregextester.com/index.php and it displays:

Code: Select all

$matches Array:
(
    [0] => Array
        (
            [0] =>  last: 297.86,change: +0.69,percent: +0.23%
        )

    [1] => Array
        (
            [0] => 297.86
        )

    [2] => Array
        (
            [0] => +0.69
        )

    [3] => Array
        (
            [0] => +0.23%
        )

)
so I am close, or atleast my regex is correct but is $matches[1] correct? is there another level of the array? Like should it be $matches[1][0]?

User avatar
Void Main
Site Admin
Site Admin
Posts: 5715
Joined: Wed Jan 08, 2003 5:24 am
Location: Tuxville, USA
Contact:

Post by Void Main » Fri Dec 11, 2009 1:05 pm

You were using preg_match_all which returns $matches as a multidimensional array instead of preg_match which returns $matches as a single dimension array. It's possible you could use preg_match_all in place of your loop and cut down on code but this works. Plus you had a "}" instead of an "]" in your regex. If made some mods and this works for me:

Code: Select all

<?
$title_array[0]->title = 'goog 1';
$title_array[0]->description = 'last: 297.86,change: +0.69,percent: +0.23%';

$title_array[1]->title = 'goog 2';
$title_array[1]->description = 'last: 397.86,change: +0.49,percent: +0.28%';

for ($x=0;$x<count($title_array);$x++) {

   $title = $title_array[$x]->title;
   $newdesc = $title_array[$x]->description;
   if (preg_match("/last:\s?([\d,\-\+\.]*),change:\s?([\d,\-\+\.]*),percent:\s?([\d,\-\+\.%]*)/", $newdesc, $matches)) {
      $price = $matches[1];
      $change = $matches[2];
      $pchange = $matches[3];

      echo "============\n";
      echo "title $x: $title\n";
      echo "newdesc $x: $newdesc\n";
      echo "Price: $price\n";
      echo "Change: $change\n";
      echo "PChange: $pchange\n";
      echo "============\n";
   }
}
?>
php tst.php
============
title 0: goog 1
newdesc 0: last: 297.86,change: +0.69,percent: +0.23%
Price: 297.86
Change: +0.69
PChange: +0.23%
============
============
title 1: goog 2
newdesc 1: last: 397.86,change: +0.49,percent: +0.28%
Price: 397.86
Change: +0.49
PChange: +0.28%
============

byrdman
administrator
administrator
Posts: 225
Joined: Thu May 08, 2003 1:59 pm
Location: In the cloud

Post by byrdman » Fri Dec 11, 2009 3:34 pm

THANK YOU!! it worked!!

byrdman
administrator
administrator
Posts: 225
Joined: Thu May 08, 2003 1:59 pm
Location: In the cloud

Post by byrdman » Tue Mar 16, 2010 9:17 am

another regex request for help.

I have a request from a client to get them NCAA scores. my feed gives me:

March 13, 2010: Team A 68* vs. Team B 61 (NCAAB)
March 13, 2010: North Team A 82* vs. South Team B 75 (NCAAB)

how could I take the above to

$date = matches[1]
$team_a = matches[2]
$score_a = matches[3]
$team_b = matches[5]
$score_b = matches[6]

for each line to equal

$date = March 13, 2010
$team_a = North Team A
$score_a = 82
$team_b = South Team B
$score_b = 75

Thanking you in advance if anyone can help... :oops:

Post Reply