[Cialug] OT: can somebody help me with a regular expression?

Chris Freeman cwfreeman at gmail.com
Fri Jul 25 22:13:54 CDT 2008


On Fri, Jul 25, 2008 at 4:39 PM, Todd Walton <tdwalton at gmail.com> wrote:

> On Fri, Jul 25, 2008 at 3:07 PM, Chris Freeman <cwfreeman at gmail.com>
> wrote:
> > \b(|D|PP)\d{1,3}(,{0,1}\d{3})*
>
> You'll miss the first 999 patents.  I'm guessing that's not gonna
> matter, but couldn't you make character 13 above be a zero instead of
> a one?
>

I'm not sure how it's going to miss the first 999 patents. It matches 88,
and 999, etc, just fine in my tests.

But, \b also matches a comma, so you'd need something like:
(\s|^)(|D|PP)\d{1,3}(,{0,1}\d{3})*(\s|$)

I'm using "\s" to match spaces, which may not be a valid assumption.
However, it does correctly match all of the examples set forth (including 1,
12, and 123).

Chris

$ cat tmp.pl
#!/usr/bin/perl
while(<STDIN>) {
    if( $_ =~ /(\s|^)(|D|PP)\d{1,3}(,{0,1}\d{3})*(\s|$)/ ) {
        print "Match\n";
    } else {
        print "No match\n";
    }
}

$ perl tmp.pl
1
Match
A1
No match
D1
Match
PP1
Match
12
Match
A12
No match
D12
Match
PP12
Match
123
Match
A123
No match
D123
Match
PP123
Match
1234
Match
A1234
No match
D1234
Match
PP1234
Match
1,234
Match
A1,234
No match
D1,234
Match
PP1,234
Match
12,345,678
Match
A12,345,678
No match
D12,345,678
Match
PP12,345,678
Match
12,34,56
No match
12,345,678,
No match
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://cialug.org/pipermail/cialug/attachments/20080725/43853c5c/attachment.htm


More information about the Cialug mailing list