[Cialug] OT: can somebody help me with a regular expression?

Chris Freeman cwfreeman at gmail.com
Fri Jul 25 15:07:26 CDT 2008


Try:

\b(|D|PP)\d{1,3}(,{0,1}\d{3})*

Chris

On Fri, Jul 25, 2008 at 2:46 PM, Nathan C. Smith <nathan.smith at ipmvs.com>
wrote:

>
> Do we have any regex wizrds in our midst?
>
> I need help creating a regular expression that can find a US patent number
> in a block of text.
>
> Elements that should match could look like 7,225,309 or 7225309 and 88,809
> or 88809 and D339,456 or D339456.  Some numbers also start with PP for plant
> patent.
>
> Obviously 7225309 looks like pretty much any number, but I'm using a free
> from input where any number is likely to be a patent.  I'm not sure if I
> should use one, two, or more regexes or really how to approach the problem.
>  Do I try to match the numbers as numbers or as characters, or do the comma
> format one way and the non-comma format the other?
>
> This was my first attempt:
> \b([D]?[0-9](,?)|D)[0-9]?[0-9]?[0-9](,?)[0-9][0-9][0-9]
>
> It matches most of my test case, but I need to figure out how to account
> for shorter variations without commas (use more question marks?) and
> probably more conditionals '|' ?  It also has the drawback of matching
> 3.14159.  It also looks a little long and I suspect there is a better way to
> do some of it.
>
> It looks like I would be happy matching about anything that wasn't PI, a
> social security number, or an IP Address.  My the expression engine I am
> using is in JAVA but is *supposed* to be pcre-compatible.
>
> Thanks.
>
> -Nate
>
>
> _______________________________________________
> Cialug mailing list
> Cialug at cialug.org
> http://cialug.org/mailman/listinfo/cialug
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://cialug.org/pipermail/cialug/attachments/20080725/f45d7d59/attachment.html


More information about the Cialug mailing list