[Cialug] OT: can somebody help me with a regular expression?

Colin Burnett cmlburnett at gmail.com
Fri Jul 25 15:06:28 CDT 2008


On Fri, Jul 25, 2008 at 2:46 PM, Nathan C. Smith <nathan.smith at ipmvs.com> wrote:
>
> Do we have any regex wizrds in our midst?
>
> I need help creating a regular expression that can find a US patent number in a block of text.

I know it's always fun to make really huge regex's that do exactly
what you want in one swoop but I don't think the time put in is worth
it (unless pride is your cardinal sin).  About 15 seconds of time to
write 9 of them.  First two in each group handle comma-ed numbers,
last in the group handles comma-less.  First group for standard
patents, then design, then plant.

[0-9]{1-3},[0-9]{3}
[0-9]{1-3},[0-9]{3},[0-9]{3}
[0-9]{1-7}

D[0-9]{1-3},[0-9]{3}
D[0-9]{1-3},[0-9]{3},[0-9]{3}
D[0-9]{1-7}

PP[0-9]{1-3},[0-9]{3}
PP[0-9]{1-3},[0-9]{3},[0-9]{3}
PP[0-9]{1-7}

I assume you'll have to weed out bad ones (though you could probably
get rid of the 1-7 and go with 3-7 unless you really are dealing with
the first 100 patents issued).  I've never analyzed run times of
regex's but I know the more complex you get the longer they take to
run so I don't see a problem with running 9 simpler regexs.


Colin


More information about the Cialug mailing list