[Cialug] Text Processing Choices

Josh More morej at alliancetechnologies.net
Tue Jun 30 11:19:36 CDT 2009


I just do it all in perl, but I'm sure that surprises no one.

Really, there are always many ways to do it, so pick one, learn it and
learn it well.  Then you'll know where its weak spots are and you can
learn something else to supplement them.  (For the record, perl's main
weak spot is that it is extremely easy to write bad code and have it
still work.  This isn't bad if you're a good programmer, but does make
some of the CPAN modules a bit iffy for production use.)


 

-Josh More, RHCE, CISSP, NCLP, GIAC 
 morej at alliancetechnologies.net 
 515-245-7701



>>> Matthew Nuzum <newz at bearfruit.org> 6/30/2009 11:07 AM >>> 
On Tue, Jun 30, 2009 at 8:56 AM, Daniel A.
Ramaley<daniel.ramaley at drake.edu> wrote:
> sed, awk, grep, sort, cut, comm, perl. Not necessarily in that
order.

Except for perl, I'm with Daniel. If the command line tools don't work
out I often move to a spreadsheet. For example I commonly need to get
a bunch of data ready to put into a database so I will bring the raw
data into Calc. If possible in a way that I can have columns, but if
not I'll put all of it in one column and use some formulas to split it
out into cols and then in the far right column generate a formula that
builds my sql query that can be pasted in and run. For example:

="INSERT INTO table values ("&A2&", '"&A3&"',"&A4&");"
(and then fill down)

Of course this fails if you have more than about 65k rows though I
haven't had this prob in a while and I think some spreadsheets
actually handle more now.

-- 
Matthew Nuzum
newz2000 on freenode, skype, linkedin, identi.ca and twitter
_______________________________________________
Cialug mailing list
Cialug at cialug.org
http://cialug.org/mailman/listinfo/cialug



More information about the Cialug mailing list