[Cialug] Columns of Data

Jeffrey Ollie jeff at ocjtech.us
Fri Jul 31 16:38:45 UTC 2020


BTW, in honor of Sysadmin Day (not directed specifically to anyone here) as
a greybeard sysadmin myself (literally) my advice to younger sysadmins is
_learn to program_. Python is a fine choice for sysadmin work, but almost
anything will do. There are *so* many times over the past <censored/> years
that whipping up a script like I just did has made my work so much easier
rather than trying to bash together tools like sed & grep to get the job
done.

On Fri, Jul 31, 2020 at 11:28 AM Jeffrey Ollie <jeff at ocjtech.us> wrote:

> I'd use Python (or Perl if that floats your boat) regular expressions to
> split up the file. You'll notice that the "start" of each "line" is a
> timestamp or duration with an easily recognized pattern. That can be used
> to split the file up. See the code here:
>
> https://gist.github.com/jcollie/02cd24dffb695210fdeaece31925c7bc
>
> $ python3 test.py < test.txt
> ('9m49s', 'Normal', 'Updated', 'machine/oo-r6sr3-worker-us-east-1d-nvfzh',
> 'Updated machine\noo-r')
> ('9m47s', 'Normal', 'Updated', 'machine/oo-r6sr3-master-1', 'Updated
> machine oo-r')
> ('9m46s', 'Normal', 'Updated', 'machine/oo-r6sr3-worker-us-east-1b-hsmsx',
> 'Updated machine\noo-r')
> ('9m46s', 'Normal', 'Updated', 'machine/oo-r6sr3-worker-us-east-1a-wk5cs',
> 'Updated machine\noo-r')
> ('9m46s', 'Normal', 'Updated', 'machine/oo-r6sr3-worker-us-east-1e-z9xlb',
> 'Updated machine\noo-r')
> ('9m44s', 'Normal', 'Updated', 'machine/oo-r6sr3-master-0', 'Updated
> machine oo-r')
> ('9m43s', 'Normal', 'Updated', 'machine/oo-r6sr3-master-2', 'Updated
> machine oo-r')
> ('9m43s', 'Normal', 'Updated', 'machine/oo-r6sr3-worker-us-east-1d-tfg6x',
> 'Updated machine\noo-r')
> ('9m43s', 'Normal', 'Updated', 'machine/oo-r6sr3-worker-us-east-1c-6l42j',
> 'Updated machine\noo-r')
> ('59s', 'Normal', 'SuccessfulUpdate', 'clusterautoscaler/default',
> 'Updated ClusterAutoscaler
> deployment:\nmachine-api/cluster-autoscaler-default\n')
> ('4m7s', 'Normal', 'Pulled', 'pod/gateway-laravel-schedule-1296080-h43n6',
> 'Container
> image\n"dockerregistry:4567/group/gateway/master:alpine-nodejs-fpm"
> already\npresent on machine\n')
>
> On Fri, Jul 31, 2020 at 10:18 AM Todd Walton <tdwalton at gmail.com> wrote:
>
>> Happy SysAdmin Day, everyone!
>>
>> Here is an example bit of text coughed up by a Kubernetes command-line
>> tool:
>>
>> 9m49s    Normal    Updated
>>  machine/oo-r6sr3-worker-us-east-1d-nvfzh    Updated machine
>> oo-r6sr3-worker-us-east-1d-nvfzh
>> 9m47s    Normal    Updated            machine/oo-r6sr3-master-1
>>       Updated machine oo-r6sr3-master-1
>> 9m46s    Normal    Updated
>>  machine/oo-r6sr3-worker-us-east-1b-hsmsx    Updated machine
>> oo-r6sr3-worker-us-east-1b-hsmsx
>> 9m46s    Normal    Updated
>>  machine/oo-r6sr3-worker-us-east-1a-wk5cs    Updated machine
>> oo-r6sr3-worker-us-east-1a-wk5cs
>> 9m46s    Normal    Updated
>>  machine/oo-r6sr3-worker-us-east-1e-z9xlb    Updated machine
>> oo-r6sr3-worker-us-east-1e-z9xlb
>> 9m44s    Normal    Updated            machine/oo-r6sr3-master-0
>>       Updated machine oo-r6sr3-master-0
>> 9m43s    Normal    Updated            machine/oo-r6sr3-master-2
>>       Updated machine oo-r6sr3-master-2
>> 9m43s    Normal    Updated
>>  machine/oo-r6sr3-worker-us-east-1d-tfg6x    Updated machine
>> oo-r6sr3-worker-us-east-1d-tfg6x
>> 9m43s    Normal    Updated
>>  machine/oo-r6sr3-worker-us-east-1c-6l42j    Updated machine
>> oo-r6sr3-worker-us-east-1c-6l42j
>> 59s      Normal    SuccessfulUpdate   clusterautoscaler/default
>>       Updated ClusterAutoscaler deployment:
>> machine-api/cluster-autoscaler-default
>> 4m7s     Normal    Pulled
>> pod/gateway-laravel-schedule-1296080-h43n6  Container image
>> "dockerregistry:4567/group/gateway/master:alpine-nodejs-fpm" already
>> present on machine
>>
>> For the purpose of this email, don't mind about the semantics. This could
>> be anything. But do notice that the output is arranged into neat columns.
>> The first four columns are strings of non-space characters. The fifth
>> column, however, gives us trouble. It seeks to undermine the movement from
>> within, throwing a wrench into the works. Fifth columns, amiright?
>>
>> Here's another example, this one taken from my /var/log/messages:
>>
>> Jun 28 02:50:42 ilm01-ll-ttwalto NetworkManager[2238]: <info>  device
>> (wlp1s0): set-hw-addr: set MAC address to 9:7:8:9:2:F (scanning)
>> Jun 28 02:50:42 ilm01-ll-ttwalto kernel: IPv6: ADDRCONF(NETDEV_UP):
>> wlp1s0:
>> link is not ready
>> Jun 28 02:50:42 ilm01-ll-ttwalto NetworkManager[2238]: <info>  device
>> (wlp1s0): supplicant interface state: inactive -> disabled
>> Jun 28 02:50:42 ilm01-ll-ttwalto NetworkManager[2238]: <info>  device
>> (wlp1s0): supplicant interface state: disabled -> inactive
>> Jun 28 02:55:57 ilm01-ll-ttwalto NetworkManager[2238]: <info>  device
>> (wlp1s0): set-hw-addr: set MAC address to 62:0F:6E:7A:B3:2C (scanning)
>> Jun 28 02:55:57 ilm01-ll-ttwalto kernel: IPv6: ADDRCONF(NETDEV_UP):
>> wlp1s0:
>> link is not ready
>> Jun 28 02:55:57 ilm01-ll-ttwalto NetworkManager[2238]: <info>  device
>> (wlp1s0): supplicant interface state: inactive -> disabled
>>
>> Here again the fifth column is making things difficult. Also the first
>> three could certainly stand to be one column, but at least they're
>> standard, predictable, and manipulable. Manipulable being what I'm looking
>> for.
>>
>> This happens frequently, where a command or log outputs text in columns
>> that are not quite usefully arranged. How does one deal with columnar data
>> like this? I can't use 'cut'. What would I cut on that would capture the
>> first columns *and* keep the last one intact? I'm not sure how one would
>> easily use awk for this. Is there something like '{ print $5- }'? Meaning,
>> from column 5 onwards? I can't use "column -t" because that screws
>> everything up royally.
>>
>> Another thing that trips me up. Sometimes I'll have a nice set of
>> comma-separated values but there'll be a comma in one of the fields. The
>> typical way of dealing with this in CSV files is to quote the entire
>> field.
>> But that doesn't help me, the bash scripter.
>>
>> Any suggestions for how to deal with stuff like this?
>>
>> --
>> Todd
>> _______________________________________________
>> Cialug mailing list
>> Cialug at cialug.org
>> https://www.cialug.org/cgi-bin/mailman/listinfo/cialug
>>
>
>
> --
> Jeff Ollie
> The majestik møøse is one of the mäni interesting furry animals in Sweden.
>


-- 
Jeff Ollie
The majestik møøse is one of the mäni interesting furry animals in Sweden.


More information about the Cialug mailing list