[Cialug] Regular Expression for Pathnames

Daniel A. Ramaley daniel.ramaley at drake.edu
Mon Nov 24 13:43:58 CST 2014


I realized right after sending it that the way your problem is stated, 
you might want to match different numbers of slashes. To match strings 
that have between 1 and 3 (inclusive) slashes, change the part between 
curly braces to be "1,3" instead of just "3":

	egrep '^(/[^/]+){1,3}$'

Explaining what this bit of line noise means:

^	Match beginning of line.
()	Groups stuff and treats it as one object.
/	Match a literal "/".
[]	Matches a single character provided inside the brackets.
	For example, "[abcde]" will match 5 lowercase letters.
^	If used as the first character in square brackets, this
	inverts the characters matched.
/	Match "/". Note that it is inverted due to the preceding
	"^" and so therefore means match anything *except* "/".
+	Match the [] expression 1 or more times.
{}	Counter. How many times we should match.
1,3	Match between 1 and 3 times, inclusive.
$	Match end of line.

You can find more precise explanation in regex documentation; this is 
just off the top of my head. The most important thing with reading 
regexes is figuring out how they are grouped. The basic pattern here is 
the (foo){bar} which just means "match foo, bar number of times".

On 2014-11-24 at 10:58:24 Ron Houk wrote:
> Wow. Your solution is a lot more elegant. I'm still trying to learn
> this stuff. :)
> 
> On Nov 24, 2014 10:35 AM, "Daniel A. Ramaley"
> <daniel.ramaley at drake.edu>
> wrote:
> > This should work for your purposes if all the data looks like the
> > samples. Set the number between curly braces to whatever you need.
> > (Note that your sample data didn't have any matches for just 2
> > slashes, but does for 3 slashes.)
> > 
> >         egrep '^(/[^/]+){3}$'
> > 
> > On 2014-11-24 at 10:23:07 Todd Walton wrote:
> > > If I have a text file full of pathnames, like:
> > > 
> > > /var/log/folder1
> > > /var/log/folder2
> > > /home/todd/mydir
> > > /var/log/folder1/fileh
> > > /var/log/folder1/foldersub/fileh
> > > 
> > > ...etc, what's the regular expression to find where a string has
> > > exactly two (or however many) forward slashes to the left of it? 
> > > I
> > > have a 360,000 line list of path names, and I'd like to find where
> > > a
> > > certain string falls early in the path.  I'm really only
> > > interested
> > > in paths where it's in the top three or four directories.
> > > 
> > > --
> > > Todd
> > > _______________________________________________
> > > Cialug mailing list
> > > Cialug at cialug.org
> > > http://cialug.org/mailman/listinfo/cialug
> > 
> > __
> > Daniel A. Ramaley  |  Network Engineer 2
> > Drake Technology Services (DTS) | Drake University
> > 
> > T: +1 515 271-4540
> > F: +1 515 271-1938
> > E: daniel.ramaley at drake.edu
> > 
> > _______________________________________________
> > Cialug mailing list
> > Cialug at cialug.org
> > http://cialug.org/mailman/listinfo/cialug
__
Daniel A. Ramaley  |  Network Engineer 2
Drake Technology Services (DTS) | Drake University

T: +1 515 271-4540
F: +1 515 271-1938
E: daniel.ramaley at drake.edu



More information about the Cialug mailing list