Thanks, Trevor, for your useful comments.  As a result, I've spent some time in the PCRE regex documentation, and have discovered just how feeble the regex implementation is in my Vedit (no, not vi!) text editor.  Even tonight, I've run into more problems.
 
Other than the lousy regex implementation, though, Vedit has served me well continuously since 1982 (with a large number of upgrades of course).
 
Hartmut W Sager - Tel +1-204-339-8331


On Sun, 5 Jan 2020 at 04:10, Trevor Cordes <trevor@tecnopolis.ca> wrote:
On 2020-01-04 Hartmut W Sager wrote:
>
> It turns out, at least in this regex implementation, that a pair of
> enclosing parentheses can only serve one of two purposes, not both,
> at the same time.  Those two purposes are:
>
> 1.  Mark a group that can then be referred to by a variable like "\3"
> in the replacement string.
> 2.  Enclose a group with alternation (regex terminology) containing
> several alternatives separated by the "or" operator "|".

That's just plain evil.  Nasty!

The de facto standard is (obviously) PCRE and your program (you said
vi?) is obviously not PCRE.  I'd be shocked if vi doesn't offer you
some way to replace the regex engine?  Or at least out-source the regex
work to a filter?  Not sure, I don't use vi.

In PCRE each () serves both purposes, unless you use (?:) in which case
you only get purpose #2 (and save CPU cycles).

The others are correct, using \s in the right hand side is not PCRE.
In PCRE \s means "(most) any whitespace" in the regex, and will be just
"s" in the substitution.

PCRE = One Ring^H^H^H^HRegex to rule them all.  Most programs with
regex use the PCRE library now, or give the option, and if you always
use -P with grep you'll basically never have to touch another
substandard regex engine again! :-)  All the perl-haters might find it
amusing that they use "perl" on a daily basis because of PCRE :-)
(Well, sort of.)

> I am a bit suspicious of the ([0-9][0-9]|\s[0-9]) group re operator
> precedence of the "or"

In most (all?) regex engines (especially PCRE; but pretty sure all!)
the rule is "first, most".  So the order you put your alternates may
matter.  In the above case, order probably doesn't matter because
things surrounding that bit must be space/comma.  Order matters in
things where surrounding bits can match the same bits, and things like
eating escaped chars, like escaped double-quotes in CSVs:
/"(\\"|[^"])+"/ works, but
/"([^"]|\\")+"/ doesn't.

As always, the O'Reilly regex book is an amazing way to fully
understand exactly what is going on and will really open a lot of eyes!!
_______________________________________________
Roundtable mailing list
Roundtable@muug.ca
https://muug.ca/mailman/listinfo/roundtable