On 2010-11-10 Sean Walberg wrote:
Adam and I were having an offline discussion, and some testing shows that AWK outperforms SED by a slight margin:
I know it's an old thread... but I had to have a go at you awk/sed weenies. ;-)
My solution is perl regex:
perl -e '$/=undef;open I,$ARGV[0];$_=<I>;/(?:^|\n)(output start\n.*\noutput end\n)/s and print $1' infile
It's not a filter (requires a filename) but could probably easily be made into one.
I recall reading in perl books that perl regex was faster than sed/awk and the above takes advantage of the slurp-whole-file that $/ allows.
On my computer the awk/sed/perl times compare like so:
time sed -n '/output start/,/output end/p' < infile > /dev/null 0.264+0.002c 0:00.26s 100.0% 0+0<774k | 1+39cs 0+259pg 0sw 0sg
time awk '/output start/,/output end/' < infile > /dev/null 0.183+0.003c 0:00.18s 100.0% 0+0<774k | 1+28cs 0+298pg 0sw 0sg
time perl -e '$/=undef;open I,$ARGV[0];$_=<I>;/(?:^|\n)(output start\n.*\noutput end\n)/s and print $1' infile > /dev/null 0.032+0.017c 0:00.05s 80.0% 0+0<8168k | 1+19cs 0+4196pg 0sw 0sg
Wow! But yikes, look at the mem usage. Good thing RAM is plentiful these days. In 1980 sed would be the better bet for sure.
[sean@bob tmp]$ W=/usr/share/dict/words [sean@bob tmp]$ (tail -1000 $W; echo output start; cat $W; echo output end; head -1000 $W) > infile [sean@bob tmp]$ wc -l infile 481831 infile