[mdlug] UNIX tips: Learn 10 good UNIX usage habits
Michael Corral
micorral at comcast.net
Tue Mar 11 14:44:59 EDT 2008
2008-03-11, Monsieur Robert Citek a ecrit:
> (
> mkdir -p tmp/a/
> tmpfile=tmp/a/longfile.txt
> for COUNT in 2 20000000; do
> yes and | head -$COUNT > $tmpfile
> time -p grep -c and $tmpfile
> time -p < $tmpfile grep -c and
> time -p cat $tmpfile | grep -c and
> done \
>> & output.txt
> )
Here are my results (this is on an AMD X2 3700+ 2.2GHz w/1GB RAM, in
Fedora 7 with a 2.6.24.2 kernel):
$ yes and | head -20000000 > tmp/a/longfile.txt
$ time grep -c and tmp/a/longfile.txt
20000000
real 0m1.848s
user 0m1.736s
sys 0m0.053s
$ time cat tmp/a/longfile.txt | grep -c and
20000000
real 0m1.977s
user 0m1.802s
sys 0m0.104s
So the one grep beats the cat+pipe to grep.
Thank you for helping me to make my point, I appreciate it. :)
> 1) For the 95% of the time where cat+pipe is inefficient, it doesn't
> matter. So, don't worry about it.
I recall saying that it was *unnecessary* 95% of the time, but who
said that it was *inefficient* 95% of the time? There's a difference,
you know. ;)
> 2) Do the experiment yourself to verify the data. Don't believe
> everything you read on the internet.
Absolutely, couldn't agree more.
> 3) 87.4% of all statistics are made up on the spot.
I'd agree with that about 79.6% of the time.
Out of curioisty, I appended the string "lametest" twice to that
longfile.txt file, then grepped for "me" both ways:
$ time grep -c me tmp/a/longfile.txt
2
real 0m0.243s
user 0m0.180s
sys 0m0.045s
$ time cat tmp/a/longfile.txt | grep -c me
2
real 0m0.316s
user 0m0.196s
sys 0m0.102s
Single grep wins again.
Of course, real-world tests would be preferable.
Michael
More information about the mdlug
mailing list