Re: [RndTbl] fast counting with find

7 Nov 2011


      On 11/05/2011 05:11 PM, Adam Thompson wrote:
...
...
-----Original Message-----
From: roundtable-bounces@muug.mb.ca [mailto:roundtable-
bounces@muug.mb.ca] On Behalf Of Trevor Cordes
Sent: Saturday, November 05, 2011 10:00
To: MUUG RndTbl
Subject: [RndTbl] fast counting with find
I found myself needing a type of -limit -quit option in find.  I
couldn't
Why do I want to quit at all?  Why not just do find|wc -l?  The
dirs I'm
scanning have about 200k files and are sometimes over NFS.  Either
way, a
full find|wc takes a long time and a lot of resources, especially
if the
find has to do a stat (for mtime, etc).  With find|wc my 1 find
command
took 10+ mins.  With my new method, it's a few seconds.
Doesn't "	find /path -args | head -1000 | wc -l" give you nearly the same
result?  It may generate more disk i/o in the background (depending on
pipe buffering and signalling semantics) but should just as fast when used
interactively.
(For the pedantic among us, that should read "find /path -args -print |
head -n 1000 | wc -l" since direct specification of the line count to
head(1) in option-style syntax is deprecated in POSIX.)
I regularly use sed Nq (where N is a number) instead of head because sed 
100q is universal, and head sometimes requires -n and sometimes doesn't, 
and that's annoying.
It seems like limiting the number of matches may not be the goal after 
all, instead perhaps it would be better to limit the resources that find 
uses? e.g. with recent coreutils you can limit the time it runs with the 
timeout(1) command.
The difference becomes immediately obvious if you think about what 
option you'd like added to findutils to give the desired result, one 
which stops find when the number of matches is reached, or one which 
stops find after some number of paths are seen. Depending on the options 
supplied these two could be very different.
Peter

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

Re: [RndTbl] fast counting with find