Hey, Kevin. I guess I misled you last week when I suggested two
processes reading from the same pipe might work to implement a
queue. I thought the "read" calls in the script would do the right
thing, but that appears not to be the case.
The key to avoiding race conditions between co-operating shell
processes is to find some sort of operation that's "atomic" - that
does its work all in one shot so it won't be interrupted
mid-stream. It appears that "read" is atomic at the single
character level, not at the single line level as would be needed for
this queue to work properly. I recall now that on Unix/Linux
systems, the link system call, or the ln command is good for that.
If two processes try to make a hard link, of the same name, to the
same file without forcing it (-f), only one of them will succeed.
So, instead of implementing a queue of file names to process, you
could make both processes use the same file list but for each file
check if a) the file has not already been processed, and b) you can
link to the file, and if both conditions are met the loop can go
ahead and process that file. If I recall, the processing was to
compress each tar file, and you wanted two simultaneous processes to
speed up the task on a multi-core system. Each process could be
implemented something like the following:
ls /zonebackup/*tar | while read F; do
if [ ! -e "$F.gz" -a ! -e "$F.lock" ] && ln "$F"
"$F.lock" 2>/dev/null; then
gzip "$F"
rm -f "$F.lock"
fi
done
Theoretically, you should be able to run two (or more) of these and
they should compress all the files without getting in each other's
way. The loop won't even attempt the ln command if it sees the .gz
file or .lock link already there, but in the event of a race where
both processes decide almost simultaneously that the lock isn't
there, and both run the ln command, only one of the two ln's will
succeed, and only for that successful one will it go on to compress
the file and remove the lock link. The redirect to /dev/null is to
keep the losing ln command quiet. I haven't actually tested this
whole script, though. But I have successfully used ln in the past
for locking files in this manner.
The next step would be to find the optimum number of processes to
run before you lose any speedup of the overall job. This would
likely depend on the number of files, their sizes, and the number
and speed of your CPU cores. The faster a CPU core is, the more the
compression task goes from being CPU-bound to being I/O-bound, and
the less gain you get from more parallel processes. Compressing
lots of small files, rather than a few larger ones, would likely
skew things towards being more I/O-bound too, because of the extra
overhead of each gzip startup.
On 01/03/2014 4:48 PM, Kevin McGregor
wrote:
Aw. I was afraid of that. In my 'production' script
there would be a long time between most reads, so it's unlikely
there would be a problem, but I still don't want rare random
failures. I'll find a work-around.
--
Gilles R. Detillieux E-mail: <grdetil@scrc.umanitoba.ca>
Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/
Dept. Physiology, U. of Manitoba Winnipeg, MB R3E 0J9 (Canada)