I'm running into a strange race condition, that appears to me to be a Linux bug of some sort.
I'm doing, perl pseudocode:
system "process-file outputfileprefix" # does work and puts it in files named outputfileprefix-0, outputfileprefix-1, etc
sleep 1; if (!<outputfileprefix-[0-9]*>) { warn "try #2\n"; sleep 2; if (!<outputfileprefix-[0-9]*>) { warn "try #3\n"; die; } }
# do something on outputfile*
For those that don't know <> is perl's glob op, which simply returns an array of all the files matching the glob. !<> is thus true if no files match the glob.
What is happening is that 10-30% of the time, I get a "try #2" output. I haven't yet seen a try #3. Of course, having those retries (and sleeps) in there at all should not be required: I had to add them as I was seeing this program blow up in unexpected ways.
process-file does not do anything async'ly, AFAIK. The final thing it does, the output that I need, is use the GD library to write a png file. Only after that does process-file exit. There are no threads or forks, unless GD is doing one, but even then the mini-program above should not return from system until all threads and (non-daemonized) forks are done.
It appears the problem is process-file writes and closes a file, returns, yet the directory entry doesn't become visible to the calling script for 0 to 3 seconds! I was under the impression that such UNIX OS actions were guaranteed to occur reliably in sequence!
Note, the fs I'm using is local and normal harddisk-based. It is not on NFS or SMB, which of course could show such results.
Comments? Ideas?
I would write a small test case in bash. Not hating on perl, but if it also exhibits those symptoms, rules perl out completely.
Also post bash script if you do end up doing this -I wouldn't mind taking a stab at it.
What file system are you using?
You could run strace on process-file which should tell you if it is indeed doing something funny..
My guess is perl specific; I'd test without it and replicate.
Rob On 2014-02-01 3:10 PM, "Trevor Cordes" trevor@tecnopolis.ca wrote:
I'm running into a strange race condition, that appears to me to be a Linux bug of some sort.
I'm doing, perl pseudocode:
system "process-file outputfileprefix" # does work and puts it in files named outputfileprefix-0, outputfileprefix-1, etc
sleep 1; if (!<outputfileprefix-[0-9]*>) { warn "try #2\n"; sleep 2; if (!<outputfileprefix-[0-9]*>) { warn "try #3\n"; die; } }
# do something on outputfile*
For those that don't know <> is perl's glob op, which simply returns an array of all the files matching the glob. !<> is thus true if no files match the glob.
What is happening is that 10-30% of the time, I get a "try #2" output. I haven't yet seen a try #3. Of course, having those retries (and sleeps) in there at all should not be required: I had to add them as I was seeing this program blow up in unexpected ways.
process-file does not do anything async'ly, AFAIK. The final thing it does, the output that I need, is use the GD library to write a png file. Only after that does process-file exit. There are no threads or forks, unless GD is doing one, but even then the mini-program above should not return from system until all threads and (non-daemonized) forks are done.
It appears the problem is process-file writes and closes a file, returns, yet the directory entry doesn't become visible to the calling script for 0 to 3 seconds! I was under the impression that such UNIX OS actions were guaranteed to occur reliably in sequence!
Note, the fs I'm using is local and normal harddisk-based. It is not on NFS or SMB, which of course could show such results.
Comments? Ideas? _______________________________________________ Roundtable mailing list Roundtable@muug.mb.ca http://www.muug.mb.ca/mailman/listinfo/roundtable
Also http://perldoc.perl.org/functions/system.html suggests using backticks to get output rather than system. On 2014-02-01 4:15 PM, "Robert Keizer" robert@keizer.ca wrote:
I would write a small test case in bash. Not hating on perl, but if it also exhibits those symptoms, rules perl out completely.
Also post bash script if you do end up doing this -I wouldn't mind taking a stab at it.
What file system are you using?
You could run strace on process-file which should tell you if it is indeed doing something funny..
My guess is perl specific; I'd test without it and replicate.
Rob On 2014-02-01 3:10 PM, "Trevor Cordes" trevor@tecnopolis.ca wrote:
I'm running into a strange race condition, that appears to me to be a Linux bug of some sort.
I'm doing, perl pseudocode:
system "process-file outputfileprefix" # does work and puts it in files named outputfileprefix-0, outputfileprefix-1, etc
sleep 1; if (!<outputfileprefix-[0-9]*>) { warn "try #2\n"; sleep 2; if (!<outputfileprefix-[0-9]*>) { warn "try #3\n"; die; } }
# do something on outputfile*
For those that don't know <> is perl's glob op, which simply returns an array of all the files matching the glob. !<> is thus true if no files match the glob.
What is happening is that 10-30% of the time, I get a "try #2" output. I haven't yet seen a try #3. Of course, having those retries (and sleeps) in there at all should not be required: I had to add them as I was seeing this program blow up in unexpected ways.
process-file does not do anything async'ly, AFAIK. The final thing it does, the output that I need, is use the GD library to write a png file. Only after that does process-file exit. There are no threads or forks, unless GD is doing one, but even then the mini-program above should not return from system until all threads and (non-daemonized) forks are done.
It appears the problem is process-file writes and closes a file, returns, yet the directory entry doesn't become visible to the calling script for 0 to 3 seconds! I was under the impression that such UNIX OS actions were guaranteed to occur reliably in sequence!
Note, the fs I'm using is local and normal harddisk-based. It is not on NFS or SMB, which of course could show such results.
Comments? Ideas? _______________________________________________ Roundtable mailing list Roundtable@muug.mb.ca http://www.muug.mb.ca/mailman/listinfo/roundtable
Very strange. I try to stay away from perl, but it looks like you should never get “try #2”.
Both system and backquote should block until the process is finished, then return to the script.
Is it possible that the main perl process has a copy of the directory tree that is stale? What would happen if you did system “ls” or something similar?
-Dan
On Feb 1, 2014, at 3:10 PM, Trevor Cordes trevor@tecnopolis.ca wrote:
I'm running into a strange race condition, that appears to me to be a Linux bug of some sort.
I'm doing, perl pseudocode:
system "process-file outputfileprefix" # does work and puts it in files named outputfileprefix-0, outputfileprefix-1, etc
sleep 1; if (!<outputfileprefix-[0-9]*>) { warn "try #2\n"; sleep 2; if (!<outputfileprefix-[0-9]*>) { warn "try #3\n"; die; } }
# do something on outputfile*
For those that don't know <> is perl's glob op, which simply returns an array of all the files matching the glob. !<> is thus true if no files match the glob.
What is happening is that 10-30% of the time, I get a "try #2" output. I haven't yet seen a try #3. Of course, having those retries (and sleeps) in there at all should not be required: I had to add them as I was seeing this program blow up in unexpected ways.
process-file does not do anything async'ly, AFAIK. The final thing it does, the output that I need, is use the GD library to write a png file. Only after that does process-file exit. There are no threads or forks, unless GD is doing one, but even then the mini-program above should not return from system until all threads and (non-daemonized) forks are done.
It appears the problem is process-file writes and closes a file, returns, yet the directory entry doesn't become visible to the calling script for 0 to 3 seconds! I was under the impression that such UNIX OS actions were guaranteed to occur reliably in sequence!
Note, the fs I'm using is local and normal harddisk-based. It is not on NFS or SMB, which of course could show such results.
Comments? Ideas? _______________________________________________ Roundtable mailing list Roundtable@muug.mb.ca http://www.muug.mb.ca/mailman/listinfo/roundtable
Dan Martin GP Hospital Practitioner Computer Scientist ummar143@shaw.ca (204) 831-1746 answering machine always on
Doh! Hitting the docs again (RTFM!) I discovered perl's <> globbing is now (as of 5.something) handled internally by File::Glob. Poking around I started to see the light. Globbing can be used in array *and* scalar context (as with most perl functions). Perl was "guessing" I was asking for scalar, when I really wanted array, probably because I was immediately negating the result. Doh.
In scalar context, globbing magically acts as an iterator. If the glob string doesn't change, then it maintains an internal pointer, iterates over each file found by the glob then finishes by returning undef once it's out of files.
So, my program was really foobar in terms of how I was using the glob to check for the error case of no files being output by my other script.
Since my other script creates on each run between 1 and 8 output files, that explains the "random" behavior I was seeing. It probably ran the glob the first time, say on 5 files, cached the glob results and simply returned the next cached file entry on each iteration. Once all 5 were exhausted, it returned undef and did the "try #2". At least I think that's what was happening. I did just yesterday see a "try #3" and I can't really think how that would ever occur, as an undef first glob should "reload" on the 2nd glob.
The bigger problem was that the above problem would mean that in most cases my sanity check that >0 files were output would no work a certain percentage of the time.
Thanks all. As usual, talking through the problem leads one to the solution.
Aside: as for backticks vs system, I didn't need the output so I used system. As I get older, I hate backticks more an more as they can be susceptible to injection attacks unless you're very very careful about the var interpolation going on. I really love the list-form of system, which gets no shell interpolation and thus is guaranteed shell-safe: system 'ps','-f',1;
I forgot to post what *does* work. There are a few different ways.
You can force the glob to return a list (hence no caching/iterating):
if (!@{[<$prefix-*>]}) {
That's sure ugly, but it makes sense. [] forces the <> to return list and turns it into an anon array-ref. @ then derefs. {} is required by the parser to make sense of it all. !@ then does its normal thing of meaning "if there are zero elements in this array".
You could also force it to list by assigning to an array:
if (!(@a=<$prefix-*>)) {
But I hate using temp vars when we don't need to; when the results are throwaway.
Strangely, using the normal "we want a list" syntax of just putting () around the expression doesn't work in this case. I'm not sure why:
if (!(<$prefix-*>)) { # doesn't work
Also strangely, while perl has a "scalar" function to force a scalar (non-list) result, it doesn't have a "list" or "array" function to force a list. The scalar docs say list isn't necessary as it's usually the default/implied or can be forced with (). Woops, looks like I found a corner case.
Ya, I could have been certain everything worked in the first place by doing opendir and iterating through the results, like one would in C, but the whole purpose of using perl is to do as much as possible in as few characters and lines. Laziness is the #1 programmer virtue, according to L.Wall and R.Schwarz.
While I will always manage my life by the three great virtues, I must thank you for reminding me why I gave up slinging Perl. DWIM and all the sigils drive a man to an early grave, or at least a lot of nights debugging.
Sean
On Sun, Feb 2, 2014 at 1:10 PM, Trevor Cordes trevor@tecnopolis.ca wrote:
I forgot to post what *does* work. There are a few different ways.
You can force the glob to return a list (hence no caching/iterating):
if (!@{[<$prefix-*>]}) {
That's sure ugly, but it makes sense. [] forces the <> to return list and turns it into an anon array-ref. @ then derefs. {} is required by the parser to make sense of it all. !@ then does its normal thing of meaning "if there are zero elements in this array".
You could also force it to list by assigning to an array:
if (!(@a=<$prefix-*>)) {
But I hate using temp vars when we don't need to; when the results are throwaway.
Strangely, using the normal "we want a list" syntax of just putting () around the expression doesn't work in this case. I'm not sure why:
if (!(<$prefix-*>)) { # doesn't work
Also strangely, while perl has a "scalar" function to force a scalar (non-list) result, it doesn't have a "list" or "array" function to force a list. The scalar docs say list isn't necessary as it's usually the default/implied or can be forced with (). Woops, looks like I found a corner case.
Ya, I could have been certain everything worked in the first place by doing opendir and iterating through the results, like one would in C, but the whole purpose of using perl is to do as much as possible in as few characters and lines. Laziness is the #1 programmer virtue, according to L.Wall and R.Schwarz. _______________________________________________ Roundtable mailing list Roundtable@muug.mb.ca http://www.muug.mb.ca/mailman/listinfo/roundtable