On 2017-07-11 Kevin McGregor wrote:
Here's another attempt to send this out:
I copied a bunch of stuff with rsync from a Linux system to a BSD system. I'm fairly sure it worked fine, but as a check, I'd like to compare the total number of bytes in files on both systems. du -sk <dir> produces slightly different results on both systems (smaller on dest)
Ya, du will work on blocks, not bytes. Linux du has a --bytes (-b) option which should give you what you're looking for. On BSD if you can get a GNU du Bob's your uncle.
I also tried out find <dir> -type f -print0 | xargs -0 stat --format=%s | awk '{s+=$1} END {print s}' on Linux and find rsync -type f -print0 | xargs -0 stat -f %Dz | awk '{s+=$1} END {print s}'
GNU find can do a format %s also then you can skip the extra xargs pipe bit: but only if BSD has the GNU find <grin>
Otherwise your code above looks sane to me.
Instead of summing, I would dump the output to a file on each host, sort it, and run diff on it. (You could alter the stat/find to output (relative) filename too to spot the offender.)
Even better, have find exec md5sum (or another hash) on every file, and output it all to a file with the filename, sort, and diff. That will catch bitflip type errors. However, on a big set of files you'll need to go grab a coffee.
I have a perl script I wrote ages ago to compare two (possibly) remote dir trees based on multiple criteria and bring up results in meld (GUI diff), I can email you off-list if you like. However, not sure if it will be portable to BSD as I just call find/etc under the hood. Never did add in hash support but I think I might now!