On 2020-08-04 1:28 p.m., Gilbert E. Detillieux wrote:
On 2020-08-04 12:55 p.m., Adam Thompson wrote:
I can't remember, did you try disabling HW offload on both sending and receiving ends already? (Either end could trigger the SSH abort.)
I hadn't yet. (I was trying a few other things first, such as changing MAC algorithms, and rebooting with the older kernel, neither of which seemed to affect things.)
I've now disabled both rx and tx checksum offloading. We'll see if that makes a difference.
So, after almost 6 days running with rx and tx checksum offloading disabled, not a single "Corrupted MAC on input" error! My overnight rsync now runs to completion.
I hope this isn't premature, but I think we found the problem! (Who would have thought it could make such a difference?!)
It also doesn't seem to have caused a noticeable performance hit. I'm thinking we were disk I/O bound on the remote (receiving) end, anyway, so if the network I/O is a bit slower, we wouldn't see it.
Thanks, everyone, for your suggestions!
Gilbert