OK, this is a new one...
Updated my kernel and rebooted tonight. Everything came up fine.
Except dovecot (IMAP server). It complained:
Error: service(imap-login): listen(*, 993) failed: Address already in use
(993=imaps)
OK. Except:
#netstat -tulpn | grep 993
<nothing>
#lsof -i:993
<nothing>
#ss -t -l 'sport = 993'
<nothing>
A wasted hour later I did:
netstat -n |grep 993
Sure enough I had a tcp connection between my workstation and my file
server on 993 to 2049. grep 2049 /etc/services... eureka!
2049 is nfs.
>From what I've figured out, nfs, which I have mount something on boot,
randomly grabbed 993 before dovecot did, and was holding it as long as the
nfs fs was mounted locally.
And because nfs is in-kernel it didn't show up in lsof et al.
I umounted it, waited for TIME_WAIT to expire and then I could start
dovecot.
*** This has never happened before on dozens of reboots... I guess it's
pretty much luck / race condition what port nfs gets, especially with
systemd possibly trying to do the mount and/or dovecot in parallel. I
have no idea when it tries to do the mounts, but I'm guessing before
dovecot.
Note, I'm using NFSv4 which (unlike older NFS) uses TCP stateful in a
persistent connection. Pretty much everything you know about v2/v3 throw
away when it comes to using v4.
I poked around nfs docs and google and found this option:
resvport / noresvport
Specifies whether the NFS client should use a privileged source port when
communicating with an NFS server for this mount point. If this option is
not specified, or the resvport option is specified, the NFS client uses a
privileged source port. If the noresvport option is specified, the NFS
client uses a non-privileged source port. This option is supported in
kernels 2.6.28 and later.
Refer to the SECURITY CONSIDERATIONS section for important details.
OK, down there is a section:
Using non-privileged source ports
Which basically says using privileged ports is for security so normal
users can't make/fake their own daemons and pretend to be any user they
want. OK, so I know that's a bit hokey from a sec standpoint, but it's
better than nothing, I guess. Regardless, having nfs client pick a random
high port might hurt me also because I use some of them for my own
purposes/daemons and I need them open also.
The docs say this:
"The exact range of privileged source ports that can be chosen is set by a
pair of sysctls to avoid choosing a well-known port, such as the port used
by ssh."
Which would be perfect, except I can't find any further reference to these
magic sysctls. I tried searching my box's /proc and /sys fs's for nfs &
ports, nfs & priv, nfs & mount, etc. Couldn't find anything relevant.
It would be really nice if I could specify the local source port, or at
least specify the list of no-no ports using this elusive, promised,
"sysctl". Anyone have any ideas?
What a strange bug to run into. Chance of 1 in 1024 and it had to hit me.