Passwordless SSH can lock you out
If you follow standard security practices, you would not allow root
logins, let alone connections over SSH (as with Debian standard install). But this would deem your PVE unable to function properly, so you can only resort to fix your /etc/ssh/sshd_config
with the option:
PermitRootLogin prohibit-password
That way, you only allow connections with valid keys (not password). Prior to this, you would have copied over your public keys with ssh-copy-id
or otherwise add them to /root/.ssh/authorized_keys
.
But this has a huge caveat on any standard PVE install. When you examine the file, it is actually a symbolic link:
/root/.ssh/authorized_keys -> /etc/pve/priv/authorized_keys
This is because there’s already other nodes’ keys there to allow for cross-connecting - and the location is shared. This has several issues, most important of which is that the actual file lies in /etc/pve
which is a virtual filesystem
mounted only when all goes well during boot-up.
What could go wrong
If your /etc/pve
does not get mounted during bootup, your node will appear offline and will not be accessible over SSH, let alone GUI.
Warning
If accessing via other node’s GUI, you will get confusing Permission denied (publickey,password)
in the “Shell”.
You are essentially locked-out, despite the system otherwise booted up except for PVE services. You cannot troubleshoot over SSH, you would need to resort to OOB management or physical access.
This is because during your SSH connection, there’s no way to verify your key against the /etc/pve/priv/authorized_keys
.
Caution
If you allow root to authenticate also by password, it will lock you out of “GUI only”. Your SSH will not work - obviously - with key, but fallback to password prompt.
How to avoid this
You need to use your own authorized_keys
, different from the default that has been hijacked by PVE. The proper way to do this is define its location in the config:
cat > /etc/ssh/sshd_config.d/LocalAuthorizedKeys.conf <<< "AuthorizedKeysFile .ssh/local_authorized_keys"
If you now copy your own keys to /root/.ssh/local_authorized_keys
file (on every node), you are immune from this design flaw.
Tip
There are even better ways to approach this, e.g. SSH certificates, in which case you are not prone to encounter this bug for your own setup. This is out of scope for this post.
Alternatives
If you were planning on to use additional non-privileged user setup with sudo
, that is indeed a good alternative. Do note that PVE does not come with sudo
pre-installed and will nevertheless require root
allowed to login over SSH to preserve full features of the PVE stack
- and these would remain broken.
Due to the Proxmox stack setup, inaccessible SSH for root
user prevents you to e.g. troubleshoot failing services (when SSH is healthy) even from GUI shell of a healthy node. For this same reason, it is impossible to remove SSH access for root
account in Proxmox - which is also the only reason why this post “embraces” it. However, if you have another way in through other steps, it is just as good (the GUI path will still not work though).
Notes
As much as this post might appear to describe an infrequent issue, the failure of pve-cluster
service at boot (which needs to run also on standalone nodes) that causes the “lockout” is quite common side effect of e.g. networking misconfiguration or pmxcfs backend-database corruption. They are out of scope of this post, but happen definitely more often than just failing SSH, let alone networking as a whole - which of course would then anyhow required out-of-band (OOB) management approach. This post was also written with home systems in mindy - which do not have OOB/KVM or even rely entirely on GUI.