It happens to the best of us from time to time: that server that’s been quietly churning away in the corner for months (years?) with little or no issues finally has something go wrong. It should be a pretty easy fix but when you try to do so you suddenly realize “I don’t know the password”.
With more and more applications tying into AD (or, least, a directory of some sort), this is becoming less of an issue but it still comes up occasionally. What can be even worse is if the machine in question is a hypervisor, meaning the impact of such an issue just got compounded several fold as every guest on that machine is affected as well. Thankfully, if the machine is an ESXi server, there’s a fairly easy way to fix this using tools freely available online.
One note before the how-to:
This is NOT supported by VMware. Be careful, change only the absolute minimum needed, and make sure you have good backups before you do anything.
- Make sure you have good backups of all your guests and power them off.
Since you don’t have access to the ESXi console or vSphere client, you aren’t going to be able to power off the hypervisor gracefully, which means no VMware tools shutdown for the guests.
- Once all the machines are down (remember, Windows machines may still be up and install updates, even if you can’t ping them), power off the server.
- Power on the server and boot off your Linux cd.
- This is the tricky part. Your live system probably did not mount the ESXi system partitions automatically, so you’ll need to review the `dmesg` output and the disks it discovered for FAT16 volumes. There will be several of them but only one should have a file “state.tgz”. That’s the one you need.
- Copy state.tgz to an empty folder on a separate disk (the local RAM disk should be fine), and extract it:
# tar zxf state.tgz
- Inside is another gzip’d tarball, local.tgz; extract this as well.
# tar zxf local.tgz
- You should now have 3 files in the current folder:
- Like any other *nix, there is a file “shadow” in that etc/ directory with username and crypt’d passwords. Edit it and remove everything between first and second colon on the line starting “root:”. This effectively gives the root user no password at all.
- Rebuild the state.tgz file with the updated files:
# tar zcf local.tgz etc/ # tar zcf start.tgz local.tgz
- Go back to where the ESXi partition was mounted and backup the old state file:
# mv state.tgz state.tgz.old
- Copy the new state file from your working directory to the ESXi partition and reboot. You should now be able to login on the console, and through the vSphere client, as root by leaving the password blank.