In this In-the-Lab segment we’re going to look at how to recover from a failed ZFS version update in case you’ve become ambitious with your NexentaStor installation after the last Short-Take on ZFS/ZPOOL versions. If you used the “root shell” to make those changes, chances are your grub is failing after reboot. If so, this blog can help, but before you read on, observe this necessary disclaimer:
NexentaStor is an appliance operating system, not a general purpose one. The accepted way to manage the system volume is through the NMC shell and NMV web interface. Using a “root shell” to configure the file system(s) is unsupported and may void your support agreement(s) and/or license(s).
That said, let’s assume that you updated the syspool filesystem and zpool to the latest versions using the “root shell” instead of the NMC (i.e. following a system update where zfs and zpool warnings declare that your pool and filesystems are too old, etc.) In such a case, the resulting syspool will not be bootable until you update grub (this happens automagically when you use the NMC commands.) When this happens, you’re greeted with the following boot prompt:
Grub is now telling you that it has no idea how to boot your NexentaStor OS. Chances are there are two things that will need to happen before your system boots again:
- Your boot archive will need updating, pointing to the latest checkpoint;
- Your master boot record (MBR) will need to have grub installed again.
We’ll update both in the same recovery session to save time (this assumes you know or have a rough idea about your intended boot checkpoint – it is usually the highest numbered rootfs-nmu-NNN checkpoint, where NNN is a three digit number.) The first step is to load the recovery console. This could have been done from the “Safe Mode” boot menu option if grub was still active. However, since grub is blown-away, we’ll boot from the latest NexentaStor CD and select the recovery option from the menu.
Import the syspool
Then, we login as “root” (empty password.) From this “root shell” we can import the existing (disks connected to active controllers) syspool with the following command:
# zpool import -f syspool
Note the use of the “-f” card to force the import of the pool. Chances are, the pool will not have been “destroyed” or “exported” so zpool will “think” the pool belongs to another system (your boot system, not the rescue system). As a precaution, zpool assumes that the pool is still “in use” by the “other system” and the import is rejected to avoid “importing an imported pool” which would be completely catastrophic.
With the syspool imported, we need to mount the correct (latest) checkpointed filesystem as our boot reference for grub, destroy the local zfs.cache file (in case the pool disks have been moved, but still all there), update the boot archive to correspond to the mounted checkpoint and install grub to the disk(s) in the pool (i.e. each mirror member).
List the Checkpoints
# zfs list -r syspool
From the resulting list, we’ll pick our highest-numbered checkpoint; for the sake of this article let’s say it’s “rootfs-nmu-013″ and mount it.
Mount the Checkpoint
# mkdir /tmp/syspool
# mount -F zfs syspool/rootfs-nmu-013 /tmp/syspool
Remove the ZPool Cache File
# cd /tmp/syspool/etc/zfs
# rm -f zpool.cache
Update the Boot Archive
# bootadm update-archive -R /tmp/syspool
Determine the Active Disks
# zpool status syspool
For the sake of this article, let’s say the syspool was a three-way mirror and the zpool status returned the following:
scan: resilvered 8.64M in 0h0m with 0 errors on Tue Nov 16 12:34:40 2010
NAME STATE READ WRITE CKSUM
syspool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
c6t13d0s0 ONLINE 0 0 0
c6t14d0s0 ONLINE 0 0 0
c6t15d0s0 ONLINE 0 0 0
errors: No known data errors
This enumerates the three disk mirror as being composed of disks/slices c6t13d0s0, c6t14d0s0 and c6t15d0s0. We’ll use that information for the grub installation.
Install Grub to Each Mirror Disk
# cd /tmp/syspool/boot/grub
# installgrub -f -m stage /dev/rdsk/c6t13d0s0
# installgrub -f -m stage /dev/rdsk/c6t14d0s0
# installgrub -f -m stage /dev/rdsk/c6t15d0s0
Unmount and Reboot
# umount /tmp/syspool
Now, the system should be restored to a bootable configuration based on the selected system checkpoint. A similar procedure can be found on Nexenta’s site when using the “Safe Mode” boot option. If you follow that process, you’ll quickly encounter an error – likely intentional and meant to elicit a call to support for help. See if you can spot the step…