Today I finally switched my US server from lilo to grub as its bootloader.
The reason for doing it now, is that I usually do not have a remote IP console (KVM) to that physical server which is located in a US datacenter whereas I live in Europe. This server’s storage is configured as software RAID1 and has been running for years on a Slackware huge kernel, since I was unable to make a generic kernel plus initrd work. That was many years ago.
Now this server runs Slackware-current and the recent updates to Slackware-current include automatic recreation of an initrd, followed by an update of the Grub configuration when installing a new kernel. Essentially, when you have configured Grub to boot your machine, you won’t have to re-install the bootloader until the actual grub package is upgraded to a new version.
I wanted this switch from lilo to grub to ease future upgrades. But had to wait until the server owner was able to connect a KVM and gave me access. The reason for being hesitant to just install grub and reboot are obvious: what if the server won’t come up after boot, for whatever reason? I won’t be able to fix that remotely without access to the console and maybe the BIOS.
I did my research on installing Grub to a software RAID and in the course of that, discovered that Linux does not even read the file “/etc/mdadm.conf” anymore, because the RAID metadata is written to the disks themselves and mdadm extracts it on the fly. Anyway, I also learned that there’s a peculiarity that you need to take into account when using Grub to boot off a software RAID: you need to install Grub to both of the hard drives that together form the software RAID1. I thought, well I can do that!
Then I scribbeled down some possible iterations of the “grub-install” command that I would try until one actually worked. Why that doubt about a successful outcome? Obviously because I was bitten before! I have switched a Linode server from lilo to grub last year and the grub-install command crapped out with:
grub-install: warning: Embedding is not possible. GRUB can only be installed in this setup by using blocklists. However, blocklists are UNRELIABLE and their use is discouraged.. grub-install: error: will not proceed with blocklists
What I eventually needed to do was add “–force” to the grub-install command because apparently Grub works fine with blocklists despite the error. The command eventually became:
# grub-install --recheck /dev/sda --force
… and the Linode rebooted just fine into Grub. Goes to say, Grub is at least as finicky as lilo.
Of course, when you anticipate bad things to happen, you’re jinxed and bad things will happen.
When I was ready to start the switch from lilo to grub, had a KVM connected, installed the latest kernel-generic and let that generate an initrd.img and write a Grub configuration, I proceeded with:
# grub-install --recheck /dev/sda
What I did not anticipate was the resulting error:
Installing for i386-pc platform.
grub-install: warning: this GPT partition label contains no BIOS Boot Partition; embedding won’t be possible.
grub-install: error: embedding is not possible, but this is required for RAID and LVM install.
I had to search for that warning and the error to find out what was happening. Searching for the error message (last line) was not returning much useful information but when searching for the warning, I found the GRUB article on the Arch Linux Wiki (Arch to the rescue once again): GRUB#GUID_Partition_Table_(GPT)_specific_instructions
It is a really informative and instructive article! Basically, my disks are formatted as GPT and not equipped with a MBR. Grub’s requirements are vastly different in those two cases:
- When installing to a MBR disk, it squeezes the ‘core.img’ code in the post-MBR gap, between byte number 512 and the start of the first partition. Now it also makes sense why disk partitioning utilities start the first partition at sector 2048 which leaves almost 1 MB for the Grub bootloader.
- When installing to a GPT disk, Grub wants to install its ‘core.img’ to a separate “BIOS Boot Partition” (partition number ‘EF02’)
Oops… my disk was already partitioned, not leaving any space and was already running as ‘us.slackware.nl’ for years! I could not simply erase the disks and start from scratch.
Luckily the Arch Wiki article pointed out that this BIOS Boot partition can still be created, in the space before the first partition. Modern disk partition tools, as said earlier, create the first partition at sector 2048. The trick is to use gdisk (because this is a GPT disk) and prompt it to create a new partition. Gdisk will tell you that it can create one that starts at sector 34 and will end at sector 2047. Perfect for our needs.
So, I created that partition and assigned partition type ‘EF02’ to it (equivalent to GUID ‘21686148-6449-6E6F-744E-656564454649’ and skipped creating a filesystem on it – not needed.
It turns out that this partition can be in any position order as long as it is on the first 2 TiB of the disk. In my case, it became partition number ‘4’ even though it’s physically located at the start of the disk.
After creating the partition, I re-ran the grub-install commands:
# grub-install --recheck /dev/sda # grub-install --recheck /dev/sdb
There were no errors, I double-checked my KVM access and rebooted. The server came back up without a hickup.
I hope this write-up will help someone somewhere in the future.
Cheers, Eric
Recent comments