We’ve not experienced many problems over the years, but this one meant that we were offline for much longer than ever before and considerable sleep was lost.

The problem manifested itself after the server was restarted and ground to a halt, long before the webserver and database could be restarted.

It also happened at 19:30 on a Friday, so it took a while to get an answer from our hosting support team (their ticketing system also had a fault).

Once they logged the request they had hopped onto the system and had the problem identified and fixed within 30 minutes.

However that was 14:30 on the Sunday.

So big apologies for the downtime.

For the technically minded here’s what happened and how it was fixed.

Watching the server attempt to boot through the management interface, it looked to be dropping to a busybox shell with the following error:

Begin: Running /scripts/local-block ... mdadm: No arrays found in config file or automatically
mdadm: error opening /dev/md?*: No such file or directory
mdadm: No arrays found in config file or automatically
done.
mdadm: No arrays found in config file or automatically
done.
Begin: Running /scripts/local-block ... mdadm: No arrays found in config file or automatically
done.
Begin: Running /scripts/local-block ... mdadm: No arrays found in config file or automatically
done.
mdadm: No arrays found in config file or automatically
Begin: Running /scripts/local-block ... mdadm: No arrays found in config file or automatically
done.
done.
Gave up waiting for root file system device.  Common problems:
 - Boot args (cat /proc/cmdline)
   - Check rootdelay= (did the system wait long enough?)
 - Missing modules (cat /proc/modules; ls /dev)
ALERT!  UUID=65f98814-65b9-4d74-be2c-94e66d96d261 does not exist.  Dropping to a shell!

BusyBox v1.22.1 (Debian 1:1.22.0-19+b3) built-in shell (ash)
Enter 'help' for a list of built-in commands.

(initramfs)

Initially, this looked like it may have been due to the `/etc/fstab` entries, so commented these out in favour of specifying the `/dev/vdaX` partitions explicitly:

# cat /etc/fstab
#UUID=65f98814-65b9-4d74-be2c-94e66d96d261     /     ext4     errors=remount-ro,noatime,nodiratime   0     1
/dev/vda2     /     ext4     errors=remount-ro,noatime,nodiratime   0     1
#UUID=17a487ad-14d5-4683-896b-27b84dae634d     /boot ext3   errors=remount-ro   0     1
/dev/vda1     /boot ext3   errors=remount-ro   0     1

Attempting to reboot after making these changes didn’t seem to make any difference, so suspected it may be due to GRUB’s boot configuration instead. Attempting to issue `update-initramfs` returned the following error (using `-u` to update, and `-k all` for all kernels):

# update-initramfs -u -k all
update-initramfs: Generating /boot/initrd.img-4.9.0-12-amd64
grep: /proc/cpuinfo: No such file or directory
Warning: couldn't identify filesystem type for fsck hook, ignoring.
W: mdadm: /etc/mdadm/mdadm.conf defines no arrays.
W: mkconf: MD subsystem is not loaded, thus I cannot scan for arrays.
W: mdadm: failed to auto-generate temporary mdadm.conf file.
update-initramfs: Generating /boot/initrd.img-4.9.0-11-amd64
grep: /proc/cpuinfo: No such file or directory
Warning: couldn't identify filesystem type for fsck hook, ignoring.
W: mdadm: /etc/mdadm/mdadm.conf defines no arrays.
W: mkconf: MD subsystem is not loaded, thus I cannot scan for arrays.
W: mdadm: failed to auto-generate temporary mdadm.conf file.
update-initramfs: Generating /boot/initrd.img-4.9.0-9-amd64
grep: /proc/cpuinfo: No such file or directory
Warning: couldn't identify filesystem type for fsck hook, ignoring.
W: mdadm: /etc/mdadm/mdadm.conf defines no arrays.
W: mkconf: MD subsystem is not loaded, thus I cannot scan for arrays.
W: mdadm: failed to auto-generate temporary mdadm.conf file.
update-initramfs: Generating /boot/initrd.img-4.9.0-8-amd64
grep: /proc/cpuinfo: No such file or directory
Warning: couldn't identify filesystem type for fsck hook, ignoring.
W: mdadm: /etc/mdadm/mdadm.conf defines no arrays.
W: mkconf: MD subsystem is not loaded, thus I cannot scan for arrays.
W: mdadm: failed to auto-generate temporary mdadm.conf file.
update-initramfs: /boot/initrd.img-4.9.0-6-amd64 has been altered.
update-initramfs: Cannot update. Override with -t option.

Added the following entry to `/etc/mdadm/mdadm.conf` to ignore /dev/vda (since it’s a virtual rather than physical disk, we wouldn’t need mdadm):

# cat /etc/mdadm/mdadm.conf # mdadm.conf # # Please refer to mdadm.conf(5) for information about this file. # ARRAY <ignore> devices=/dev/vda

Then passed the `-t` flag to `update-initramfs` to force an overwrite:

# update-initramfs -u -k all update-initramfs: Generating /boot/initrd.img-4.9.0-12-amd64 update-initramfs: Generating /boot/initrd.img-4.9.0-11-amd64 update-initramfs: Generating /boot/initrd.img-4.9.0-9-amd64 update-initramfs: Generating /boot/initrd.img-4.9.0-8-amd64

Following this, rebooted the machine, which then looked to come back up successfully.

It’s difficult to say what originally caused the issue, but it may have been influenced by a recent kernel upgrade, since the server’s now running 4.9.0-12 which is fairly recent (7th June 2020), which would’ve affected the server’s initramfs entries.

Please let us know if you’re still seeing any issues!