Installing Debian 9 (Stable) on Dell Gen 14 servers with Perc H740P RAID Controllers

Posted by

Dell’s 14th Generation servers have the option of shipping with Perc H740P RAID controllers. These are stonkingly good, as compared to the old H730 controllers, sporting 8GB of NVRAM as compared to the old 2GB. However, Linux kernel 4.14 is the earliest kernel with driver support and Debian 9.3 (at time of writing) runs 4.9.0-5. Debian can and does backport drivers into stable kernels but, at time of writing, they haven’t done so. I suspect that’s why you’re here reading this!

Foreword – Update your firmware!

[2018] These controllers are very new and, clearly, very unproven! Before you start, ensure they are running the latest firmware. We have had multiple perfectly working SSDs rejected from an array under heavy read load to the point that the array failed completely. Dell claim the latest firmware resolves it. We’ll see. If I’m honest, I regret buying 25 of the H740P and should probably have gotten the H730 until the issues with the H740P are ironed out. But we live and learn…

[May 2020] We’ve seen a few different firmware related issues on these controllers (regardless of OS) over the past few years. The latest firmware as of May 2020 seems pretty stable but it’s been a rocky ride so far. That said, the controllers have been great in terms of performance… just the reliability leaves something to be desired.

Anyway. Back to drivers. You have a few options…

Option 1: Use Debian 10

Debian strongly discourage the use of their “Testing” versions in production, though if you’re running a staging server (or you’ve got massive balls)… maybe it’s an option. Debian 10 works fine, at time of writing, and I suspect it will continue to do so.

Debian 10 is now the latest stable version and you should be using it. That makes this blog post somewhat redundant so you might as well stop reading now 😀

Option 2: Install Debian on a disk not behind the RAID controller

As far as I’m aware, this isn’t how Dell ships servers with RAID controllers. The entire HBA is connected to the controller. If you can bodge in a disk or two that is not connected to the H740P, you can install Debian onto that. Once Debian is booted, you can either install a later kernel from Backports or you can compile and load the driver from Dell (see instructions below).

Once you’ve gotten Debian installed and able to detect arrays behind the RAID controller, it’s feasible that you can use something like an Ubuntu live CD (something with a Kernel new enough to use the Perc 740P) to image your disk onto the RAID array and then boot from that. I’ve not tried this, but it’s probably possible.

Option 3: Install Debian 9 with the driver supplied by Dell

Dell supply drivers for Linux. They are listed as RPMs inside .tar.gz files for RHEL and SUSE. The SUSE driver works on Debian 9 – I’ve not tried the RHEL driver.

Download the .tar.gz (e.g. UnifiedDriver_7.700.52.00_SLES12.tar.gz) from Dell. Open it with 7zip, as that will allow you to browse through the many layers of archive. Do this:

  • Open the .tar.gz with 7zip
  • Open the tar
  • Open the folder
  • Open the -src RPM
  • Open the .cpio
  • Open the .tar.bz2
  • Open the .tar
  • Open the folder

Herein lie the source files!

Compilation

You’ll need to compile these on a Debian machine with an identical version of the kernel to that on which you wish to run the drivers. If you wish to install Debian from USB, Disk, etc. then you’ll need to load the drivers into the installer. As such, you’ll need to make sure that the kernel version you’re running is the same as the installer. To be safe, download a fresh install CD from the Debian website and use this to create your server and your compile machine. The compile machine can be a VirtualBox VM, or similar. It doesn’t need to have a Perc controller.

Your compile machine will need the Linux Kernel headers.

apt-get install linux-headers-$(uname -r)

Get a copy of the source files onto your compile machine and `cd` into your source directory.

Run the compile script:

./compile.sh

This will create a file called megaraid_sas.ko. That is your driver. You can copy this off your compile machine.

Loading driver such that the Debian installer can “see” your RAID array

If you have physical access to the server, put this on a USB stick. If you don’t, the easiest way to use this in the Debian installer on a Dell is to make it into a floppy disk image and mount that with iDrac. You can use MagicISO on Windows for this. Open MagicISO and:

File→New→Disk Image→2.88MB. Drag and drop your megaraid_sas.ko into the file pane on the right. Click File→Save As and save the file as something like driver.img.

The driver.img can be remotely loaded into iDrac (assuming iDrac Enterprise).

Start the Debian installer, ensuring that you do a Graphical Installation.

After network setup, hit Ctrl-Alt-F2 to enter a shell.

Find the device name of the USB or floppy disk image using fdisk (e.g. /dev/sdb). You can identify it based on size (floppy disk will be about 3MB).

fdisk -l

Mount the disk (changing the device name and filesystem type appropriately. ext2 is right for the floppy image):

mount /dev/sdb /mnt -t ext2

Copy the driver across:

cp /mnt/megaraid_sas.ko /lib/modules/$(uname -r)/kernel/drivers/scsi/megaraid/

Load the module:

modprobe megaraid_sas

Unmount the USB/floppy disk:

umount /mnt

Hit Ctrl-Alt-F5 to get back to the GUI.

Loading the driver into the installed OS, such that it boots correctly

Once you’ve installed Debian and are still in the installer, on last screen before reboot, hit Ctrl-Alt-F2 to enter shell again.

Load the module into initramfs and update:

cp /lib/modules/$(uname -r)/kernel/drivers/scsi/megaraid/megaraid_sas.ko /target/lib/modules/$(uname -r)/kernel/drivers/scsi/megaraid/
chroot /target
echo megaraid_sas >> /etc/initramfs-tools/modules
update-initramfs -u
exit

Reboot, and Debian should boot.

When you update your kernel

On every kernel update, you’ll need to recompile and install the driver. DKMS can do this for you, but here’s the manual instructions. It’s really important that you do this before rebooting with the new kernel else it will not boot. You also won’t be able to boot older kernels… so beware!

Compile the driver against the kernel that you have upgraded to, as per the instructions above. You can do this on the server itself or on a separate compilation machine, as long as the right kernel headers are installed.

Copy the driver to the server and `cd` into its directory. Now copy it to the right place and update initramfs:

cp ./megaraid_sas.ko /lib/modules/$(uname -r)/kernel/drivers/scsi/megaraid/
update-initramfs -u

You can now reboot into the new kernel.

I updated my kernel and rebooted already, now it’s broken

Oh dear. You can fix this by booting something like an Ubuntu Live CD (i.e. something that has the drivers for the controller already). In this, you can mount the partitions from your Debian install. If you have multiple partitions, ensure that at least /, /boot and /var are mounted.

You can now chroot into your root mount (e.g. chroot /mnt/debian) and then following instructions above (copy the driver and update initramfs).

Some more revelations

[22/05/2020] – We’ve seen data corruption across 3 high volume database clusters which resulted in quite a lot of pain. Having restored the data from a fresh dump and upgrading to the latest firmware, at time of writing, the issues seem to have subsided but it might be too early to tell.

[10/03/2018] – The latest firmware to unlock the full 8GB of cache has been released. Be sure to upgrade to it.

[05/03/2018] – We had 3 disks in a 4 disk RAID 10 array fail. That was messy. Turns out it’s a bug in the controller firmware and the latest firmware fixes it. Be sure to upgrade!

[05/02/2018] – Today my colleague noticed that megacli only reports 4GB of cache on this controller. It turns out that Dell couldn’t get it working with 8GB and have promised to release a firmware update some time in March to unlock the full 8GB. For now, you’ve got 4… which is still better than the 2GB of the older models.

[23/02/2018] – An SSD failed and needed replacing. It was possible to mark the disk as offline, but attempts to mark it as missing or prepare it for removal failed. The assumption is that megacli isn’t compatible with these controllers at the moment. Storcli didn’t work either. In the end, my colleague had remote hands remove and replace the drive. This caused the controller to mark it as Foreign. megacli was used to remove the foreign flag from the drive. Following this, he ensured that auto rebuild was enabled for the controller and then set the disk as a hot spare. This caused the controller to add it back to the array and rebuild. If anyone else has found a nice way to replace disks in an array, please add a comment!

8 comments

  1. THANK YOU!

    Today I had to remotely install Debian Stretch on 2 Dell R440 servers which a customer of ours ordered – after the housing provider guaranteed “RAID controller is 100% compatible with Debian Stretch”. Funny. Not.

    Thanks to this article the H740P did cost me only about 4 hours extra.

    The missing | on the remote console’s virtual keyboard is beyond a joke too, but at least there’s a HTML5 console now (iDRAC 9 Enterprise).

    One hint for new kernels:

    The megaraid_sas in the current stretch-backports kernel, 4.14.0-0.bpo.3-amd64, is fresh enough by the way.

    Basically (after adding stretch-backports source to /etc/apt/sources.list):
    apt-get update
    apt-get install -t stretch-backports linux-image-amd64

    I managed to complete that before the first boot from the RAID, phew, obviously after
    chroot /target /bin/bash
    in console 2 of the installer.

    One more hint to optimize the installation phase work:

    Switching between text consoles does not require Ctrl (Ctrl-Alt-F2, …), Alt-Fx (Alt-F2) is satisfactory.
    Ctrl-Alt-Fx is only required to switch away from an X11 console (usually 7+).

    Regards, Christoph

  2. Good morning,

    Thanks for your article, it helped me a lot.
    My H740P RAID card is detected after adding the driver when installing Debian 9.5
    However, I have a problem with updates.

    So I will ask DELL to replace the H740P with H73OP, can you tell me if the H730P is supported by the 4.9.0.7 kernel please?

    Thanks again for your help

    Best regards;
    Maki

    1. H730P is supported natively on Debian 9. What issues do you have with updates? You’ll struggle to get Dell to swap the RAID controller after purchase – computer will say no.

      1. Hello,

        >>730P is supported natively on Debian 9.
        This is great news;) I didn’t have any more in stock to test.

        We deploy and host servers for clients, they are autonomous thereafter, therefore likely to make kernel updates (remotely).
        We have a partnership with DELL, so I will arrange with them to deduct these new cards on my next orders (this is done regularly).
        Otherwise you are right, DELL cannot exchange an item already paid.

        Thank you again for your precious help

Leave a Reply

Your email address will not be published. Required fields are marked *