nvidia-automatic-builds-via-dkms

Red Hat based systems ship with an nvidia-compatible graphics kernel module called nouveau. Keeping this default configuration is recommended because nouveau is supported by Red Hat and is provided with each new kernel update.

However, in some situations the 3rd party proprietary graphics kernel module from nvidia is needed instead. This 3rd party kernel module is not supplied or supported by Red Hat. With each new kernel, a 3rd party kernel module (like the proprietary nvidia one) needs to be rebuilt. This process can be tedious and confusing (eg, why did my graphical login break after a reboot?). There is a framework named "DKMS" which stands for Dynamic Kernel Module Support. This enables the automatic rebuild of 3rd party kernel modules during the RPM install of a new kernel (ie, DKMS builds the nvidia driver as part of the kernel RPM install). This is done via hooks which are present in the kernel post-install RPM scriplet.

This document describes how to install the nvidia kernel module with DKMS support, so a manual rebuild of the nvidia kernel module is no longer required for every new kernel.

Find the nvidia graphics (video) model number.

# lspci | grep -i nvidia

An example of what the output looks like (this is just the graphics card line from the output):

01:00.0 VGA compatible controller: NVIDIA Corporation GP107GL [Quadro P600] (rev a1)

A google search for 'nvidia linux' finds the nvidia linux download page:

https://www.nvidia.com/object/unix.html

Under the "64 bit" section at the top of that page, select "Latest Long Lived Branch Version".

It currently looks like this:

Linux x86_64/AMD/EM64T
Latest Long Lived Branch Version: 410.78

After selecting the latest long lived branch above, then select the "Supported products" tab, and search for the model name/number from step 1 above.

From the example in step 1, I found "Quadro P600" in the list of supported products. If you cannot find your model number, go back and look under the supported products for the Latest Legacy GPU version.

Click download.

The current link (NON-legacy) from above is (as of 2018-Dec-7):
http://us.download.nvidia.com/XFree86/Linux-x86_64/410.78/NVIDIA-Linux-x86_64-410.78.run

And the current LEGACY driver as of 2018-Dec-10 is (this is for older systems / older graphics cards):
http://us.download.nvidia.com/XFree86/Linux-x86_64/390.87/NVIDIA-Linux-x86_64-390.87.run

An example of how to directly download one of the above using a command line:

# mkdir /scswork/ksa
# cd /scswork/ksa
# curl -sLO 'http://us.download.nvidia.com/XFree86/Linux-x86_64/410.78/NVIDIA-Linux-x86_64-410.78.run

Install the DKMS rpm (this should be automatically available from the EPEL yum repository)

# yum install dkms

Installing the dkms rpm should also prompt you to install the kernel-devel RPM (as a dependency). If you already have the kernel-devel RPM installed, you won't get prompted. The kernel-devel RPM is required for the nvidia kernel module to be built. DKMS is not supplied or supported by Red Hat. But packages available in EPEL usually work well with RHEL.

You can view the help info (optional) with this command (use -A to view "Advanced help options)

# /bin/sh ./NVIDIA-Linux-x86_64-[version].run --help
# /bin/sh ./NVIDIA-Linux-x86_64-[version].run -A

Stop the currently running X server by changing to run level 3.

This is required by the nvidia installer. This will kick off anyone who is logged in at the video console. So check to see if anyone is logged in at ":0" which is the video console. You can use the 'w' command for this. Here is an example of someone logged in at the video console:

[root@lcls-fairley ~]# w
USER     TTY      FROM              LOGIN@   IDLE   JCPU   PCPU WHAT
dfairley tty1     :0               Tue07   27:54m  0.00s  0.04s pam: gdm-password

If no one is logged in at ":0", or you get the OK to stop the graphical X server, then switch to runlevel 3:

# init 3

Run the installer (this installs and builds without any questions):

# /bin/sh ./NVIDIA-Linux-x86_64-[version].run --dkms --run-nvidia-xconfig --no-questions

The installer will build any required kernel modules. This could take quite a while (several minutes to 10s of minutes).

If you prefer to have an interactive installation, so you can read and answer the questions during the install:

# /bin/sh ./NVIDIA-Linux-x86_64-[version].run

Next, reboot. This is optional -- but recommended for verification that everything works after a reboot.

If you do not reboot, then switch back to run level 5 with this command:

# init 5

The "init 5" command will start the graphical login program, assuming the nvidia install/config was successful.

The following command will tell you which kernels have the nvidia module installed:

# find /lib/modules | grep nvidia.ko

After the reboot, this is how to verify the nvidia module is being used and the install was successful (look for gdm in the output from the last ps tree command below).

# lsmod | grep nvidia
# ps axuww | grep X
# ps axuwwf | less

There are also nvidia and X server log files which can be viewed.

/var/log/nvidia-installer.log
/var/log/Xorg.0.log

How does a kernel install trigger an nvidia rebuild using DKMS?

$ rpm -q --scripts kernel
  -> /sbin/new-kernel-pkg
    -> /etc/kernel/postinst.d/dkms
      -> /usr/lib/dkms/dkms_autoinstaller
        -> /usr/sbin/dkms

Taylor runs during the night, and taylor runs 'yum upgrade'. When there is a new kernel available, yum will install it. When it gets installed, the postinstall scripts in the RPM are run. Many things are run in the these postinstall scripts, including a script called '/sbin/new-kernel-pkg'. /sbin/new-kernel-pkg looks in the /etc/kernel/postinst.d/ directory and runs anything in there. /sbin/new-kernel-pkg finds a script called dkms inside the /etc/kernel/postinst.d/ directory. The /etc/kernel/postinst.d/dkms script runs the /usr/lib/dkms/dkms_autoinstaller script, which in turn runs the /usr/sbin/dkms script. The dkms man page describes in detail how the build happens with the /usr/sbin/dkms script.

A side note: if you want to get notified anytime a new kernel is installed via RPM on a certain host, you can write a script and put it in the /etc/kernel/postinst.d/ directory. When a new kernel is installed, the /sbin/new-kernel-pkg script will look in /etc/kernel/postinst.d/ and run anything it finds there (with 2 arguments supplied). This is a script I've installed on a machine, so I get an email whenever a new kernel is installed there:

cat /etc/kernel/postinst.d/notify 
#!/bin/sh

cat <<EOF | mail -r unix-admin@slac.stanford.edu -s "$1 installed on `hostname`" ksa@slac.stanford.edu
New kernel was installed on `hostname` at `date`.

$1
$2
EOF

References:

DKMS

See the dkms manual page, using the command: 'man dkms'
https://en.wikipedia.org/wiki/Dynamic_Kernel_Module_Support
https://github.com/dell/dkms

Kernel Module Weak Updates

https://trapsink.com/wiki/Kernel_Module_Weak_Updates

Nouveau

https://nouveau.freedesktop.org/

Space shortcuts

Page tree