e1000e NIC link failure / x9scm / x9scl / patch fix for Centos

nic driver photo

Photo by stibbons

If you’ve had problems with the nic driver failing on x9 / x10 supermicro motherboards with the intel e1000e NIC, we have a fix for you. Centos 6 on X9 motherboards (x9scl / x9scm) seems to have the most problems, but other motherboards and OS’s are potentially affected as well.

There are a lot of suggestions floating around the internet, many are partial solutions. There are a few steps to take, each will improve reliability, but only somewhat. Doing all of the steps together, it should be very stable, but most tutorials only tell you one or two fixes, not the complete solution. We have the complete solution for you here, that we install for every customer of ours who requests Centos.

Often, you will see an error with the network, requiring you to reboot. It will happen randomly, maybe after 5 minutes, maybe after 5 hours or 5 days, but it will come. The kind of error like:

Oct 15 14:25:24 ______ kernel: e1000e: eth0 NIC Link is Down

If you see this problem, then this solution is for you.

First off, upgrade your kernel

yum -y upgrade

and reboot into the new kernel

Next, check your existing driver version

modinfo e1000e | grep version:

Make a note of the current version, to make sure the upgrade worked later.

Next, copy and paste the following script into a bash file using your favorite text editor (I usually prefer nano). Chmod 755 the file and run it.

# Copyright 2014 Input Output Flood LLC
# www.IOFLOOD.com -- We Love Servers
# This script may be freely distributed so long as this copyright notice remains intact
#
# this is a pre-requisite for our nifty nic upgrade script
 yum -y install pciutils
 
 # update this network driver for the appropriate RHEL release and the appropriate driver (e1000e and igb supported)
 NIC=`lspci -nv | egrep "e1000e$|igb$" | sed 's/\tKernel driver in use: //g' | sed 's/\tKernel modules: //g' | uniq`
 if grep -q -i "release 5" /etc/redhat-release
 then
   RPM="http://elrepo.org/elrepo-release-5-5.el5.elrepo.noarch.rpm"
 elif grep -q -i "release 6" /etc/redhat-release
 then
   RPM="http://elrepo.org/elrepo-release-6-6.el6.elrepo.noarch.rpm"
   if [[ "$NIC" == "e1000e" ]]
   then
     grubby --update-kernel=ALL --args="pcie_aspm=off e1000e.IntMode=1,1 e1000e.InterruptThrottleRate=10000,10000 acpi=ht"
   fi
 elif grep -q -i "release 7" /etc/redhat-release
 then
   RPM="http://elrepo.org/elrepo-release-7.0-2.el7.elrepo.noarch.rpm"
   if [[ "$NIC" == "e1000e" ]]
   then
     grubby --update-kernel=ALL --args="pcie_aspm=off e1000e.IntMode=1,1 e1000e.InterruptThrottleRate=10000,10000 acpi=ht"
   fi
 fi
 if [[ -n "$RPM" && -n "$NIC" ]] 
 then
   rpm --import http://elrepo.org/RPM-GPG-KEY-elrepo.org
   rpm -Uvh $RPM
   yum -y install kmod-$NIC
 fi
#
# Copyright 2014 Input Output Flood LLC
# www.IOFLOOD.com -- We Love Servers
# This script may be freely distributed so long as this copyright notice remains intact

The above script does a few things. First off, it will install pciutils, which is needed for the other steps. Secondly, it checks your centos version because Centos 5, Centos 6, and Centos 7 require a different rpm to be downloaded. Next, it will download the correct driver for your version and for your nic. The script works with both e100e and IGB network drivers. Next, it will update your kernel flags to be appropriate for maximum stability. For e100e nics, it will enable “pcie_aspm=off” and “acpi=ht” so that the only acpi function turned on will be hyperthreading, other acpi functions will be turned off.

After running this script, you should reboot again (to enable the new kernel flags) and then check your driver version again:

modinfo e1000e | grep version:

You should now see an updated driver version, and should now have a stable network card under Centos. Congratulations.

If you have any questions about the above solution, or want any information about ioflood, feel free to email us at sales [at] ioflood.com