mpt2sas驱动系统日志分析

常见的mpt2sas日志

线上服务器/var/log/messages日志

code(0x03), sub_code(0x010a)

Jan 29 06:30:52 mpt2sas0: log_info(0x3003010a): originator(IOP), code(0x03), sub_code(0x010a)
Jan 29 06:30:52 mpt2sas0: log_info(0x3003010a): originator(IOP), code(0x03), sub_code(0x010a)

Driver mpt2sas0 spam dmesg - mpt2sas0: log_info(0x30030110): originator(IOP), code(0x03), sub_code(0x0110) 提供了排查思路:

  • 在运行 sas2ircu-status 时会出现上述消息

  • Dell提供了一个内建的检测工具可以检查SAS相关故障码

  • Russell McOrmond (russell-flora) wrote on 2012-07-11: 更新存储卡的firmware可以修复这个问题

排查案例:

检查硬件

#lspci | grep -i lsi
04:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS2308 PCI-Express Fusion-MPT SAS-2 (rev 05)

LSI SAS2308控制芯片

code(0x11), sub_code(0x0630)

2016-12-12 17:08:32    mpt2sas0: log_info(0x31110630): originator(PL), code(0x11), sub_code(0x0630)
2016-12-12 17:08:33    mpt2sas0: log_info(0x31110630): originator(PL), code(0x11), sub_code(0x0630)
2016-12-12 17:08:35    mpt2sas0: log_info(0x31110630): originator(PL), code(0x11), sub_code(0x0630)
2016-12-12 17:08:36    mpt2sas0: log_info(0x31110630): originator(PL), code(0x11), sub_code(0x0630)

code(0x11), sub_code(0x1000)

2017-02-13 09:09:15    mpt2sas0: log_info(0x31111000): originator(PL), code(0x11), sub_code(0x1000)
2017-02-13 09:09:15    mpt2sas0: log_info(0x31111000): originator(PL), code(0x11), sub_code(0x1000)

这里111000似乎是一个异常的信息,服务器显示负载极高,磁盘iowait达到40+,检查磁盘sdg故障导致util达到100,采用mpt2sas下线故障磁盘方法处理后,这个0x31111000不再出现。

参考

Last updated