首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >LSI HBA 3008的多路配置

LSI HBA 3008的多路配置
EN

Unix & Linux用户
提问于 2018-08-07 15:02:32
回答 1查看 781关注 0票数 0

我有5个jbod通过LSI-LSI 3008连接到控制器。我正在使用Arch-Linux 4.14.41-1-lts & multipath-tools v0.7.6 (03/10,2018)

我的问题是当磁盘开始产生I/O错误并开始闪烁多路径时,尝试检查磁盘并重新映射失败的路径。

代码语言:javascript
复制
Jul 23 04:59:51 FKM1 multipathd[5315]: 35000c50093d4e7c7: sdbe - tur checker timed out
Jul 23 04:59:51 FKM1 multipathd[5315]: checker failed path 67:128 in map 35000c50093d4e7c7
Jul 23 04:59:51 FKM1 multipathd[5315]: 35000c50093d4e7c7: remaining active paths: 0
Jul 23 04:59:51 FKM1 multipathd[5315]: sdbe: mark as failed
Jul 23 04:59:56 FKM1 multipathd[5315]: checker failed path 67:128 in map 35000c50093d4e7c7
Jul 23 05:04:37 FKM1 multipathd[5315]: 67:128: reinstated
Jul 23 05:04:37 FKM1 multipathd[5315]: 35000c50093d4e7c7: remaining active paths: 1
Jul 23 05:05:27 FKM1 multipathd[5315]: 35000c50093d4e7c7: sdbe - tur checker timed out
Jul 23 05:05:27 FKM1 multipathd[5315]: checker failed path 67:128 in map 35000c50093d4e7c7
Jul 23 05:05:27 FKM1 multipathd[5315]: 35000c50093d4e7c7: remaining active paths: 0
Jul 23 05:05:27 FKM1 multipathd[5315]: sdbe: mark as failed

因为磁盘多路径错误,所以每次磁盘出现时都试图重新映射。

代码语言:javascript
复制
[Fri Aug  3 00:18:37 2018] alua: device handler registered
[Fri Aug  3 00:18:37 2018] emc: device handler registered
[Fri Aug  3 00:18:37 2018] rdac: device handler registered
[Fri Aug  3 00:18:37 2018] device-mapper: uevent: version 1.0.3
[Fri Aug  3 00:18:37 2018] device-mapper: ioctl: 4.37.0-ioctl (2017-09-20) initialised: dm-devel@redhat.com
[Fri Aug  3 00:18:43 2018] device-mapper: multipath service-time: version 0.3.0 loaded
[Fri Aug  3 00:18:43 2018] device-mapper: table: 254:0: multipath: error getting device
[Fri Aug  3 00:18:43 2018] device-mapper: ioctl: error adding target to table
[Fri Aug  3 00:18:43 2018] device-mapper: table: 254:0: multipath: error getting device
[Fri Aug  3 00:18:43 2018] device-mapper: ioctl: error adding target to table
[Fri Aug  3 00:21:19 2018] sd 12:0:16:0: attempting task abort! scmd(ffffa03a6c4de948)
[Fri Aug  3 00:21:19 2018] sd 12:0:16:0: [sdbh] tag#1 CDB: opcode=0x88 88 00 00 00 00 02 ba a0 f0 00 00 00 02 00 00 00
[Fri Aug  3 00:21:19 2018] scsi target12:0:16: handle(0x001c), sas_address(0x5000c50093d5135d), phy(8)
[Fri Aug  3 00:21:19 2018] scsi target12:0:16: enclosure_logical_id(0x500304800929f87f), slot(8)
[Fri Aug  3 00:21:19 2018] scsi target12:0:16: enclosure level(0x0001),connector name(1   )
[Fri Aug  3 00:21:19 2018] sd 12:0:16:0: task abort: SUCCESS scmd(ffffa03a6c4de948)
[Fri Aug  3 00:21:19 2018] sd 12:0:16:0: attempting task abort! scmd(ffffa07b2eb87d48)
[Fri Aug  3 00:21:19 2018] sd 12:0:16:0: [sdbh] tag#0 CDB: opcode=0x88 88 00 00 00 00 02 ba a0 f0 00 00 00 02 00 00 00
[Fri Aug  3 00:21:19 2018] scsi target12:0:16: handle(0x001c), sas_address(0x5000c50093d5135d), phy(8)
[Fri Aug  3 00:21:19 2018] scsi target12:0:16: enclosure_logical_id(0x500304800929f87f), slot(8)
[Fri Aug  3 00:21:19 2018] scsi target12:0:16: enclosure level(0x0001),connector name(1   )
[Fri Aug  3 00:21:19 2018] sd 12:0:16:0: task abort: SUCCESS scmd(ffffa07b2eb87d48)
[Fri Aug  3 00:21:21 2018] device-mapper: multipath: Failing path 67:176.
[Fri Aug  3 00:21:21 2018] sd 12:0:16:0: attempting task abort! scmd(ffffa03a89b38148)
[Fri Aug  3 00:21:21 2018] sd 12:0:16:0: [sdbh] tag#11 CDB: opcode=0x0 00 00 00 00 00 00
[Fri Aug  3 00:21:21 2018] scsi target12:0:16: handle(0x001c), sas_address(0x5000c50093d5135d), phy(8)
[Fri Aug  3 00:21:21 2018] scsi target12:0:16: enclosure_logical_id(0x500304800929f87f), slot(8)
[Fri Aug  3 00:21:21 2018] scsi target12:0:16: enclosure level(0x0001),connector name(1   )
[Fri Aug  3 00:21:21 2018] sd 12:0:16:0: task abort: SUCCESS scmd(ffffa03a89b38148)
[Fri Aug  3 00:21:26 2018] print_req_error: I/O error, dev dm-208, sector 11721044480
[Fri Aug  3 00:21:26 2018] print_req_error: I/O error, dev dm-208, sector 0
[Fri Aug  3 00:21:26 2018] print_req_error: I/O error, dev dm-208, sector 512
[Fri Aug  3 00:21:26 2018] print_req_error: I/O error, dev dm-208, sector 11721043968
[Fri Aug  3 00:21:26 2018] print_req_error: I/O error, dev dm-208, sector 11721044480
[Fri Aug  3 00:21:26 2018] print_req_error: I/O error, dev dm-208, sector 0
[Fri Aug  3 00:21:26 2018] print_req_error: I/O error, dev dm-208, sector 512
[Fri Aug  3 00:21:26 2018] print_req_error: I/O error, dev dm-208, sector 11721043968
[Fri Aug  3 00:21:26 2018] print_req_error: I/O error, dev dm-208, sector 11721044480
[Fri Aug  3 00:21:57 2018] sd 12:0:16:0: attempting task abort! scmd(ffffa03a89b3f148)

过了一段时间,当MPT3SAS驱动程序放弃并重新设置LSI卡时,循环将继续。

代码语言:javascript
复制
[Fri Aug  3 00:18:12 2018] mpt3sas_cm3: iomem(0x00000000fbe40000), mapped(0xffffbe0e8dca0000), size(65536)
[Fri Aug  3 00:18:12 2018] mpt3sas_cm3: ioport(0x000000000000e000), size(256)
[Fri Aug  3 00:18:12 2018] usb 2-1-port6: over-current condition
[Fri Aug  3 00:18:12 2018] mpt3sas_cm3: sending message unit reset !!
[Fri Aug  3 00:18:12 2018] mpt3sas_cm3: message unit reset: SUCCESS
[Fri Aug  3 00:18:12 2018] mpt3sas_cm3: Allocated physical memory: size(20778 kB)
[Fri Aug  3 00:18:12 2018] mpt3sas_cm3: Current Controller Queue Depth(9564),Max Controller Queue Depth(9664)
[Fri Aug  3 00:18:12 2018] mpt3sas_cm3: Scatter Gather Elements per IO(128)
[Fri Aug  3 00:18:12 2018] usb 3-14.1: new low-speed USB device number 3 using xhci_hcd
[Fri Aug  3 00:18:12 2018] mpt3sas_cm3: LSISAS3008: FWVersion(15.00.02.00), ChipRevision(0x02), BiosVersion(08.35.00.00)
[Fri Aug  3 00:18:12 2018] mpt3sas_cm3: Protocol=(
[Fri Aug  3 00:18:12 2018] Initiator
[Fri Aug  3 00:18:12 2018] ,Target
[Fri Aug  3 00:18:12 2018] ),
[Fri Aug  3 00:18:12 2018] Capabilities=(
[Fri Aug  3 00:18:12 2018] TLR
[Fri Aug  3 00:18:12 2018] ,EEDP
[Fri Aug  3 00:18:12 2018] ,Snapshot Buffer
[Fri Aug  3 00:18:12 2018] ,Diag Trace Buffer
[Fri Aug  3 00:18:12 2018] ,Task Set Full
[Fri Aug  3 00:18:12 2018] ,NCQ
[Fri Aug  3 00:18:12 2018] )
[Fri Aug  3 00:18:12 2018] scsi host13: Fusion MPT SAS Host
[Fri Aug  3 00:18:12 2018] mpt3sas_cm3: sending port enable !!
[Fri Aug  3 00:18:12 2018] mpt3sas_cm4: 64 BIT PCI BUS DMA ADDRESSING SUPPORTED, total mem (528262416 kB)
[Fri Aug  3 00:18:12 2018] mpt3sas_cm3: host_add: handle(0x0001), sas_addr(0x500605b00c482a80), phys(8)
[Fri Aug  3 00:18:12 2018] mpt3sas_cm3: expander_add: handle(0x0009), parent(0x0001), sas_addr(0x5003048017aed57f), phys(38)
[Fri Aug  3 00:18:12 2018] scsi 13:0:0:0: Direct-Access     SEAGATE  ST800FM0173      0007 PQ: 0 ANSI: 6

当Mpt3sas发送"diag重置“时,这意味着我同时丢失了一个jbod "90磁盘”!正因为如此,一个简单的错误磁盘可以挂起我的ZFS池。

现在我正在寻找一个解决方案,我想如果我对多路径说,“如果一个磁盘故障3次,不要重新映射”,那么我的问题就会解决,因为这个磁盘不会被池使用,如果我的池不使用错误的磁盘,那么磁盘就不会产生I/O错误。

因此,通过简单的解释,我正在寻找一种方法来禁用故障磁盘的使用。

我发现/etc/multipath.conf . found的设置很少,但我不确定这是否能解决我的问题。你能告诉我解决问题的最好办法吗?

代码语言:javascript
复制
defaults {
    user_friendly_names no
    path_grouping_policy failover
  polling_interval        10
  path_selector           "round-robin 0"
  path_grouping_policy    failover
  path_checker            readsector0
  failback                manual
  no_path_retry           3
  prio            rdac
}


blacklist_exceptions {
        property "(ID_WWN|SCSI_IDENT_.*|ID_SERIAL)"
}

这是完整的DMESG日志-> https://paste.ubuntu.com/p/XZZ2CScmHP/

EN

回答 1

Unix & Linux用户

发布于 2018-09-22 16:55:56

它将不是多路径发送中止那些SCSI命令,它将是Linux内核。一旦该中止未能及时处理,SCSI错误处理将启动,并将越来越多的将被重置(一直到HBA重置),试图把磁盘带回来。您需要说服Linux以某种方式更快地声明磁盘已死。

您可能可以编写一个udev规则,这样您就可以降低与路径https://access.redhat.com/documentation/en-us/red_帽子_企业_linux/5/html/online_存储_重新配置_指导/任务_对应的磁盘上的timeout,这样它就可以离线声明,但它可能需要进行大量的实验(其风险可能最终适用于所有路径)。

票数 0
EN
页面原文内容由Unix & Linux提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://unix.stackexchange.com/questions/461092

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档