作为企业环境中的科学家,我们可以从Ubuntu20.04虚拟机(Proxmox)中的SAN中获得存储资源。SAN控制器直接传递给VM (PCIe通道)。
SAN本身使用硬件Raid 60 (没有其他选项提供给我们),并向我们提供380 TB,我们可以在多个LUN中分割。我们希望从ZFS压缩和快照功能中获益。我们选择了30×11 TB LUN,然后组织成条形RAID-Z。设置是多余的(两个服务器),我们有备份,性能很好,这使我们倾向于条形RAID-Z,而不是通常的条形镜像。
独立于ZFS几何,我们注意到在ZFS清理过程中,高的写入负载(> 1GB/S)会导致磁盘错误,最终导致设备故障。通过查看呈现错误的文件,我们可以将此问题链接到试图访问SAN缓存中仍然存在的数据的清理过程。在清理过程中,使用适度的负载,流程将完成,不会出现任何错误。
出现此问题?
锌池状态输出
pool: sanpool
state: ONLINE
scan: scrub repaired 0B in 2 days 02:05:53 with 0 errors on Thu Mar 17 15:50:34 2022
config:
NAME STATE READ WRITE CKSUM
sanpool ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
wwn-0x60060e8012b003005040b0030000002e ONLINE 0 0 0
wwn-0x60060e8012b003005040b0030000002f ONLINE 0 0 0
wwn-0x60060e8012b003005040b00300000031 ONLINE 0 0 0
wwn-0x60060e8012b003005040b00300000032 ONLINE 0 0 0
wwn-0x60060e8012b003005040b00300000033 ONLINE 0 0 0
raidz1-1 ONLINE 0 0 0
wwn-0x60060e8012b003005040b00300000034 ONLINE 0 0 0
wwn-0x60060e8012b003005040b00300000035 ONLINE 0 0 0
wwn-0x60060e8012b003005040b00300000036 ONLINE 0 0 0
wwn-0x60060e8012b003005040b00300000037 ONLINE 0 0 0
wwn-0x60060e8012b003005040b00300000038 ONLINE 0 0 0
raidz1-2 ONLINE 0 0 0
wwn-0x60060e8012b003005040b00300000062 ONLINE 0 0 0
wwn-0x60060e8012b003005040b00300000063 ONLINE 0 0 0
wwn-0x60060e8012b003005040b00300000064 ONLINE 0 0 0
wwn-0x60060e8012b003005040b00300000065 ONLINE 0 0 0
wwn-0x60060e8012b003005040b00300000066 ONLINE 0 0 0
raidz1-3 ONLINE 0 0 0
wwn-0x60060e8012b003005040b0030000006a ONLINE 0 0 0
wwn-0x60060e8012b003005040b0030000006b ONLINE 0 0 0
wwn-0x60060e8012b003005040b0030000006c ONLINE 0 0 0
wwn-0x60060e8012b003005040b0030000006d ONLINE 0 0 0
wwn-0x60060e8012b003005040b0030000006f ONLINE 0 0 0
raidz1-4 ONLINE 0 0 0
wwn-0x60060e8012b003005040b00300000070 ONLINE 0 0 0
wwn-0x60060e8012b003005040b00300000071 ONLINE 0 0 0
wwn-0x60060e8012b003005040b00300000072 ONLINE 0 0 0
wwn-0x60060e8012b003005040b00300000073 ONLINE 0 0 0
wwn-0x60060e8012b003005040b00300000074 ONLINE 0 0 0
raidz1-5 ONLINE 0 0 0
wwn-0x60060e8012b003005040b00300000075 ONLINE 0 0 0
wwn-0x60060e8012b003005040b00300000076 ONLINE 0 0 0
wwn-0x60060e8012b003005040b00300000077 ONLINE 0 0 0
wwn-0x60060e8012b003005040b00300000079 ONLINE 0 0 0
wwn-0x60060e8012b003005040b0030000007a ONLINE 0 0 0
errors: No known data errors多径-ll输出
mpathr (360060e8012b003005040b00300000074) dm-18 HITACHI,OPEN-V
size=11T features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
|- 7:0:0:25 sdz 65:144 active ready running
`- 8:0:0:25 sdbd 67:112 active ready running
mpathe (360060e8012b003005040b00300000064) dm-5 HITACHI,OPEN-V
size=11T features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
|- 7:0:0:13 sdn 8:208 active ready running
`- 8:0:0:13 sdar 66:176 active ready running
mpathq (360060e8012b003005040b00300000073) dm-17 HITACHI,OPEN-V
size=11T features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
|- 7:0:0:24 sdy 65:128 active ready running
`- 8:0:0:24 sdbc 67:96 active ready running
mpathd (360060e8012b003005040b00300000063) dm-4 HITACHI,OPEN-V
size=11T features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
|- 7:0:0:12 sdm 8:192 active ready running
`- 8:0:0:12 sdaq 66:160 active ready running
mpathp (360060e8012b003005040b00300000072) dm-16 HITACHI,OPEN-V
size=11T features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
|- 7:0:0:23 sdx 65:112 active ready running
`- 8:0:0:23 sdbb 67:80 active ready running
mpathc (360060e8012b003005040b00300000062) dm-3 HITACHI,OPEN-V
size=11T features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
|- 7:0:0:11 sdl 8:176 active ready running
`- 8:0:0:11 sdap 66:144 active ready running
mpatho (360060e8012b003005040b00300000071) dm-15 HITACHI,OPEN-V
size=11T features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
|- 7:0:0:22 sdw 65:96 active ready running
`- 8:0:0:22 sdba 67:64 active ready running
mpathb (360060e8012b003005040b00300000038) dm-2 HITACHI,OPEN-V
size=11T features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
|- 7:0:0:10 sdk 8:160 active ready running
`- 8:0:0:10 sdao 66:128 active ready running
mpathn (360060e8012b003005040b00300000070) dm-14 HITACHI,OPEN-V
size=11T features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
|- 7:0:0:21 sdv 65:80 active ready running
`- 8:0:0:21 sdaz 67:48 active ready running
mpatha (360060e8012b003005040b0030000002e) dm-1 HITACHI,OPEN-V
size=11T features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
|- 7:0:0:1 sdb 8:16 active ready running
`- 8:0:0:1 sdaf 65:240 active ready running
mpathz (360060e8012b003005040b00300000033) dm-26 HITACHI,OPEN-V
size=11T features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
|- 7:0:0:5 sdf 8:80 active ready running
`- 8:0:0:5 sdaj 66:48 active ready running
mpathm (360060e8012b003005040b0030000006f) dm-13 HITACHI,OPEN-V
size=11T features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
|- 7:0:0:20 sdu 65:64 active ready running
`- 8:0:0:20 sday 67:32 active ready running
mpathy (360060e8012b003005040b00300000032) dm-25 HITACHI,OPEN-V
size=11T features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
|- 7:0:0:4 sde 8:64 active ready running
`- 8:0:0:4 sdai 66:32 active ready running
mpathl (360060e8012b003005040b0030000002f) dm-12 HITACHI,OPEN-V
size=11T features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
|- 7:0:0:2 sdc 8:32 active ready running
`- 8:0:0:2 sdag 66:0 active ready running
mpathx (360060e8012b003005040b0030000007a) dm-24 HITACHI,OPEN-V
size=11T features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
|- 7:0:0:30 sdae 65:224 active ready running
`- 8:0:0:30 sdbi 67:192 active ready running
mpathad (360060e8012b003005040b00300000037) dm-30 HITACHI,OPEN-V
size=11T features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
|- 7:0:0:9 sdj 8:144 active ready running
`- 8:0:0:9 sdan 66:112 active ready running
mpathk (360060e8012b003005040b0030000006d) dm-11 HITACHI,OPEN-V
size=11T features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
|- 7:0:0:19 sdt 65:48 active ready running
`- 8:0:0:19 sdax 67:16 active ready running
mpathw (360060e8012b003005040b00300000031) dm-23 HITACHI,OPEN-V
size=11T features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
|- 7:0:0:3 sdd 8:48 active ready running
`- 8:0:0:3 sdah 66:16 active ready running
mpathac (360060e8012b003005040b00300000036) dm-29 HITACHI,OPEN-V
size=11T features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
|- 7:0:0:8 sdi 8:128 active ready running
`- 8:0:0:8 sdam 66:96 active ready running
mpathj (360060e8012b003005040b0030000006c) dm-10 HITACHI,OPEN-V
size=11T features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
|- 7:0:0:18 sds 65:32 active ready running
`- 8:0:0:18 sdaw 67:0 active ready running
mpathv (360060e8012b003005040b00300000079) dm-22 HITACHI,OPEN-V
size=11T features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
|- 7:0:0:29 sdad 65:208 active ready running
`- 8:0:0:29 sdbh 67:176 active ready running
mpathab (360060e8012b003005040b00300000035) dm-28 HITACHI,OPEN-V
size=11T features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
|- 7:0:0:7 sdh 8:112 active ready running
`- 8:0:0:7 sdal 66:80 active ready running
mpathi (360060e8012b003005040b0030000006b) dm-9 HITACHI,OPEN-V
size=11T features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
|- 7:0:0:17 sdr 65:16 active ready running
`- 8:0:0:17 sdav 66:240 active ready running
mpathu (360060e8012b003005040b00300000077) dm-21 HITACHI,OPEN-V
size=11T features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
|- 7:0:0:28 sdac 65:192 active ready running
`- 8:0:0:28 sdbg 67:160 active ready running
mpathaa (360060e8012b003005040b00300000034) dm-27 HITACHI,OPEN-V
size=11T features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
|- 7:0:0:6 sdg 8:96 active ready running
`- 8:0:0:6 sdak 66:64 active ready running
mpathh (360060e8012b003005040b0030000006a) dm-8 HITACHI,OPEN-V
size=11T features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
|- 7:0:0:16 sdq 65:0 active ready running
`- 8:0:0:16 sdau 66:224 active ready running
mpatht (360060e8012b003005040b00300000076) dm-20 HITACHI,OPEN-V
size=11T features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
|- 7:0:0:27 sdab 65:176 active ready running
`- 8:0:0:27 sdbf 67:144 active ready running
mpathg (360060e8012b003005040b00300000066) dm-7 HITACHI,OPEN-V
size=11T features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
|- 7:0:0:15 sdp 8:240 active ready running
`- 8:0:0:15 sdat 66:208 active ready running
mpaths (360060e8012b003005040b00300000075) dm-19 HITACHI,OPEN-V
size=11T features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
|- 7:0:0:26 sdaa 65:160 active ready running
`- 8:0:0:26 sdbe 67:128 active ready running
mpathf (360060e8012b003005040b00300000065) dm-6 HITACHI,OPEN-V
size=11T features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
|- 7:0:0:14 sdo 8:224 active ready running
`- 8:0:0:14 sdas 66:192 active ready running发布于 2022-06-14 14:05:52
我们成功地修复了设置。
nano /etc/multipath.conf .nano
最后,我们更新了初始RAM磁盘:
更新-initramfs -u -k all
所有在负载下描述的问题现在都解决了,多路径-ll不再显示任何故障路径,并且zfs停止报告错误。
发布于 2022-05-07 02:02:21
你看错地方了。如果你有负载下的故障,那么你就不能依赖它,就这样。修一下山。
发布于 2022-05-07 08:12:12
考虑到设置的特定性质和奇怪的SAN配置,这实际上属于专业服务领域。
可以对此进行调整和调整,以获得更好的行为和性能。
/etc/modprobe.d/zfs.conf/etc/sysctl.confhttps://serverfault.com/questions/1100165
复制相似问题