Linux iSCSI发起者在写入NetApp FAS目标时,会看到较高的服务时间,从而暴露了一堆LUN。从哪里找出原因,如何解决?
我使用来自sysstat包的iostats和sa来计算“等待”--一个特定请求平均等待的时间:
dd if=/dev/urandom of=test bs=8K count=1000000 & iostat -xdm 5 sdy dm-26
[...]
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
sdy 0.00 1115.60 0.10 47.50 0.00 7.94 341.68 1.63 34.05 2.00 9.52
dm-26 0.00 0.00 0.10 2032.90 0.00 7.94 8.00 328.10 161.39 0.05 9.52在这种情况下,“等待”的预期值将比iostats测量的值低一个数量级。在上述示例的时间段内传输的10秒网络流量为可在CloudShark上获得。dm-26是承载文件系统(单磁盘NSS卷)的设备映射器设备,它引用sdy“物理”设备。
启动器和目标被放在同一个子网中。发起者主机的IP为192.168.20.72,目标为192.168.20.33,流量交换为1GbE,启用Jumbo帧(并通过网络跟踪确认正在使用),启用iSCSI即时数据(并根据上述跟踪使用),不启用摘要。
iSCSI会话信息:
iscsiadm -m session -P 3
iSCSI Transport Class version 2.0-870
version 2.0-873.suse
Target: iqn.1992-08.com.netapp:sn.151745715
Current Portal: 192.168.20.33:3260,2003
Persistent Portal: 192.168.20.33:3260,2003
**********
Interface:
**********
Iface Name: default
Iface Transport: tcp
Iface Initiatorname: iqn.2015-06.de.example.dom:01:gw-cl-07
Iface IPaddress: 192.168.20.72
Iface HWaddress: <empty>
Iface Netdev: <empty>
SID: 1
iSCSI Connection State: LOGGED IN
iSCSI Session State: LOGGED_IN
Internal iscsid Session State: NO CHANGE
*********
Timeouts:
*********
Recovery Timeout: 120
Target Reset Timeout: 30
LUN Reset Timeout: 30
Abort Timeout: 15
*****
CHAP:
*****
username: <empty>
password: ********
username_in: <empty>
password_in: ********
************************
Negotiated iSCSI params:
************************
HeaderDigest: None
DataDigest: None
MaxRecvDataSegmentLength: 262144
MaxXmitDataSegmentLength: 65536
FirstBurstLength: 65536
MaxBurstLength: 65536
ImmediateData: Yes
InitialR2T: No
MaxOutstandingR2T: 1
************************
Attached SCSI devices:
************************
Host Number: 3 State: running
scsi3 Channel 00 Id 0 Lun: 0
Attached scsi disk sdb State: running
scsi3 Channel 00 Id 0 Lun: 1
Attached scsi disk sdc State: running
scsi3 Channel 00 Id 0 Lun: 10
Attached scsi disk sdl State: running
scsi3 Channel 00 Id 0 Lun: 11
Attached scsi disk sdm State: running
scsi3 Channel 00 Id 0 Lun: 12
Attached scsi disk sdn State: running
scsi3 Channel 00 Id 0 Lun: 13
Attached scsi disk sdo State: running
scsi3 Channel 00 Id 0 Lun: 14
Attached scsi disk sdp State: running
scsi3 Channel 00 Id 0 Lun: 15
Attached scsi disk sdq State: running
scsi3 Channel 00 Id 0 Lun: 16
Attached scsi disk sdr State: running
scsi3 Channel 00 Id 0 Lun: 17
Attached scsi disk sds State: running
scsi3 Channel 00 Id 0 Lun: 18
Attached scsi disk sdt State: running
scsi3 Channel 00 Id 0 Lun: 19
Attached scsi disk sdu State: running
scsi3 Channel 00 Id 0 Lun: 2
Attached scsi disk sdd State: running
scsi3 Channel 00 Id 0 Lun: 20
Attached scsi disk sdv State: running
scsi3 Channel 00 Id 0 Lun: 21
Attached scsi disk sdw State: running
scsi3 Channel 00 Id 0 Lun: 22
Attached scsi disk sdx State: running
scsi3 Channel 00 Id 0 Lun: 23
Attached scsi disk sdy State: running
scsi3 Channel 00 Id 0 Lun: 24
Attached scsi disk sdz State: running
scsi3 Channel 00 Id 0 Lun: 25
Attached scsi disk sdaa State: running
scsi3 Channel 00 Id 0 Lun: 26
Attached scsi disk sdab State: running
scsi3 Channel 00 Id 0 Lun: 27
Attached scsi disk sdac State: running
scsi3 Channel 00 Id 0 Lun: 28
Attached scsi disk sdad State: running
scsi3 Channel 00 Id 0 Lun: 3
Attached scsi disk sde State: running
scsi3 Channel 00 Id 0 Lun: 4
Attached scsi disk sdf State: running
scsi3 Channel 00 Id 0 Lun: 5
Attached scsi disk sdg State: running
scsi3 Channel 00 Id 0 Lun: 6
Attached scsi disk sdh State: running
scsi3 Channel 00 Id 0 Lun: 7
Attached scsi disk sdi State: running
scsi3 Channel 00 Id 0 Lun: 8
Attached scsi disk sdj State: running
scsi3 Channel 00 Id 0 Lun: 9
Attached scsi disk sdk State: running由于某种原因,当请求队列中的写入请求合并时,映射到“物理”LUN的dm设备显示,等待时间显著增加。但我的问题实际上是关于底层设备上的等待-- NetApp FAS应该简单地将所有的写入请求放入其NVRAM中并立即确认,即使是同步写入,所以我不应该看到超过5ms的等待,只要网络链路不饱和(它没有饱和),并且NVRAM没有反向压力(它不是-- FAS目前根本不处理任何其他负载)。
对于读取操作,即使是随机读取,“等待”时间也要低得多。来自iozone运行随机读/写测试(启用了O_DSYNC的单线程自动类型测试,8K块大小)的10秒sysstat数据显示了这样的效果:

图的前半部分是随机读取,以2-4 kIOPS和~3ms的延迟运行.在下半部分,工作负载切换为随机写入,等待时间上升到>10 is,IOPS下降到~100 (负载是延迟绑定和单线程的,因此IOPS与延迟成反比)。
由于某种原因,当分析上面的网络流量捕获时,Wireshark的“服务响应时间”统计特性无法识别大多数write呼叫--只发现19个请求,并且报告了3ms的平均服务响应时间,在这里,我预计大约有500个呼叫和一个类似于等待34 ms的Avg值。
Linux使用的是SuSE SLES 11 SP3,内核3.0.101-0.47.55-默认。
发布于 2015-08-11 12:46:31
时间太长,不能发表评论;
我不是Linux专家,但在Windows中,我禁用了NIC上的TCP大发送卸载,因为它会产生滞后。它发送的数据包较少,但更大,但对于关键IO则不推荐。
官方解释;
TCP大发送卸载选项允许AIX层构建长达64 KB的TCP消息,并通过IP和以太网设备驱动程序一次调用堆栈发送它。然后,适配器将消息重新分割成多个TCP帧,以便在有线上传输.在线路上发送的TCP数据包要么是1500的最大传输单元( MTU )的1500字节帧,要么是高达9000字节的MTU(巨帧)的帧。
发布于 2015-08-11 14:51:10
我将根据进一步的信息编辑这个答案。首先,我们需要确定Netapp是否也在观察等待,还是只观察主机。如果主机看到较高的服务时间,但NAS声称服务时间较低,则在NAS端口和服务器的SCSI堆栈之间。
您运行的是什么版本的数据?7模式还是CDOT?什么是LUN OS设置和igroup OS设置?有了这些信息,我可以提供您可以在Netapp上使用的命令来检查存储观察的延迟。
https://serverfault.com/questions/713308
复制相似问题