这是一个相对常见的问题,当SAN中出现问题时,ext3会检测磁盘写入错误并重新装入只读文件系统。这一切都很好,只有当SAN被修复时,我才想不出如何在不重新启动的情况下重新挂载文件系统读写。
看:
[root@localhost ~]# multipath -ll
mpath0 (36001f93000a310000299000200000000) dm-2 XIOTECH,ISE1400
[size=1.1T][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=2][active]
\_ 1:0:0:1 sdb 8:16 [active][ready]
\_ 2:0:0:1 sdc 8:32 [active][ready]
[root@localhost ~]# mount /dev/mapper/mpath0 /mnt/foo
[root@localhost ~]# touch /mnt/foo/blah很好,现在我把伦人从下面拉出来。
[root@localhost ~]# touch /mnt/foo/blah
[root@localhost ~]# touch /mnt/foo/blah
touch: cannot touch `/mnt/foo/blah': Read-only file system
[root@localhost ~]# tail /var/log/messages
Mar 18 13:17:33 localhost multipathd: sdb: tur checker reports path is down
Mar 18 13:17:34 localhost multipathd: sdc: tur checker reports path is down
Mar 18 13:17:35 localhost kernel: Aborting journal on device dm-2.
Mar 18 13:17:35 localhost kernel: Buffer I/O error on device dm-2, logical block 1545
Mar 18 13:17:35 localhost kernel: lost page write due to I/O error on dm-2
Mar 18 13:17:36 localhost kernel: ext3_abort called.
Mar 18 13:17:36 localhost kernel: EXT3-fs error (device dm-2): ext3_journal_start_sb: Detected aborted journal
Mar 18 13:17:36 localhost kernel: Remounting filesystem read-only它只认为它是只读的,实际上它甚至不在那里。
[root@localhost ~]# multipath -ll
sdb: checker msg is "tur checker reports path is down"
sdc: checker msg is "tur checker reports path is down"
mpath0 (36001f93000a310000299000200000000) dm-2 XIOTECH,ISE1400
[size=1.1T][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=0][enabled]
\_ 1:0:0:1 sdb 8:16 [failed][faulty]
\_ 2:0:0:1 sdc 8:32 [failed][faulty]
[root@localhost ~]# ll /mnt/foo/
ls: reading directory /mnt/foo/: Input/output error
total 20
-rw-r--r-- 1 root root 0 Mar 18 13:11 bar它还记得那个“酒吧”文件在那里..。神秘,但现在不重要。现在我再来介绍一下伦:
[root@localhost ~]# tail /var/log/messages
Mar 18 13:23:58 localhost multipathd: sdb: tur checker reports path is up
Mar 18 13:23:58 localhost multipathd: 8:16: reinstated
Mar 18 13:23:58 localhost multipathd: mpath0: queue_if_no_path enabled
Mar 18 13:23:58 localhost multipathd: mpath0: Recovered to normal mode
Mar 18 13:23:58 localhost multipathd: mpath0: remaining active paths: 1
Mar 18 13:23:58 localhost multipathd: dm-2: add map (uevent)
Mar 18 13:23:58 localhost multipathd: dm-2: devmap already registered
Mar 18 13:23:59 localhost multipathd: sdc: tur checker reports path is up
Mar 18 13:23:59 localhost multipathd: 8:32: reinstated
Mar 18 13:23:59 localhost multipathd: mpath0: remaining active paths: 2
Mar 18 13:23:59 localhost multipathd: dm-2: add map (uevent)
Mar 18 13:23:59 localhost multipathd: dm-2: devmap already registered
[root@localhost ~]# multipath -ll
mpath0 (36001f93000a310000299000200000000) dm-2 XIOTECH,ISE1400
[size=1.1T][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=2][enabled]
\_ 1:0:0:1 sdb 8:16 [active][ready]
\_ 2:0:0:1 sdc 8:32 [active][ready]很好对吧?上面写着rw就在那儿。不要那么快:
[root@localhost ~]# touch /mnt/foo/blah
touch: cannot touch `/mnt/foo/blah': Read-only file system好吧,不是自动做的,我只是稍微推一下:
[root@localhost ~]# mount -o remount /mnt/foo
mount: block device /dev/mapper/mpath0 is write-protected, mounting read-only你他妈的是:
[root@localhost ~]# mount -o remount,rw /mnt/foo
mount: block device /dev/mapper/mpath0 is write-protected, mounting read-only不知道。
我已经尝试过各种不同的挂载/tune2fs/dmsetup命令,但我想不出如何将其取消标记为写保护的块设备。重新启动会修复它,但我更愿意在网上进行。一小时的谷歌搜索也让我无处可寻。救我ServerFault。
发布于 2012-02-07 22:27:55
我最近刚刚遇到了这个问题,并通过重新启动来解决它,但是经过进一步的调查,似乎发出下面的命令可能会修复它。
echo running > /sys/block/device-name/device/state我想您可能想看看见第25.14.4节:更改联机逻辑单元在本文件中的读/写状态,不过,我建议您重新启动。
发布于 2010-03-19 00:02:18
试着使用:
mount -o remount,rw /mnt/fo发布于 2012-09-13 20:10:22
我一开始就喜欢防止这个问题。大多数企业UNIX盒都会像永远一样重试文件系统操作。作为管理员,您需要在优化MPIO配置之前做一些准备工作。如果您的应用程序应该等到设备恢复到可用状态,那么下面是一个解决方案。在/etc/multipath.conf . care中,确保您所关心的设备类型将"no_path_retry“设置为"queue”。设置此设置将导致失败的I/O排队,直到有有效路径为止。我们这样做是为了让我们的EMC Symmtrix/DMX盒在某些条件下处理hiccups驱动器/控制器/srdf路径故障/恢复。当您想要在故障期间手动故障设备时,它会变得更加复杂,因为您需要使用dmsetup之类的工具来刷新/失败I/O,或者暂时更改multipath.conf文件并重新扫描devices....etc。
这种方法已经无数次地节省了我们的精力,并且是我们的标准,可用于多比瓦内/多供应商SAN上的数百个盒子,以及用于灾难恢复的复制。
只是想和你们分享一下。保重。
https://serverfault.com/questions/124051
复制相似问题