首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >脏降级数组,无法超级块,内核恐慌

脏降级数组,无法超级块,内核恐慌
EN

Server Fault用户
提问于 2012-03-07 11:49:16
回答 1查看 3K关注 0票数 3

机器:LinuxCentos5.4有2个hdds和raid 5(是的,第三个磁盘丢失了)。

情况:

  1. 所有运行良好(第三个磁盘丢失)
  2. 然后是断电(当电池供电结束时,系统会自行关闭)。
  3. 机器不回来

屏幕上的信息:

代码语言:javascript
复制
Memory for crash kernel (0x0 to 0x0) notwithin permissible range
PCI: BIOS Bug: MCFC area at e0000000 is not E820-reserved
PCI: Not using MMCONFIG.
Red Hat nash version 5.1.19.6 starting
insmod: error inserting '/lib/raid456.ko': -1 File exists
md: md2: raid array is not clean -- starting background reconstruction
raid5: cannot start dirty degraded array for md2
raid5: failed to run raid set md2
md: pers->run() failed ...
md: md2: raid array is not clean -- starting background reconstruction
raid5: cannot start dirty degraded array for md2
raid5: failed to run raid set md2
md: pers->run() failed ...
EXT3-fs: unable to read superblock
mount: error mounting /dev/root on /sysroot as ext3: Invalid argument
setuproot: moving /dev failed: No such file or directory00
setuproot: error mounting /proc: No such file or directory
setuproot: mount failed: No such file or directory
Kernel panic - not syncing: Attempted to kill init!

所以我在记忆棒上安装了sysresccd,然后用它启动。然后我做了这些测试:

代码语言:javascript
复制
smartctl -t short /dev/sda 
smartctl -X /dev/sda 
smartctl -l selftest /dev/sda 

sdb也是如此。结果如下:

代码语言:javascript
复制
sda: test=Short offline, status="Completed without error", remaining=00%, lifetime=19230, firsterror=- 
sdb: test=Short offline, status="Completed: read failure", remaining=90%, lifetime=19256, firsterror=67031516 

及康体局的详情:

代码语言:javascript
复制
root@sysresccd /root % smartctl -A /dev/sdb
smartctl 5.42 2011-10-20 r3458 [i686-linux-3.0.21-std250-i586] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   180   180   021    Pre-fail  Always       -       5975
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       33
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   074   074   000    Old_age   Always       -       19256
 10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       32
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       27
193 Load_Cycle_Count        0x0032   183   183   000    Old_age   Always       -       51128
194 Temperature_Celsius     0x0022   111   093   000    Old_age   Always       -       39
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       17
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age   Offline      -       0

Current_Pending_Sector 17可能是个问题。

然后,进一步的步骤: 1.购买3x2TbHDDs 2.用记忆棒引导. 3.将两个旧的1.5tb磁盘一个接一个地复制到两个新的磁盘上: dd if=/dev/sda of=dev/sdc bs=32M dd =/dev/sdb of=dev/sdc bs=32M 4.删除了两个旧磁盘(不让情况更糟) 5.附加了3个新磁盘。重新启动。

然后输出如下:

代码语言:javascript
复制
Memory for crash kernel (0x0 to 0x0) notwithin permissible range
PCI: BIOS Bug: MCFC area at e0000000 is not E820-reserved
PCI: Not using MMCONFIG.
Red Hat nash version 5.1.19.6 starting
insmod: error inserting '/lib/raid456.ko': -1 File exists
md: invalid raid superblock magic on sdb3
md: md2: raid array is not clean -- starting background reconstruction
raid5: not enough operational devices for md2 (2/3 failed)
raid5: failed to run raid set md2
md: pers->run() failed ...
md: md2: raid array is not clean -- starting background reconstruction
raid5: not enough operational devices for md2 (2/3 failed)
raid5: failed to run raid set md2
md: pers->run() failed ...
EXT3-fs: unable to read superblock
mount: error mounting /dev/root on /sysroot as ext3: Invalid argument
setuproot: moving /dev failed: No such file or directory
setuproot: error mounting /proc: No such file or directory
setuproot: error mounting /sys: No such file or directory
setuproot: mount failed: No such file or directory
Kernel panic - not syncing: Attempted to kill init!

所以我从记忆棒上启动了新的磁盘和sysresccd。以下是一些信息:

代码语言:javascript
复制
fdisk -l
shows the two full disks exactly like the output was on the old disks

Device  Boot    Start   End Blocks  Id  System
/dev/sda1   *   63  610469  305203+ fd  Linux raid autodetect
/dev/sda2       610470  8803619 4096575 fd  Linux raid autodetect
/dev/sda3       8803620 2930272064  1460734222+ fd  Linux raid autodetect

/dev/sdb1   *   63  610469  305203+ fd  Linux raid autodetect
/dev/sdb2       610470  8803619 4096575 fd  Linux raid autodetect
/dev/sdb3       8803620 2930272064  1460734222+ fd  Linux raid autodetect

sdc不包含有效的分区表(即空的第三个磁盘)

代码语言:javascript
复制
smartctl -t short /dev/sda
smartctl -X /dev/sda
smartctl -l selftest /dev/sda
sda: test=Short offline, status="Completed without error", remaining=00%, lifetime=19230, firsterror=-
sdb: test=Short offline, status="Completed: read failure", remaining=90%, lifetime=19256, firsterror=67031516

smartctl -A /dev/sdb
offline_uncorrectable: 0

然后:

代码语言:javascript
复制
root@sysresccd /root % cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md125 : inactive sda3[0](S)
      1460734144 blocks

md126 : active raid1 sda1[0] sdb1[1]
      305088 blocks [2/2] [UU]

md127 : active raid1 sda2[0] sdb2[1]
      4096448 blocks [2/2] [UU]

unused devices: <none>

注意: raid5在那里显示为md125。

127的详细情况:

代码语言:javascript
复制
root@sysresccd /root % mdadm --detail /dev/md127
/dev/md127:
        Version : 0.90
  Creation Time : Sun Dec 13 18:45:15 2009
     Raid Level : raid1
     Array Size : 4096448 (3.91 GiB 4.19 GB)
  Used Dev Size : 4096448 (3.91 GiB 4.19 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 127
    Persistence : Superblock is persistent

    Update Time : Thu Mar  8 00:40:45 2012
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           UUID : 939f1a92:590d4172:2414ef47:5e2b15cb
         Events : 0.236

    Number   Major   Minor   RaidDevice State
       0       8        2        0      active sync   /dev/sda2
       1       8       18        1      active sync   /dev/sdb2

126人:

代码语言:javascript
复制
root@sysresccd /root % mdadm --detail /dev/md126
/dev/md126:
        Version : 0.90
  Creation Time : Sun Dec 13 19:21:09 2009
     Raid Level : raid1
     Array Size : 305088 (297.99 MiB 312.41 MB)
  Used Dev Size : 305088 (297.99 MiB 312.41 MB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 126
    Persistence : Superblock is persistent

    Update Time : Wed Mar  7 23:34:02 2012
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           UUID : bde56644:86d3e3a4:1128f4fe:0f47f21f
         Events : 0.242

    Number   Major   Minor   RaidDevice State
       0       8        1        0      active sync   /dev/sda1
       1       8       17        1      active sync   /dev/sdb1

125的详细情况:

代码语言:javascript
复制
root@sysresccd /root % mdadm --detail /dev/md125
mdadm: md device /dev/md125 does not appear to be active.

sda3:

代码语言:javascript
复制
root@sysresccd /root % mdadm --examine /dev/sda3
/dev/sda3:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : 062f3190:b9337fc1:0b38f5df:7ec7c53b
  Creation Time : Sun Dec 13 18:45:15 2009
     Raid Level : raid5
  Used Dev Size : 1460733952 (1393.06 GiB 1495.79 GB)
     Array Size : 2921467904 (2786.13 GiB 2991.58 GB)
   Raid Devices : 3
  Total Devices : 2
Preferred Minor : 2

    Update Time : Sat Mar  3 22:48:34 2012
          State : active
 Active Devices : 2
Working Devices : 2
 Failed Devices : 1
  Spare Devices : 0
       Checksum : e5ac0d6c - correct
         Events : 26243911

         Layout : left-symmetric
     Chunk Size : 256K

      Number   Major   Minor   RaidDevice State
this     0       8        3        0      active sync   /dev/sda3

   0     0       8        3        0      active sync   /dev/sda3
   1     1       8       19        1      active sync   /dev/sdb3
   2     2       0        0        2      faulty removed

sdb3: root@sysresccd /root % mdadm --检查/dev/sdb3 3 mdadm:没有在/dev/sdb3 3上检测到md超级块。

然后:

代码语言:javascript
复制
root@sysresccd /root % mdadm --examine /dev/sd[ab]3 | egrep 'dev|Update|Role|State|Chunk Size'
mdadm: No md superblock detected on /dev/sdb3.
/dev/sda3:
    Update Time : Sat Mar  3 22:48:34 2012
          State : active
     Chunk Size : 256K
      Number   Major   Minor   RaidDevice State
this     0       8        3        0      active sync   /dev/sda3
   0     0       8        3        0      active sync   /dev/sda3
   1     1       8       19        1      active sync   /dev/sdb3

更多:

代码语言:javascript
复制
root@sysresccd /root % mdadm --verbose --examine --scan
ARRAY /dev/md2 level=raid5 num-devices=3 UUID=062f3190:b9337fc1:0b38f5df:7ec7c53b
   devices=/dev/sda3
ARRAY /dev/md126 level=raid1 num-devices=2 UUID=bde56644:86d3e3a4:1128f4fe:0f47f21f
   devices=/dev/sdb1,/dev/sda1
ARRAY /dev/md127 level=raid1 num-devices=2 UUID=939f1a92:590d4172:2414ef47:5e2b15cb
   devices=/dev/sdb2,/dev/sda2

(注意:这里列出的是md125,而不是md2)

代码语言:javascript
复制
root@sysresccd /root % mdadm --verbose --create --assume-clean /dev/md2 --level=5 --raid-devices=3 /dev/sda3 /dev/sdb3 missing
mdadm: layout defaults to left-symmetric
mdadm: chunk size defaults to 512K
mdadm: layout defaults to left-symmetric
mdadm: layout defaults to left-symmetric
mdadm: super1.x cannot open /dev/sda3: Device or resource busy
mdadm: failed container membership check
mdadm: device /dev/sda3 not suitable for any style of array

更新:可能是磁盘sdb的dd副本没有成功。康体发展局的副本看上去可疑,因此我运行以下命令:

代码语言:javascript
复制
root@sysresccd /root % dd if=/dev/sda3 of=/dev/sdc3 bs=128M
11144+1 records in
11144+1 records out
1495791843840 bytes (1.5 TB) copied, 42354.9 s, 35.3 MB/s
root@sysresccd /root % dd if=/dev/sdb3 of=/dev/sdd3 bs=128M
dd: reading `/dev/sdb3': Input/output error
222+1 records in
222+1 records out
29813932032 bytes (30 GB) copied, 676.459 s, 44.1 MB/s
root@sysresccd /root %

这一次只复制sdb3分区,因为sdb1和sdb2都很好。正如你所看到的,它会中止。因此,我现在正在竞选:

-S -c 20480 -f /dev/sdb3 3 /dev/sdd3 /tmp/log3 3

再复制一次,这次是用快速救援法。到目前为止,errsize=17928 kB和errors=3还需要更多的时间。

我会更新这篇文章,当副本完成,我会发现更多。

EN

回答 1

Server Fault用户

发布于 2012-03-17 20:05:58

(回答我自己)

ddrescue解决了这个问题,之后可以重新组装raid5阵列.

票数 1
EN
页面原文内容由Server Fault提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://serverfault.com/questions/367168

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档