Author Topic: [SOLVED]: URGENT 810 Software RAID failed after power outage  (Read 42996 times)

Beeker

  • Guru
  • ****
  • Posts: 267
    • View Profile
Hi All,
We had a power failure at home and my system with 810 and when the power came back and it booted back up now the LMCE software RAID is showing all the drives as removed I have tried su mdadm -D /dev/md1 and it comes back with mdadm: md device /dev/md1 does not appear to be active.

I have very limited cmd line knowledge so any help would be appreciated to see if there is anything I can try to recover the RAID and data as all our family photos are on it and I am currently building up a new QNAP NAS with RAID6 as I was advised to do

I have attached a photo from LMCE of the RAID

Kind regards
Beeker
« Last Edit: August 27, 2013, 04:52:24 am by Beeker »

Crumble

  • Veteran
  • ***
  • Posts: 146
    • View Profile
Re: URGENT 810 Software RAID failed after power outage
« Reply #1 on: June 16, 2013, 09:22:01 pm »
Read this

 If the md driver detects a write error on a device in a  RAID1,  RAID4,
       RAID5,  RAID6,  or  RAID10  array,  it immediately disables that device
       (marking it  as  faulty)  and  continues  operation  on  the  remaining
       devices.   If  there are spare drives, the driver will start recreating
       on one of the spare drives the data which was  on  that  failed  drive,
       either by copying a working drive in a RAID1 configuration, or by doing
       calculations with the parity block on RAID4,  RAID5  or  RAID6,  or  by
       finding and copying originals for RAID10.

       In  kernels  prior  to  about 2.6.15, a read error would cause the same
       effect as a write error.  In later kernels, a read-error  will  instead
       cause  md  to  attempt a recovery by overwriting the bad block. i.e. it
       will find the correct data from elsewhere, write it over the block that
       failed, and then try to read it back again.  If either the write or the
       re-read fail, md will treat the error the same way that a  write  error
       is treated, and will fail the whole device.


Since all seem to be removed read this:

http://linuxexpresso.wordpress.com/2010/03/31/repair-a-broken-ext4-superblock-in-ubuntu/

Obviously ignore the parts about parted magic, test disk.  fsck is already in linux (if you didn't know that).  You have not explained what we are working with here btw.  RAID5 external NAS i assume? 

Just run these two commands (dev/xxx being one/all of the RAID partitions of course) and report back the info.  Unless you feel comfortable fixing it urself.  I am not sure its a bad superblock, not a good idea to start trying to fix things without knowing what the problem is first.  Just a guess.  Good Luck!   :) 

sudo fdisk -l

mdadm -E /dev/xxx (on all the RAID partitions)

Crumble

  • Veteran
  • ***
  • Posts: 146
    • View Profile
Re: URGENT 810 Software RAID failed after power outage
« Reply #2 on: June 16, 2013, 10:00:49 pm »
oh and this derp.  mdadm –detail /dev/mdx whatever RAID number is at the end of md

Beeker

  • Guru
  • ****
  • Posts: 267
    • View Profile
Re: URGENT 810 Software RAID failed after power outage
« Reply #3 on: June 17, 2013, 01:20:26 am »
Thanks very much for the info I will try it now and report back and sorry its a software RAID5 in LMCE

Kind regards
Beeker

Beeker

  • Guru
  • ****
  • Posts: 267
    • View Profile
Re: URGENT 810 Software RAID failed after power outage
« Reply #4 on: June 17, 2013, 01:39:32 am »
Hi Crumble

This is what I got from sudo fdisk -l and from memory sdf was always the spare drive

dcerouter_1024641:/home/bruce# sudo fdisk -l

Disk /dev/sda: 750.1 GB, 750156374016 bytes
255 heads, 63 sectors/track, 91201 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x39686ed2

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1       90446   726507463+  83  Linux
/dev/sda2           90447       91201     6064537+   5  Extended
/dev/sda5           90447       91201     6064506   82  Linux swap / Solaris

Disk /dev/sdb: 1500.3 GB, 1500301910016 bytes
255 heads, 63 sectors/track, 182401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x0006a5c8

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               1      182401  1465136001   83  Linux

Disk /dev/sdc: 1500.3 GB, 1500301910016 bytes
255 heads, 63 sectors/track, 182401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0xb217d64b

   Device Boot      Start         End      Blocks   Id  System
/dev/sdc1               1      182401  1465136001   83  Linux

Disk /dev/sde: 1500.3 GB, 1500301910016 bytes
255 heads, 63 sectors/track, 182401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x43b7a284

   Device Boot      Start         End      Blocks   Id  System
/dev/sde1               1      182401  1465136001   83  Linux

Disk /dev/sdf: 1500.3 GB, 1500301910016 bytes
255 heads, 63 sectors/track, 182401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0xc8efaf55

Disk /dev/sdf doesn't contain a valid partition table


I also tried

dcerouter_1024641:/home/bruce# mdadm -E /dev/md1
mdadm: No md superblock detected on /dev/md1.

Any help on how to proceed would be greatly appreciated I wasn't sure where to go from here

Regards
Beeker

Crumble

  • Veteran
  • ***
  • Posts: 146
    • View Profile
Re: URGENT 810 Software RAID failed after power outage
« Reply #5 on: June 17, 2013, 02:17:32 am »
what about mdadm –detail /dev/md1.  It may be in the process of rebuilding which takes some time.

Beeker

  • Guru
  • ****
  • Posts: 267
    • View Profile
Re: URGENT 810 Software RAID failed after power outage
« Reply #6 on: June 17, 2013, 02:26:10 am »
I tried
mdadm –detail /dev/md1 and got

dcerouter_1024641:/home/bruce# mdadm -detail /dev/md1
mdadm: -d does not set the mode, and so cannot be the first option.

So than I tried mdadm -D /dev/md1 and got
dcerouter_1024641:/home/bruce# mdadm -D /dev/md1
mdadm: md device /dev/md1 does not appear to be active.


Crumble

  • Veteran
  • ***
  • Posts: 146
    • View Profile
Re: URGENT 810 Software RAID failed after power outage
« Reply #7 on: June 17, 2013, 03:01:31 am »
run each command and post results please

cat /proc/mdstat   


mdadm -E /dev/sdb1
mdadm -E /dev/sdc1
mdadm -E /dev/sde1


Beeker

  • Guru
  • ****
  • Posts: 267
    • View Profile
Re: URGENT 810 Software RAID failed after power outage
« Reply #8 on: June 17, 2013, 03:20:18 am »

Here you go
cat /proc/mdstat   

dcerouter_1024641:/home/bruce# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md1 : inactive sdb1[0](S) sdf[4](S) sde1[3](S) sdc1[1](S)
      5860546304 blocks

mdadm -E /dev/sdb1
dcerouter_1024641:/home/bruce# mdadm -E /dev/sdb1
/dev/sdb1:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 5747d0ac:31c15bff:bd9f1658:0a1d2015 (local to host dcerouter)
  Creation Time : Sun Dec 27 10:14:43 2009
     Raid Level : raid5
  Used Dev Size : 1465135936 (1397.26 GiB 1500.30 GB)
     Array Size : 4395407808 (4191.79 GiB 4500.90 GB)
   Raid Devices : 4
  Total Devices : 5
Preferred Minor : 1

    Update Time : Thu Jun 13 09:02:55 2013
          State : clean
 Active Devices : 4
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 1
       Checksum : ee358c49 - correct
         Events : 1802

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     0       8       17        0      active sync   /dev/sdb1

   0     0       8       17        0      active sync   /dev/sdb1
   1     1       8       33        1      active sync   /dev/sdc1
   2     2       8       49        2      active sync
   3     3       8       65        3      active sync   /dev/sde1
   4     4       8       80        4      spare   /dev/sdf



mdadm -E /dev/sdc1
dcerouter_1024641:/home/bruce# mdadm -E /dev/sdc1
/dev/sdc1:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 5747d0ac:31c15bff:bd9f1658:0a1d2015 (local to host dcerouter)
  Creation Time : Sun Dec 27 10:14:43 2009
     Raid Level : raid5
  Used Dev Size : 1465135936 (1397.26 GiB 1500.30 GB)
     Array Size : 4395407808 (4191.79 GiB 4500.90 GB)
   Raid Devices : 4
  Total Devices : 5
Preferred Minor : 1

    Update Time : Thu Jun 13 09:02:55 2013
          State : clean
 Active Devices : 4
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 1
       Checksum : ee358c5b - correct
         Events : 1802

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     1       8       33        1      active sync   /dev/sdc1

   0     0       8       17        0      active sync   /dev/sdb1
   1     1       8       33        1      active sync   /dev/sdc1
   2     2       8       49        2      active sync
   3     3       8       65        3      active sync   /dev/sde1
   4     4       8       80        4      spare   /dev/sdf


mdadm -E /dev/sde1
dcerouter_1024641:/home/bruce# mdadm -E /dev/sde1
/dev/sde1:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 5747d0ac:31c15bff:bd9f1658:0a1d2015 (local to host dcerouter)
  Creation Time : Sun Dec 27 10:14:43 2009
     Raid Level : raid5
  Used Dev Size : 1465135936 (1397.26 GiB 1500.30 GB)
     Array Size : 4395407808 (4191.79 GiB 4500.90 GB)
   Raid Devices : 4
  Total Devices : 5
Preferred Minor : 1

    Update Time : Thu Jun 13 09:02:55 2013
          State : clean
 Active Devices : 4
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 1
       Checksum : ee358c7f - correct
         Events : 1802

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     3       8       65        3      active sync   /dev/sde1

   0     0       8       17        0      active sync   /dev/sdb1
   1     1       8       33        1      active sync   /dev/sdc1
   2     2       8       49        2      active sync
   3     3       8       65        3      active sync   /dev/sde1
   4     4       8       80        4      spare   /dev/sdf




Crumble

  • Veteran
  • ***
  • Posts: 146
    • View Profile
Re: URGENT 810 Software RAID failed after power outage
« Reply #9 on: June 17, 2013, 03:45:42 am »
Ok, looks like all the drives are clean.  Your RAID is showing all the drives as spare (like in the pic) wasn't sure how accurate the gui is.  This may not be a superblock problem, but lets find out for sure.

Run

mdadm --assemble --scan -v

this will let us know which have bad superblocks and if that is indeed the problem. 


Crumble

  • Veteran
  • ***
  • Posts: 146
    • View Profile
Re: URGENT 810 Software RAID failed after power outage
« Reply #10 on: June 17, 2013, 03:49:23 am »
wait a sec dont run that, wish there was an edit option.  gimme a sec

Beeker

  • Guru
  • ****
  • Posts: 267
    • View Profile
Re: URGENT 810 Software RAID failed after power outage
« Reply #11 on: June 17, 2013, 03:50:08 am »
ok

Crumble

  • Veteran
  • ***
  • Posts: 146
    • View Profile
Re: URGENT 810 Software RAID failed after power outage
« Reply #12 on: June 17, 2013, 04:24:22 am »
ok, i just noticed disk sdd1 did not appear in the fdisk -l report.  This is odd, we need to figure out what is going on there.  run mdadm -E /dev/sdd1 and post what it says.

Beeker

  • Guru
  • ****
  • Posts: 267
    • View Profile
Re: URGENT 810 Software RAID failed after power outage
« Reply #13 on: June 17, 2013, 04:43:17 am »
This is what I got

dcerouter_1024641:/home/bruce# mdadm -E /dev/sdd1
mdadm: cannot open /dev/sdd1: No such file or directory

Crumble

  • Veteran
  • ***
  • Posts: 146
    • View Profile
Re: URGENT 810 Software RAID failed after power outage
« Reply #14 on: June 17, 2013, 04:53:16 am »
This is odd, if that drive is or was an active hot spare it should be partitioned and ready to write to if drive failure occurs.  I would think b,c,d would be the active drives and e the hot spare with f your backup hot spare.  Is this correct?