LinuxMCE Forums

General => Users => Topic started by: Beeker on June 14, 2013, 10:50:26 pm

Title: [SOLVED]: URGENT 810 Software RAID failed after power outage
Post by: Beeker on June 14, 2013, 10:50:26 pm
Hi All,
We had a power failure at home and my system with 810 and when the power came back and it booted back up now the LMCE software RAID is showing all the drives as removed I have tried su mdadm -D /dev/md1 and it comes back with mdadm: md device /dev/md1 does not appear to be active.

I have very limited cmd line knowledge so any help would be appreciated to see if there is anything I can try to recover the RAID and data as all our family photos are on it and I am currently building up a new QNAP NAS with RAID6 as I was advised to do

I have attached a photo from LMCE of the RAID

Kind regards
Beeker
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on June 16, 2013, 09:22:01 pm
Read this

 If the md driver detects a write error on a device in a  RAID1,  RAID4,
       RAID5,  RAID6,  or  RAID10  array,  it immediately disables that device
       (marking it  as  faulty)  and  continues  operation  on  the  remaining
       devices.   If  there are spare drives, the driver will start recreating
       on one of the spare drives the data which was  on  that  failed  drive,
       either by copying a working drive in a RAID1 configuration, or by doing
       calculations with the parity block on RAID4,  RAID5  or  RAID6,  or  by
       finding and copying originals for RAID10.

       In  kernels  prior  to  about 2.6.15, a read error would cause the same
       effect as a write error.  In later kernels, a read-error  will  instead
       cause  md  to  attempt a recovery by overwriting the bad block. i.e. it
       will find the correct data from elsewhere, write it over the block that
       failed, and then try to read it back again.  If either the write or the
       re-read fail, md will treat the error the same way that a  write  error
       is treated, and will fail the whole device.


Since all seem to be removed read this:

http://linuxexpresso.wordpress.com/2010/03/31/repair-a-broken-ext4-superblock-in-ubuntu/

Obviously ignore the parts about parted magic, test disk.  fsck is already in linux (if you didn't know that).  You have not explained what we are working with here btw.  RAID5 external NAS i assume? 

Just run these two commands (dev/xxx being one/all of the RAID partitions of course) and report back the info.  Unless you feel comfortable fixing it urself.  I am not sure its a bad superblock, not a good idea to start trying to fix things without knowing what the problem is first.  Just a guess.  Good Luck!   :) 

sudo fdisk -l

mdadm -E /dev/xxx (on all the RAID partitions)
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on June 16, 2013, 10:00:49 pm
oh and this derp.  mdadm –detail /dev/mdx whatever RAID number is at the end of md
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on June 17, 2013, 01:20:26 am
Thanks very much for the info I will try it now and report back and sorry its a software RAID5 in LMCE

Kind regards
Beeker
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on June 17, 2013, 01:39:32 am
Hi Crumble

This is what I got from sudo fdisk -l and from memory sdf was always the spare drive

dcerouter_1024641:/home/bruce# sudo fdisk -l

Disk /dev/sda: 750.1 GB, 750156374016 bytes
255 heads, 63 sectors/track, 91201 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x39686ed2

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1       90446   726507463+  83  Linux
/dev/sda2           90447       91201     6064537+   5  Extended
/dev/sda5           90447       91201     6064506   82  Linux swap / Solaris

Disk /dev/sdb: 1500.3 GB, 1500301910016 bytes
255 heads, 63 sectors/track, 182401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x0006a5c8

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               1      182401  1465136001   83  Linux

Disk /dev/sdc: 1500.3 GB, 1500301910016 bytes
255 heads, 63 sectors/track, 182401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0xb217d64b

   Device Boot      Start         End      Blocks   Id  System
/dev/sdc1               1      182401  1465136001   83  Linux

Disk /dev/sde: 1500.3 GB, 1500301910016 bytes
255 heads, 63 sectors/track, 182401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x43b7a284

   Device Boot      Start         End      Blocks   Id  System
/dev/sde1               1      182401  1465136001   83  Linux

Disk /dev/sdf: 1500.3 GB, 1500301910016 bytes
255 heads, 63 sectors/track, 182401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0xc8efaf55

Disk /dev/sdf doesn't contain a valid partition table


I also tried

dcerouter_1024641:/home/bruce# mdadm -E /dev/md1
mdadm: No md superblock detected on /dev/md1.

Any help on how to proceed would be greatly appreciated I wasn't sure where to go from here

Regards
Beeker
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on June 17, 2013, 02:17:32 am
what about mdadm –detail /dev/md1.  It may be in the process of rebuilding which takes some time.
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on June 17, 2013, 02:26:10 am
I tried
mdadm –detail /dev/md1 and got

dcerouter_1024641:/home/bruce# mdadm -detail /dev/md1
mdadm: -d does not set the mode, and so cannot be the first option.

So than I tried mdadm -D /dev/md1 and got
dcerouter_1024641:/home/bruce# mdadm -D /dev/md1
mdadm: md device /dev/md1 does not appear to be active.

Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on June 17, 2013, 03:01:31 am
run each command and post results please

cat /proc/mdstat   


mdadm -E /dev/sdb1
mdadm -E /dev/sdc1
mdadm -E /dev/sde1

Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on June 17, 2013, 03:20:18 am

Here you go
cat /proc/mdstat   

dcerouter_1024641:/home/bruce# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md1 : inactive sdb1[0](S) sdf[4](S) sde1[3](S) sdc1[1](S)
      5860546304 blocks

mdadm -E /dev/sdb1
dcerouter_1024641:/home/bruce# mdadm -E /dev/sdb1
/dev/sdb1:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 5747d0ac:31c15bff:bd9f1658:0a1d2015 (local to host dcerouter)
  Creation Time : Sun Dec 27 10:14:43 2009
     Raid Level : raid5
  Used Dev Size : 1465135936 (1397.26 GiB 1500.30 GB)
     Array Size : 4395407808 (4191.79 GiB 4500.90 GB)
   Raid Devices : 4
  Total Devices : 5
Preferred Minor : 1

    Update Time : Thu Jun 13 09:02:55 2013
          State : clean
 Active Devices : 4
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 1
       Checksum : ee358c49 - correct
         Events : 1802

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     0       8       17        0      active sync   /dev/sdb1

   0     0       8       17        0      active sync   /dev/sdb1
   1     1       8       33        1      active sync   /dev/sdc1
   2     2       8       49        2      active sync
   3     3       8       65        3      active sync   /dev/sde1
   4     4       8       80        4      spare   /dev/sdf



mdadm -E /dev/sdc1
dcerouter_1024641:/home/bruce# mdadm -E /dev/sdc1
/dev/sdc1:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 5747d0ac:31c15bff:bd9f1658:0a1d2015 (local to host dcerouter)
  Creation Time : Sun Dec 27 10:14:43 2009
     Raid Level : raid5
  Used Dev Size : 1465135936 (1397.26 GiB 1500.30 GB)
     Array Size : 4395407808 (4191.79 GiB 4500.90 GB)
   Raid Devices : 4
  Total Devices : 5
Preferred Minor : 1

    Update Time : Thu Jun 13 09:02:55 2013
          State : clean
 Active Devices : 4
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 1
       Checksum : ee358c5b - correct
         Events : 1802

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     1       8       33        1      active sync   /dev/sdc1

   0     0       8       17        0      active sync   /dev/sdb1
   1     1       8       33        1      active sync   /dev/sdc1
   2     2       8       49        2      active sync
   3     3       8       65        3      active sync   /dev/sde1
   4     4       8       80        4      spare   /dev/sdf


mdadm -E /dev/sde1
dcerouter_1024641:/home/bruce# mdadm -E /dev/sde1
/dev/sde1:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 5747d0ac:31c15bff:bd9f1658:0a1d2015 (local to host dcerouter)
  Creation Time : Sun Dec 27 10:14:43 2009
     Raid Level : raid5
  Used Dev Size : 1465135936 (1397.26 GiB 1500.30 GB)
     Array Size : 4395407808 (4191.79 GiB 4500.90 GB)
   Raid Devices : 4
  Total Devices : 5
Preferred Minor : 1

    Update Time : Thu Jun 13 09:02:55 2013
          State : clean
 Active Devices : 4
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 1
       Checksum : ee358c7f - correct
         Events : 1802

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     3       8       65        3      active sync   /dev/sde1

   0     0       8       17        0      active sync   /dev/sdb1
   1     1       8       33        1      active sync   /dev/sdc1
   2     2       8       49        2      active sync
   3     3       8       65        3      active sync   /dev/sde1
   4     4       8       80        4      spare   /dev/sdf



Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on June 17, 2013, 03:45:42 am
Ok, looks like all the drives are clean.  Your RAID is showing all the drives as spare (like in the pic) wasn't sure how accurate the gui is.  This may not be a superblock problem, but lets find out for sure.

Run

mdadm --assemble --scan -v

this will let us know which have bad superblocks and if that is indeed the problem. 

Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on June 17, 2013, 03:49:23 am
wait a sec dont run that, wish there was an edit option.  gimme a sec
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on June 17, 2013, 03:50:08 am
ok
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on June 17, 2013, 04:24:22 am
ok, i just noticed disk sdd1 did not appear in the fdisk -l report.  This is odd, we need to figure out what is going on there.  run mdadm -E /dev/sdd1 and post what it says.
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on June 17, 2013, 04:43:17 am
This is what I got

dcerouter_1024641:/home/bruce# mdadm -E /dev/sdd1
mdadm: cannot open /dev/sdd1: No such file or directory
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on June 17, 2013, 04:53:16 am
This is odd, if that drive is or was an active hot spare it should be partitioned and ready to write to if drive failure occurs.  I would think b,c,d would be the active drives and e the hot spare with f your backup hot spare.  Is this correct?
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on June 17, 2013, 05:05:50 am
No I have 4 x HDD's with one hot spare so it should be b,c,d & e with f as a hot spare so may be either the power or the sata lead may have come out, I will check these when I get home and let you know
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on June 17, 2013, 05:28:28 am
oops didn't see your post.  so yeah that is a good start.  once you got d showing again, give these two commands a go. assuming it doesnt rebuild itself on its own.


mdadm --stop /dev/md1
mdadm --assemble --force /dev/md1 /dev/sd[b,c,d,e,f]  
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on June 17, 2013, 10:49:57 am
I got home and checked all the HDD's and found with one making a slow beeping noise which I can only assume is sdd1 so a couple of questions

Should I leave it plugged in and try to rebuild the RAID with the commands you gave me and hopefully the spare drive will now form part of the RAID though I think the spare drive is just that a "spare drive" and not a hot spare so the other choice would be to unplug sdd1 and plug that into sdf and then reassemble the RAID with your commands and hopefully the data will come back

I really appreciate all you help with trying to help me fix this thanks very much

Best regards
Beeker
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: tschak909 on June 17, 2013, 04:14:42 pm
Once you're back up and running, we need to make a ticket so that we can improve the RAID UI in the web admin, to help deal with these issues.

-Thom
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on June 17, 2013, 09:54:31 pm
Will do happy to provide any information or help
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: tschak909 on June 17, 2013, 10:02:05 pm
please make a ticket @ http://svn.linuxmce.org/ ... thanks :)

-Thom
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on June 17, 2013, 10:20:56 pm
Done Thom..........excuse any mistakes as its the first ticket I have ever created so I hope its correct  :)

Please let me know if I need to make any changes

Beeker
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on June 18, 2013, 04:03:56 am
Well, I do not recommend setting up the RAID again with out a spare.  Unless you back up the data.  Remember RAID 5 will survive one drive failure.  If more than that fails that it is it, donezo.  All that data is gone forever.  I recommend replacing the drive.  I know it sucks, but better safe than sorry.  If you do not want to heed this warning.  I would leave it as is and run those commands to see if it will rebuild.  I really stress being patient if you need another drive and can't get one right away though.
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on June 18, 2013, 04:36:47 am
Just so I fully understand could I clarify a couple of your suggestions as you point out I rather not lose all the data.

As the original RAID was set up with 4 x HDD's and the 5th one Linuxmce automatically marked as spare can I buy another 1.5TB HDD and replace the failed one in the raid and hopefully it will rebuild as from my understanding that you are saying if I replace the HDD I can't just have 4 hDD's I need it to have the spare as well and if that is correct once I replace the faulty HDD then it will it rebuild or will I have to use one of your commands that you sent.

I have a backup of some of the data though as its a RAID array I assume I can't take out each HDD and put it in my QNAP and copy it

Really appreciate all your help and sorry for the dumb questions I just need to recover the family photos or life won't be worth living I certainly have learned a lesson here about backing data up

Regards
Beeker
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on June 18, 2013, 05:54:52 am
RAID does not like changes being made.  Problems can arise in linux from just trying to rebuild the array with one disk missing.  If the drive is there and say linux sees a write error or read error it will mark that dirty and should start using the spare.  In your case it knows that f is part of the array.  If d just isn't there this could create a problem.  Even though it sees f and knows its a spare.  I don't know why honestly.  Just my experience.  Your best bet is to replace d, leave f where it is.  Then run the very first command you posted to verify it is rebuilding.  If it does not automagically rebuild you will need to run the two commands i posted to force it to rebuild.  That would be the smartest way of doing this.  Although i should mention, there is always a chance that something else could go very wrong.  RAID 5 is a piece of garbage.  I do not understand why it was so popular.  It puts the data on the disks like a tic tac toe game making it difficult to retrieve data.  There are tools that can do this though.  Knoppix has one.  What you do is make images of the disks.  Put the images on four more disks then have the tools try and rebuild the data.  In your case i don't think that will work.  Usually a four drive RAID 5 the fourth disk is the parity drive.  Where the three disk RAID 5 parity is usually broken up between all three drives along with the data.  So, on second thought backup probably isn't an option for you.  You could try, there is no harm in making images of the drives before you try and rebuild it.  At least that way, you can always send the images off for recovery if your significant other is making death threats.  Then just blame it on the company, they screwed up, act really mad.   :P
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on June 18, 2013, 01:34:38 pm
Thanks I will purchase a new Seagate 1.5TB HDD and see what it does and let you know............its going to take a couple of days to organise a new HDD

Thanks again for your patience and assistance with this issue........just confirming I should first try su mdadm -D /dev/md1 then if that does nothing then use the two cmd's that you suggested 
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on June 18, 2013, 07:17:02 pm
Yes, that is exactly right.  Good Luck Beeker. 
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on June 18, 2013, 07:18:04 pm
Thanks I think I am going to need it  :)
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on June 29, 2013, 09:26:10 am
how did you make out beeker?
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on June 29, 2013, 09:57:19 am
Hi Crumble,
I finally found a used Seagate 1.5TB on eBay for the right price of $61 and its on its way so I should have it next week so I am hoping that when I plug it in it will automatically rebuild the software RAID then I can copy all the content over to my new QNAP NAS that is ready to go with RAID6 and 19TB

I will have everything crossed next week when I put the replacement HDD in, so wish me luck and I will definitely let you know how it goes as you have been a fantastic help and if it all goes according to plan than I wont end up being killed if I can recover the wedding photos and all the other data  :)

Thanks again

Kind regards
Beeker
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on July 05, 2013, 06:17:22 pm
Hi Crumble,
Installed the new HDD and tried

mdadm --stop /dev/md1
mdadm --assemble --force /dev/md1 /dev/sd[b,c,d,e,f] 


And got

dcerouter_1024641:/home/bruce# mdadm --stop /dev/md1
mdadm: stopped /dev/md1
dcerouter_1024641:/home/bruce# mdadm --assemble --force /dev/md1 /dev/sd[b,c,d,e,f]
mdadm: no recogniseable superblock on /dev/sdb
mdadm: /dev/sdb has no superblock - assembly aborted
dcerouter_1024641:/home/bruce#

Any thoughts

Regards
Beeker
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on July 05, 2013, 06:35:22 pm
Also I tried the first command of mdadm -D /dev/md1

And got

dcerouter_1024641:/home/bruce# mdadm -D /dev/md1
mdadm: md device /dev/md1 does not appear to be active.

May be look up Beeker in the Funeral Notices  :-[

Regards
Beeker
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on July 06, 2013, 09:28:10 am
OK, this is good actually.  We know what the problem is finally.  Besides a bad HDD.  I will show you where to look in a few hours.  Doing some work this friday.  Don't panic this is fixable.
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on July 06, 2013, 09:42:02 am
Awesome news means I can cancel the funeral director :)
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on July 09, 2013, 07:33:09 am
OK, first lets do this.

fdisk -l to make sure the partitions are named the same after replacing the drive.  post the output



Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on July 09, 2013, 07:41:37 am
Here you go

dcerouter_1024641:/home/bruce# fdisk -l

Disk /dev/sda: 750.1 GB, 750156374016 bytes
255 heads, 63 sectors/track, 91201 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x39686ed2

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1       90446   726507463+  83  Linux
/dev/sda2           90447       91201     6064537+   5  Extended
/dev/sda5           90447       91201     6064506   82  Linux swap / Solaris

Disk /dev/sdb: 1500.3 GB, 1500301910016 bytes
255 heads, 63 sectors/track, 182401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x0006a5c8

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               1      182401  1465136001   83  Linux

Disk /dev/sdc: 1500.3 GB, 1500301910016 bytes
255 heads, 63 sectors/track, 182401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0xb217d64b

   Device Boot      Start         End      Blocks   Id  System
/dev/sdc1               1      182401  1465136001   83  Linux

Disk /dev/sdd: 1500.3 GB, 1500301910016 bytes
255 heads, 63 sectors/track, 182401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x43b7a284

   Device Boot      Start         End      Blocks   Id  System
/dev/sdd1               1      182401  1465136001   83  Linux

Disk /dev/sde: 1500.3 GB, 1500301910016 bytes
240 heads, 63 sectors/track, 193801 cylinders
Units = cylinders of 15120 * 512 = 7741440 bytes
Disk identifier: 0x2848762e


   Device Boot      Start         End      Blocks   Id  System
/dev/sde1   *          14      193802  1465033728    7  HPFS/NTFS




Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on July 09, 2013, 09:16:16 am
haha ok geez something has changed.  im trying to figure out what happened. 
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on July 09, 2013, 09:19:04 am
Thanks I will hold off contacting the funeral director :)
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on July 09, 2013, 09:20:11 am
FYI

The 4 x HDD's in the RAID are still showing as removed
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on July 09, 2013, 09:39:41 am
ah ok so we have d now but no f LOL.  maybe jiggled a wire loose?  and it looks like e is formatted for HPFS/NTFS.  UH, HPFS... o.0  what in the wild world of sports?  I have not seen that before!  That has got to be formatted first and we need /dev/sdf back.  I hope i just never noticed that linux reports NTFS as HPFS/NTFS.  HPFS has been dead since NT 4.0 came out.  Read quote from microsoft LOL!

Disadvantages of HPFS
Because of the overhead involved in HPFS, it is not a very efficient choice for a volume of under approximately 200 MB. In addition, with volumes larger than about 400 MB, there will be some performance degradation. You cannot set security on HPFS under Windows NT.

HPFS is only supported under Windows NT versions 3.1, 3.5, and 3.51. Windows NT 4.0 cannot access HPFS partitions.
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on July 09, 2013, 09:51:25 am
and thats ok if the drives are still showing as removed.  we have a bad superblock to fix, I am assuming we will find some more but we will get to that in a bit.  gotta get those two things sorted first.
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on July 09, 2013, 10:55:17 am
I have removed sdd1 and just formatting with ext3 from my Win 7 PC using mini partition wizard than I will put it back in and run fdisk -l again and will post the results
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on July 09, 2013, 10:58:45 am
NO DON"T DO THAT
E Beeker E
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on July 09, 2013, 11:01:32 am
Sorry oops to late as that was the replacement HDD that was showing as no valid partition so I figured it needed to be formatted as ext3
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on July 09, 2013, 11:05:05 am
Have I stuffed things ?
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on July 09, 2013, 11:05:56 am
oh, maybe it was e.  can you run fdisk -l again and post
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on July 09, 2013, 11:23:06 am
Here you go after formatting sdd1 with ext3

dcerouter_1024641:/home/bruce# fdisk -l

Disk /dev/sda: 750.1 GB, 750156374016 bytes
255 heads, 63 sectors/track, 91201 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x39686ed2

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1       90446   726507463+  83  Linux
/dev/sda2           90447       91201     6064537+   5  Extended
/dev/sda5           90447       91201     6064506   82  Linux swap / Solaris

Disk /dev/sdb: 1500.3 GB, 1500301910016 bytes
255 heads, 63 sectors/track, 182401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x0006a5c8

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               1      182401  1465136001   83  Linux

Disk /dev/sdc: 1500.3 GB, 1500301910016 bytes
255 heads, 63 sectors/track, 182401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0xb217d64b

   Device Boot      Start         End      Blocks   Id  System
/dev/sdc1               1      182401  1465136001   83  Linux

Disk /dev/sdd: 1500.3 GB, 1500301910016 bytes
255 heads, 63 sectors/track, 182401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0xc8efaf55

Disk /dev/sdd doesn't contain a valid partition table

Disk /dev/sde: 1500.3 GB, 1500301910016 bytes
255 heads, 63 sectors/track, 182401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x43b7a284

   Device Boot      Start         End      Blocks   Id  System
/dev/sde1               1      182401  1465136001   83  Linux

Disk /dev/sdf: 1500.3 GB, 1500301910016 bytes
255 heads, 63 sectors/track, 182401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x2848762e

   Device Boot      Start         End      Blocks   Id  System
/dev/sdf1               1      182401  1465134080   83  Linux
dcerouter_1024641:/home/bruce#
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on July 09, 2013, 11:24:47 am
if it was the replacement drive it probably was e. 
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on July 09, 2013, 11:26:05 am
woohoo!  ok safe, didn't mean to scare you Beeker would be a shame to get this far and then format your data off LOL.
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on July 09, 2013, 11:30:33 am
Phew sdd1 is definitely the replacement HDD that is currently showing no valid partition
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on July 09, 2013, 11:38:11 am
ok, looks like you got f working again.  this is probably why the drive assignment changed and made me pee myself a little when you formatted d.  good stuff.  ok lets work on the superblocks.  gimme a couple to get the right strategy going.  working at the moment too.  
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on July 09, 2013, 11:38:53 am
Thanks much appreciated  :)
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on July 16, 2013, 09:23:30 am
Hey Beeker, sorry I haven't replied in a while.  I have been very busy at work and had no internet at the home.  If you still need to fix that RAID follow these instructions on this link for /dev/sdb.

first run this command just to be safe

mdadm --stop /dev/md1

then follow this
http://linuxexpresso.wordpress.com/2010/03/31/repair-a-broken-ext4-superblock-in-ubuntu/

your using ext3 i think?
Once you have replaced the bad superblock with a backup run


mdadm --assemble --force /dev/md1 /dev/sd[b,c,d,e,f] 


let me know if you have any problems
if you do get the RAID up and running again you will need to run fsck
but NOT until it is rebuilt and ready to use/online
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on July 16, 2013, 09:32:00 am
On second thought don't run fsck.  Backup your data once you get it running and then run fsck if you want to use that RAID.  Sometimes fsck will throw stuff in the lost and found folder.  Better to keep your organization the way it is for photos i am assuming.
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on July 16, 2013, 09:45:05 am
Thanks Crumble,
Just in the car stuck in traffic and will be home
in about 40mins so I will let you know how it
goes and thanks again:)
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on July 16, 2013, 11:07:24 am
Hey Crumble,
Reading the wordpress documentation it says to run

sudo fsck.ext3 -v /dev/sdd

Is it ok to run this cmd ?

Regards
Beeker
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on July 16, 2013, 11:11:57 am
you want    sudo fsck.ext3 -v /dev/sdb

do everything to /dev/sdb

If i am correct that is where the bad superblock was

make sure to run mdadm --stop /dev/md1  before anything
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on July 16, 2013, 11:14:18 am
I have stopped it already and I am sure it was sdd as that was the HDD I replaced so I assume it wont hurt if it the wrong HDD and if it comes back ok then I will check sdb
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on July 16, 2013, 11:18:31 am
sdd should be blank from the formatting

dcerouter_1024641:/home/bruce# mdadm --stop /dev/md1
mdadm: stopped /dev/md1
dcerouter_1024641:/home/bruce# mdadm --assemble --force /dev/md1 /dev/sd[b,c,d,e,f]
mdadm: no recogniseable superblock on /dev/sdb
mdadm: /dev/sdb has no superblock - assembly aborted
dcerouter_1024641:/home/bruce#
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on July 16, 2013, 11:21:06 am
I started checking sdd and currently it says this

dcerouter_1024641:/home/bruce# fsck.ext3 -v /dev/sdd
e2fsck 1.41.3 (12-Oct-2008)
/dev/sdd has gone 1270 days without being checked, check forced.
Pass 1: Checking inodes, blocks, and sizes

Should I cancel it and check sdb
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on July 16, 2013, 11:22:20 am
no it wont hurt anything, just shouldn't have any valid superblocks on it as it was never part of the RAID.  should be blank.
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on July 16, 2013, 11:26:20 am
Thanks so do you think sdb will have bad superblock even though sdd was the faulty HDD
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on July 16, 2013, 11:29:49 am
This is what I got below from sdd so now I am going to check sdb

dcerouter_1024641:/home/bruce# fsck.ext3 -v /dev/sdd
e2fsck 1.41.3 (12-Oct-2008)
/dev/sdd has gone 1270 days without being checked, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information

      11 inodes used (0.00%)
       0 non-contiguous inodes (0.0%)
         # of inodes with ind/dind/tind blocks: 0/0/0
 5798288 blocks used (1.58%)
       0 bad blocks
       1 large file

       0 regular files
       2 directories
       0 character device files
       0 block device files
       0 fifos
       0 links
       0 symbolic links (0 fast symbolic links)
       0 sockets
--------
       2 files
dcerouter_1024641:/home/bruce#
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on July 16, 2013, 11:31:42 am
This is what I get for sdb, sdc,sde

dcerouter_1024641:/home/bruce# fsck.ext3 -v /dev/sdb
e2fsck 1.41.3 (12-Oct-2008)
fsck.ext3: Superblock invalid, trying backup blocks...
fsck.ext3: Bad magic number in super-block while trying to open /dev/sdb

The superblock could not be read or does not describe a correct ext2
filesystem.  If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
    e2fsck -b 8193 <device>
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on July 16, 2013, 11:40:49 am
ok, you will have to do one disk at a time.  start with sdb. 

try to reassemble after you fix the superblock on sdb.  you may not have to do all the disks.  if that does not work then fix the superblocks on c and e and then try the reassemble command.  it will work eventually.
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on July 16, 2013, 11:50:14 am
Sorry to be a pain in the you know where..........I have tried a number of the superblocks and I am getting the same error as per below and I tried e3fsck and that didn't work either as it came back as command not found

dcerouter_1024641:/home/bruce# e2fsck -b 32768 /dev/sdb
e2fsck 1.41.3 (12-Oct-2008)
e2fsck: Bad magic number in super-block while trying to open /dev/sdb

The superblock could not be read or does not describe a correct ext2
filesystem.  If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
    e2fsck -b 8193 <device>
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on July 16, 2013, 11:55:56 am
did you try them all?
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on July 16, 2013, 11:58:11 am
no did you want me to try and restore a super block on sdc & sde as well
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on July 16, 2013, 12:05:32 pm
no start with the sdb drive.  after you restore one of the superblocks try the assemble command and see if that works.
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on July 16, 2013, 12:16:11 pm
tried all the superblocks for sdb and none of them want to restore
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on July 16, 2013, 12:45:04 pm
post the output of this again

sudo mdadm --assemble --force /dev/md1 /dev/sd[bcdef]1
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on July 16, 2013, 12:50:23 pm
i was looking through the thread again and there was no 1 on the end of that assemble command.  that could cause a problem.
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on July 16, 2013, 12:54:59 pm
haha i think that was the problem with the assemble and superblocks commands.  you always have to use the full drive name, that includes the number.
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on July 16, 2013, 02:16:01 pm
here you go

dcerouter_1024641:/home/bruce# mdadm --assemble --force /dev/md1 /dev/sd[bcdef]1
mdadm: no RAID superblock on /dev/sdf1
mdadm: /dev/sdf1 has no superblock - assembly aborted

Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on July 16, 2013, 03:35:17 pm
tried putting a 1 and testing sdb,c,d & e again and below is what I got

dcerouter_1024641:/home/bruce# fsck.ext3 -v /dev/sdb1
e2fsck 1.41.3 (12-Oct-2008)
fsck.ext3: Group descriptors look bad... trying backup blocks...
Superblock has an invalid ext3 journal (inode 8).
Clear<y>? no

fsck.ext3: Illegal inode number while checking ext3 journal for /dev/sdb1

dcerouter_1024641:/home/bruce# fsck.ext3 -v /dev/sdc1
e2fsck 1.41.3 (12-Oct-2008)
fsck.ext3: Superblock invalid, trying backup blocks...
fsck.ext3: Bad magic number in super-block while trying to open /dev/sdc1

The superblock could not be read or does not describe a correct ext2
filesystem.  If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
    e2fsck -b 8193 <device>

dcerouter_1024641:/home/bruce# fsck.ext3 -v /dev/sdd1
e2fsck 1.41.3 (12-Oct-2008)
fsck.ext3: No such file or directory while trying to open /dev/sdd1

The superblock could not be read or does not describe a correct ext2
filesystem.  If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
    e2fsck -b 8193 <device>

dcerouter_1024641:/home/bruce# fsck.ext3 -v /dev/sde1
e2fsck 1.41.3 (12-Oct-2008)
/dev/sde1 has unsupported feature(s): FEATURE_I26 FEATURE_R26
e2fsck: Get a newer version of e2fsck!
dcerouter_1024641:/home/bruce#
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on July 16, 2013, 04:37:41 pm
Ok try

mdadm --assemble --force /dev/md1 /dev/sd[bcde]1
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on July 16, 2013, 04:51:03 pm
got this

dcerouter_1024641:/home/bruce# mdadm --assemble --force /dev/md1 /dev/sd[bcde]1
mdadm: /dev/md1 has been started with 3 drives (out of 4).
dcerouter_1024641:/home/bruce#
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on July 16, 2013, 04:57:50 pm
LMCE now reports the RAID status as damaged and when I check the status of each HDD

sdb is ok
sbc is ok
sdd is still saying removed
sde is ok
sdf is still saying removed
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on July 16, 2013, 04:58:30 pm
There you go.  You only need the three.  But once its done get that data off.  One drive failure and its kablooe data.  You can check the rebuild with

cat /proc/mdstat
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on July 16, 2013, 05:02:19 pm
Thanks Crumble you are awesome and I really
appreciate your patience and help

Kind regards
Beeker

:)
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on July 16, 2013, 05:04:09 pm
No problem Beeker.   You owe me the life debt now :-p
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on July 16, 2013, 05:06:41 pm
you bet I do...........and there was going to be a special mention to you on my headstone if all that hard work didn't pay off :)

I just browsed to the RAID array and all the data is there so I will be madly copying all the data off tonight
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on July 16, 2013, 05:23:48 pm
Hey Crumble,
Do you know if it will repair itself eventually or will it stay in damaged mode?

Don't want to push my luck though I cant get any help in getting LMCE to regonise the QNAP NAS if you see my other recent post I have tried everything I know which isn't much :)
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on July 16, 2013, 05:39:36 pm
It will stay in damaged mode until sdd1 is added.  I will check the other thread never used a QNAS.  Once you get that data off we can sdd1 back in if you like
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on July 16, 2013, 05:43:45 pm
Thanks............its a circus of HDD lights flashing in my office copying all that data off
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on August 20, 2013, 01:18:43 pm
Hi Crumble,
Sorry been traveling and I finally got all the data of the RAID array and I want to try and add in the sdd1 so what should we do

Cheers
Beeker 
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on August 20, 2013, 01:50:30 pm
Hey Beeker, give me a fdisk -l and a df -h
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on August 20, 2013, 01:51:44 pm
No problems I will get it shortly and come back to

Cheers
Beeker :)
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on August 24, 2013, 12:19:28 pm
Hi Crumble,
Very sorry been out of action with work and an arm injury...........below is fdisk -l and the df -h

Cheers
Beeker

--------------------------------------------------------------------------------------------------------------
dcerouter_1024641:/home/# fdisk -l

Disk /dev/sda: 750.1 GB, 750156374016 bytes
255 heads, 63 sectors/track, 91201 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x39686ed2

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1       90446   726507463+  83  Linux
/dev/sda2           90447       91201     6064537+   5  Extended
/dev/sda5           90447       91201     6064506   82  Linux swap / Solaris

Disk /dev/sdb: 1500.3 GB, 1500301910016 bytes
255 heads, 63 sectors/track, 182401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x0006a5c8

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               1      182401  1465136001   83  Linux

Disk /dev/sdc: 1500.3 GB, 1500301910016 bytes
255 heads, 63 sectors/track, 182401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0xb217d64b

   Device Boot      Start         End      Blocks   Id  System
/dev/sdc1               1      182401  1465136001   83  Linux

Disk /dev/sdd: 1500.3 GB, 1500301910016 bytes
255 heads, 63 sectors/track, 182401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x43b7a284

   Device Boot      Start         End      Blocks   Id  System
/dev/sdd1               1      182401  1465136001   83  Linux

Disk /dev/sde: 1500.3 GB, 1500301910016 bytes
255 heads, 63 sectors/track, 182401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x2848762e

   Device Boot      Start         End      Blocks   Id  System
/dev/sde1               1      182401  1465134080   83  Linux
dcerouter_1024641:/home/#

---------------------------------------------------------------------------------------------------

dcerouter_1024641:/home/# df -h
Filesystem            Size Used Avail Use% Mounted on
rootfs                682G   44G  604G   7% /
udev                 1012M  2.9M 1009M   1% /dev
/dev/disk/by-uuid/50ef0102-13de-4f96-8c0f-23a2e75ee41f
                      682G   44G  604G   7% /
/dev/disk/by-uuid/50ef0102-13de-4f96-8c0f-23a2e75ee41f
                      682G   44G  604G   7% /dev/.static/dev
tmpfs                1012M     0 1012M   0% /lib/init/rw
varrun               1012M  392K 1012M   1% /var/run
varlock              1012M     0 1012M   0% /var/lock
tmpfs                1012M  2.2M 1010M   1% /lib/modules/2.6.27-17-generic/volatile
tmpfs                1012M     0 1012M   0% /dev/shm
/dev/md1              4.1T  3.2T  703G  83% /mnt/device/31
/dev/md1              4.1T  3.2T  703G  83% /tmp/tmp.RZtaM22685
dcerouter_1024641:/home/#
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on August 25, 2013, 10:56:50 am
ok now i need

mdadm -D /dev/md1
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on August 25, 2013, 10:57:36 am
Will grab it now
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on August 25, 2013, 11:17:36 am
Hi Crumble,
Here you go

dcerouter_1024641:/home/# mdadm -D /dev/md1
/dev/md1:
        Version : 00.90
  Creation Time : Sun Dec 27 10:14:43 2009
     Raid Level : raid5
     Array Size : 4395407808 (4191.79 GiB 4500.90 GB)
  Used Dev Size : 1465135936 (1397.26 GiB 1500.30 GB)
   Raid Devices : 4
  Total Devices : 3
Preferred Minor : 1
    Persistence : Superblock is persistent

    Update Time : Sun Aug 25 19:10:32 2013
          State : clean, degraded
 Active Devices : 3
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 64K

           UUID : 5747d0ac:31c15bff:bd9f1658:0a1d2015 (local to host dcerouter)
         Events : 0.625036

    Number   Major   Minor   RaidDevice State
       0       8       17        0      active sync   /dev/sdb1
       1       8       33        1      active sync   /dev/sdc1
       2       0        0        2      removed
       3       8       49        3      active sync   /dev/sdd1
 
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on August 25, 2013, 11:51:06 am
mdadm --manage /dev/md1 --add /dev/sde1


then watch the rebuild with

watch cat /proc/mdstat


let me know when it is done.  we may need to change the mdadm.conf
doubtful though, just dont reboot till we check it.
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on August 25, 2013, 12:24:08 pm
Tried that and got this message

dcerouter_1024641:/home/# mdadm --manage /dev/md1 --add /dev/sde1
mdadm: /dev/sde1 not large enough to join array
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on August 25, 2013, 01:13:00 pm
mdadm --add /dev/md1 /dev/sde1

try this, if it says it added it watch with

watch cat /proc/mdstat
 
you should see something like this

 md1 : active raid5 sdb2[4] sdd2[3] sdc2[2] sda2[0]
         1464765696 blocks level 5, 256k chunk, algorithm 2 [4/3] [U_UU]
         [>....................]  recovery =  0.0% (84068/488255232) finish=193.4min speed=42034K/sec

Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on August 25, 2013, 03:18:47 pm
Tried that and still got the same thing as per below

dcerouter_1024641:/home/# mdadm --add /dev/md1 /dev/sde1
mdadm: /dev/sde1 not large enough to join array
dcerouter_1024641:/home/#

Checked this as well just to give you some more info

dcerouter_1024641:/home/# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md1 : active raid5 sdb1[0] sdd1[3] sdc1[1]
      4395407808 blocks level 5, 64k chunk, algorithm 2 [4/3] [UU_U]
unused devices: <none>
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on August 25, 2013, 04:27:04 pm
Derp.  I forogt, the superblocks on that drive are bad so the partition needs to be rebuilt.  Since its only part of one raid just format  sde1 ext3.  Then run the add command.  It will rebuild itself and mdmadm.conf looks good.

Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on August 25, 2013, 04:29:23 pm
Can I format it with out taking it out and putting it into another pc
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on August 25, 2013, 04:39:17 pm
yeah do this

mdadm --fail /dev/md1 /dev/sde1
mdadm: set /dev/sde1 faulty in /dev/md1

then format ext3

then run the add command
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on August 25, 2013, 10:26:47 pm
Sorry Crumble no luck with that command see below

dcerouter_1024641:/home/# mdadm --fail /dev/md1 /dev/sde1
mdadm: set device faulty failed for /dev/sde1:  No such device

dcerouter_1024641:/home/# mdadm: set /dev/sde1 faulty in /dev/md1
bash: mdadm:: command not found

Cheers
Beeker


Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on August 26, 2013, 01:46:38 am
Its already marked it faulty then.  Just format ext3 then run add command.  There are four drives in the config so it should grow/rebuild itself. 
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on August 26, 2013, 01:50:01 am
Just confirming 

mkfs.ext3 /dev/sde1
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on August 26, 2013, 01:55:43 am
That is the one :-)
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on August 26, 2013, 02:11:33 am
Hey beeker,  quick question.  Since you got all that data off and have four disks,  why not go with a raid 10?  You are using that qnas for your media now right? 
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on August 26, 2013, 05:14:27 am
Not a bad idea, I assume no other data will magically appear after rebuilding the RAID5 and spot on moved all the data to my QNAP with 16TB of storage in RAID6

Can I rebuild the existing RAID5 to RAID10 once I get that extra HDD back into the original RAID array
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on August 26, 2013, 01:26:55 pm
Hey Crumble,
Tried formatting sde1 and got the following then tried to add and still get that same error and I also have included a fdisk -l report just as an fyi

dcerouter_1024641:/home/# mkfs.ext3 /dev/sde1
mke2fs 1.41.3 (12-Oct-2008)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
91578368 inodes, 366283520 blocks
18314176 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=0
11179 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
       4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,
        102400000, 214990848

Writing inode tables: done
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 33 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.
dcerouter_1024641:/home/#


dcerouter_1024641:/home/# mdadm --add /dev/md1 /dev/sde1
                  mdadm: /dev/sde1 not large enough to join array


dcerouter_1024641:/home/# fdisk -l

Disk /dev/sda: 750.1 GB, 750156374016 bytes
255 heads, 63 sectors/track, 91201 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x39686ed2

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1       90446   726507463+  83  Linux
/dev/sda2           90447       91201     6064537+   5  Extended
/dev/sda5           90447       91201     6064506   82  Linux swap / Solaris

Disk /dev/sdb: 1500.3 GB, 1500301910016 bytes
255 heads, 63 sectors/track, 182401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x0006a5c8

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               1      182401  1465136001   83  Linux

Disk /dev/sdc: 1500.3 GB, 1500301910016 bytes
255 heads, 63 sectors/track, 182401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0xb217d64b

   Device Boot      Start         End      Blocks   Id  System
/dev/sdc1               1      182401  1465136001   83  Linux

Disk /dev/sdd: 1500.3 GB, 1500301910016 bytes
255 heads, 63 sectors/track, 182401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x43b7a284

   Device Boot      Start         End      Blocks   Id  System
/dev/sdd1               1      182401  1465136001   83  Linux

Disk /dev/sde: 1500.3 GB, 1500301910016 bytes
255 heads, 63 sectors/track, 182401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x2848762e

   Device Boot      Start         End      Blocks   Id  System
/dev/sde1               1      182401  1465134080   83  Linux
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: mkbrown69 on August 26, 2013, 03:01:50 pm
Your disks appear to be identical sizes, but the number of blocks available on sde are less than the others, which is the cause of the error.  More than likely, your older disks were partitioned with a starting sector of 63, which was the old standard.  The user-land disk partitioning tools had their defaults changed to accommodate the newer disks, where you want to partition align on a different boundary (usually 2048).  change your units to sectors, and see what the starting sector is for your sde vs sdd.  You'll likely have to delete the partition again, and re-create while forcing the starting sector to 63 to match your existing disks.

Hope that helps!

/Mike
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Crumble on August 26, 2013, 03:54:01 pm
mkbrown is right the block size is off by 4000 KB.  That was sdf, which was never part of the raid.  I checked the thread again and that has been the case all along for that drive.  Looks like we need to delete the partition with fdisk not just format it.  Follow these instructions to just delete the partition.
Do not create a new partition just save and exit once deleted.


http://www.howtogeek.com/106873/how-to-use-fdisk-to-manage-partitions-on-linux/

stop the raid mdadm --stop /dev/md1

then run this command, this is the easy way.

 sfdisk -d /dev/sdd | sfdisk /dev/sde

then format it using mkfs

then add
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on August 26, 2013, 04:36:08 pm
Thanks guys will give it a go and let you know how it goes

Kind regards
Beeker
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: mkbrown69 on August 26, 2013, 06:08:05 pm
then run this command, this is the easy way.

 sfdisk -d /dev/sdd | sfdisk /dev/sde

then format it using mkfs

then add
Actually, you don't need to format the individual members of the raid array.  They contain the striped blocks of the md device, not an actual file system.  It's the /dev/mdX that actually gets a file system, which in this case already exists so DON'T do a mkfs on /dev/md1.

Just copy the partition table as per the sfdisk command above, and then add /dev/sde1 into the array.

Things should work a lot better this time around.

HTH!

/Mike
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on August 26, 2013, 06:17:34 pm
OK thanks Mike :)
Title: Re: URGENT 810 Software RAID failed after power outage
Post by: Beeker on August 27, 2013, 04:48:56 am
All fixed thanks to Crumble for a super human effort and avoid an early funeral after recovering the wedding photos :)

Thanks to Mike as well

Kind regards
Beeker
Title: Re: [SOLVED]: URGENT 810 Software RAID failed after power outage
Post by: mkbrown69 on August 27, 2013, 05:32:20 am
Beeker,

Glad to hear you recovered your treasured memories!  RAID is meant for availability in the case of a hardware failure in one of the "spinning platters of rust", to quote a colleague.  It won't ever replace a good backup, which is something you may wish to do in the near future...

Take care!

/Mike
Title: Re: [SOLVED]: URGENT 810 Software RAID failed after power outage
Post by: Beeker on August 27, 2013, 05:33:36 am
Thanks Mike.............I do have a back up in place now to avoid sudden death in the future :)