Raid Failed, now what?

  • jflynn
  • Mastermind
  • Mastermind
  • User avatar
  • Posts: 2305
  • Loc: Baker City, Oregon

Post 3+ Months Ago

I'm not sure if this is a hardware or software issue but it started as a hardware so I'll post here.

I have a Server Running a Raid 5 with 4 disks. One of the disks failed and has been replaced.

When first booting, you have the option to enter the Bios and then the option to enter the RAID Bios (i think that's what it is). It says that the RAID is bootable and shows that it needs to Rebuild (it's supposed to do this after fully booting.)

After this screen disappears it usually goes the the "Windows 2003" screen wile it loads the OS. This isn't happening now. It just goes to a Black Screen. The monitor light stays green like it is receiving a signal.

I've tried 3 differant monitors including LCD and CRT. I've tried hitting F8 to get to safe mode.

Before i start to tear this thing apart, do any of you have any ideas?
  • Anonymous
  • Bot
  • No Avatar
  • Posts: ?
  • Loc: Ozzuland
  • Status: Online

Post 3+ Months Ago

  • ATNO/TW
  • Super Moderator
  • Super Moderator
  • User avatar
  • Posts: 23456
  • Loc: Woodbridge VA

Post 3+ Months Ago

What manufacturer is it? I had a failed RAID drive a year or so ago. Thought I was going to have to replace it and didn't have the foggiest idea what to do. First thing I did was call Dell Support (my server manufacturer). Nice thing about Enterprise support is you don't get a tier one support person. They immediately got me to a tech that knew what he was doing. Before replacing the drive he actually had me download, install and run several diagnostic tools. Turns out the drive did not have to be replaced. Just rebuilt. Honestly it was so far above me, I can't remember exactly what all he had me do, but following his instructions to the letter took care of it.

My suggestion would be contact your server manufacturer and get support from them, even if you have to pay for it. It's worth it, to have someone that knows what they are doing help you rather than beating your head against a wall trying to figure it out.
  • grinch2171
  • Moderator
  • Genius
  • User avatar
  • Posts: 6800
  • Loc: Martinsburg, WV

Post 3+ Months Ago

I just had a drive go bad on a RAID-1. Dell sent me my new one, I pulled the bad one and slapped in the good one. No reboot was required for the rebuild to take place, it did it on the fly.

I've also had Dell servers report a drive bad and it ended up being a firmware issue, updated the firmware and all was well.

Your RAID-5 should have handled a single drive failure without issue and replacing the drive should have fixed it and the rebuild should have taken place without any interaction from you.

At this point I do think you are in need of support from the server manufacturer.
  • jflynn
  • Mastermind
  • Mastermind
  • User avatar
  • Posts: 2305
  • Loc: Baker City, Oregon

Post 3+ Months Ago

Called the manufacturer and went through serveral things. It appears that the HD failure also caused the OS to become corrupted.

I was able to try and boot to safe mode and it hangs..... argh!!!!!

Just tried doing a repair. It doesn't seem to want and do that either..... [F-BOMB]
  • ATNO/TW
  • Super Moderator
  • Super Moderator
  • User avatar
  • Posts: 23456
  • Loc: Woodbridge VA

Post 3+ Months Ago

That doesn't sound very good jflynn. Don't know what else to suggest except try to continue to get advice from your support.

And grinch, now that you mention it, that was exactly what happened with me. The firmware upgrade was what fixed it. In fact I seem to recall I had sent you a PM about it when it happened.

//jflynn - was the server working with the failed drive? If so have you thought about putting it back in and see if you can get it to boot? Then check into the firmware issue grinch suggested.
  • jflynn
  • Mastermind
  • Mastermind
  • User avatar
  • Posts: 2305
  • Loc: Baker City, Oregon

Post 3+ Months Ago

WOOHOO! IT'S FIXED! Now that I have that out of the way, I'll tell you what I did in case it happens to somebody else.

Not being able to do a repair confused me. Lots of files were unable to write or be deleted and I didn't know why. I got to thinking that the actual error occured after the hard drive failed and it had switched over to only using 3 of the 4 drives.

It didn't know what to make of this 4th drive that was now in the set. On a hunch I removed the 4th drive from the raid and retryed doing the repair off of the 03' server disk.

It came up with a strange message that I've never seen before. To paraphrase "Fixed error on disk". It immediatly asked for a reboot and then booted all the way to the login screen.

I logged in and noticed that the radi manager was wanting the fourth hard drive installed. I shut down the server, plugged the 4th disk back in and booted.

It is currently in the rebuild process :)

Hope this explaination of my actions helps if anybody has this problem too. I know that I never want to have it again.
  • ATNO/TW
  • Super Moderator
  • Super Moderator
  • User avatar
  • Posts: 23456
  • Loc: Woodbridge VA

Post 3+ Months Ago

Nice! and congrats! I think the only thing that was not necessary was you probably could have just plopped in the 4th drive without shutting down. Should be hot swappable. For the record this post helps me remember what happened with mine and gives me good reference for if/when it happens again
  • jflynn
  • Mastermind
  • Mastermind
  • User avatar
  • Posts: 2305
  • Loc: Baker City, Oregon

Post 3+ Months Ago

yeah, they are supposed to be hotswap but it wasn't necessary to keep the server up.

But now..........

Last night when i left the office the server was rebuilding the RAID. This morning the Server was locked up and unresponsive. I had to do a Hard Restart for anything to happen. At the RAID configuration window it showed that another of the hard drives had failed. LMFAO

Fortunately, it did so after the rebuild had been completed so the Server was still bootable.

After speaking with Tech Support, they are sending me a new RAID Cage, and 2 hard drives to swap out all of the “Original” drives.

What a pain in the A$$.
  • grinch2171
  • Moderator
  • Genius
  • User avatar
  • Posts: 6800
  • Loc: Martinsburg, WV

Post 3+ Months Ago

Speaking of bizarre RAID behavior, I just ran into one.

I went to upgrade one of my servers to 64GB of RAM for my virtualization project and I reboot the server to enable some stuff in the BIOS and I see a message saying there is a configuration issue, press 'C' to enter setup or any key to continue. I let it continue and it says my logical drive is degraded. The server has three drives setup in a RAID-5. So I do my BIOS changes and the server reboots but this time I press C to get into the setup. In there it says that Drive2 is missing. I look at the server and see the drive sitting there so I pull it out and put it back in. The drive is currently rebuilding.

That was odd. Not a big deal since Dell will have me a drive within four hours if needed. I wonder what caused the server to think the drive was missing. Once the rebuild is done I plan on updating the BIOS and firmware.
  • jflynn
  • Mastermind
  • Mastermind
  • User avatar
  • Posts: 2305
  • Loc: Baker City, Oregon

Post 3+ Months Ago

That is wierd.

Post Information

  • Total Posts in this topic: 10 posts
  • Users browsing this forum: No registered users and 45 guests
  • You cannot post new topics in this forum
  • You cannot reply to topics in this forum
  • You cannot edit your posts in this forum
  • You cannot delete your posts in this forum
  • You cannot post attachments in this forum
 
 

© 1998-2014. Ozzu® is a registered trademark of Unmelted, LLC.