Software RAID Howto

Redhat Enterprise 3 doesn’t contain a good guide on how to install and manage a RHEL3 system to a pair of mirrored disks using software RAID. Here’s is my guide. This guide should work equally well for the clones of RHEL, e.g. Whitebox linux, CentOS, Tao Linux …
Installing RHEL

My hardware for installing to was a Pentium 4 machine, with two 80GB Maxtor IDE hard disks, 1GB RAM. I booted RHEL off disk 1, and started working through the installer.

At the point where disk partitioning takes place, I chose Disk Druid (instead of fdisk / auto) to partition the disks. I created two 100MB software RAID primary paritions, one on each disk, two 512MB linux swap partitions, two 79GB paritions to fill the rest of the disk. I made the two 100MB partitions a single RAID 1 device, mounted on /boot, and the other two a RAID 1 device mounted on /. The rest of the install proceeds as normal.

When the machine reboots back into RHEL, it will have working software RAID, however the boot loader will only be installed on the first disk (/dev/hda). To install this on the second disk (/dev/hdc), we need to run grub.

$ grub

grub> device (hd0) /dev/hdc
grub> root (hd0,0)
Filesystem type is ext2fs, partition type 0xfd
grub> setup (hd0)
Checking if “/boot/grub/stage1” exists… no
Checking if “/grub/stage1” exists… yes
Checking if “/grub/stage2” exists… yes
Checking if “/grub/e2fs_stage1_5” exists… yes
Running “embed /grub/e2fs_stage1_5 (hd0)”…  15 sectors are embedded.
succeeded
Running “install /grub/stage1 (hd0) (hd0)1+15 p (hd0,0)/grub/stage2
/grub/grub.conf”… succeeded
Done.

The next thing to do is to take a backup of the parition table on the disk – you will need this to restore to an extra disk. You can get this by running fdisk, and picking option ‘p’ (print partition table).

/dev/hda
Device    Boot    Start    End    Blocks    id    System
/dev/hda1    *    1     203     102280+    fd    Linux raid autodetect
/dev/hda2        204    1243    524160    82    Linux swap
/dev/hda3        1244    158816    79416792    fd    Linux raid autodetect

Monitoring the RAID array

This is for my setup with two disks, /dev/hda, /dev/hdc both with identical data on.

$ cat /proc/mdstat
Personalities : [raid1]
read_ahead 1024 sectors
Event: 1
md0 : active raid1 hda2[0] hdc2[1]
119925120 blocks [2/2] [UU]

This will give the status of the raid array, if both disks are operating it looks like this

md0 : active raid1 hdc3[1] hda3[0]

If it’s broken and only one disk is operating it looks like this

md0 : active raid1 hdc3[1]

If it’s recovering from a failed disk it looks like this

md0 : active raid1 hda3[1] hdc[1]
….
[.>………] recovery = 3% (…/…) finish=128min speed=10000k/sec

More information comes from

mdadm –query –detail /dev/md0

…. lots of stuff …

Number    Major    Minor     RaidDevice    State
0    0    0    0        faulty, removed
1    222    3    1        active sync /dev/hdc3

This tells us that device 0 is missing – device 1 is working fine.

In theory the mdmonitor
How to restore from a broken raid array

In this case /dev/hda has failed and I’m inserting a replacement disk. I start by rebooting the machine from CD on disk 1, and running the rescue mode by typing ‘linux rescue’ at the command prompt on the CD.

Do not mount any disks, or set up the network. You will be dropped into a command prompt.

Partition the new disk with the same partition table as the old disk. It is very important to make sure you partition the correct disk. You may wish to unplug the working disk during this step to insure yourself against user error.

$ fdisk /dev/hda

n (new)
p 1 (patition #1)
1 203 (start and end cylinders)
t 1 fd (set the partition type to linux raid)

n
p 2
204 1243
t 2 82 (set the partition type to linux swap)

n
p 3
1244 158816
t 3 fd (set the partition type to linux raid)

I then boot the machine from it’s working disk. I then need to add the replacement disk into the array and trigger the rebuild.

mdadm –manage –add /dev/md0 /dev/hda3
mdadm –manage –add /dev/md1 /dev/hda1

The new disk has no boot sector – that’s not covered by the RAID array. We need to write this back to the disk as earlier.

$ grub

grub> device (hd0) /dev/hdc
grub> root (hd0,0)
Filesystem type is ext2fs, partition type 0xfd
grub> setup (hd0)
Checking if “/boot/grub/stage1” exists… no
Checking if “/grub/stage1” exists… yes
Checking if “/grub/stage2” exists… yes
Checking if “/grub/e2fs_stage1_5” exists… yes
Running “embed /grub/e2fs_stage1_5 (hd0)”…  15 sectors are embedded.
succeeded
Running “install /grub/stage1 (hd0) (hd0)1+15 p (hd0,0)/grub/stage2 /grub/grub.conf”… succeeded
Done.

At this point our system is fully restored.
Notes

It’s entirely possible to do the recovery by booting from the working disk rather than a rescue CD. This increases the chance of accidently destroying all your data. I’d recommend not doing that, until you can perform a recovery with a CD without referencing this guide at any point.

Read more at http://ex-parrot.com/~pete

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s