RAID

From LQWiki
Jump to navigation Jump to search

What is RAID?

RAID stands for "Redundant Array of Independent Disks". It allows multiple physical hard drives to be used as one device. There are many different types or "levels" of RAID for different purposes.

Can I use it?

RAID capabilities are considered essential for most server-class computers and are widely implemented on servers. But there is no reason that RAID cannot be used on more basic PCs as well. Any computer capable of supporting more than one hard-disk can use RAID.

Why Might I Want it?

RAID capabilities can provide several benefits:

  • Redundancy -- If one disk fails, one or more others have the same data and can prevent data loss until the failed drive is replaced.
  • Performance -- Data can be written to more than one drive at a time, improving overall transfer rates.
  • Convenience -- The space from several physical disks can be addressed as though it were single device. This can be done without RAID using symlinks and well designed mount points, but may be easier to set up with RAID.

The different RAID levels provide these benefits in different combinations -- A linear RAID (sometime called "concatenation") provides convenience, but no performance or redundancy benefits. RAID 0 offers performance benefits, but no redundancy. RAID 1 offers redundancy but no performance benefit. Most other RAID types offer some combination of both performance and redundancy benefits.

What Are The Drawbacks?

Redundancy costs extra writes, which can slow down your system. Error-correction involves calculations, and in software RAID these are done by your CPU. It is more complicated than taking extra backups.

That said, the simpler forms of RAID can be a clear benefit.

What is the Difference Between Software and Hardware RAID?

RAID can be implemented either by a dedicated hardware device or through software.

In hardware RAID, the drives are attached to a controller card with a dedicated processor chip. The controller card handles the creation of the RAID and any parity calculations that must be made and presents the storage to the operating system as though each array were a single drive instead of an array of several physical drives. Using hardware RAID, an operating system does not need to know anything about RAID since it simply sees what it believes to be physical disks. True hardware RAID controllers are based on SCSI controllers or SAS (Serial-Attached-SCSI) controllers. While there are a few IDE-based or SATA-based RAID controllers that are true hardware RAID controllers in the conventional sense, in many cases these cards are actually driver-based RAID as explained below.

In software RAID, the creation of the array and all of the calculations involved are handled by software (most often by the OS itself). This does add a small amount of additional overhead to the system CPU, but in most systems it is a negligible amount.

What is Driver-based RAID or Fake-RAID?

Some "RAID cards", most notably a large number of SATA (serial ATA) RAID controllers are marketed as though they are true hardware RAID controllers -- when in fact they are little more than plain SATA controllers that are shipped with a device driver (usually Windows-only) that implements software RAID at a driver level instead of in the OS kernel. In these devices, the driver passes the tasks of creating the arrays, calculating parity, and etc., to the system CPU -- thus differing little in effect from software RAID as discussed above.

An additional drawback to such cards stems from the fact that most vendors initially provide full RAID functionality only in the Windows versions of their drivers. In Linux, many such cards must be configured as ordinary IDE or SATA controllers and then the OS is used to provide RAID functionality with normal software RAID.

See http://linux-ata.org/faq-sata-raid.html for more information.

Simple RAID Levels

There are only a few basic RAID types although they can be combined together to produce combination types.

Basic RAID types supported by Linux software RAID include Linear, RAID-0, RAID-1, RAID-4, RAID-5, RAID-6 and RAID-10.

Linear

A RAID in Linear mode offers no redundancy benefit and very little performance benefit. It would be used only because it allows the storage space on multiple physical hard drives to be addressed as a single device. As shown below, it fills the first device before writing to the next.

Physical Disk 1

data1
data2
data3

Physical Disk 2

data4
data5
xxxxx

Physical Disk 3

xxxxx
xxxxx
xxxxx

The available space for this type of RAID is the sum of the space available for each of the participating disks. The disks do not have to be of the same size. So, if I form a Linear RAID of a 10 GB and two 20 GB disks, the usable size of the array is the full 50 GB.

In the event of a drive failure, only the files stored on the affected drive would be lost. For example, in the illustration above, if disk 2 failed, only data4 and data5 would be lost.

(Linear mode is sometimes called "just a bunch of disks" -- JBOD).

RAID-0

RAID-0 is also known as "striping". It spreads data across several hard drives so that the system can be reading from several drives at once, increasing performance. Like Linear RAID, RAID-0 offers no redundancy.

RAID-0 is popular amongst gamers, for whom performance is more important than reduced reliability.

Physical Disk 1

data1
data4
xxxxx

Physical Disk 2

data2
data5
xxxxx

Physical Disk 3

data3
xxxxx
xxxxx

The available space for this type of RAID is the sum of the space available for each of the participating disks. So, if I form a RAID-0 of three 20 GB disks, the usable size of the array is the full 60 GB.

It is advisable, but not required, to use disks of identical size (the performance benefit to this RAID is reduced if one drive is substantially larger than the rest). In theory the speed of a RAID-0 array is roughly that of the slowest drive times the number of drives, although it will usually be slower in the real world.

In the event of a drive failure, since files can be spread across multiple disks, most or all of the data in the array will probably be lost.

RAID-1

We recommend RAID-1 type arrays for home users. RAID-10 gives you better performance (up to twice the speed) but you need at least 4 disks.

RAID-1 is also known as "mirroring". It creates a duplicate copy of data on another hard drive (or several more) so that if one of the drives fails, no data is lost. This RAID level offers good redundancy, but no performance benefit. In fact, while read performance is equivalent to that of a single drive, write performance is a bit lower than when using a single drive.

RAID-1 is most commonly implemented using only two drives, but Linux software RAID supports the use of multiple drive -- each an exact copy of the others.

Physical Disk 1

data1
data2
data3

Physical Disk 2

data1
data2
data3

Physical Disk 3

data1
data2
data3

It is advisable, but not required, to use disks of identical size. However, the array will only use space equal to the size of the smallest drive in the array.

The available space for this type of RAID is the same as the space available on the smallest of the hard drives being used. So, if I form a RAID-1 of a 10 GB and a 20 GB disk, the usable size of the array is only 10 GB.

Since each disk in the array is an exact copy of the same data, a RAID-1 array can withstand the failure of one or several drives, so long as at least one remains intact.

RAID-4

RAID-4 is not used in the real world. It is only used as an academic illustration, as a stepping stone to understanding RAID-5.

RAID-4 is also known as "striping with dedicated parity". It requires at least three disks to create. One of them is used exclusively for parity data and the rest contain striped data.

This RAID level offers both redundancy and performance benefits, but the performance advantage is not as significant as in RAID-0.

Physical Disk 1

data1
data3
data5

Physical Disk 2

data2
data4
data6

Physical Disk 3

parity1-2
parity3-4
parity5-6

It is advisable, but not required, to use disks of identical size. However, the array will only use space equal to the size of the smallest drive in the array.

The available space for this type of RAID is S * (N-1) where 'S' is the size of the smallest of the hard drives being used and 'N' is the number of disks in the array. So, if I form a RAID-4 of two 10 GB disks and one 20 GB disk, the usable size of the array is 20 GB.

A RAID-4 can withstand the failure of any single drive without data loss. If the failed drive is the parity drive, the array can continue to function because all of the actual data is still intact on the other drives. If the failed drive is one of the data drives, the array can continue to function by using the parity information and the data on the remaining drive(s) to calculate what data the failed drive should be storing if it were functioning. Although the array continues to function, performance is significantly reduced and the data is at risk if a second drive should fail before the first failed drive is replaced.

RAID-5

RAID-5 is also known as "striping with distributed parity". It requires at least three disks to create. Each of them is used for both parity data and striped data.

This RAID level offers both redundancy and performance benefits, but the performance advantage is not as significant as in RAID-0 nor RAID-10.

Physical

Disk 1

data1
data3
parity5-6
Physical

Disk 2

data2
parity3-4
data5
Physical

Disk 3

parity1-2
data4
data6

It is advisable, but not required, to use disks of identical size. However, the array will only use space equal to the size of the smallest drive in the array.

The available space for this type of RAID is S * (N-1) where 'S' is the size of the smallest of the hard drives being used and 'N' is the number of disks in the array. So, if I form a RAID-5 of three 10 GB disks, the usable size of the array is 20 GB.

A RAID-5 can withstand the failure of any single drive without data loss. When any single drive fails the array can continue to function because all of the actual data is either still intact on the other drives or able to be reconstructed from the remaining data plus the parity information on the other drives. Although the array continues to function, performance is significantly reduced and the data is at risk if a second drive should fail before the first failed drive is replaced.

RAID-6

RAID-6 is also known as "striping with double distributed parity". It requires at least four disks to create. Each of them is used for both parity data and striped data.

This RAID level offers both redundancy and performance benefits, but the performance advantage is not as significant as in RAID-0, RAID-10, nor RAID-5.

It is advisable, but not required, to use disks of identical size. However, the array will only use space equal to the size of the smallest drive in the array.

The available space for this type of RAID is S * (N-2) where 'S' is the size of the smallest of the hard drives being used and 'N' is the number of disks in the array. So, if I form a RAID-6 of four 10 GB disks, the usable size of the array is 20 GB.

A RAID-6 can withstand the failure of any two single drive without data loss. When any two drive fails the array can continue to function because all of the actual data is either still intact on the other drives or able to be reconstructed from the remaining data plus the parity information on the other drives. Although the array continues to function, performance is significantly reduced and the data is at risk if a third drive should fail before the first two failed drives are replaced.

RAID-10

RAID-10 is built on RAID-1. It is a mirrored raid type, just like RAID-1. It requires at least 4 disks to create, and only even numbers of drives are allowed.

RAID-10 comes in 3 different layouts:

near - in most cases equivalent to RAID-1, but an odd number of drives are allowed

far - the raid is laid out as multiple RAID-0 arrays, and it thus enjoys striping speed for sequential reading and writing (writing is done twice).

offset - a layout that should also be faster than layout=near

This RAID level offers both redundancy and performance benefits, and the performance advantage may be as big as for RAID-0 for reading. For writing the writing of the data twice gives a performance of something like half of RAID-0.

It is advisable, but not required, to use disks of identical size. However, the array will only use space equal to the size of the smallest drive in the array.

The available space for this type of RAID is S * (N/C) where 'S' is the size of the smallest of the hard drives being used, 'N' is the number of disks in the array and C is the number of copies present (normally 2). So, if I form a RAID-10 of two 10 GB disks, the usable size of the array is 10 GB.

A RAID-10 can withstand the failure of a single drive without data loss. If you have more than two copies of data, than you may at least withstand the number of copies minus one. Furthermore, you may withstand more failures, with more disks, depending on the combination of failing disks. If you have N disks and 2 copies, the maximum number of disk failures you can withstand is N/2. For an array with 8 disks you may be lucky to withstand four disk failures. Although the array continues to function, performance is significantly reduced and the data is at risk if a third drive should fail before the first two failed drives are replaced.

Combination RAID Levels

Arrays can be built using other arrays just as they can from drives or partitions. This allows the creation of "nested", "multiple", or "combination" RAID types.

Combining simple RAID types with different strengths can often provide the best of both worlds. For example, RAID 0 offers great performance but no redundancy while RAID 1 offers redundancy but no performance advantage. Combining them into a RAID 0+1 or a RAID 1+0 offers both -- with even better redundancy that in a basic RAID 1.

Combinations can be formed of any RAID types supported by Linux, but the most common variations are probably:

  • RAID 0+1 -- Mirroring of stripe sets
  • RAID 1+0 -- Striping across mirror sets
  • RAID 50 -- Striping across RAID 5 sets

Naming Conventions

Although usage of these conventions is not always consistent, the general rule is that the first digit in the name describes the RAID type first or at the lowest level. The second digit describes the RAID type applied second or at the logically higher level.

For example, a RAID 0+1 (so named to prevent people from assuming that a RAID 01 is the same as a RAID 1) is created by first building two or more stripe sets (identical size and configuration) and then building a mirror from them.

A RAID 1+0 would be created in the opposite sequence -- building multiple mirrors from paired disks and then creating a strip set across them.

RAID 0+1: Mirroring of Stripe sets

RAID 0+1 provides better performance than simple RAID 1 along with the redundancy that simple RAID 0 lacks. It requires at least four drives to implement and provides usable space equal to S*N/2 where 'S' is the size of the smallest of the hard drives being used and 'N' is the number of disks in the array. So, if I form a RAID 0+1 of four 10 GB disks, the usable size of the array is 20 GB. You are most likely better off with raid10,f2 than using raid0+1, as raid10,f2 gives the double sequential read performance, and is on par wrt. other IO operations.

RAID 0+1 (Mirroring of Stripe sets)
First Stripe Set
Disk 1
Data1
Data4
xxx
Disk 2
Data2
Data5
xxx
Disk 3
Data3
Data6
xxx
Second Stripe Set
Disk 4
Data1
Data4
xxx
Disk 5
Data2
Data5
xxx
Disk 6
Data3
Data6
xxx

A RAID 0+1 can withstand the failure of a single drive (or multiple drives, so long as they are all from the same stripe set).

RAID 1+0: Striping across Mirror sets

RAID 1+0 provides better performance than simple RAID 1 along with the redundancy that simple RAID 0 lacks. It requires at least four drives to implement and provides usable space equal to S*N/2 where 'S' is the size of the smallest of the hard drives being used and 'N' is the number of disks in the array. So, if I form a RAID 1+0 of four 10 GB disks, the usable size of the array is 20 GB. RAID 1+0 is sometimes called RAID 10, but that is easily misunderstood as the Linux MD raid10 array type, and should be avoided.

RAID 0+1 and RAID 1+0 are very similar but, by most measures of both performance and redundancy, RAID 1+0 is considered to be preferable. But you are most likely better off with RAID10 in the far layout (raid10,f2) than using RAID 1+0, as raid10,f2 gives the double sequential read performance, and is on par wrt. other IO operations.

RAID 1+0 -- Striping across Mirror sets
Mirror 1
Disk 1
Data1
Data4
Disk 2
Data1
Data4
Mirror 2
Disk 3
Data2
Data5
Disk 4
Data2
Data5
Mirror 3
Disk 5
Data3
Data6
Disk 6
Data3
Data6

A RAID 1+0 can withstand the failure of a single drive (or multiple drives, so long as they are NOT in the same mirror set).


RAID 50: Striping across RAID 5 sets

RAID 50 provides better write performance and improved redundancy over simple RAID 5 along with the redundancy that simple RAID 0 lacks. It requires at least six drives to implement and (assuming all drives are the same size) provides usable space equal to S * (N-1) * R where 'S' is the size of the smallest of the hard drives being used, 'N' is the number of disks in each RAID 5 array, and 'R' is the number of RAID 5 sets used in the stripe set. So, if I form a RAID 50 of six 10 GB disks, the usable size of the array is 40 GB.

RAID 50 -- Striping across RAID 5 sets
RAID 5

Set 1

Disk 1
Data1
Data5
Parity 9,11
Disk 2
Data3
Parity 5,7
Data9
Disk 3
Parity 1,3
Data7
Data11
RAID 5

Set 2

Disk 4
Data2
Data6
Parity 10,12
Disk 5
Data4
Parity 6,8
Data10
Disk 6
Parity 2,4
Data8
Data12

A RAID 50 can withstand the failure of one drive in each RAID 5.

Configuring Linux Software RAID

Linux software RAID can be configured in several different ways. Some installation tools allow for the creation of arrays during the OS install. The maintenance tool is mdadm.

Post-installation configuration of Linux software RAID consists of several steps:

  • Create partitions for use. Configure them as Linux RAID Automounting partitions (hex code "fd")
  • Use mdadm to create the array.
  • Create the /etc/mdadm-conf file (mdadm -R -p is useful to generate a template mdadm.conf file, but it may require some editing).
  • Create a filesystem on the partition
  • Mount the partition and/or add the partition to /etc/fstab

Partition Drives

Partitioning is always a thorny problem. People have trouble with their root and boot file systems running in software RAID configurations, although both are possible. If hardware is managing your RAID, you probably won't be reading this. NB Some so-called RAID controllers are nothing more than a BIOS front for two IDE ports: e.g. Promise FastTrak. This is Software Raid, like a WinModem is a type of modem. These typically are accessed by the 2.6 kernel with dmraid.

You can use the partition type 'FD' (the hex value) for linux RAID autodetect. If you have the right stuff compiled in (or a terribly complicated initrd setup), the arrays should be detected at boot time, then all your configuration is complete and you can refer to the drives as /dev/md*. This is better than having post boot init scripts to sort out the raid, and allows the root partition to be raided. Oddly enough, in some cases you can have both boot and root raided without the 'fd' partition type, assuming you have a firmware 'fakeraid' controller as mentioned above. Only the kernel autodetect needs 'fd'.

Create the RAID

The raidtools/raidtools2 package is depreciated. Use the concise mdadm tool to build your software managed RAID arrays.

Create a filesystem on the RAID

Or duplicate one. A common strategy in migrating to RAID is to create matching partitions on the new empty disk, copy the data over, then set up 'degraded' arrays.

 mdadm --create /dev/md0 --raid-level=raid10 -n=2  missing /dev/hdg5

where hdg5 is the newly create partition. "MISSING" will be added into the raid array once the configuration of the array is otherwise complete. This allows a lower-risk migration, although backups are essential.

Use also --layout=f2 to create faster raid10 arrays. These cannot be used for booting, though.

Create an /etc/mdadm.conf file

This is not necessary if you have the kernel autodetecting the arrays at boot time

booting off a RAID drive

If you want to be able to boot after your primary disk fails, a few more tips:

  • By default, the Master Boot Record (MBR) is only set up on the primary disk. You need to install the MBR on the other disk as well (with grub). The following apply to software and firmware raid:
  • grub and LILO can boot off a RAID-1 or Raid10,n2 partition (because it looks just like an ordinary disk during the boot). grub-0.97 can boot off a raid-0 using dmraid. Alas, it can't boot off a RAID-5 partition.

http://linux-raid.osdl.org/index.php/Preventing_against_a_failing_disk

other things to set up ahead of time

alternatives to RAID

  • Linas Vepstas 2003 suggests "If you are a sysadmin contemplating the use of RAID, I strongly encourage you to use EVMS instead."

further reading