Friday, February 17, 2012

RAID, it's not just a bug spray...

Many of us tend to take RAID for granted and don’t fully understand the different RAID types and when each is appropriate to use.  In an effort to explain things a bit more I’m going to go a bit deeper into the topic and share some experiences.
First off what is RAID?  RAID actually stands for Redundant Array of Inexpensive Disks which most of us know is anything but inexpensive.  There is also two types of RAID from a software standpoint, Soft-RAID & Hardware-RAID.
In Soft-RAID all the RAID functions are typically performed by the OS, this type of RAID impacts performance of the machine since it uses the CPU & RAM for its functions.  Soft-RAID also provides not assurance in the event of a corruption since the OS that runs the Soft-RAID can become corrupt causing a total failure. 
Hardware-RAID uses an actual piece of hardware called a RAID-Controller.  These controllers can handle two or more drives directly attached to them and have their own CPU & RAM to perform the necessary RAID functions thus not impacting performance of the system.  These controllers also provide the best defense against failure since they are fully independent of the OS.
Now lets talk about the different RAID levels.  Its also not just about the level of RAID, or the controller you use, a great deal also is dependent on the type of drive you use.  I’ll go more in depth on drive types in another blog.  When dealing with RAID it is extremely important to consider the following:
1.      How much usable storage do I need?
2.      How important is this data?
3.      Do I need mass storage or do I need performance & mass storage?
Question one is very important, you can’t simply add up the space each disk will provide and determine you have enough.  Say for example you need four terabytes usable space, in a RAID5 you would need three drives minimum, so assume you buy three two-terabyte drives that would mean six-terabytes right?  Well in RAID5 there is a parity drive that is not available for use as storage, so out of the three drives you are actually left with two, which would equal your four-terabytes. 
Question two, how important is this data?  Well for most of us it’s very important and for this reason we will go to great lengths and expense to protect it.  Going back to the example in question one on RAID5, you tolerance for drive loss is only one.  Meaning you can lose only one drive and not lose data, lose more than one and the data is gone.  So how do we allow for more than a single drive failure to occur and not lose it all?  Hot-Spares, we would add a forth drive to the example above as a hot-spare.  This drive will immediately come online and take over the role of the failed drive when the drive fails.  Keep this in mind though, while you can now lose more than one drive you cannot lose more than one at the same time!  See the hot-spare has to fully come online and rebuild the data, it is at this point than another drive loss can be tolerated.
Mass storage, pure performance, or both?  Very important here, while RAID5 can provide loads of storage, it is typically slow on write operations due to the parity drive.  Then take in to account what type of drive you have (SATA, 10k, 15k, SSD) and things can get really slow.  Generally speaking for performance you would want a RAID10.   RAID10 can give you the best of both worlds, but it’s also costly since your storage needs double.  If you need four-terabytes of usable storage you will need to buy eight-terabytes worth of drives!  I’ll explain more about RAID10 below.
Now lets look at the most common RAID levels and their pros and cons.
JBOD (Just a Bunch of Disks)
            Not truly RAID but this still needs to be covered.  JBOD is also known as a “Concatenated Disk Set” or “spanning disks”.
            Concatenation or spanning of disks is not one of the numbered RAID levels, but it is a popular method for combining multiple physical disk drives into a single virtual disk. It provides no data redundancy. As the name implies, disks are merely concatenated together, end to beginning, so they appear to be a single large disk. It may be referred to as SPAN or BIG (meaning just the words "span" or "big", not as acronyms).
Concatenation may be thought of as the inverse of partitioning. Whereas partitioning takes one physical drive and creates two or more logical drives, concatenation uses two or more physical drives to create one logical drive.
In that it consists of an array of independent disks, it can be thought of as a distant relative of RAID. Concatenation is sometimes used to turn several odd-sized drives into one larger useful drive, which cannot be done with RAID 0. For example, one could combine 80 GB, 80 GB, 160 GB, and 200 GB drives into a logical drive at 520 GB, which is often more useful than the individual drives separately.
In the diagram to the right, data are concatenated from the end of disk 0 (block A63) to the beginning of disk 1 (block A64); end of disk 1 (block A91) to the beginning of disk 2 (block A92). If RAID 0 were used, then disk 0 and disk 2 would be truncated to 28 blocks, the size of the smallest disk in the array (disk 1) for a total size of 84 blocks.
Many Linux distributions use the terms "linear mode" or "append mode".


RAID level 0 (RAID0) – Striping
Minimum number of disks: 2
In a RAID 0 system data is split up in blocks that get written across all the drives in the array. By using multiple drives at the same time, this offers superior I/O performance. This performance can be enhanced further by using multiple controllers, ideally one controller per drive.
 
Advantages  
  • RAID 0 offers great performance, both in read and write operations. There is no overhead caused by parity controls.
  • All storage capacity is used, there is no drive overhead.
  • The technology is easy to implement.
Disadvantages
 
  • RAID 0 is not fault-tolerant. If one drive fails, all data in the RAID 0 array is lost. It should not be used on mission-critical systems.
 
Ideal use
 
RAID 0 is ideal for non-critical storage of data that have to be read/written at a high speed, such temp file storage or data sources that are easily restored from backup and are used for “research” such as read only documents.

RAID level 1 (RAID1) – Mirroring
 
Minimum number of disks: 2
 
Data is stored twice by writing to both the data drive (or set of data drives) and a mirror drive (or set of drives) . If a drive fails, the controller uses either the data drive or the mirror drive for data recovery and continues operation. 
 
Advantages 
  • RAID 1 offers excellent read speed and a write-speed that is comparable to that of a single drive.
  • In case a drive fails, data does not have to be rebuilt, it is simply copied to the replacement drive.
  • RAID 1 is a very simple technology.
Disadvantages
  
  • The main disadvantage is that the effective storage capacity is only half of the total drive capacity because all data get written twice.
  • Software RAID 1 solutions do not always allow a hot swap of a failed drive (meaning it cannot be replaced while the server keeps running). Ideally a hardware controller is used.  
Ideal use 
 
RAID-1 is ideal for mission critical storage, for instance for accounting systems. It is also suitable for small servers in which only two drives will be used. It is also ideal for SAN connected solutions where only the OS is exists on the hardware and all the data exists on the SAN.

RAID level 5 (RAID5)
Minimum number of drives: 3+
RAID 5 is the most common secure RAID level.  Data is transferred to the drives by independent read and write operations (not in parallel).  The data chunks that are written are also larger.  Instead of a dedicated parity drive, parity information is spread across all the drives.
            A RAID 5 array can withstand a single drive failure without losing data or access to data.  Although RAID 5 can be achieved in software, a hardware controller is recommended.  Often extra cache memory is used on these controllers to improve the write performance.
Advantages
  •  Read data transactions are very fast while write data transaction are somewhat slower (due to the parity that has to be calculated). 
Disadvantages 
  • Drive failures have an effect on throughput, although this is still acceptable.
  • Complex technology.  
Ideal use 
 
RAID 5 is a good all-round system that combines efficient storage with excellent security and decent performance. It is ideal for file and application servers.

RAID level 10 (RAID10) – RAID0+1
Minimum number of drives: 4
RAID 10 combines the advantages (and disadvantages) of RAID 0 and RAID 1 in one single system.  It provides security by mirroring all data on a secondary set of drives (drive 3 and 4 in the drawing below) while using striping across each set of drives to speed up data transfers and increase capacity.
Advantages  
  •  Read & write transactions are fast
  •  Data is protected from in the event of a drive failure
Disadvantages 
  • Only half of the total space is available for use
  • Cost per useable GB / TB is doubled
  • Complex technology.
Ideal use
 
High IO systems that need fail protection such as virtualization or databases.


Thats not it is there? No not by a long shot, there is also RAID 2, 3, 4, 6 and 7! These RAID levels are very rarely seen. Theres also several Nested RAID levels which are rarely seen.
Now before I wrap this up I was to scream this: RAID IS NO SUBSTITUTE FOR BACKUPS!  You would be amazed how many times I’m asked to attempt to recover data from a failed array and no backups are available.  Please don’t be that person!
 

No comments:

Post a Comment