RAID (Redundant Array of Inexpensive/Independent Disks)
It combines multiple available disks into 1 or more logical drives and gives you the ability to survive one or more drive failures depending upon the RAID level used.
RAID contains groups or sets or Arrays(disks), A combination of drivers makes a group of disks to form a RAID Array or RAID set. It can be a minimum of 2 disks connected to a raid controller and makes a logical volume or more drives can be in a group. Only one Raid level can be applied to a group of disks. RAID is used when we need excellent performance.
Software RAID has low performance because of consuming resources from hosts. RAID software needs to load for Read data from software RAID volumes. Before loading RAID software, the OS needs to boot to load the RAID software. There is no need of physical hardware in software RAID. It’s a zero cost investment.
Hardware RAID has high performance. They are dedicated RAID controllers which are physically built using PCI express cards. It won’t use the host resources. They have NVRAM for cache to read and write. It stores cache while rebuilding even if there is a power-failure, it will store the cache using battery power backups. Very costly investment, needed for a large scale.
Featured Concepts of RAID
- The Parity method in RAID regenerates the lost content from parity saved information. RAID 5 and RAID 6 are based on parity.
- Stripe is sharing data randomly to multiple disks. It won’t have full data in a single disk. If we use 3 disks half of our data will be in each disks.
- Mirroring is used in RAID 1 and RAID 10. Mirroring is making a copy of the same data. In RAID 1 it will save the same content to the other disk also.
- Hot spare is just a spare drive in our server which can automatically replace the failed drives. If any one of the drives fails in our array, the hot spare drive will be used and rebuild it automatically.
- Chunk is just a size of data which can be minimum from 4KB and more. By defining chunk size we can increase the I/O performance.
RAIDs are in various levels. Here we will see only the RAID levels which are used in real environment.
- RAID 0 = Striping - RAID 1 = Mirroring - RAID 5 = Single Disk Distributed Parity - RAID 6 = Double Disk Distributed Parity - RAID 10 = Combine of Mirror & Stripe. (Nested RAID)
This level strips the data into multiple available drives equally giving a very high read and write performance but offering no fault tolerance or redundancy. This level does not provide any of the RAID factors and cannot be considered in an organization looking for redundancy instead it is preferred where high performance is required.
Calculation: No. of Disk: 5 Size of each disk: 100GB Usable disk size: 500GB
This level performs mirroring of data in drive 1 to drive 2. It offers 100% redundancy as the array will continue to work even if either of the disk fails. So the organizations looking for better redundancy can opt for this solution but again cost can become a factor here.
Calculation: No. of Disk: 2 Size of each disk: 100GB Usable disk size: 100GB
RAID 5 (or) Distributed Parity
RAID 5 is mostly used in enterprise levels. RAID 5 works by distributed parity method. Parity info will be used to rebuild the data. It rebuilds from the information left on the remaining good drives. This will protect our data from drive failure.
Assume we have 4 drives, if one of the drives fails and while we replace the failed drive we can rebuild the replaced drive from the parity information. Parity information is stored in all the 4 drives. If we have four 1TB hard-drives, the parity information will be stored in 256GB in each drive and the remaining 768GB in each drive will be defined for the users. RAID 5 can survive from a single Drive failure. If more than one drive fails, it will cause data loss.
RAID 6 (or) Two Parity Distributed Disk
RAID 6 is the same as RAID 5 with two parity distributed systems. Mostly used in a large number of arrays. We need minimum 4 Drives, even if 2 drives fail we can rebuild the data while replacing the failed drives with new ones.
Very slower than RAID 5, because it writes data to all 4 drivers at same time. Will be average in speed while we use a Hardware RAID Controller. If we have six 1TB hard-drives 4 drives will be used for data and 2 drives will be used for Parity.
RAID 10 (or) Mirror & Stripe
RAID 10 can be called as 1+0 or 0+1. This will do both the works of Mirroring & Striping. Mirror will be first and stripe will be the second in RAID 10. Stripe will be the first and mirror will be the second in RAID 01. RAID 10 is better as compared to 01.
LAB ON RAID
Configuring RAID 0
# mdadm --create /dev/md0 --level=0 --raid-devices=2 /dev/sdd /dev/sde # mdadm --detail /dev/md0 or /dev/md/md0
Note: Chunk size is 512K of your stripe size, it means if your data size is 1MB then 512k will be stored in both the disks.
# mkfs -t xfs or ext4 /dev/md0 # mkdir /raid0 # mount /dev/md0 /raid0 # df -Th
For raid 1
# mdadm --create md1 --level=linear --raid-devices=2 /dev/sde/ /dev/sdf
For raid 5
# mdadm --create md5 --level=5 --raid-devices=3 /dev/sdg /dev/sdh /dev/sdi # mdadm --detaild md5 or /dev/md/md5 # cat /proc/mdstat # mdadm --detail -scan # mdadm --detail -scan >> /etc/mdadm.conf
Note: Now if you will reboot the system still your RAID will be active
# mdadm /dev/md5 -f /dev/sdg (To make one device faulty) # mdadm --detail /dev/md5 (To check the device status) # df -Th # cd /raid5 (Still all good)
How to recover or remove the faulty disk
# mdadm /dev/md5 -r /dev/sdg (To remove the device) # mdadm /dev/md5 -a /dev/sdg (To add again) # mdadm --detail /dev/md5 (It will show spare is building)
Note: If you have configured the hot spare disk and it will automatically replace the disk
Steps to remove the RAID
# umount /raid5 (First unmount the disk) # mdadm --stop /dev/md5 (This command will stop the device) # mdadm --detail /dev/md5 (Now it will show no such file or directory)
That’s it in this article, hope you enjoyed it. Please share it across if you think it’s good.