RAID- Redundant Array of Inexpensive/Independent Disks
- RAID is the technique in which we use multiple physical hard disks which all together act as a single logical hard disk.
- Initially, RAID was known as Redundant Array of Inexpensive Disks because larger hard disks were costly, so we used multiple smaller disks.
- Nowadays we use RAID to increase performance and reliability so now RAID is Redundant Array of Independent Disks.
- Based on certain criteria how these multiple hard drives are used we have different RAID levels.
Here is a short video which i made to help you understand better!
- In RAID 0 all the data is striped and equally divided among the number of available disks.
- Striping can be bitwise/byte wise/block wise.
- Redundancy is not present as the same data is not copied anywhere else.
- Performance is better as more than one disk participates in read/write operation.
- Consider this example- If one man is asked to write A-Z the amount of time taken by him will be more as compared to 2 men writing A-Z because from the 2 men one man will write A-M and another will write N-Z at the same time so this will speed up the process.
- Hence, more the number of disks involved in read/write operation better will be the performance.
- As there is no redundancy i.e. no copy of data is maintained reliability is low. So if any one of the disk fails we lose the data.
- In RAID 1 we perform mirroring.
- We mirror i.e make copies of all the available data.
- This is expensive as more number of disks are required to make copies.
- It is reliable because if one disk is lost we already have another copy of it.
- It doesn’t help in increasing the performance.
- RAID 2 uses Error Correcting code like Hamming code to restore the damaged data.
- More than 2 bits are used for storing ECC so minimum 3 dedicated disks are required for storing ECC with 1bit on each disk.
- Data is split bit wise across the data storing disks(Here from disk 0 – disk 3) while ECC is stored in ECC storing disks(Here disk 4 – disk 6)
- This is expensive as more disks are required.
- It lowers the write operation as for every write operation ECC has to be calculated which is time consuming.
- In RAID 2 we used minimum 3 dedicated disks to store ECC
- In RAID 3 we use the concept of Parity instead of ECC.
- In this case Parity is XOR values of A1,A2 and A3. If they contain even number of 1s then Parity is set to 0 and if they contain odd number of 1s then Parity is set to 1.
- Only 1 bit is used so this requires only 1 dedicated disk.
- RAID 3 increases the performance as all the disks participate in every read/write operation.
- It doesn’t increase number of simultaneous accesses as all the disks participate in each read/write operation so no disk is available to service other read/write request.
- RAID 4 is similar to RAID 3. The only difference is that in RAID 3 we split the data bit wise but in RAID 4 we split the data block wise.
- As blocks are bigger than bits, so smaller read/write operations involve only one disk. So other disks are free to service other read/write requests so RAID 4 increases the number of simultaneous accesses.
- As only one disk is involved performance is not increased.
- It is mainly suitable for larger read/write operations in which many blocks of data will have to be either read or written, so more than one disk will be involved which will boost the performance.
- In RAID 3 and RAID 4 one dedicated disk was used to store the parity information. If this disk fails we will lose our entire backup.
- To overcome this flaw in RAID 5 we distribute the parity data evenly among all the disks.
- Any formula can be used to distribute it evenly.(example- as we have 5 disks here parity of nth block will be stored in n(mod 5)+1 disk.)
- Here one parity bit per block is stored in a disk chosen by the formula and the actual data is divided among the other disks.
- This reduces the potential overuse of a single disk.
- As only one bit of parity is stored we can overcome failures if only 1 disk fails, if more than 1 disk fails RAID 5 cannot help in recovering the damaged data.
- RAID 6 is similar to RAID 5.
- In RAID 5 we used only one parity bit but in RAID 6 we use more than one parity bit.
- This extra bit contains extra redundant information which helps to recover from multiple disk failures.
- This is the hybrid of RAID 0 and RAID 1
- So it gives better performance as well as reliability.
- RAID 0 – Striping of data
- RAID 1- Mirroring of data
- Here first we mirror the available data and form copies of it.
- Then we stripe those copies.
- Here we first stripe the data.
- Then we mirror the striped data to form copies.