Raid DBMS
Raid DBMS
Why Raid?
Having a large number of disks in a system presents opportunities for improving the rate at
which data can be read or written, if the disks are operated in parallel. Parallelism can also be
used to perform several independent reads or writes in parallel. Furthermore, this setup offers the
potential for improving the reliability of data storage, because redundant information can be
stored on multiple disks. Thus, failure of one disk does not lead to loss of data.
RAID level 0 refers to disk arrays with striping at the level of blocks, but without any
redundancy (such as mirroring or parity bits). Figure 11.4a shows an array of size 4.
• RAID level 1 refers to disk mirroring with block striping. Figure 11.4b shows a mirrored
organization that holds four disks worth of data.
• RAID level 4, block-interleaved parity organization, uses block level striping, like RAID 0,
and in addition keeps a parity block on a separate disk for corresponding blocks from N other
Page 2 of 4 CSE 467 (RAID)
disks. This scheme is shown pictorially in Figure 11.4e. If one of the disks fails, the parity block
can be used with the corresponding blocks from the other disks to restore the blocks of the failed
disk.
A block read accesses only one disk, allowing other requests to be processed by the other disks.
Thus, the data-transfer rate for each access is slower, but multiple read accesses can proceed in
parallel, leading to a higher overall I/O rate. The transfer rates for large reads is high, since all
the disks can be read in parallel; large writes also have high transfer rates, since the data and
parity can be written in parallel.
Small independent writes, on the other hand, cannot be performed in parallel. A write of a block
has to access the disk on which the block is stored, as well as the parity disk, since the parity
block has to be updated. Moreover, both the old value of the parity block and the old value of the
block being written have to be read for the new parity to be computed. Thus, a single write
requires four disk accesses: two to read the two old blocks, and two to write the two blocks.
• RAID level 5, block-interleaved distributed parity, improves on level 4 by partitioning data and
parity among all N + 1 disks, instead of storing data in N disks and parity in one disk. In level 5,
all disks can participate in satisfying read requests, unlike RAID level 4, where the parity disk
cannot participate, so level 5 increases the total number of requests that can be met in a given
amount of time. For each set of N logical blocks, one of the disks stores the parity, and the other
N disks store the blocks.
Figure 11.4f shows the setup. The P’s are distributed across all the disks. For example, with an
array of 5 disks, the parity block, labelled Pk, for logical blocks 4k, 4k+1, 4k+2, 4k+3 is stored in
disk (k mod 5)+1; the corresponding blocks of the other four disks store the 4 data blocks 4k to
4k + 3. The following table indicates how the first 20 blocks, numbered 0 to 19, and their parity
blocks are laid out. The pattern shown gets repeated on further blocks.
RAID level 6, the P + Q redundancy scheme, is much like RAID level 5, but stores extra
redundant information to guard against multiple disk failures. Instead of using parity, level 6 uses
error-correcting codes such as the Reed– Solomon codes (see the bibliographical notes). In the
scheme in Figure 11.4g, 2 bits of redundant data are stored for every 4 bits of data—unlike 1
parity bit in level 5—and the system can tolerate two disk failures.
RAID level 0 is used in high-performance applications where data safety is not critical. Since
RAID levels 2 and 4 are subsumed by RAID levels 3 and 5, the choice of RAID levels is
restricted to the remaining levels. Bit striping (level 3) is rarely used since block striping (level
5) gives as good data transfer rates for large transfers, while using fewer disks for small transfers.
For small transfers, the disk access time dominates anyway, so the benefit of parallel reads
diminishes. In fact, level 3 may perform worse than level 5 for a small transfer, since the transfer
completes only when corresponding sectors on all disks have been fetched; the average latency
for the disk array thus becomes very close to the worst-case latency for a single disk, negating
the benefits of higher transfer rates. Level 6 is not supported currently by many RAID
implementations, but it offers better reliability than level 5 and can be used in applications where
data safety is very important.
The choice between RAID level 1 and level 5 is harder to make. RAID level 1 is popular for
applications such as storage of log files in a database system, since it offers the best write
performance. RAID level 5 has a lower storage overhead than level 1, but has a higher time
overhead for writes. For applications where data are read frequently, and written rarely, level 5 is
the preferred choice.
Disk storage capacities have been growing at a rate of over 50 percent per year for many years,
and the cost per byte has been falling at the same rate. As a result, for many existing database
applications with moderate storage requirements, the monetary cost of the extra disk storage
needed for mirroring has become relatively small (the extra monetary cost, however, remains a
significant issue for storage-intensive applications such as video data storage). Access speeds
have improved at a much slower rate (around a factor of 3 over 10 years), while the number of
I/O operations required per second has increased tremendously, particularly for Web application
servers.
RAID level 5, which increases the number of I/O operations needed to write a single logical
block, pays a significant time penalty in terms of write performance. RAID level 1 is therefore
the RAID level of choice for many applications with moderate storage requirements, and high
I/O requirements.
RAID system designers have to make several other decisions as well. For example, how many
disks should there be in an array? How many bits should be protected by each parity bit? If there
are more disks in an array, data-transfer rates are higher, but the system would be more
expensive. If there are more bits protected by a parity bit, the space overhead due to parity bits is
lower, but there is an increased chance that a second disk will fail before the first failed disk is
repaired, and that will result in data loss.