In coding theory, an erasure code is a forward error correction (FEC) code under the assumption of bit erasures (rather than bit errors), which transforms a message of k symbols into a longer message (code word) with n symbols such that the original message can be recovered from a subset of the n symbols. The fraction r = k/n is called the code rate. The fraction k’/k, where k’ denotes the number of symbols required for recovery, is called reception efficiency. The recovery algorithm expects that it is known which of the n symbols are lost.
History
Erasure coding was invented by Irving Reed and Gustave Solomon in 1960.
There are many different erasure coding schemes. The most popular erasure codes are
Reed-Solomon coding, low-density parity-check code (LDPC codes), and turbo codes.
- Replication
- RAID (redundant array of inexpensive disks)
- Erasure coding
While technically RAID can be seen as a kind of erasure code, RAID is generally applied to an array attached to a single host computer (which is a single point of failure), while "erasure coding" generally implies multiple hosts,
Compared to block-level RAID systems, object storage erasure coding has some significant differences that make it more resilient.
Optimal erasure codes
Optimal erasure codes have the property that any k out of the n code word symbols are sufficient to recover the original message (i.e., they have optimal reception efficiency). Optimal erasure codes are maximum distance separable codes (MDS codes).
Parity check
Parity check is the special case where n = k + 1. From a set of k values <math>\{v_i\}_{1\leq i \leq k}</math>, a checksum is computed and appended to the k source values:
:<math>v_{k+1}= - \sum_{i=1}^k v_i.</math>
The set of k + 1 values <math>\{v_i\}_{1\leq i \leq k+1}</math> is now consistent with regard to the checksum.
If one of these values, <math>v_e</math>, is erased, it can be easily recovered by summing the remaining variables:
:<math>v_{e}= - \sum_{i=1 ,i \neq e }^{k+1}v_i.</math>
RAID 5 is a widely used application of the parity check erasure code using XOR. In particular, various implementations of Reed-Solomon erasure coding are used by Apache Hadoop, the RAID-6 built into Linux, Microsoft Azure, Facebook cold storage, and Backblaze Vaults.
The classical way to recover from failures in storage systems was to use replication. However, replication incurs significant overhead in terms of wasted bytes. Therefore, increasingly large storage systems, such as those used in data centers use erasure-coded storage. The most common form of erasure coding used in storage systems is Reed-Solomon (RS) code, an advanced mathematics formula used to enable regeneration of missing data from pieces of known data, called parity blocks. In a (k, r) RS code, a given set of k data blocks, called "chunks", are encoded into (k + r) chunks. The total set of chunks comprises a stripe. The coding is done such that as long as at least k out of (k + r) chunks are available, one can recover the entire data. This means a (k, r) RS-encoded storage can tolerate up to r failures. (This is different from the standard RS(n, k) notation, with n = k + r.)
RS(10,4)
Example: In RS(10, 4) code, which is used in Facebook for their HDFS (now part of Apache Hadoop), 10 MB of user data is divided into ten 1MB blocks. Then, four additional 1 MB parity blocks are created to provide redundancy. This can tolerate up to 4 concurrent failures. The storage overhead here is 14/10 = 1.4×.
In the case of a fully replicated system, the 10 MB of user data will have to be replicated 4 times to tolerate up to 4 concurrent failures. The storage overhead in that case will be 50/10 = 5.0×.
This gives an idea of the lower storage overhead of erasure-coded storage compared to full replication and thus the attraction in today's storage systems.
The Hitchhiker scheme can be combined with RS-coding to reduce the amount of computation and data-transfer required for data block reconstruction. It is also implemented as an HDFS codec, though a policy will need to be manually defined for it to be used. Alternatively, RAID N+M refers to N regular data drives with M redundancy drives, being able to recover all the data when any M drives fail.
- Any other maximum distance separable code
See also
- Forward error correction codes.
- Secret sharing (differs in that the original secret is encrypted and obscured until the decode quorum is reached)
- Spelling alphabet
- Binary erasure channel
References
External links
- Jerasure is a Free Software library implementing Reed-Solomon and Cauchy erasure code techniques with SIMD optimisations.
- Software FEC in computer communications by Luigi Rizzo describes optimal erasure correction codes
- Feclib is a near optimal extension to Luigi Rizzo's work that uses band matrices. Many parameters can be set, like the size of the width of the band and size of the finite field. It also successfully exploits the large register size of modern CPUs. How it compares to the near optimal codes mentioned above is unknown.
- Coding for Distributed Storage wiki for regenerating codes and rebuilding erasure codes.
- ECIP "Erasure Code Internet Protocol" Developed in 1996, was the first use of FEC "Forward Error correction" on the Internet. It was first used commercially to *stream live video of Sir Arthur C. Clarke in Sri Lanka to UIUC in Indiana.
