Keywords: Hamming code, error-correction code, digital communication, data storage, reliable transmission, computer memory systems, satellite communication systems, single-bit error, two-bit errors.
What is a Hamming Code
Hamming codes are a class of error-correcting codes that are commonly employed in digital communication and data storage systems to detect and correct errors that may occur during transmission or storage. They were created by Richard Hamming in the 1950s and bear his name.
The central concept of Hamming codes is to introduce additional (redundant) bits to a message in order to enable the identification and correction of errors. By appending parity bits to the original message, Hamming codes can identify and correct single-bit errors.
One notable characteristic of these codes is their ability to correct any single-bit error and detect any two-bit error, which has contributed to their widespread usage in computer memory systems, satellite communication systems, and other domains where reliable data transmission is crucial.
Technical details of Hamming code
Linear binary Hamming code falls under the category of linear block codes that can correct single bit errors. For every integer p ≥ 3 (the number of parity bits), there is a (2p-1, 2p-p-1) Hamming code. Here, 2p-1 is the number of symbols in the encoded codeword and 2p-p-1 is the number of information symbols the encoder can accept at a time. All such Hamming codes have a minimum Hamming distance dmin=3 and thus they can correct any single bit error and detect any two bit errors in the received vector. The characteristics of a generic (n,k) Hamming code is given below.
With the simplest configuration: p=3, we get the most basic (7, 4) binary Hamming code. The (7,4) binary Hamming block encoder accepts blocks of 4-bit of information, adds 3 parity bits to each such block and produces 7-bits wide Hamming coded blocks.
Systematic & Non-systematic encoding
Block codes like Hamming codes are also classified into two categories that differ in terms of structure of the encoder output:
● Systematic encoding
● Non-systematic encoding
In systematic encoding, just by seeing the output of an encoder, we can separate the data and the redundant
bits (also called parity bits). In the non-systematic encoding, the redundant bits and data bits are interspersed.
Constructing (7,4) Hamming code
Hamming codes can be implemented in systematic or non-systematic form. A systematic linear block code can be converted to non-systematic form by elementary matrix transformations. A non-systematic Hamming code is described next.
This article is part of the book |
Let a codeword belonging to (7, 4) Hamming code be represented by [D7,D6,D5,P4,D3,P2,P1], where D represents information bits and P represents parity bits at respective bit positions. The subscripts indicate the left to right position taken by the data and the parity bits. We note that the parity bits are located at position that are powers of two (bit positions 1,2,4).
Now, represent the bit positions in binary.
Seeing from left to right, the first parity bit (P1) covers the bits at positions whose binary representation has 1 at the least significant bit. We find that P1 covers the following bit positions
Similarly, the second parity bit (P2) covers the bits at positions whose binary representation has 1 at the second least significant bit. Hence, P2 covers the following bit positions.
Finally, the third parity bit (P4) covers the bits at positions whose binary representation has 1 at the most significant bit. Hence, P4 covers the following bit positions
If we follow even parity scheme for parity bits, the number of 1’s covered by the parity bits must add up to an even number. Which implies that the XOR of bits covered by the parity (including the parity bits) must result in 0. Therefore, the following equations hold.
For clarity, let’s represent the subscripts in binary.
Following table illustrates the concept of constructing the Hamming code as described by R.W Hamming in his groundbreaking paper [1].
We note that the parity bits and data columns are interspersed. This is an example of non-systematic Hamming code structure. We can continue our work on the table above as it is. Or, we can also re-arrange the entries of that table using elemental transformations, such that a systematic Hamming code is rendered.
After the re-arrangement of columns, we see that the parity columns are nicely clubbed together at the end. We can also drop the subscripts given to the parity/data locations and re-index them according to our convenience. This gives the following structure to the (7,4) Hamming code.
We will use the above systematic structure in the following discussion.
Encoding process
Given the structure in Figure 3, the parity bits are calculated from the following linearly independent equations using modulo-2 additions.
At the transmitter side, a Hamming encoder implements a generator matrix – . It is easier to construct the generator matrix from the linear equations listed in equation above. The linear equations show that the information bit D1 influences the calculation of parities at P1 and P3 . Similarly, the information bit D2 influences P1 and P2, D3 influences P1, P2 & P3 and D4 influences P2 & P3.
Represented as matrix operations, the encoder accepts 4 bit message block \(\mathbf{m}\), multiplies it with the generator matrix \(\mathbf{G}\) and generates 7 bit codewords \(\mathbf{c}\). Note that all the operations (addition, multiplication etc.,) are in modulo-2 domain.
Given a generator matrix, the Matlab code snippet for generating a codebook containing all possible codewords (\(\mathbf{c} \in \mathbf{C}\)) is given below. The resulting codebook \(\mathbf{C}\) can be used as a Look-Up-Table (LUT) when implementing the encoder. This implementation will avoid repeated multiplication of the input blocks and the generator matrix. The list of all possible codewords for the generator matrix (\(\mathbf{G}\)) given above are listed in table 2.
Program: Generating all possible codewords from Generator matrix
%program to generate all possible codewords for (7,4) Hamming code
G=[ 1 0 0 0 1 0 1;
0 1 0 0 1 1 0;
0 0 1 0 1 1 1;
0 0 0 1 0 1 1];%generator matrix - (7,4) Hamming code
m=de2bi(0:1:15,'left-msb');%list of all numbers from 0 to 2ˆk
codebook=mod(m*G,2) %list of all possible codewords
Decoding process – Syndrome decoding
The three check equations for the given generator matrix (\(\mathbf{G}\)) for the sample (7,4) Hamming code, can be expressed collectively as a parity check matrix – \(\mathbf{H}\). Parity check matrix finds its usefulness in the receiver side for error-detection and error-correction.
According to parity-check theorem, for every generator matrix G, there exists a parity-check matrix H, that spans the null-space of G. Therefore, if c is a valid codeword, then it will be orthogonal to each row of H.
Therefore, if \(\mathbf{H}\) is the parity-check matrix for a codebook \(\mathbf{C}\), then a vector \(\mathbf{c}\) in the received code space is a valid codeword if and only if it satisfies \(\mathbf{c} \mathbf{H}^T=0\).
Consider a vector of received word \(\mathbf{r}=\mathbf{c}+\mathbf{e}\), where \(\mathbf{c}\) is a valid codeword transmitted and \(\mathbf{e}\) is the error introduced by the channel. The matrix product \(\mathbf{r}\mathbf{H}^T\) is defined as the syndrome for the received vector \(\mathbf{r}\), which can be thought of as a linear transformation whose null space is \(\mathbf{C}\) [2].
Thus, the syndrome is independent of the transmitted codeword \(\mathbf{c}\) and is solely a function of the error pattern \(\mathbf{e}\). It can be determined that if two error vectors \(\mathbf{e}\) and \(\mathbf{e}’\) have the same syndrome, then the error vectors must differ by a nonzero codeword.
It follows from the equation above, that decoding can be performed by computing the syndrome of the received word, finding the corresponding error pattern and subtracting (equivalent to addition in \(GF(2)\) domain) the error pattern from the received word. This obviates the need to store all the vectors as in a standard array decoding and greatly reduces the memory requirements for implementing the decoder.
Following is the syndrome table for the (7,4) Hamming code example, illustrated here.
Some properties of generator and parity-check matrices
The generator matrix \(\mathbf{G}\) and the parity-check matrix \(\mathbf{H}\) satisfy the following property
Note that the generator matrix is in standard form where the elements are partitioned as
where Ik is a k⨉k identity matrix and P is of dimension k ⨉ (n-k). When G is a standard form matrix, the corresponding parity-check matrix H can be easily determined as
In Galois Field – GF(2), the negation of a number is simply its absolute value. Hence the H matrix for binary codes can be simply written as
References
[1] R.W Hamming, “Error detecting and error correcting codes”, Bell System Technical Journal. 29 (2): 147–160, 1950.↗
[2] Stephen B. Wicker, “Error Control Systems for digital communication storage”, Prentice Hall, ISBN 0132008092, 1995.
Topics in this Chapter
|
---|
Thanks!
Hello. I like your books and is planning to purchase all three. But when I worked out the problem by hand and refer to https://en.wikipedia.org/wiki/Hamming_code, there are some discrepancies. My work is in line with https://en.wikipedia.org/wiki/Hamming_code.
My method starts with a truth table to get the booleon equations, and the venn diagram, the generator matrix and the H matrix, etc..
Please comments on your approach.
Thank you
Mong
Thanks for commenting. My aim is not to repeat the theoretical approach that you find in authoritative texts on coding theory. Rather, this is an intuitive approach to understand the idea behind code construction.