0% found this document useful (0 votes)
2 views

ECCLectureNotes 2

This document discusses error correcting codes, which are essential for transmitting accurate messages over noisy communication channels. It emphasizes the trade-off between redundancy and error correction capabilities, providing examples and definitions related to coding theory. The document also introduces basic concepts such as Hamming distance, coding problems, and algorithmic efficiency in the context of error correction.

Uploaded by

asa peter pan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

ECCLectureNotes 2

This document discusses error correcting codes, which are essential for transmitting accurate messages over noisy communication channels. It emphasizes the trade-off between redundancy and error correction capabilities, providing examples and definitions related to coding theory. The document also introduces basic concepts such as Hamming distance, coding problems, and algorithmic efficiency in the context of error correction.

Uploaded by

asa peter pan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 81

Chapter 1

The main coding problem

This course is mainly about error correcting codes and the mathematical background
needed to master theses algorithms.
Error correcting codes are used to correct errors when messages are transmitted
through a noisy communication channel, for example sending an image from deep space
to earth.
The object of error-correcting code is to encode the data , in general by adding a
certain amount of redundancy to the message so the original message can be recovered
if a certain amount of errors have occurred.

1.1 The Fundamental Question


Humans have been used to communication before the numeric era, error correction is
something done somehow instantly by human brain, parents can understand their kids
incorrect words and some small amount of errors is sometimes even corrected without
notice by human brain.
Error-correcting codes ( or sometimes just codes) are clever ways of representing data
so that one can recover the original information even if parts of it are corrupted. The
basic idea is to judiciously introduce redundancy so that the original information can be

1
recovered even when parts of the data have been corrupted.
In this course, we will mainly think of codes in the communication scenario. In this
framework, there is a sender who wants to send (say) k message symbols over a noisy
channel.
The sender rst encodes the k message symbols into n symbols (called a codeword)
and then sends it over the channel. The receiver gets a received word consisting of n
symbols.
The receiver then tries to decode and recover the original k message symbols. Thus,
encoding is the process of adding redundancy and decoding is the process of removing
errors.
The fundamental question that will occupy our attention is the trade-o¤ between the
amount of redundancy used and the number of errors that can be corrected by a code.
In particular, we would like to understand how much redundancy do we need to cor-
rect a given amount of errors ?

We would like to correct as many errors as possible with as little redundancy as


possible.
Note that maximizing error correction and minimizing redundancy are contradictory
goals.
We are also interested in achieving this goal by coming with e¢cient encoding and
decoding methods. By e¢cient we mean that can be run in polynomial time.

1.2 An example : sending a path


Suppose that AWACS and Fighter have identical maps grid as shown in the following
gure.
AWACS can transmit date to Fighter of a safe route by which Fighter can avoid enemy
air defenses.

2
In this situation reliability is more important than speed of transmission.

If we encode the four directions North (N),South (S),E (East) ,West (W), the path to be
sent is then
NEEESEEN EE

Naive approach

We could encode the four directions North (N),South (S),E (East) ,West (W) using binary
code, one code could be : 8
>
> 00 = N
>
>
>
>
< 01 = W
C1 =
> 10 = E
>
>
>
>
>
: 11 = S

The path to be sent is then :

00101010111010001010

3
If we introduce an error

00 |{z}
|{z} 10 |{z}
10 |{z}
10 |{z}
11 |{z}
10 |{z}
11 |{z}
00 |{z}
10 |{z}
10
N E E E S E S N E E

The receiver in unable to check for errors

Sending a path (Adding redundancy)

We could add redundancy in order to protect these message vectors against noise , con-
sider the length 3 code C2 as follows :
8
>
> 000 = N
>
>
>
>
< 011 = W
C2 =
>
> 101 = E
>
>
>
>
: 110 = S

The path to be send is NEEESEENEE

000 |{z}
|{z} 101 |{z}
101 |{z}
101 |{z}
110 |{z}
101 |{z}
111 |{z}
000 |{z}
101 |{z}
101

If we introduce one error than we notice that 111 is not a valid code.
Notice that if we introduce one error in any position then it will be detected.

N One error Remark W One error Remark


000 100 Not a code 011 111 Not a code
000 010 Not a code 011 001 Not a code
000 001 Not a code 011 010 Not a code

4
E One error Remark E One error Remark
101 001 Not a code 110 010 Not a code
101 111 Not a code 110 100 Not a code
101 101 Not a code 110 111 Not a code

Notice that if we can detect one error we are unable to correct it for example 111
could be 101 or 011.

Error correction

In this example we consider the following code : 25 = 32 43 = 64


8
>
> 00000 = N
>
>
>
>
< 01101 = W
C3 =
>
> 10110 = E
>
>
>
>
: 11011 = S

If a single error occurs in any codeword of C3 we are able not only to detect it but actually
to correct it, since the received vector will still be closer to the transmitted one than to
any other.

| {z } |10110
00000 {z } |10110
{z } 10110
| {z } 11011
| {z } |10110
{z } |10110
{z } 00000
| {z } 10110
| {z } |10110
{z }

We will check this for 10110, others are similar

E One error Remark Closest vestor


10110 00110 Not a code 10110
10110 11110 Not a code 10110
10110 10010 Not a code 10110
10110 10100 Not a code 10110
10110 10111 Not a code 10110

5
1.3 Basic de nitions
De nition 1 .
1- Let A be a nite set called an alphabet , en element of A is referred to as a letter.
2- A nite word is a sequence of elements of A. The length of a nite word u = u0 ...un¡1 2
An is juj = n.
3- The set of nite words is given by A¤ = [1 n 0
n=0 A . The set A = fλg, where λ is the

empty word.
4- We denote by AN the set of in nite words over A and by AZ the set of bi-in nite
sequences over A.
5- For two integers i, j with i < j we denote by x (i, j) the word xi ...xj .
6- The length of an in nite word is denoted by 1.

For an element of AZ we may use a decimal notation to avoid confusion. The rst
element to the right of the decimal point denoting position 0.
For example x = ...000.100.. of f0, 1gZ has a one at position zero (x0 = 1) .

De nition 2 .
1- Let u = u0 ...un¡1 and v = v0 ...vm¡1 be two nite words, the concatenation of the words
u and v is denoted by uv = u0 ...un¡1 v0 ...vm¡1 .
2- We say that the word u is a factor of v and denote u v v if there is two words x, y
such that xuy = v.
3- If x = λ the word u is said a pre x of v and we denote it by u vp v.
4- If y = λ the word u is said a su¢x of v and we denote it by u vs v.
5- If u is a non empty nite word we denote by u1 2 AN the in nite concatenation of u.

The set of nite words A¤ equipped with the concatenation operation is a monoid.
i.e. has the associative property and λ is the neutral element.

6
1.3.1 Distances

1. The Hamming distance between two words x and y of An is the number of coordin-
ates in which they are di¤erent :

X
n
dH (x, y) = δ(xi , yi )
i=1
δ(xi , yi ) = 1 if xi 6= yi and 0 otherwise

2. The set AN may be endowed with the distance


8
< 2¡n with n = min fi ¸ 0 : xi 6= yi g if x 6= y
dC (x, y) =
: 0 if x = y

3. The set AZ may be endowed with the distance


8
< 2¡n with n = min fi ¸ 0 : x 6= y or x 6= y g if x 6= y
i i ¡i ¡i
dC (x, y) =
: 0 if x = y

Remark 3 The distance dC may seem counter intuitive in the sense it is giving a greater
weight to central coordinates for example

1
d (101 , 01 ) = 1 > d (00011 , 01 ) =
23

Notice that 101 and 01 di¤er only by one letter whereas 00011 and 01 are di¤erent at
in nitely many ones.

7
1.4 The coding problem
Let A be an alphabet, we want to send information written as words of constant length
1
.
The transmission being subject to some form of alteration due to noise, the received
word may be di¤erent from the original one.
Decoding is the process we apply to recover the original message.

Let A be an alphabet of q elements and suppose that we want to encode words of


length k using words of length n.
A code C is a subset of An an element of C is often called a codeword or a vectorcode
to distinguish it from any other element from An nC.
Let us go back to our example of sending a path.
The 4 elements to be coded are the four simple directions : N,E,S,W.
It is possible then to use any set with cardinality greater than 4. For example we may
use f0, 1g3 or f0, 1g5 .
We can then view the encoding problem as de ning a map from f0, 1g2 to f0, 1gn for
n > 2.

1
It is possible to use words of varying lenghts but for now we consider only the constant lenght cases.

8
Example 4 .

Notice from precedent examples that the closest codeword is the decoded word, this
method is systematically used , a word which is not a codeword cannot be decoded if
there are ambiguities.

De nition 5 Let C ½ An be a code. We de ne the minimum distance over dH by :

d (C) = min fdH (x, y) : x, y 2 C, x 6= yg

Proposition 6 .
1) A code C can detect up to s errors in any codeword if d (C) ¸ s + 1.
2) A code C can correct up to t errors in any codeword if d (C) ¸ 2t + 1.

Proof. 1) Suppose a codeword x is transmitted and that the number of errors is less
than or equal to s.
Denote by x0 the received word we have then dH (x, x0 ) = s this means that x0 2
/ C.
2) Suppose d (C) ¸ 2t + 1 and suppose that a codeword x is transmitted and the
vector x0 received in which t or fewer errors have occurred so that d (x, x0 ) · t.
If y is any codeword other than x then we should have d (y, x0 ) ¸ t + 1. Otherwise
d (x, x0 ) · d (x, y) + d (y, x0 ) · 2t (contradiction)
So x0 is the nearest codeword to x and nearest neighbor decoding corrects the errors.

9
1.5 Hamming bound
The Hamming bound is a limit on the parameters of an arbitrary code. It gives an
important limitation on the e¢ciency with which any error-correcting code can utilize
the space in which its code words are embedded. A code that attains the Hamming
bound is said to be a perfect code.
For any codeword x 2 C the balls

Bt (x) = fy 2 An : dH (x, y) · tg

We will count the number of elements of Bt (x)


¯
¯
¯ Number of di¤erent letters Number of possibilities
¯
¯
¯ 0 1
¯
¯
¯ 1 (jAj ¡ 1) Cn1
¯
¯
¯ 2 (jAj ¡ 1)2 Cn2
¯
¯
¯ ... ...
¯
¯
¯ t (jAj ¡ 1)t Cnt

Hence
P
t
jBt (x)j = 1 + (jAj ¡ 1) Cn1 + ... (jAj ¡ 1)t Cnt = (jAj ¡ 1)k Cnk
k=0

We have then
P
t
jCj (jAj ¡ 1)k Cnk · jAjn
k=0

The last inequality is called the Hamming bound on a code.

10
1.6 Algorithmic e¢ciency and algorithmic complex-
ity
There might be several algorithms or programs for the same problem, the concept of
complexity allows us to compare several methods to evaluate the least expensive in com-
putation time.
To illustrate this idea we will start with an example.

Example 7 We wish to calculate the sum of the elements of an upper triangular matrix
of a given matrix. To do this, we compare three programs and their execution times under
Maple.
1) The rst program SumTrig1 calculates the sum of all elements of the given matrix,
it will give a correct result but will perform unnecessary additions for the zeros will be
found in the triangle below the diagonal of the matrix. Its calculation time for a square
matrix of dimension 104 £ 104 is 44.938 seconds.

2) The second SumTrig2 program calculates the sum of the elements by carrying out a
choice test to nd out whether or not they are in the upper triangle. Its calculation time

11
for a square matrix of dimension 104 £ 104 is 126,110 seconds.

3) The second SumTrig2 program calculates the sum of the elements found only in the
upper triangle of the given matrix. Its calculation time for a square matrix of dimension
104 £ 104 is 44.938 seconds.

Evaluating arithmetic complexity of a method is done using Landau notation .


The arithmetic cost of a method is the number of arithmetic operations necessary for
its completion; it is also called arithmetic complexity. We use the notation O to indicate
the asymptotic behavior of complexity.
Let us study some basic examples.

12
1.6.1 Power evaluation

1. Naive method.
We want to write a program to compute xn for given values of x and n.
A rst approach is using successive multiplication :

xn = |x £ x £{z..... £ x}
(n¡1) times

The cost of this method is (n ¡ 1) multiplications so the complexity is O (n) .

2. Dichotomic exponentiation.
The dichotomic exponentiation is summarized by the following formula :
8 Ã n !2
>
>
>
>
>
> x2 if n is even
<
xn = 0 1
> n¡1 2
>
>
>
> x @x 2 A if n is odd
>
:

³ ´2 µ³ ´2 ¶ 2
16 8 2 4 2 2 2
Example 8 For example x = (x ) = (x ) = ((x )) and we need 4 mul-
tiplications in order to compute x16 instead of 15 operations for the naive method.

The cost of the exponentiation method may be evaluated using the following method
.
Denote by CostDicho (n) the cost to evaluate xn .
If n = 2m is a pure power of 2 then we have :

³n´
CostDicho (n) = CostDicho +1

³n
= CostDicho +2
4
= ....

= CostDicho (1) + m

13
So the cost is O (ln2 (n))

Example 9 Below is a comparison of two Maple programs computing powers. The rst
one named naive is applying naive method :

The second one is using the dichotomic method :

Let us compare time execution of the two programs :

1.6.2 Polynomial evaluation : Horner’s method

The Horner algorithm aims to avoid the evaluation of successive powers of x, the idea
consists of a repeated evaluation of a polynomial of degree 1, thus a polynomial P of

14
degree n will be evaluated according to the following technique:
¯
¯
¯ P (x) = an xn + an¡1 xn¡1.... + a3 x3 + a2 x2 + a1 x + a0
¯
¯
¯ = (an xn¡1 + an¡1xn¡2 .... + a3x2 + a2 x + a1 ) x + a0
¯
¯
¯ ..............................
¯ 000 1 1 1
¯
¯ 0 1
¯ BBB C C C
¯ BBB C C C
¯ = BBB@an x + an¡1 A x + an¡2 .... + a3 C x + a2 C x + a1 C x + a0
¯ BBB | {z } C C C
¯ @@@ A A A
¯ | P 1
{z }
¯
P2

This may be denoted by the following formula :


8
< P0 = an
: P = P x+a 1·i·n
i i¡1 n¡i

The cost of this method can be evaluated as follows :


¯
¯
¯ P1 = P0 x + an¡1 1 addition + 1 multiplication
¯
¯
¯ P2 = P1 x + an¡2 1 addition + 1 multiplication
¯
¯
¯ P3 = P2 x + an¡3 1 addition + 1 multiplication
¯
¯
¯ ................... 1 addition + 1 multiplication
¯
¯
¯ Pn = Pn¡1 x + a0 1 addition + 1 multiplication

The cost of the Horner’s method is then n multiplications + n additions or 2n operations.


The complexity is then O (n) .

15
1.6.3 Matrix computation

A determinant computation method

One method to compute a matrix determinant is using submatrices, the determinant


may then be evaluated using the following formula :
¯ ¯
¯ ¯
¯ a11 a12 ... a1n ¯
¯ ¯
¯ ¯
¯ a21 a22 ... a2n ¯
¯ ¯ = a11M11 ¡ a21 M21 + ... (¡1)n+1 an1 Mn1
¯ ¯
¯ ... ... ... ... ¯
¯ ¯
¯ ¯
¯ an1 an2 ... ann ¯

Where Mi1 are submatrices obtained by suppression of the line i and column 1 of the
matrix A.
Denote by Costn the cost of evaluation of a determinant of dimension n we have then:

Costn = nCostn¡1 + n mul + (n ¡ 1) add

Hence the asymptotic complexity is O (n!).


So computing a matrix determinant using this method is algorithmically ine¢cient.

Strassen’s algorithm 1969

In 1969 Strassen introduced an original method to compute the product of two matrices
costing only O (n2. 807 4 ) arithmetic operations.
Consider two matrices A, B of dimension (2n £ 2n ) .
Decompose each matrix into 4 blocs of dimension (2n¡1 £ 2n¡1 )
0 1 0 1 0 1
A11 A12 B11 B12 C11 C12
A=@ A,B = @ A,C = @ A
A21 A22 B21 B22 C21 C22

16
Compute then

M1 = (A11 + A22) (B11 + B22)

M2 = (A21 + A22) B11

M3 = A11 (B12 ¡ B22)

M4 = A22 (B21 ¡ B11)

M5 = (A11 + A12) B22

M6 = (A21 ¡ A11 ) (B11 + B12 )

M7 = (A12 ¡ A22 ) (B21 + B22 )

We have then :

C11 = M1 + M4 ¡ M5 + M7

C12 = M3 + M5

C21 = M2 + M4

C22 = M1 ¡ M2 + M3 + M6

Remark 10 If the matrix is not of dimension 2n it is enough to add the necessary number
of 0 rows and columns to active this condition..

Strassen’s method cost Let n = 2m and denote by CostStrassen the cost of the
Strassen’s product and by CostAdd the cost of the addition of two matrices.

17
We have then :

³n´ ³n´ ³ n ´ 18
CostStrassen (n) = 7 £ CostStrassen + CostAdd = 7 £ CostStrassen + n2
2 2 2 4
³ ³n´ ³ n ´´ 18
= 7 £ 7CostStrassen + 18CostAdd + n2
4 4 4
³ n ´ 99
= 72 £ CoutStrassen + n2
4 8
= ....
¡ ¢
= 7m CoutStrassen (1) + O n2

ln(n)
We have 2m = n ) m = ln(2)
:

³ ³ ln(n) ´´ µ ¶
m
ln(n) ln (n) ln(7) ln(7)
7 =7 ln(2) = exp ln 7 ln(2) = exp ln (7) = exp (ln (n)) ln(2) = n ln(2)
ln (2)

ln(7)
The approximative value of ln(2)
is 2. 807 4.

18
Chapter 2

Linear codes

2.1 Introduction
In this section we suppose that the alphabet A has a eld structure, this allows us to
treat An as a linear space.
A linear code is simply any subspace of An . As a consequence of the linearity the
word 0nA is always a codeword.

De nition 11 Let x = x0 ...xn¡1 2 An the weight of the word x denoted by w (x) is the
number of letters of x di¤erent from 0A .

w (u) = jf0 · i · n ¡ 1 : xi 6= 0A gj

From the de nition we have w (x) = 0 if and only if x is 0n , the weight function check
the triangular inequality w (x + y) · w (x) + w (y) , however the weight function is not
a norm.
There is an interesting relation between the minimum distance of a code and the
concept of weight.

dH (x, y) = dH (x ¡ y, y ¡ y) = dH (x ¡ y, 0A ) = w (x ¡ y)

19
De nition 12 Let us denote by w (C) the smallest of the weights of the non-zero code-
words of C
w (C) = min fw (x) , x 2 Cn f0nA gg

Proposition 13 Let C be a linear code and let w(C) be the smallest of the weights of
the non-zero codewords of C. Then d(C) = w(C).

2.1.1 Generator matrix

As a linear code C is a linear space we can found a generator matrix.

De nition 14 A generator matrix G is a matrix of dimension (n £ k) such that :

© ª
C = G.x : x 2 Ak

Example 15 We want to encode elements from f0, 1g3 using the generator matrix G :
0 1
1 1 1
B C
B C
B 1 0 1 C
B C
B C
G=B 1 0 0 C
B C
B C
B 1 0 0 C
@ A
1 1 0

Each codeword has the form


0 1 0 1
1 1 1 x + x2 + x3
B C0 1 B 1 C
B C B C
B 1 0 1 C x1 B x1 + x3 C
B CB C B C
B CB C B C
G.x = B 1 0 0 C B x2 C = B x1 C
B C@ A B C
B C B C
B 0 0 1 C x3 B x3 C
@ A @ A
1 1 0 x1 + x2

20
Hence the code C is given by :

© ª
C = (x1 + x2 + x3 , x1 + x3 , x1 , x3, x1 + x2) , (x1 , x2 , x3) 2 f0, 1g3

Remark 16 Some references uses the notation x.G to denote Generator, in this case
the generator matrix is imply the transpose of the one from the notation G.x and the x
is supposed a vector of dimension (1 £ k)

Example 17 Consider the following linear code generated by G0


0 1
1 1 1 0 1
B C
B C
G0 = B 1 0 0 0 1 C
@ A
1 1 0 1 0

Each codeword has the form


0 1
1 1 1 0 1
³ ´B C ³ ´
B C
x1 x2 x3 B 1 0 0 0 1 C = x1 + x2 + x3 x1 + x3 x1 x3 x1 + x2
@ A
1 1 0 1 0

Hence the code C is given by :

© ª
C = (x1 + x2 + x3 , x1 + x3 , x1 , x3, x1 + x2) , (x1 , x2 , x3) 2 f0, 1g3

2.1.2 Equivalence of linear codes

Two (n £ k) matrices generate equivalent linear codes if one matrix can be obtained from
the other by elementary operations

1. Permutation of the rows.

2. Multiplication of a row by a non-zero scalar.

21
3. Addition of a. scalar multiple of one row to another.

4. Permutation of the columns.

5. Multiplication of any column by a non-zero scalar.

2.1.3 Gaussian elimination over nite elds

The Gauss method still works over nite elds.

Example 18 Solve the following linear system in Z5 using Gaussian elimination :


8 2 3
>
> x + 2y + 2z = 3 1 2 2 3
>
< 6 7
6 7
2x + z = 4 , 6 2 0 1 4 7
>
> 4 5
>
: 3x + y + 3z = 1 3 1 3 1

2 3
8 1 2 2 3
< l à l + 3l 6 7
2 2 1 6 7
! 6 2+3 6 1+6 4+9 7
: l à l + 2l 4 5
3 3 1
3+2 1+4 3+4 1+6
2 3
1 2 2 3
6 7
6 7
! 6 0 1 2 3 7
4 5
0 0 2 2

8 8
>
> x + 2y + 2z = 3 >
> x + 2y + 2z = 3
>
< >
<
y + 2z = 3 ! y = 3 ¡ 2z = 1
>
> >
>
>
: >
:
2z = 2 z=1
8
>
> x = 3 ¡ 2y ¡ 2z = 3 ¡ 2 ¡ 2 = 4
>
<
! y = 3 ¡ 2z = 1
>
>
>
: z=1

22
Remark 19 There is no need to minimize roundo¤ errors when you work over nite
elds.

2.2 Standard form


0 1
³ ´ Ik
Any generator matrix G can be transformed to the standard form Ik A @ resp. A
A
where Ik is the k £ k identity matrix, and A is a k £ (n ¡ k) matrix. (resp. A a
(n ¡ k) £ k version)
0 1
1 1 1 0 1
B C
B C
Example 20 B 1 0 0 0 1 C
@ A
1 1 0 1 0

0 1
8 1 1 1 0 1
< l Ãl +l B C
2 2 1 B C
! B 0 1 1 0 0 C
: l Ãl +l @ A
3 3 1
0 0 1 1 1
0 1
8 1 1 0 0 0
< l Ãl +l B C
1 1 3 B C
! B 0 1 0 0 0 C
: l Ãl +l @ A
2 2 1
0 0 1 1 1
0 1
1 0 0 0 0
B C
B C
fl1 Ã l1 + l2 ! B 0 1 0 0 0 C
@ A
0 0 1 1 1

2.2.1 Dual code and parity check matrix

De nition 21 A matrix H is said to be a parity check matrix if

Hx = 0n , 8x 2 C

Proposition 22 Suppose that C ½ An is a linear code with a generator matrix is stand-

23
ard form (Ik j G) then the matrix (¡Gt j Ir ) is a parity check matrix for C where r = n¡k.

Proof. Exercise.

Remark 23 The parity check matrix is a generator matrix of the dual code C ? .

De nition 24 Suppose that C ½ An is a code of length n, de ne the dual code C ? by :

C ? = fx 2 An : x.y = 0A , 8y 2 Cg

De nition 25 A linear code is self dual if C ? = C.

Example 26 1) Suppose C = f0000, 1100, 0011, 1111g then C ? = C.


2) Suppose C = f000, 110, 011, 101g then C ? = f000, 111g .

¡ ¢?
Proposition 27 For any (n, k) code C we have C ? = C.

¡ ¢?
Proof. Clearly C ½ C ? since every vector in C is orthogonal to every vector in
C?.
¡ ¢? ¡ ¢?
We have also dim C ? = n ¡ (n ¡ k) = k = dim (C) and so C ? = C.

2.3 Decoding of linear codes

2.3.1 Cosets and Lagrange’s theorem

De nition 28 Let H be a subgroup of a group G and g be any element of G. The left


coset gH is de ned by :
g + H = fg + h : h 2 Hg

Remark 29 The coset h+H is simply the subgroup H.

24
Proposition 30 Let H be a subrgoup of G then the relation R on G de ned by

xRy , y ¡ x 2 H

is an equivalence relation.

Proof. Exercise.

Proposition 31 Let H be a subgroup of G and R be the equivalence relation R de ned


in the previous proposition. Then the equivalence class of an element g 2 G is the left
coset g + H.

Proof. Exercise.

Corollary 32 Let H be a subgroup of G. Then two left cosets x + H and y + H of H in


G are either equal or disjoint and each element of G is in some left coset of H.

Example 33 Let G be the additive group of integers Z and let n be any positive integer.

Proposition 34 Let H be a subgroup of a group G. For any element g 2 G, there is a


bijection between H and g + H.

Proof. De ne the map f : H ! g + H by f (h) = g + h using the properties of a


group structures this map is bijective.

Theorem 35 (Lagrange) Let G be a group with a nite number of elements and let H
be a subgroup of G. Then the number r of distinct left cosets of H is equal to jGj / jHj .
In particular both jHj and r divide jGj .

Proof. The equivalence classes under the relation R partition G. Each equivalence
class is a left coset and each left coset has the same number of elements as H.
It follows that if r is the number of distinct left cosets then jGj = r jHj

De nition 36 The number of distinct left cosets of a subgroup H in a group G is the


index of H in G. It is usually denoted by jG : Hj .

25
2.3.2 Cosets and standard array

De nition 37 Suppose that C ½ An is a code of length n and consider a 2 An . De ne


the set a + C by
a + C = fa + x : x 2 Cg

the set a + C is called a coset of C.

Proposition 38 Properties
1- If a + C is a coset of C and b 2 a + C then b + C = a + C.
2- Every element ofAn is in some coset of C.
3- Every coset contains exactly jCj elements.
4-Two cosets either are disjoint or coincide.

De nition 39 The vector having minimum weight in a coset is called the coset leader.

A standard array for a code C is an array of all the vectors in An in which the rst
row consists of the code C on the extreme left and the other rows are the cosets a + C
each arranged in corresponding order, with the coset leader on the left.
A standard array may be constructed as follows:

1. Step 1 : List the codewords of C, starting with 0, as the rst row.

2. Step 2 Choose any vector a1 not in the rst row of minimum weight. List the coset
a1 + C as the second row starting from a1 .

3. Step k From those vectors not in rows 1 until k ¡ 1 , choose ak of minimum weight
and list the coset ak + C as in Step 2 to get the third row.

26
Example 40 Consider the code generated by the matrix G
0 1
1 0
B C
B C
B 0 1 C
B C
B C
G=B 1 1 C
B C
B C
B 0 1 C
@ A
1 0

0 1
x1
B C
0 B
1 C
B x2 C
x1 B C
The code C is hence given by G @ A=B
B x1 + x2
C
C
x2 B C
B C
B x2 C
@ A
x1

½³ ´t ³ ´t ³ ´t ³ ´t ¾
C= 0 0 0 0 0 , 0 1 1 1 0 , 1 0 1 0 1 , 1 1 0 1 1

³ ´ ³ ´ ³ ´ ³
0 0 0 0 0 0 1 1 1 0 1 0 1 0 1 1 1 0 1
³ ´ ³ ´ ³ ´ ³ ´ ³
+ 0 0 0 0 1 0 0 0 0 1 0 1 1 1 1 1 0 1 0 0 1 1 0 1
³ ´ ³ ´ ³ ´ ³ ´ ³
+ 0 0 0 1 0 0 0 0 1 0 0 1 1 0 0 1 0 1 1 1 1 1 0 0
³ ´ ³ ´ ³ ´ ³ ´ ³
+ 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 1 1 1 1
³ ´ ³ ´ ³ ´ ³ ´ ³
+ 0 1 0 0 0 0 1 0 0 0 0 0 1 1 0 1 1 1 0 1 1 0 0 1
³ ´ ³ ´ ³ ´ ³ ´ ³
+ 1 0 0 0 0 1 0 0 0 0 1 1 1 1 0 0 1 1 1 0 0 1 0 1
³ ´ ³ ´ ³ ´ ³ ´ ³
+ 1 1 0 0 0 1 1 0 0 0 1 0 1 1 0 0 1 1 0 1 0 0 0 1
³ ´ ³ ´ ³ ´ ³ ´ ³
+ 1 0 0 1 0 1 0 0 1 0 1 1 1 0 0 0 0 1 1 1 0 1 0 0

Proposition 41 Let C be a linear code with distance d, if x is a vector such that


· ¸
d¡1
w (x) ·
2

27
then x is a unique element of minimum weight in its coset of C and hence is always a
coset leader in a standard array for C.

£ d¡1 ¤
Proof. Suppose w (x) · 2
and x is not a unique vector of minimum weight in
its coset Ci .
Then there exists some vector y 6= x 2 Ci such that w (x) · w (y) .
Since x and y are in the same coset Ci such that w (y) · w (x)
· ¸ · ¸
d¡1 d¡1
w (x ¡ y) · w (x) + w (y) · + ·d¡1
2 2

This contradicts the distance d of the code.

2.3.3 Syndromes

De nition 42 Let H be a parity-check matrix for an (n, k) code over F, for x 2 An the
syndrome s of x is de ned by s = Hx (or x.H)

Proposition 43 Let H be a parity-check matrix for a linear code C. Then two words x
and y are in the same coset of C if and only if they have the same syndrome (i.e Hx = Hy)

Proof. If x and y are in the same coset of C, then x = l + ci and y = l + cj for some
codewords ci and cj and some coset leader l.

Hx = H (l + ci ) = Hl + Hci = Hl = Hl + Hcj = Hy

Conversely, suppose that Hx = Hy then Hx ¡ Hy = 0n hence x ¡ y is a codeword.


We have then x ¡ y = ci for some 1 · i · jCj hence x = y + ci and x and y are in
the same coset.

28
Example 44 Suppose we have a linear code with a parity check matrix
0 1
1 0 0 1 0 1
B C
B C
H=B 0 1 0 1 1 0 C
@ A
0 0 1 0 1 1

The syndromes are given by the following table :

Coset leader Syndrome


000000 000
000001 100
000010 010
000100 001
001000 110
010000 011
100000 101
001100 111

Proposition 45 For every code with parity check matrix H, the minimum distance d
equals the size of the smallest subset of columns of H that are linearly dependent.

Proof. We need to show that the minimum weight of a non zero codeword in C is
the minimum number of linearly dependent columns.
Let t be the minimum number of linearly dependent columns in H.
Let c 6= 0Ak 2 C be a codeword with w (c) = d.
By de nition of the parity check matrix Hc = 0Ak and by matrix multiplication this gives
P
n
us that ci H i = 0Ak where H i is the ith column of the matrix H.
i=1
So for Hc to be the zero vector we need all Hi with non zero coe¢cients are linearly
dependent.
This means that d ¸ t as the columns corresponding to non-zero entries in c are one

29
instance of linearly dependent columns.
For the other direction, consider the minimum set of columns from H that are linearly
dependent.
This implies that there exists non zero elements c0i1 , ..., c0it 2 A such that

c0i1 H i1 + ... + c0it H it = 0

Now extend c0i1 , ..., c0it to the vector c0 such that c0j = 0 for j 2
/ fi1 , ...it g .
We have then Hc0 = 0 and thus c0 is a codeword with a weight t thus d · t.

2.4 Some special codes

2.4.1 The (7, 4, 3) Hamming code

The binary Hamming code is generated by the following generating matrix :


0 1
1 0 0 0
B C
B C
B 0 1 0 0 C
B C
B C
B 0 0 1 0 C
B C
B C
G=B 0 0 0 1 C
B C
B C
B 0 1 1 1 C
B C
B C
B 1 0 1 1 C
@ A
1 1 0 1

hence each word (x1, x2 , x3 , x4 ) is associated to the codeword

(x1 , x2 , x3, x4 , x2 + x3 + x4 , x1 + x3 + x4, x1 + x2 + x4 )

If we denote by c1 , ..., c7 the successive letters of the alphabet of the coded word, we

30
have the following property :
8
>
> c + c3 + c4 + c5 = 0
>
< 2
c1 + c3 + c4 + c6 = 0
>
>
>
: c +c +c +c =0
1 2 4 7

This code has the following nice geometric interpretation :

The sum of elements of each circle should be 0, let us explain how to use this scheme
with an example.

31
Example 46 Suppose we received the message 1000110

Here we have a problem with the green circle because the sum of the four elements inside
is di¤erent from 0.
We have also a problem with the salmon circle however the blue circle is correct.
If we suppose that only one error occurred then it should be at the intersection of the two
circle hence we have to correct x2 .

2.4.2 The general binary Hamming code

Let r be a positive integer, de ne the parity check matrix Hr as the matrix where each
column Hri is the binary representation of i, for 1 · i · 2r ¡ 1.

Example 47 For example, for the case r = 3 we have 2r ¡ 1 = 7

1j2 2j2 3j2 4j2 5j2 6j2 7j2


Coff 22 0 0 0 1 1 1 1
Coff 21 0 1 1 0 0 1 1
Coff 20 1 0 1 0 1 0 1

32
Hence the code is given by :
8
>
> c + c3 + c4 + c5 = 0
>
< 2
c1 + c3 + c4 + c6 = 0
>
>
>
: c +c +c +c =0
1 2 4 7

Proposition 48 The general binary Hamming code (2r ¡ 1, 2r ¡ r ¡ 1, 3) has distance


3.

Proof. No two columns in H are linearly dependent.


If they were we would have H i + H j = 0 but this is impossible since they di¤er in at
least one bit.
(binary representations of integers i 6= j) Thus the distance is at least 3.
It is as at most 3 since H 1 + H 2 + H 3 = 0.

2.4.3 Decoding single error linear codes

Let H be a parity check matrix for a linear code C. Suppose that our channel has a high
probability to introduce only one error.
So a single error correcting code is supposed to be enough for this kind of channels.
Let us denote by r the received word, c the codeword and e the error introduced.
We have then r = c + e, using the parity check matrix we obtain :

Hr = H (c + e) = |{z}
Hc + He = He
0F n

As our channel is supposed to introduce only one error hence the weight of e is equal to
one. Then there exist an index i and a scalar α 2 F such that He = αhi where hi is the
t ¡ th column of H.
Using this remark we can give an algorithm for decoding single error codes including
Hamming code.

33
Algorithm 49 Let H be the partity-check matrix and let r be the received word.
1) Compute Hr
2) If Hr = 0F n then r is the transmitted codeword.
3) If Hr = s 6= 0F n then compare s with the columns of H.
4) If there is some i such that s = αhi then e is the n-tuple with α in position i and 00 s
elsewhere, correct r to c = r ¡ e.
5) Otherwise more than one error has occurred.

Example 50 Suppose that we want to encode the information (1010)


Hence the encoded word is given by (1010101), suppose now that the received word with
one error is r = (1110101)
Let us compute Hr.
0 1
1
B C
B C
B 1 C
0 1B C 0 1
B C
0 1 1 1 1 0 0 B 1 C 1
B CB C B C
B CB C B C
Hr = B 1 0 1 1 0 1 0 C B 0 C=B 0 C
@ AB C @ A
B C
1 1 0 1 0 0 1 B 1 C 1
B C
B C
B 0 C
@ A
1

Hence Hr corresponds to the column 2 of the matrix H.


So the error was made at position 2 of the received word and it is enough to change the
value at position 2
r = (1110101) ! c = (1010101)

34
2.4.4 Reed Solomon code

Background (Polynomial interpolation)


© ª
Let (xi , yi )0·i·n a set of n + 1 pairs the main interpolation problem is to nd a poly-
nomial P such that

P (xi ) = yi : 0 · i · n

Proposition 51 For (n + 1) distinct points (xi , yi ) there is a unique polynomial of degree


n satisfying P (xi ) = yi : 0 · i · n.

Remark 52 By distinct points we mean that xi , 0 · i · n. are distinct.

Lagrange polynomials

For a set f(xi , yi )g0·i·n of n + 1 points, we construct the Lagrange polynomials

Q x ¡ xj
Lk (x) = : k = 0.....n
j=0..n xk ¡ xj
j6=k

Lagrange polynomials have the following property :


8
< Lk (xk ) = 1 8k
: L (x ) = 0 8k = 6 j
k j

Proposition 53 Let f(xi , yi )g0·i·n be a set of of n + 1 points. Then the unique inter-
polation polynomial is given by :

X
n
P (x) = yk Lk (x)
k=0

35
Interpolation problem over nite elds

Proposition 54 A polynomial of degree n with coe¢cients in a eld F has at most n


roots.

Proposition 55 Every function de ned from a nite eld to itself is a polynomial

Proof. Suppose f is a function de ned from a nite eld to itself, we have then

f (xi ) = yi : 1 · i · pn

Considering this as an interpolation problem there is a unique polynomial of degree


m · pn ¡ 1 that interpolates this data

2.4.5 Reed Solomon code

Reed—Solomon codes are a group of error-correcting codes that were introduced in 1960.
They are used in many applications, such as DVDs and QR codes.
The Reed Solomon code is described via its encoder mapping.
Fix integers k · n · q and n distinct elements x1 , x2, ...xn 2 GF (q).
Suppose we want to encode the message a1...ak compute P (x) the interpolation poly-
nomial of (xi , ai )1·i·k .
The encoded message is then

(P (x1 ) , ...P (xk ) , P (xk+1 ) , ...P (xn ))

Hence the number of redundant bits is n ¡ k.

Example 56 We want to send the following message : [3, 2, 4] using the alphabet Z/5Z
We start by computing the interpolation polynomial of the points (0, 3) , (1, 2) , (3, 4).

36
We can use Lagrange method :

(x ¡ 1) (x ¡ 2)
L0 (x) = = 3x2 + x + 1
(0 ¡ 1) (0 ¡ 2)
(x) (x ¡ 2)
L1 (x) = = 4x2 + 2x
(1) (1 ¡ 2)
(x) (x ¡ 1)
L2 (x) = = 3x2 + 2x
(2) (2 ¡ 1)
¡ ¢ ¡ ¢ ¡ ¢
P (x) = 3 3x2 + x + 1 + 2 4x2 + 2x + 4 3x2 + 2x = 29x2 + 15x + 3 = 4x2 + 3

Now we replace the values 3 and 4 in P (x) and we have P (3) = 4 and P (4) = 2.
The encoded message is then [3, 2, 4, 4, 2]

2.4.6 Reed Solomon code used for erasures correction

The Reed Solomon is particularly suited to recover data lost to erasures

Example 57 Suppose we received the following message [¥, 2, ¥, 4, 2]


Our received message has two erasures.
We start by computing the interpolation polynomial of the points , (1, 2) , (3, 4) , (4, 2).
We can use Lagrange method :

(x ¡ 3) (x ¡ 4)
L0 (x) = = x2 + 3x + 2
(1 ¡ 3) (1 ¡ 4)
(x ¡ 1) (x ¡ 4)
L1 (x) = = 2x2 + 3
(3 ¡ 1) (3 ¡ 4)
(x ¡ 1) (x ¡ 3)
L2 (x) = = 2x2 + 2x + 1
(4 ¡ 1) (4 ¡ 3)

¡ ¢ ¡ ¢ ¡ ¢
P (x) = 2 x2 + 3x + 2 + 4 2x2 + 3 + 2 2x2 + 2x + 1

= 4x2 + 3

Now we replace the values 0 and 2 in P (x) and we have P (0) = 3 and P (2) = 4.

37
The decoded message is then [3, 2, 4] .

Proposition 58 The Reed Solomon code is linear.

Proof. The proof is straightforward using de nition and construction of Reed So-
lomon code.

2.4.7 Minimum distance of Reed Solomon code

Proposition 59 The minimum distance of a Reed Solomon code is d = n ¡ k + 1.

Proof. Let (Pk (x1 ) , Pk (x2 ) , ..., Pk (xn )) be a codeword.


As P is of degree k ¡ 1 it has at most k ¡ 1 roots hence :

w (Pk (x1) , Pk (x2 ) , ..., Pk (xn )) ¸ n ¡ k + 1

So d ¸ n ¡ k + 1

Corollary 60 The Reed Solomon code match the Singleton bound.

2.4.8 Decoding algorithm for RS code (original version)

The original decoding algorithm used the majority rule.


Suppose we received a code of lenght n containing e errors and we want to decode it.
Pick up all possible k uples of received words and then compute the associated poly-
nomial interpolation, compute again the values corresponding to that polynomial.
We will end up with Cnk possible values and consider the major one as the most
probable one.
This method need at to run Cnk polynomial interpolation and hence is not algorith-
mically e¢cient.

38
2.4.9 Decoding algorithm for RS code (Berlekamp Welch Al-
gorithm)

De nition 61 (Polynomial error locator) Consider the n pairs f(xi , yi )g1·i·n where
(yi )1·i·n is the received message of an RS code and P (x) the generator polynomial.
Q
De ne E (x) = (x ¡ xi ) the polynomial having as roots only the points xi corres-
P (xi )6=yi
ponding to the transmission errors .E (x) is called the Polynomial error locator.

Consider now the polynomial Q (x) = E (x) P (x) this polynomial has the following
property
Q (xi ) = yi E (xi ) : 1 · i · n

Notice that if xi corresponds to an error transmission then we have Q (xi ) = 0 =


yi E (xi ) = 0 and if xi corresponds to a correct transmission then by de nition we have
Q (xi ) = yi E (xi ) .
The main idea of the Berlekamp Welch algorithm is to try to nd a polynomial Q
and a polynomial E such that

Q (xi ) = yi E (xi ) : 1 · i · n

Notice that this algorithm may fail.

Algorithm 62 (Berlekamp Welch) .


n¡k+1
Input : n ¸ k ¸ 1, 0 < e < 2
and n pairs f(xi , yi )g1·i·n
Output : Polynomial P (x) of degree at most k ¡ 1 or fail
Step 1 : Compute a non zero polynomial E (x) of degree e and a polynomial Q (x) of
degree at most e + k ¡ 1 such that

yi E (xi ) = Q (xi ) : 1 · i · n

If such polynomial do not exist output fail.

39
Q(x)
Step 2 : If E (x) does not divide Q (x) then output fail else compute P (x) = E(x)
.
If dH ((y1 , ..., yn ) , (P (x1) , ...P (xn ))) > e then output fail else output P (x) .

Example 63 Suppose we are using a Reed Solomon code with k = 2 and n = 5 hence
our code is of distance 3 and can correct one error.
Suppose we received the following message

x 1 2 3 4 5
P (x) 2 4 3 7 6

De ne the error locator polynomial E (x) = 1 + ex and the polynomial Q (x) = a0 + a1x +
a 2 x2 8
>
> 2 (1 + e) = a0 + a1 + a2
>
>
>
>
>
> 4 (1 + 2e) = a0 + a12 + a2 22
>
<
3 (1 + 3e) = a0 + a13 + a2 32
>
>
>
> 7 (1 + 4e) = a + a 4 + a 42
>
> 0 1 2
>
>
>
: 6 (1 + 5e) = a + a 5 + a 52
0 1 2
8
>
> 2 + 2e = a0 + a1 + a2
>
>
>
>
>
> 4 + e = a0 + 2a1 + 4a2
>
<
3 + 2e = a0 + 3a1 + 2a2
>
>
>
>
>
> 0 = a0 + 4a1 + 2a2
>
>
>
: 6 + 5e = a + 5a + 4a
0 1 2

40
Chapter 3

Cyclic codes

3.1 Finite elds


Proposition 64 Zn is a nite eld if and only if n is a prime number.

Proof. Exercise.

3.1.1 Characteristic of a eld

De nition 65 The characteristic of a eld F is the smallest natural n such that 1| + 1{z+ ...1} =
n
0. If n does not exist we say that char F = 0.

Proposition 66 If F is a nite eld then the characteristic is nite.

Proposition 67 If the characteristic m of a eld F is not 0 then m is a prime number.

Proof. Suppose m is not a prime number. Then m = ab where a > 1 and b > 1.
P
a P
b
Now if we consider the eld elements t and s where t = 1 and s = 1 then ts = 0
i=1 i=1
since the characteristic is m.
P
a P
b
Hence we have either 1 = 0 or 1 = 0 and this is a contradiction with the fact
i=1 i=1
that m is minimal.

41
If F is a nite eld of characteristic m, then F contains a sub eld having p elements.
To see this, we consider the set of elements

P
m
1, 1 + 1, ..., 1.
i=1

These elements are all distinct, closed under addition and multiplication, and the
additive and multiplicative inverses can be shown to exist for the non-zero elements.
This sub eld is then in bijection with Z/pZ and is referred to as the ground eld of
F.
A convenient way to think of F is as a vector space over its ground eld, this we will
be helpfull to prove the following result.

Proposition 68 If F is a eld then the cardinal of F is a pure power of a prime number.

Proof. Since F is nite the linear space must have nite dimension n.
There exist then n elements fα1, α2 , ...αn g of F which form a basis for F over its
ground eld and hence : ½ ¾
P
n
F = λi αi : λi 2 Z/pZ
i=1

This implies that F contains pn elements for some positive integers n.

Notation 69 A eld of cardinal q = pn where p is a prime number and n is some positive


integer will be denoted by GF (q) where GF stands for Galois Field.

We will now give some methods to construct nite elds of cardinal pn for n ¸ 2.

3.1.2 Construction involving square roots

Constructing a eld with 4 elements. Consider the eld Z/3Z if we compute the
square of all elements of this eld we have

02 = 0, 12 = 1, 22 = 1

42
We notice that no element of Z/3Z is a square root of 2.
Let us denote a a symbol such that a2 = 2 and de ne

³p ´ n p o
(Z/3Z) 2 = x + y 2, x, y 2 Z/3Z

We can show that this set has a eld structure and contain 9 elements. The eld structure
is inherited from that of (Z/3Z) .
Denote by (Z/pZ)2 the set of all squares of elements of Z/pZ

© ª
(Z/pZ)2 = x2 : x 2 Z/pZ

02 = 0, 12 = 1, 22 = 4, 32 = 9 = 4, 42 = 1
¯ ¯
We have to notice that in general the cardinal of the set ¯(Z/pZ)2¯ = p¡1
2
+ 1 because
for every x 6= 0 we have x2 = (¡x)2 .
Hence there is always elements from Z/pZ with no square roots in Z/pZ and hence
p
it is possible to de ne elds using extension similar to that de ned by 2.
Notice that extension involving square roots are limited by the fact that the number
p¡1
of square roots we can use in extension of Z/pZ is limited to for p ¸ 2.
2
For example using this method we can extend Z/3Z only by one dimension, we cannot
then create elds of dimension 3n for n > 2.
There is a more convenient way to exetend a nite eld.

3.1.3 Construction involving polynomials

De nition 70 Let F be a eld, A polynomial h (x) is said to be congruent to g (x)


modulo f (x) if and only if there exists a polynomial l (x) 2 F [x] such that

h (x) ¡ g (x) = l (x) f (x)

we write h (x) ´ g (x) (mod f (x))

43
The de nition implies that h (x) and g (x) are congruent if f (x) divides their di¤er-
ence. Equivalently h (x) and g (x) leave the same remainder when divided by f (x) .
This de nition is valid for polynomial f (x) in F [x] , it is possible to show that the
congruence is an equivalence relation on F [x] . An equivalence relation partitions the set
that is de ned on into congurence classes.
We de ne addition and multiplication of congruence classes of polynomials as follows
, for g (x) , h (x) 2 Zp [x] de ne :

[g (x)] + [h (x)] = [g (x) + h (x)]

and
[g (x)] . [h (x)] = [g (x) .h (x)]

If Zp [x] /f (x) is the set of all equivalences classes in Zp [x] under congruence modulo
the irreducible polynomial f (x) , then it is not hard to show that the operations above
de ned on this set of classes form a nite eld.
Most of the eld axioms follow immediately, we will show only that for any [g (x)] 6= 0
there exists [h (x)] such that [g (x)] [h (x)] = [1] .
This requires that g (x) h (x) ´ 1.
Since f (x) is irreducible and f (x) does not divide g (x) the greatest common divisor
of g (x) and f (x) is 1.
By the Euclidean algorithm for polynomials we know that there exist polynomialds
s (x) and t (x) such that s (x) g (x) + t (x) f (x) = 1 hence [s (x)] [g (x)] = 1 which estab-
lishes that every non zero class has a multiplicative inverse.

Case n = 2.

Consider the set Z/2Z [x] of all polynomials in x over the eld Z/2Z. The set of polyno-
mials has a ring structure, we choose an irreductible polynome

Example 71 We construct a eld with 4 elements, we use P (x) = x2 + x +1 2 Z/2Z [x]

44
which is irreductible over Z/2Z. This polynomial de nes the eld

F = f[0] , [1] , [x] , [1 + x]g

Multplication table

+ [0] [1] [x] [1 + x]


[0] [0] [1] [x] [1 + x]
[1] [1] [0] [x + 1] [x]
[x] [x] [1 + x] [0] [1]
[1 + x] [1 + x] [x] [1] [0]

Product table
. [0] [1] [x] [1 + x]
[0] [0] [0] [0] [0]
[1] [0] [1] [x] [1 + x]
[x] [0] [x] [1 + x] [1]
[1 + x] [0] [1 + x] [1] [x]

Notice that we have by euclidian division x2 ´ (1 + x) [1 + x + x2 ]

Example 72 We construct a eld with 4 elements, we use P (x) = x2 + 1 2 Z/3Z [x]


which is irreductible over Z/3Z. This polynomial de nes the eld

F = f[0] , [1] , [2] , [x] , [1 + x] , [2 + x] , [2x] , [2 + x] , [2 + 2x]g

45
Multplication table

+ 0 1 2 x 2x 1+x 2+x 1 + 2x 2 + 2x
0 0 1 2 x 2x 1+x 2+x 1 + 2x 2 + 2x
1 1 2 0 1+x 1 + 2x 2+x x 2 + 2x 2x
2 2 0 1 2+x 2 + 2x x 1+x 2x 1 + 2x
x x 1+x 2+x 2x 0 1 + 2x 2 + 2x 1 2
2x 2x 1 + 2x 2 + 2x 0 x 1 2 1+x 2+x
1+x 1+x 2+x x 1 + 2x 1 2 + 2x 2x 2 0
2+x 2+x x 1+x 2 + 2x 2 2x 1 + 2x 0 1
1 + 2x 1 + 2x 2 + 2x 2x 1 1+x 2 0 2+x x
2 + 2x 2 + 2x 2x 1 + 2x 2 2+x 0 1 x 1+x

Product table

. 0 1 2 x 2x 1+x 2+x 1 + 2x 2 + 2x
0 0 0 0 0 0 0 0 0 0
1 0 1 2 x 2x 1+x 2+x 1 + 2x 2 + 2x
2 0 2 1 2x x 2 + 2x 1 + 2x 2+x 1+x
x 0 x 2x 2 1 x+2 2x + 2 1+x 1 + 2x
2x 0 2x x 1 2 1 + 2x 1+x 2 + 2x 2+x
1+x 0 1+x 2 + 2x x+2 1 + 2x 2x 1 2 x
2+x 0 2+x 1 + 2x 2 + 2x 1+x 1 x 2x 2
1 + 2x 0 1 + 2x 2+x 1+x 2 + 2x 2 2x x 1
2 + 2x 0 2 + 2x 1+x 1 + 2x 2+x x 2 1 2x

46
Example 73 From the table of product we obtain
8 8 8
>
> 0
(1 + x) = 1 >
> (1 + x)3 = 1 + 2x >
> (1 + x)6 = x
>
< >
< >
<
(1 + x)1 = 1 + x (1 + x)4 = 2 (1 + x)7 = 2 + 2
>
> >
> >
>
>
: >
: (1 + x)5 = 2 + 2x >
: (1 + x)5 = 2 + 2x
(1 + x)2 = 2x

3.1.4 Properties of nite elds

Proposition 74 Let F be a eld of characteristic p then for every a, b 2 F

(a + b)p = ap + bp

P
p
Proof. We have (a + b)p = Cpj ap¡j bj as p divides Cpj for 0 < j < p the result
j=0
follow immediatly.

De nition 75 An element a in a nite eld F is said to be a generator of F ¤ or primitive


element of F if a generates all non zero elements of F :

© i ª
a : i ¸ 0 = F¤

Proposition 76 For every non-zero element a 2 GF (p) we have ap¡1 = 1. An element


a 2 GF (pn ) lies in GF (p) if and only if ap = a.

Proof. Let the distinct non zero elements of GF (p) be x1, ...xp¡1

fa.x1 , a.x2, ..., a.xp¡1 g = fx1, x2 , ..., xp¡1 g

These elements are all distinct otherwise we should have

47
hence

a.x1 , a.x2 ...a.xp¡1 = x1 .x2 ...xp¡1

) ap¡1. (x1 .x2 , ....xp¡1) = x1, x2 , ..., xp¡1

) ap¡1 = 1.

Hence every element of GF (p) is a root of the polynomial xp ¡ x.


Suppose that we have n = 1 and the elements of GF (p) provide p distinct roots of
the polynomial xp ¡ x. We conclude that this polynomial has no other roots and that
ap = a implies a 2 GF (p) .

3.1.5 Minimal polynomial

De nition 77 Let F be a eld of characteristic p and let α 2 F ¤ . A minimal polynomial


of α with respect to GF (p) is a monic polynomial m (x) of least degree in GF (p) [x] such
that m (α) = 0.

Proposition 78 The minimal polynomial of an element α is unique.

Proof. Suppose the cardinal of F is of cardinal q = pn where p is the characteristic


of F. As α 2 F ¤ it satis es the equation xq¡1 ¡ 1 = 0.
Since there is some polynomial in F Hence there is always some polynomial in GF (p) [x]
for which α is a root which establish the existence of a minimal polynomial.
Suppose there are two monic polynomials m1 (x) and m2 (x) of least degree having α as
a root.
By division algorithm we have

m1 (x) = l (x) m2 (x) + r (x) where deg r (x) < deg m2 (x)

Since m1 (α) = 0 and m2 (α) = 0 we have r (α) = 0.


As m2 (x) has least degree this implies that r (x) = 0 hence m2 (x) divides m1 (x) .

48
We can show similarily that m1 (x) divides m2 (x) .

Notation 79 We will denote the minimal polynomial of an element α by mα (x) .

Proposition 80 For α 2 F ¤ then mα (x) the minimal polynomial of α is an irreducible


polynomial.

Proof. If mα (x) is reducible then mα (x) = h (x) l (x) with deg h (x) ¸ 1 and
deg l (x) ¸ 1.
Hence at least one of h (x) and l (x) has α as a root and this contradicts the minimalitliy
of degree in the de nition of mα (x) .
t
De nition 81 For α 2 F let t be the smallest positive integer such that αp = α. Then
the set of conjugates of α with respect to GF (p) is

n t¡1
o
C (α) = α, αp , ..., αp

³ ´
pi
Remark 82 We have C (α) = C α for all i, for a eld F with characteristic p.

Lemma 83 Let F be a nite eld of characteristic p, let α 2 F ¤ and let C (α) be the set
of conjugates of α with respect to GF (p) . Then

Q
m (x) = (x ¡ β)
β2C(α)

is a polynomial with coe¢cients from GF (p) .


P
t
Proof. Let m (x) = mi xi . The coe¢cients mi are in F ; we need to prove that
i=0
they are in fact in the ground eld GF (p) .
First note that :

Q Q
m (x)p = (x ¡ β)p = (xp ¡ β p)
β2C(α) β2C(α)
Q P
t
= (xp ¡ β p ) = m (xp ) = mi xip
β2C(α) i=0

49
Note also that :
t ¡
P ¢p P
t
m (xp ) = mi xi = mpi xip
i=0 i=0

Hence mpi = mi implying that mi 2 GF (p) for 0 · i · p

3.2 Rings and Ideals

3.2.1 Basic de nitions and examples

De nition 84 .
(1) A ring R is a set together with two binary operations + and £ called addition and
multiplication) satisfying the following axioms :
(i) (R, +) is an abelian group.
(ii) £ is associative : (a £ b £ c) = a £ (b £ c) for all a, b, c 2 R.
(iii) the distributive laws holds in R :

(a + b) £ c = (a £ c) + (b £ c) and a £ (b + c) = (a £ b) + (a £ c)

(2) The ring R is commutative if the multiplication is commutative.


(3) The ring R is said to have an identity (or contain a 1)if there is an element 1 2 R
with
1 £ a = a £ 1 = a for all a 2 R

We shall usually write simply ab rather than a £ b for a, b 2 R.


The additive identiy of R will always be denoted by 0 and the additive inverse of the
ring element a will be denoted by ¡a.
The condition that R be a group under addition is a natural one, but it may seem
arti cial to require that this group is abelian.
One motivation for this is that if the ring R has a 1, the commutatitivity under
addition if forced by the distributive laws.

50
To see this, compute the product (1 + 1) (a + b) in two di¤erent ways, using the
distibutive laws (but not assuming that addition is commutative. One obtains
8
< (1 + 1) (a + b) = 1 (a + b) + 1 (a + b) = 1a + 1b + 1a + 1b = a + b + a + b
: (1 + 1) (a + b) = (1 + 1) (a) + (1 + 1) (b) = 1a + 1a + 1b + 1b = a + a + b + b

Since R is a group under addition, this implies b + a = a + b i.e that R under addition is
necessarily commutative.

De nition 85 A ring R with identity 1, where 1 6= 0 is called a division ring (or skew
eld) if every nonzero element a 2 R has a multiplicative inverse, i.e there exists b 2 R
such that ab = ba = 1.
A commutative ring is called a eld.

Example 86 The simplest examples of rings are the trivial rings obtained by taking
R to be any commutative group (denoting the group operation by +) and de ning the
multiplication £ on R by a £ b = 0 for all a, b 2 R.
It is easy to see that this multiplication de nes a commutative ring. In particular , if
R = f0g is the trivial group the resulting ring R is called the zero ring denoter R = 0.

Example 87 The quotient group Z/pZ is a commutative ring with identity under the
operations of addition and multiplication of residue classes.

Example 88 Let us consider the set

H = fa + bi + cj + dk, a, b, c, d 2 Rg

addition is de ned componentwise

(a + bi + cj + dk) + (a0 + b0 i + c0 j + d0 k) = (a + a0 ) + (b + b0 ) i + (c + c0 ) j + (d + d0 ) k

51
and multiplication is de ned by expanding (a + bi + cj + dk) (a0 + b0 i + c0 j + d0 k) using
the distributive law and simplifying

i2 = j 2 = k 2 = ¡1, ij = ¡ji = k, jk = ¡kj = i, ki = ¡ik = j

where the real number coe¢cients commute with i,j and k.


For example :

(1 + i + 2j) (j + k) = 1 (j + k) + i (j + k) + 2j (j + k)

= j + k + ij + ik + 2j 2 + 2jk

= j + k + k + (¡j) + 2 (¡1) + 2 (i) = ¡2 + 2i + 2k

The Hamilton Quaternions are a noncommutative ring with identity 1=1+0i+0j+0k.


Similarly one can de ne the ring of rational Hamilton Quaternions by taking a, b, c, d to
be rational numbers.
Both rational and real Hamilton Quaternions are division rings where inverses of nonzero
elements are given by

a ¡ bi ¡ cj ¡ dk
(a + bi + cj + dk)¡1 =
a2 + b2 + c2 + d2

De nition 89 Let R be a ring


1) A nonzero element a of R is called a zero divisor if there is a nonzero element b in R
such that either ab = 0 or ba = 0.
2) Assume A has an identity 1 6= 0. An element u of R is called a unit in R if there is
some v in R such that uv = vu = 1. The set of units in R is denoted R£

It is easy to see that the units in a ring R form a group under multiplication so R£
will be referred to as the group of units of R. In this terminology a eld is a commutative
ring F with identity 1 6= 0 in which every nonzero element is a unit. i.e F £ = F ¡ f0g .

52
Observe that a zero divisor can never be a unit. Suppose for example that a is a
unit in R and that ab = 0 for some nonzero b in R. Then va = 1 for some v 2 R, so
b = 1b = (va) b = v (ab) = v0 = 0 a contradiction.
Similarly if ba = 0 for some nonzero b then a cannot be a unit.
This shows in particular that elds contain non zero divisors.

Example 90 The ring Z of integers has no zero divisors and its only units are §1, i.e
Z£ = f§1g .

Example 91 Let n be an integer ¸ 2. In the ring Z/nZ the elements u for which u and
n are relatively prime are units.On the other hand , a is a nonzero integer and a is not
relatively prime to n then we show that a is a zeri divisor in Z/nZ. To see this let d be
the gcd of a and n and let b = nd . By assumption d > 1 so 0 < b < n i.e b 6= 0.
But by construction n divides ab that is ab = 0 in Z/nZ . This shows that every nonzero
element of Z/nZ is either a unit or a zero divisor.
Furthermore every nonzero element is a unit if and only if every integer a in the range
0 < a < n is relatively prime to n.
This happens if and only if n is a prime i.e Z/nZ is a eld if and only if n is a prime.

Example 92 If R is the ring of all functions from the closed interval [0, 1] to R then the
units of R are the functions that are not zero at any point (for such f its inverse is the
function f1 ).
If f is not a unit and not zero then f is a zero divisor because we de ne :
8
< 0, if f (x) 6= 0
g (x) =
: 1, if f (x) = 0

then g is not the zero function but f (x) g (x) = 0 for all x.

Example 93 If R is the ring of all continuous functions from the closed interval [0, 1]
to R then the units of R are still the functions that are not zero at any point.

53
1
¡ ¢
For example f (x) = x ¡ 2
has only one zero at x = 21 so f is not a unit.
1
On the other hand if gf = 0 then g must be zero for all x 6= 2
and the only continuous
function with this property is the zero function.
Hence f is neither a unit nor a zero divisor. Similarly, so function only a nite (or
countable) number of zeros on [0, 1] is a zero divisor.
This ring also contains many zero divisors. For instance let :
8
< 0, 0 · x · 1
2
f (x) =
: x ¡ 1, 1 · x · 1
2 2

and let g (x) = f (1 ¡ x) then f and g are nonzero continuous functions whose product
is the zero function.

De nition 94 A commutative ring with identity 1 6= 0 is called an integral domain if it


has no zero divisors.

The absence of zero divisors in intergral domains give these rings a cancellation prop-
erty :

Proposition 95 Assume a, b and c are elements of any ring with a not a zero divisor if
ab = ac then either a = 0 or b = c.

Proof. If ab = ac then a (b ¡ c) = 0 so either a = 0 or b ¡ c = 0.

Remark 96 If R is an integral domain then the proposition is valid for every a, b, c in


R.

Corollary 97 Any nite integral domain is a eld.

Proof. Let R be a nite integral domain and let a be a nonzero element of R. By


the cancellation law the map x 7! ax is an injective function. Since R is nite this map
is also surjective.
In particular there is some b 2 R such that ab = 1 i.e a is a unit in R. Since a was an
arbitrary nonzero element R is a eld.

54
De nition 98 A subring of the ring R is a subgroup of R that is closed under multiplic-
ation.

3.3 Ideals
De nition 99 Let (R, +, ¤) be a ring. A nonempty subset I of R is called an ideal of
the ring if :
i) (I, +) is a subgroup and
ii) For all i 2 I and all r 2 R we have i ¤ r 2 I

Example 100 An example of a non-trivial ideal is nZ ½ Z for an integer n > 1.


The corresponding quotient ring is the ring Z/nZ of residue classes modulo n.

Example 101 Let (R, +, ¤) be a ring and g 2 R de ne the set

I = fg ¤ r : r 2 Rg

It is easy to verify that I is an ideal, it is called the ideal generated by g.

De nition 102 Let (R, +, ¤) be a ring , R is called a principal ideal ring if for every
ideal I of R there exists an element g 2 I such that I = fg ¤ r : r 2 Rg .

Proposition 103 Let F be a eld then F [x] the set of polynomials over F is a principal
ideal ring.

Proof. Let I be an ideal of F [x], If I 6= f0g let g (x) be a monic polynomial of least
degree in I.
Consider any h (x) 2 I, by the division algorithm for polynomials we have

h (x) = q (x) g (x) + r (x)

where r (x) = 0 or deg (r) < deg (g).

55
Since g (x) 2 I it follows from the de nition that q (x) g (x) 2 I hence r (x) =
h (x) ¡ q (x) g (x) 2 I.
Since g (x) is a polynomial of least degree we must have r (x) = 0 and thus g (x)
divides h (x) .

Proposition 104 Let F be a eld and f (x) 2 F [x] then F [x] /f (x) is a principal ideal
ring.

Proof. The proof is very similar to the last one, just using classes of equivalences.
Let I be an ideal of F [x] /f (x), If I 6= f[0]g let g (x) be a monic polynomial of least
degree which represents some class in I.
Consider any [h (x)] 2 I, by the division algorithm for polynomials we have

h (x) = q (x) g (x) + r (x)

where r (x) = 0 or deg (r) < deg (g).


We have then :

[h (x)] = [q (x) g (x) + r (x)] = [q (x) g (x)] + [r (x)]

Since [q (x) g (x)] 2 I it follows from the de nition that [r (x)] = [h (x)]¡[q (x) g (x)] 2
I .
This implies r (x) = 0 and thus g (x) divides h (x) .

56
3.4 Cyclic codes
A cyclic code is a block code, where the shift to the right of each codeword gives another
codeword.

De nition 105 Let F be a eld , A subspace S of F n is a cyclic subspace if for every


(a1, a2, ..., an¡1 , an ) 2 S we have (an , a1 , a2 , ..., an¡1) 2 S.

De nition 106 A linear code C is a cyclic code if C is a cyclic subspace.

Example 107 .
1) C = f0000, 0101, 1010, 1111g ½ Z42 is a cyclic code.
2) C 0 = f1111, 2222g ½ Z43 is not a cyclic code.

3.5 Cyclic subspaces construction


Let F be a eld, we will associacte to each element of a = (a0 , ..., an¡1 ) 2 F n the
polynomial a0 + a1 x + ... + an¡1xn¡1 .
In some way each sequence will be identi ed to a polynomial, this is convenient to give
us a tool to compute the product of sequences by identifying the result to the product of
the two associated polynomials.
Let us consider the polynomial f (x) = xn ¡ 1 and consider the F [x] /f (x)
As f (x) = xn ¡ 1 we have then xn ´ 1 mod f (x).
Consider now any polynomial P (x) = a0 + a1 x + ... + an¡1xn¡1 we have then

xP (x) = a0 x + a1 x2 + ... + an¡1xn

= an¡1 + a0x + ... + an¡2 xn¡1

Hence multyplying by x corresponds to a cyclic shift of a = (a0, ..., an¡1 ) .

57
Proposition 108 A non-empty subset S of F n is a cyclic subspace if and only if the set
of polynomials I associated with S is an ideal in the ring of polynomials.

Proof. 1) Suppose S is a cyclic subspace.


As S is a linear subspace then (I, +) is an abelian group.
We will show that I is closed under multiplication by elements from R.
Let (a0 , ..., an¡1 ) 2 S since S is cyclic we know that (an¡1 , a0..., an¡2 , ) 2 S hence xP (x) 2
I.
We can obtain then that for every 0 · i · n ¡ 1 we have xi P (x) 2 I.
P
n¡1
As S is a vector space we have λj xj P (x) 2 I.
j=0
Hence I is an ideal.
2) Suppose that I is an ideal of R.
The fact that S is a vector space follows from the facts that (I, +) is a group and
that scalar multiplication in S corresponds to multiplication by constant polynomials.
The fact that S is cyclic subspace follows from the fact that I is closed under poly-
nomial multiplication by all elements of R.
In particular multiplication by x corresponds to a cyclic shift.

Proposition 109 Let I be a non zero ideal in F n and let g (x) be a monic polynomial of
least degree which represents some class of I. Then [g (x)] generates I and g (x) divides
f (x) = xn ¡ 1.

Proof. Let f (x) = h (x) g (x) + r (x) with deg (r (x)) < deg (g (x)) then

[f (x)] = [h (x)] [g (x)] + [r (x)]

Since [f (x)] = [0] we have [r (x)] = [¡h (x)] [g (x)] 2 I then by de nition of g (x) we have
r (x) = 0 and then g (x) divides f (x) .

Proposition 110 There is a unique monic polynomial of least degree which generates a
non zero ideal I of F n .

58
Proof. Suppose that g1 (x) and g2 (x) are monic polynomials of least degree in an
ideal I.
That is deg (g1 (x)) = deg (g2 (x)).
Since g1 (x) generates I and g2 (x) 2 I then g2 (x) = P (x) g1 (x) for some polynomial
P (x) since g1 (x) and g2 (x) are monic and have same degree then P (x) = 1 hence
g1 (x) = g2 (x) .
The unique monic polynomial in an ideal I is called a generator polynomial.

Proposition 111 Let g (x) be a monic divisor of f (x) = xn ¡ 1, then g (x) is the
generator polynomial of the ideal I = fP (x) g (x) : P (x) 2 Rg of R = F [x] / (f (x)) .

Proof. Let g 0 (x) be a monic polynomial of least degree in I, then g 0 (x) is the
generator polynomial of I hence g 0 (x) divides f (x) .
As g 0 (x) 2 I there exists a polynomial such that : [g 0 (x)] = [P (x) g (x)] so

g 0 (x) = P (x) g (x) + l (x) f (x)

Since g (x) divides f (x) hence g 0 (x) divides g (x) .


From another hand since g (x) is the generator g (x) divides g 0 (x) , since both are
monic we conclude that g (x) = g 0 (x) .
We will refer to the unique monic polynomial of least degree in an ideal I as the
generator polynomial.

Example 112 Consider the space Z/2Z and f (x) = x7 ¡ 1 we have :

¡ ¢¡ ¢
x7 ¡ 1 = (x + 1) x3 + x2 + 1 x3 + x + 1

59
The monic divisors of f (x) are

g1 (x) = 1, g2 (x) = x + 1, g3 (x) = x3 + x2 + 1, g4 (x) = x3 + x + 1


¡ ¢ ¡ ¢
g5 (x) = (x + 1) x3 + x2 + 1 , g6 (x) = (x + 1) x3 + x + 1 .
¡ ¢¡ ¢
g7 (x) = x3 + x2 + 1 x3 + x + 1 , g8 (x) = f (x)

For example g6 (x) will generate the following cyclic subspace

Operation Polynomial in Z/2Z Polynomial in Z/2Z /x7 ¡ 1 Codeword


g6 (x) x4 + x3 + x2 + 1 x4 + x3 + x2 + 1 1011100
xg6 (x) x5 + x4 + x3 + x x5 + x4 + x3 + x 0101110
x2 g6 (x) x6 + x5 + x4 + x2 x6 + x5 + x4 + x2 0010111
x3 g6 (x) x7 + x6 + x4 + x3 x6 + x5 + x3 + 1 1001011
x4 g6 (x) x7 + x6 + x4 + x x6 + x4 + x + 1 1100101
x5 g6 (x) x7 + x5 + x2 + x x5 + x2 + x + 1 1110010
x6 g6 (x) x6 + x3 + x2 + x x6 + x3 + x2 + x 0111001

Proposition 113 Let g (x) be a monic divisor of xn ¡ 1 having degree n ¡ k. Then g (x)
is the generator polynomial for a cyclic subspace of F n of dimension k.

Proof. Let S be the cyclic subspace of F n generated by g (x) and consider the set of
vectors corresponding to
© ª
g (x) , xg (x) , ..., xk¡1g (x)

We will show that B is a basis for S.


P
k¡1
Consider any linear combination λi xi g (x) = 0 as the only term which contains
i=0
xn¡1 is λk¡1 xk¡1 g (x) is less than n then we have αk¡1 = 0 by similiar arguments we get
successively all λi are equal to 0.

60
Consider any element h (x) 2 S since g (x) is the generator of S hence

h (x) = P (x) g (x)

As g (x) is of degree n ¡ k and we are computing modulo xn ¡ 1 hence P (x) may be


written in the form of a polynomial of degree less than or equal to k ¡ 1.

P
k¡1 P
k¡1
P (x) = λi xi ) h (x) = λi xi g (x)
i=0 i=0

So B is a basis for S.

3.5.1 Finding irreducible polynomials

Given any monic polynomial E (X) of degree n, it can be veri ed whether it is an irre-
¡ n ¢
ducible polynomial by checking ig gcd E (x) , xq ¡ x = E (x) .
This is true as every irreducible polynomial in F [x] of degree exactly n divides the poly-
n
nomial xq ¡ x.
The Euclid algorithm for computing the gcd of two polynomials can be implemented
in time polynomial in the minimum of the degree of the two polynoms.

Algorithm 114 Input : Prime power p and an integer n > 1


Output : A monic irreducible polynomial of degree s over F
bÃ0
while b=0 do
P
n¡1
F (x) Ã xn + fi xi where each fi is chosen uniformly at random from F
¡ i=0
n ¢
If gcd F (x) , xq ¡ x = F (x) then b à 1
Return F (x)

61
Chapter 4

In nite length codes

Convolutional codes or in nite length codes have been widely used in wireless commu-
nications (WiFi, cellular, and satellite) and are constituents of the widely used Turbo
coding. Unlike block codes, convolutional codes don’t have a nite block length.
A convolutional encoder utilizes linear shift registers to encode k input bits into n
output bits.
Let us explain this with an example of a convolutional encoder.

The information is coming as a continuous stream of data , let us suppose that at a


moment the we can see the nite sequence 010110.

62
This sequence will go inside the encoder and be encoded "continuisly"

time step Input data Output dat


1 110 10
2 011 10
3 101 00
4 010 01

Hence the output stream should contain a sequence of the form

....10100001....

4.1 Symbolic dynamics


A convolution encoder may be seen as a map from AZ (resp. AN ) to some set B Z (resp.
B N ), in most cases the alphabets are identical and binary.
The properties of convolutional encoders can be mathematicaly understood using
tools from the theory of symbolic dynamics.
We start by a reminder of basic de nitions related to nite and in nite words.

De nition 115 .
1- Let A be a nite set called an alphabet , en element of A is referred to as a letter.
2- A nite word is a sequence of elements of A. The length of a nite word u = u0 ...un¡1 2
An is juj = n.
3- The set of nite words is given by A¤ = [1 n 0
n=0 A . The set A = fλg, where λ is the

empty word.
4- We denote by AN the set of in nite words over A and by AZ the set of bi-in nite
sequences over A.
5- For two integers i, j with i < j we denote by x (i, j) the word xi ...xj .
6- The length of an in nite word is denoted by 1.

63
For an element of AZ we may use a decimal notation to avoid confusion. The rst
element to the right of the decimal point denoting position 0.
For example x = ...000.100.. of f0, 1gZ has a one at position zero (x0 = 1) .

De nition 116 .
1- Let u = u0 ...un¡1 and v = v0 ...vm¡1 be two nite words, the concatenation of the words
u and v is denoted by uv = u0 ...un¡1 v0 ...vm¡1 .
2- We say that the word u is a factor of v and denote u v v if there is two words x, y
such that xuy = v.
3- If x = λ the word u is said a pre x of v and we denote it by u vp v.
4- If y = λ the word u is said a su¢x of v and we denote it by u vs v.
5- If u is a non empty nite word we denote by u1 2 AN the in nite concatenation of u.

The set of nite words A¤ equipped with the concatenation operation is a monoid.
i.e. has the associative property and λ is the neutral element.

4.1.1 Bernoulli’shift

De nition 117 Let A be un alphabet, the shift map is de ned from the set AZ (resp.
AN ) to itself by
σ (x)i = xi+1

The name shift is coming from the fact that while computing the image of an element
by this map it is enough to "shift" the coordinates to the left

x ... x¡2 x¡1 x0 x1 x2 ...


σ(x) ... x¡1 x0 x1 x2 x3 ...
σ 2(x) ... x0 x1 x2 x3 x4 ...

Proposition 118 The shift map is continuous.

64
Proof. Case of AZ .
If d (x, y) = 20 = 1 then d (σ (x) , σ (y)) · 2d (x, y) and if d (x, y) = 2¡n and n > 1 then
d (x, y) = 2¡n+1 , ainsi nous avons pour tout x, y 2 AN

d (σ (x) , σ (y) · 2d (x, y))

Proposition 119 The shift map is bijective if de ned on the set AZ but only surjective
if de ned on the set. AN .

Proof. Exercise.

4.1.2 Periodic points of the shift map

De nition 120 Let F be a function from X to itself a point x 2 X is said of period


p 2 N¤ if 8
< F p (x) = x
: 8i < p, F i (x) 6= x

A xed point of the shift verify σ (x) = x we have then by de nition :

8i 2 N : xi = σ (x)i = xi+1 ) 9a 2 A : 8i 2 N : xi = a.

Hence the number of xed points is equal to the cardinal of A.


Periodic points of period 2 must verify the condition :

8i 2 N : xi = σ 2 (x)i = xi+2 ) 9a, b 2 A : 8i 2 N : xi = a, xi+1 = b.

Hence periodic points are the result of combination of two elements of A.


In general q ¡ periodic points of (AZ , σ) are elements wrtitten as a in nite concaten-
ation of q letters of A. This is a necessary but not su¢tient condition .

65
Example 121 The point x = (1212)1 is not 4¡periodique but 2¡periodic..

Proposition 122 The set of shift periodic points is dense in the the set AN (AZ ).

¡ ¢1
Proof. 8x 2 AN de ne the sequence y (n) = x[0,n] by de nition of the distance we
¡ (n) ¢
have lim d y , x = 0.
n!1

4.1.3 Topology of symbolic spaces

Proposition 123
Let AZ (resp AN ) a symbolic space, the map :

d1 : AZ ! R
8
< 1
, si x 6= y
2n
(x, y) 7¡! d(x, y) =
: 0, si x = y

Where n = min fi : xi 6= yi ou x¡i 6= y¡i g . (resp n = min fi : xi 6= yi g) de ne a distance


on AZ

Proof. Exercice.

De nition 124 Let u 2 An and m 2 N de ne the cylinder [u]m by :

[u]m = fx 2 AZ : x[m,m+n¡1] = ug.

The cylinder [u]0 is simply denoted [u].

Proposition 125 Cylinders of AZ are both open and closed.

Proof. 1. Opennes
Consider the cylinder [u] = fx 2 AZ : x[0,n¡1] = ug we have :

x 2 [u] , x[0,n¡1] = u , x 2 B2juj+1 (x)

66
Hence [u] = B2juj+1 (x).
For a general cylinder it is enough to [u]m = σ¡m ([u])
2. Closeinnes .
Let (xn )n2N be a sequence of elements of [u] we will show that if the limit of (xn )n2N
exists then it belongs to [u].

x = lim xn , 8 > 0, 9N0, 8n > N0 : d (xn , x) < 


n!1

1 1
Let  = 2juj+1
then there exists N0 , 8n > N0 : d (xn , x) < 2juj+1
hence x[0,n] = u then x 2
[u].

Example 126 Soit le cylindre [0]0 on a 01 2 [0]0 , 011 2 [0]0 , (01)1 2 [0]0 , (01)1
2 [1]1 .

Remark 127 Cylinders form a partition of the space AZ .

Proposition 128 The space AZ is compact.

Proof. Exercise

4.2 Subshifts
De nition 129 A subshift is any closed invariant subset of AN (resp. AZ ) .

A classical way to de ne a subshift is using its set of forbidden words which cannot
occur at any point of the subshift.
Recall that we say that the word u is a factor of v and denote u v v if there is two
words x, y such that xuy = v.

Proposition 130 Let F ½ A¤ a set of nite words and de ne

© ª
§F = x 2 AN : 8u 2 A¤ , u v x ) u 2
/F

67
If §F is non empty then it is a subshift..

Proof. 1) Invarianace :
Suppose x 2 §F hence for every 0 · i · j, σ (x)[i,j] = x[i+1,j+1] 2
/ F hence σ (x) 2 §F
hence §F is invariant.
2) §F is closed.
£ ¤
Suppose x 2/ §F hence there is 0 · i · j, x[i,j] 2 F and for every y 2 x[0,j] , y 2
/ §F
£ ¤
hence x[0,j] ½ AN n§F and §F is closed.

Example 131 The subshift §f01,10g = f01 , 11 g .

Example 132 The subshift Golden mean §f11g ½ f0, 1gN .

Example 133 The even subshift §F with F = f012n+1 0 : n ¸ 0g .

Example 134 The set of all binary sequences for which 1’s occur in nitely often in each
direction, and such that the number of 0’s between successive occurrences of a 1 is either
1, 2, or 3. This shift is used in a common data storage method for hard disk drives.

Example 135 For each positive integer c, the charge constrained shift, is de ned as the
set of all points inf¡1, +1gZ so that for every block occurring in the point, the algebraic
sum s of the +1’s and -1’s satis es ¡c · s · c. These shifts arise in engineering
applications and often go by the name “DC-free sequences.” .

4.2.1 Subshift of nite type

De nition 136 A subshift is of nite type or an SFT if the set of forbidden words is a
nite set.
A subshift of nite type is said of order p if the set of forbidden words is included in Ap.
A subshift of order 2 is said Markovian .

68
Remark 137 If a word u 2 A¤ is forbidden then the words ua are also forbidden for
every a 2 A. So a subshift of nite type has always an order for example §f00,111g =
§f000,001,111g is an SFT of order 3.

Example 138 The subshift §f10g = f0n 11 : n ¸ 0g [ f01 g is a subshift of nite type of
order 2.

Example 139 Caracterize the following subshifts §f01,10g , §f00,10g , §f00,11g, §f01,10,11g

4.2.2 Subshifts and graphes

A Markov subshift § ½ AN can be described by an oriented graph. The vertices of the


graph are letters of A and there is an arrow from a to b if ab 2 L2 (§) . If § is a subshift
with alphabet L2 (§) .
If § is a subshift with alphabet L1 (§) = A, then for any a 2 A there exists at least
one b 2 A with ab 2 L2 (§) otherwise no word u 2 AN containing the letter a would
belong to § and § would be a subshift in the alphabet An fag .

De nition 140 A transition graph is a pair (A, E) , where A is a nonempty nite set
and E ½ A £ A is a set of edges satisfying

8a 2 A, 9b 2 A, (a, b) 2 E.

The subshift §E ½ AN of a transition graph (A, E) is de ned by :

u 2 §E i¤ 8i ¸ 0, (ui , ui+1 ) 2 E

Example 141

69
4.2.3 Languages

It is sometimes easier to describe a shift space by specifying which blocks are allowed,
rather than which are forbidden. This leads naturally to the notion of the language of a
shift.

De nition 142 Let X be a subshift and let Bn (X) denote the set of all word of lenght
n that occur in points in X. The language of X is the collection :

B (X) = [1
n=0 Bn (X)

70
Example 143 The full 2-shift has language fλ, 0, 1, 00, 01, 10, 11, 000, 001, 010, 011, 100, ...g .

Example 144 The golden mean shift has language fλ, 0, 1, 00, 01, 10, 000, 001, 010, 100, 101, 0000, ...g

The term “language” comes from the theory of automata and formal languages. Think
of the language as the collection of “allowed” blocks in X. For a block u in B(X), we
sometimes use alternative terminology such as saying that u occurs in X or appears in
X or is in X .

Proposition 145 Let X be a subshift and L be its language. If w 2 L then :


(a) every subblock of w belongs to L.
(b) There are non empty words u and v in L so that uwv 2 L.

Proof. If w 2 L = B (X) then w occurs in some point x in X. But then every


subbblock of w also occurs in x so is in L.
Furthermore clearly there are nonempty blocks u and v such that uwv occurs in x, so
that u, v 2 L and uwv 2 L.

De nition 146 A subshift X is irreducible if for every ordred pair of blocks u, v 2 B (X)
there is a w 2 B (X) so that uwv 2 B (X) .

4.3 Sliding block maps


A map F : AZ ! B Z is a sliding block if there exist two integers m, a : and a map
f : Aa+m+1 ! B such that :

8x 2 AZ , 8i 2 Z : (F (x))i = f (x[i¡m,i+a] ).

71
1. The integers a and m are called anticipation and memory of f .
2. The value r = maxfm, ag is called the radius of f .

Remark 147 In some books the concept of memory and anticipation are not used prefer-
ing to talk only about the concept of radius.

4.4 Examples
³ ´
Z
4.4.1 The majority rule f0, 1g , M

This is a function used as a model of opinion change and as an encoder in some cases.
In mathematical modeling this is a model of opinion changes. An individual keeps
his opinion (0 or 1) provided he can share it with at least one of his neighbours. We see
that the cellular space is successively homogenized so that no isolated 0 or 1 remain.
The function is based on the idea that an individual keeps his opinion (0 or 1) provided
he can share it with at least one of his neighbours.

· ¸
xi¡1 + xi + xi+1
(M(x))i = where the symbol [] is for the integer part.
2

Notice that the cylinders [00]0 and [11]0 are invariant i.e M ([00]0) ½ [00]0 and
M ([11]0) ½ [11]0.

72
It is enough to let x 2 [00]0 and just compute all four possibilites.

³ ´
Z
4.4.2 The sum rule f0, 1g , S

The rule table is given by :


S(x)i = (xi¡1 + xi+1 )

000 001 010 011 100 101 110 111


0 1 0 1 1 0 1 0

If we compute (S 2 (x))i we have :

i i¡2 i¡1 i i+1 i+2


x xi¡2 xi¡1 xi xi+1 xi+2
S (x) ... S (x)i¡1 = (xi¡2 + xi ) (xi¡1 + xi+1 ) (xi + xi+2) ...
S 2 (x) ((xi¡2 + xi ) + (xi + xi+2) )

We obtain then S 2(x)i = (xi¡2 + 2xi + xi+2 ) = (xi¡2 + xi+2 )

Theorem 148 (Hedlund) A map F : AZ ! B Z is a sliding block if and only if it is


continuous and commutes with the Bernoulli shift : σ ± F = F ± σ.

F : AZ ! AZ
σ# #σ
F : AZ ! AZ

73
Proof. Let F de a sliding block of radius r = maxf¡m, ag and let n 2 Z,
On a:

1
d(x, y) < ) x[¡n¡r,n+r] = y[¡n¡r,n+r].
2n+r
) x[¡n+m,n+a] = y[¡n+m,n+a] because ¡ r · m · a · r.

) (F (x))[¡n,n] = (F (y))[¡n,n] .
1
) d(F (x), F (y)) = n .
2
) F is continuous

For every i 2 Z :

(F (σ(x))i = F ((σ(x))[i+m,i+a] ) = F (x[i+m+1,i+a+1] ) = (F (x))i+1 = σ(F (x))i .

Suppose F is continuous ( which is also uniformly continuous by the Heine theorem )


and commute with the shift .
For ε = 1 there is r > 0 such that:

1
d(x, y) < ) d(F (x), F (y)) < 1.
2r
x[¡r,r] = y[¡r,r] ) (F (x))0 = (F (y))0 .

Hence there is f : A2r+1 ! A such that 8x 2 AZ

F (x)0 = f (x[¡r,r] ).

As F commute with the shift we have :

(F (x))i = σi (F (x))0 = F (σi (x))0 = F (σi (x)[¡r,r] ) = F (x[i¡r,i+r] ).

74
Proposition 149 Let F be a sliding block on AZ or radius r there exist a topologicaly
conjugate sliding block G over B Z with radius 1.

Proof. Exercise.

De nition 150 Let F be a function from X to itself a point x 2 X is said eventualy of


preperiod m and of period p 2 N¤ if F m (x) is periodic of periodic of period p.

Example 151

Proposition 152 Let F be a sliding block map, the set of eventualy periodic points is
dense in AZ .

Proof. Let x be a p¡periodic point for the shift. There exist then u such that juj = p
and x = u1 .
We know that σ (x) is also a p¡periodic point for the shift.
Applying the local rule on x the image F (x) is also a periodic point for the shift of a
certain period p0 .
As the number of p0 periodic points is bounded then there exist a certain m > 0 and
p > 0 such that: F m+p (x) = F m (x).
³ ´
Example 153 Consider the sliding block de ned over f0, 1, 2gZ , Nl de ned by the
local rule :
nl (0) = 1, nl (1) = 2, nl (2) = 1

Theorem 154 .
1. Every convolutional encoder is a sliding block code.
2. Every convolutional code is an irreducible subshift .

75
4.5 Computer simulation
In computer simulations of sliding blocks we have to found a way to simulate a sliding
block on a nite list.
This create the problem of computing on border. To avoid this di¢culty we have to
de ne border condition for programming, below are some of them.

1. Torus or periodic boundary conditions


Here the values of the right of a cell is the value at begining position and vice versa,
in this we treat the lattice like a torus.

2. Random boundary conditions


Here the boundary conditions are chosen randomly.

3. Constant boundary conditions


Here the boundary conditions are xed to some constant, in the examples below

4. Null boundary conditions: Similar to constant but where the constant is always
0.

76
Annex

4.6 The multivariate interpolation problem


In dimension 1 the unicity of the interpolation polynomial is based of the fact the number
of roots of a polynomial of one variable of degree p is less than or equal to p.
This is not true in multivariate polynomials, we give below a probabilistic result in
higher dimension about the number of roots of a multivariate polynomial.

Theorem 155 (Schwartz, Zippel) Let P 2 F [x1 , x2 , ..., xn ] be a non zero polynomial
of total degree d ¸ 0 over a eld F. Let S be a nite subset of F and let r1 , r2 , ..., rn be
selected at random independently and uniformly from S then

d
P [P (r1 , r2 , ..., rn ) = 0] ·
jSj

Equivalently for any nite subset S of F, if Z (P ) is the zero set of P, then

jZ (P ) \ S n j · d. jSjn¡1

Proof. Use induction.


For n = 1 a polynomial has at most d roots.
Assume the theorem holds for all polynomials in n ¡ 1 variables and consider P to be a
polynomial in x1
P
d
P (x1 , ...xn ) = xi1 Pi (x2, ..., xn )
i=0

77
Since P is not identically 0 there is some i such that Pi is not identically 0.
Consider the largest such i then d± Pi · d ¡ i.
For randomly chosen r2, ..., rn from S we have by induction hypothesis

d¡i
Pr [Pi (r2 , ..., rn ) = 0] · .
jSj

If Pi (r2, ..., rn ) 6= 0 then P (x1, ...xn ) is of degree i and thus not identically zero so

i
Pr [P (r2, ..., rn ) = 0 jPi (r2 , ..., rn ) 6= 0] ·
jSj

To make notation compact let us denote the event P (r1 , ..., rn ) = 0 by A, the event
Pi (r2, ..., rn ) = 0 by B we have then

Pr (A) = Pr (A \ B) + Pr (A \ B c )

= Pr (B) Pr (A jB ) + Pr (B c ) Pr (A jB c )

· Pr (B) + Pr (A jB c )
d¡i i d
· + ·
jSj jSj jSj

4.6.1 Finding a multivariate interpolation polynomial

In general nding a multivariate interpolation polynomial is achieved by solving a linear


system of equations using some direct method like Gaussian elimination.
Let a1 , ...ap be a distinct points in F m , to nd some multivariate polynomial M in
F [x1 , ..., xm ] of degree · n such that M (ai ) = bi for all 1 · i · p we need to solve the
associated system.

78
4.6.2 Multivariate Lagrange method

There is a possible generalisation of the Lagrange method to higher dimensions, however


the interpolation polynomial degree is not optimal.
Let us consider for example the case of dimension 2.
Let (xi )0·i·n and (yj )0·j·m a set of interpolation points from a eld F.
Consider then the two families of Lagrange polynomials :

YN
(x ¡ xr ) YM
(y ¡ ys )
Li (x) = and Lj (y) =
r=0
(xi ¡ xr ) s=0
(yj ¡ ys )
r6=i s6=j

Denote the multivariate Lagrane polynomials by Lij (x, y) = Li (x)Lj (y), 0 · j · m, 0 ·


i · n.
we have then :
8
< 1 if r = i and s = j
Lij (xr , ys ) = : 0 · j · m, 0 · i · n.
: 0 else.

Notice that countrary to dimension the Lagrange multivariate polynomials form a gen-
erator set but not a base of the space of multivariate polynomials .
Let (xi )0·i·n and (yj )0·j·m a set of interpolation points from a eld F. Let (zij )0·j·m,0·i·n.
a set of values.
The multivariate interpolation polynomial is then given by the formula :

X
N X
M
P (x, y) = zij Lij (x, y) with d± (P (x, y)) = N + M.
i=0 j=0

79
4.7 Hadamard codes
The Hadamard code is an error-correcting code used for error detection and correction
when transmitting messages over very noisy or unreliable channels. In 1971, the code
was used to transmit photos of Mars back to Earth from the NASA spacecraft Mariner
9.

De nition 156 A Hadamard matrix H of order n is an n £ n matrix of 1s and -1s in


which HH t = nIn . (In is the n £ n identify matrix.)

Conjecture 157 A Hadamard matrix can exist only if n is 1, 2, or a multiple of 4.


0 1
H H
Proposition 158 If H is a Hadamard matrix of order n, then @ A is a Hadam-
H ¡H
ard matrix of order 2n.

Proof. Exercise.
The last proposition allow a recursive construction of Hadamard matrices of pure
power of 2 dimension. Starting from H1 one can build Hadamard matrices from any
dimension of the form 2K .

Example 159

H1 = (1)
0 1
1 1
H2 = @ A
1 ¡1
0 1
1 1 1 1
B C
B C
B 1 ¡1 1 ¡1 C
H4 = B
B
C
C
B 1 1 ¡1 ¡1 C
@ A
1 ¡1 ¡1 1

80
0 1
H
e =@
Consider the (2n £ n) matrix H A.
¡H
e We have the following properties :
Denote by Ci : 1 · i · 2n rows of the matrix H.

² Rows Ci and Cj of C agree in 0 positions if j = i + n.


n
² Rows Ci and Cj of C such that i 6= j agree in positions otherwise.
2

81

You might also like