0% found this document useful (0 votes)
96 views

Lectures Rings and Modules

The document summarizes the key concepts in rings and modules that will be covered in the course. It recaps rings, including definitions and examples. It discusses homomorphisms of rings, quotient rings, ideals, prime and maximal ideals. It also mentions Euclidean domains, principal ideal domains, unique factorization, and modules, including definitions, examples, submodules and quotient modules. It provides examples of algebraic structures that are rings or fields. It concludes with recommended textbooks for the course.

Uploaded by

chandu93152049
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
96 views

Lectures Rings and Modules

The document summarizes the key concepts in rings and modules that will be covered in the course. It recaps rings, including definitions and examples. It discusses homomorphisms of rings, quotient rings, ideals, prime and maximal ideals. It also mentions Euclidean domains, principal ideal domains, unique factorization, and modules, including definitions, examples, submodules and quotient modules. It provides examples of algebraic structures that are rings or fields. It concludes with recommended textbooks for the course.

Uploaded by

chandu93152049
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 68

A3 Rings and Modules

Richard Earl

Hilary Term 2018


Syllabus
Recap on rings (not necessarily commutative or with an identity) and examples: Z, fields,
polynomial rings (in more than one variable), matrix rings. Zero-divisors, integral domains.
Units. The characteristic of a ring. Discussion of fields of fractions and their characterization
(proofs non-examinable) [2]
Homomorphisms of rings. Quotient rings, ideals and the first isomorphism theorem and conse-
quences, e.g. Chinese remainder theorem. Relation between ideals in R and R/I. Prime ideals
and maximal ideals, relation to fields and integral domains. Examples of ideals. [3]
Euclidean Domains. Examples. Principal Ideal Domains. EDs are PIDs. Application of
quotients to constructing fields by adjunction of elements; examples to include C ∼
= R[x]/ x2 +1
and some finite fields. Degree of a field extension, the tower law. [2.5]
Unique factorisation for PIDs. Gauss’s Lemma and Eisenstein’s Criterion for irreducibility. [2]
Modules: Definition and examples: vector spaces, abelian groups, vector spaces with an endo-
morphism. Submodules and quotient modules and direct sums. The first isomorphism theorem.
[2]
Row and column operations on matrices over a ring. Equivalence of matrices and canonical
forms of matrices over a Euclidean Domain. [1.5]
Free modules and presentations of finitely generated modules. Structure of finitely generated
modules of a Euclidean domain. [2]
Application to rational canonical form for matrices, and structure of finitely generated Abelian
groups. [1]

Recommended Texts
• M. E. Keating, First Course in Module Theory (Imperial Press, 1998) (Possibly out of
print, but many libraries should have it. Covers much of the course.)

• Joseph Gallian, Contemporary Abstract Algebra (9th edition, CENGAGE 2016) (Excel-
lent text covering material on groups, rings and fields).

• B. Hartley, T. O. Hawkes, Chapman and Hall, Rings, Modules and Linear Algebra. (Pos-
sibly out of print, but many libraries should have it. Relatively concise and covers all the
material in the course).

• Neils Lauritzen, Concrete Abstract Algebra, CUP (2003) (Excellent on groups, rings and
fields, and covers topics in the Number Theory course also. Does not cover material on
modules).

• Michael Artin, Algebra (2nd ed. Pearson, (2010). (Excellent but highly abstract text
covering everything in this course and much more besides).

1
EXAMPLES OF ALGEBRAIC STRUCTURES

RINGS
Z the integers under + and ×.
Zn — the integers, modulo n, under + and ×.
H — the quaternions under + and ×.
R[x] — polynomials in x with coefficients in the ring R under + and ×.
R[[x]] — formal power series in x with coefficients in the ring R under + and ×.
R[A] — polynomials in a square matrix A with entries in a CRI R.
RS = {f : S → R} — the set of maps from a set S to a ring R under pointwise + and ×.
C(R) — the ring of continuous functions f : R → R under pointwise + and ×.
C 1 (R) — the ring of continuously differentiable f : R → R under pointwise + and ×.
Z[i] — the Gaussian integers, i.e. the subring of complex numbers a + bi where a, b ∈ Z.
Mn (F ) — the n×n matrices with entries in the field F under matrix addition and mutliplication.
End(V ) — endomorphisms (i.e. linear maps V → V ) of a vector space V under + and ◦.
P(X) — the power set of X under ∆ and ∩.
R/I — the quotient (or factor) ring of a ring R by an ideal I of R.
FIELDS
Q, R, C — the rationals/reals/complex numbers under + and ×.
Zp — the integers modulo p (a prime) under + and ×.
√ — {q1 + iq2 : √
Q[i] q1 , q2 ∈ Q} under + and ×.
Q[ 2] — {q1 + q2 2 : q1 , q2 ∈ Q}.
F (x) — rational functions in x with coefficients in the field F .
Fq — the finite field with q elements; q = pn for a prime p and n 1.

2
1. A RECAP ON RINGS

The material in this chapter that is covered in A0 Linear Algebra will be only briefly revisited.

Definition 1 A ring (R, +, ×) consists of a set R with + and × being two binary operations
on R

+ : R × R → R, (a, b) → a + b;
× : R × R → R, (a, b) → a × b,

such that (R, +) is an abelian group, that is:

• (A1) associativity: for all a, b, c ∈ R we have a + (b + c) = (a + b) + c;

• (A2) commutativity: for all a, b ∈ R we have a + b = b + a;

• (A3) zero element: there exists 0R ∈ R such that for all a ∈ R we have a+0R = a = 0R +a;

• (A4) inverses: for any a ∈ R there exists −a ∈ R such that a + (−a) = (−a) + a = 0;

and further that:

• (M 1) for all a, b, c ∈ R we have a × (b × c) = (a × b) × c;

• (D) for all a, b, c ∈ R we have a × (b + c) = a × b + a × c and (a + b) × c = a × c + b × c.

Notation 2 We will often suppress the × symbol and simply write ab for a × b.

Remark 3 If the operations + and × are clear then we will often refer simply to a ring R.

Definition 4 We say that a ring R is commutative if multiplication is commutative — that


is if the following axiom holds

• (M 2) commutativity: for all a, b ∈ R we have ab = ba.

Definition 5 We say that a ring R has a 1 or an identity if there is an element 1R ∈ R such


that

• (M 3) existence of an identity: for all a ∈ R we have a × 1R = a = 1R × a;

• (Z) avoiding collapse: 1R = 0R .

If R has an identity then it is necessarily unique.

Basic rules of algebra following from the ring axioms are:

A RECAP ON RINGS 3
Proposition 6 Let R be a ring with a 1 and a, b ∈ R.
(a) If a + b = a + c then b = c.
(b) −(−a) = a.
(c) a0R = 0R = 0R a.
(d) − (ab) = (−a) b = a (−b) .
(e) (−1R ) a = −a = a (−1R ) .

Proof. Left as exercises.

Notation 7 Most, but not all, of the rings we will study are commutative and also have an
identity; we will use the acronym CRI as shorthand for "commutative ring with an identity".

A ring, then, is a half-way house between a group and a field, in that the second operation ×
is not "fully-developed": it need not be commutative, there need not be an identity, and there
need not be inverses. In a ring, multiplication is permissible but division may well not be.

Example 8 Z, Q, R, C with + and × are all CRIs. In each case 0R = 0 and 1R = 1.

Example 9 Zn — the ring of integers modulo n — is a CRI with standard + and × and 0Zn = 0̄
and 1Zn = 1̄.

Example 10 If R is a ring, then we can consider R[x] the ring of polynomials with coefficients
in R. If R is commutative, then so is R [x] and if R has an identity then so does R[x], namely
the constant polynomial 1R . Explicitly, given two polynomials p, q over R, say

p(x) = ai xi ; q(x) = bi xi
i i

then the sum p + q and product pq in R[x] are defined by

p(x) + q(x) = (ai + bi )xi ; p(x)q(x) = ai bj xk .


i k i+j=k

We can more generally define polynomial rings over several variables, R[x1 , . . . , xn ]. These rings
can be inductively defined by

R[x1 , . . . , xn ] = R[x1 , . . . , xn−1 ][xn ].

Example 11 Note that the rule deg(fg) = deg f + deg g will not apply generally in rings. For
example in Z4 [x] we have

(2̄x + 1)(2̄x + 1) = 4̄x2 + 4̄x + 1̄ = 1̄

Also in Z8 [x] we see that the quadratic x2 − 1̄ has four roots 1̄, 3̄, 5̄, 7̄ and has two distinct
factorizations
x2 − 1̄ = (x − 1̄)(x + 1̄) = (x − 3̄)(x − 5̄).

A RECAP ON RINGS 4
Example 12 Given a ring R and a square matrix A with entries in R, we can consider the
ring R[A] with addition and multiplication defined similarly to those in R[x]. Recall that if

p(x) = a0 + a1 x + · · · + an xn then p(A) = a0 I + a1 A + · · · + an An .

So with R = R and
1 0
A = , R[A] = RI2 ∼
= R.
0 1
0 1 x y
B = , R[B] = : x, y ∈ R .
0 0 0 x
0 1 x y ∼
C = R[C] = : x, y ∈ R = C.
−1 0 −y x

The isomorphism with C in the third example is shown in Exercise Sheet 1, #2. Note that
C 2 = −I2 .

Example 13 Given two rings R and S we can form their direct sum R ⊕ S which has R × S
as its underlying set, and operations

(r1 , s1 ) + (r2 , s2 ) = (r1 +R r2 , s1 +S s2 ), (r1 , s1 ) × (r2 , s2 ) = (r1 ×R r2 , s1 ×S s2 ).

Example 14 C(R), the set of continuous functions from R to R forms a CRI under pointwise
addition and multiplication with 0C(R) and 1C(R) being the constant functions 0 and 1. With the
same operations and identities, C 1 (R), the set of differentiable functions from R to R likewise
forms a CRI.

Example 15 The even integers form a ring 2Z under + and × which is commutative but which
has no identity.

Example 16 The n × n real matrices Mn (R) form a non-commutative ring with an identity
In . We can similarly consider Mn (Q), Mn (C) or Mn (F ) where F is any field or indeed any
ring.

Example 17 If V is a vector space then End(V ), the set of linear maps V → V , forms a ring
under addition and composition. The zero and identity maps are the additive and multiplicative
identities of End(V ).

Example 18 The power set P(X) of a set X, that is, the set of subsets of X, forms a CRI
under symmetric difference ∆ and intersection ∩. That is, for A, B ⊆ X we have

A + B := A ∆ B = (A\B) ∪ (B\A); A × B := A ∩ B.

We further have 0P(X) = ∅ and 1P(X) = X. The additive inverse of A is itself A. Notice that
every element satisfies A2 = A.
If X = {x1 , . . . xn } then we can identify P(X) with Zn2 by S ↔ (e1 , . . . , en ) with ei = 1̄ if
and only if xi ∈ S.

A RECAP ON RINGS 5
Example 19 The following sequence spaces all form commutative rings under coordinate-
wise addition and multiplication:

• l∞ the space of bounded real sequences;

• l1 the space of real sequences which are absolutely summable;

• c the space of convergent real sequences;

• c0 the space of real sequences which converge to 0.

Both l∞ and c have an identity in the constant sequence (1, 1, 1, . . .).

Example 20 The Gaussian integers Z[i] are the complex numbers {a + bi : a, b ∈ Z} and
form a CRI.

Example 21 If R is a ring and S is a set, then we can form a ring RS = {f : S → R} with


the set of maps from S to R by taking + and × to be pointwise addition and multiplication. RS
has an identity if and only if R has and is commutative if and only if R is.

Example 22 A quaternion is a "four-dimensional" number of the form

q = a + bi + cj + dk

where a, b, c, d are real numbers. The ring of quaternions is denoted H after the mathematician
William Rowan Hamilton (1805-1865) who discovered them in 1843. Two quaternions add
component-wise as one would expect and multiply according to the rules

i2 = j2 = k2 = ijk = −1.

From these rules one can further show that

ij = k = −ji, jk = i = −kj, ki = j = −ik,

so we see that H is non-commutative. The elements

{1, −1, i, −i, j, −j, k, −k}

form a group which is denoted Q8 .

Definition 23 Let R be a ring. A subset S ⊆ R is a subring of R if the operations + and ×


restrict to make a ring of S. That is, S is a subring of R if:
(a) 0R ∈ S;
(b) whenever r1 , r2 ∈ S then r1 ± r2 ∈ S.
(c) whenever r1 , r2 ∈ S then r1 r2 ∈ S.

Remark 24 Note that the axioms A1, A2, M 1, D automatically apply in S as they apply to all
elements of R, including those of S.

A RECAP ON RINGS 6
Example 25 (a) Z[x] is a subring of Q[x].
(b) Z[i] is a subring of C.
(c) R[x] is a subring of C(R).
(d) 2Z is a subring of Z.
(e) C = {a + bi : a, b ∈ R} is a subring of H.
(e) Zn is not a subring of Z as it is not a subset of Z.
Example 26 {0R } and R are always subrings of R. Given a subset X of R the subring gen-
erated by X is the smallest subring containing X. It is the intersection of all subrings which
contain X.
Definition 27 In a ring R, we say that a non-zero element a ∈ R is a zero-divisor if there
exist non-zero b, c ∈ R such that ab = 0 = ca. In a commutative ring, if a is a zero-divisor,
then we can assume b = c.
Definition 28 An integral domain R is a CRI which has no zero-divisors.
Definition 29 In a ring R, with an identity 1R , we say that a ∈ R is a unit if there exists
b ∈ R such that ab = 1R = ba. If such b exists then it is unique and we will denote it a−1 .
Example 30 (a) Z, Q, R, C are integral domains. In Z the units are only 1 and −1 whilst
every non-zero element is a unit in Q, R, C.
(b) R[x] is an integral domain and the units are the non-zero constant polynomials.
(c) In Mn (R), the non-zero singular matrices are the zero-divisors and the invertible matrices
are the units.
(d) Z[i] is an integral domain and the units are 1, −1, i, −i only.
(e) In P(X), every non-empty proper set is a zero-divisor and the only unit is X.
(f) In H, there are no zero divisors (but, not being commutative, H is not an integral domain)
and every non-zero element is a unit.
Proposition 31 (Cancellation Law) Let R be an integral domain and a, b, c ∈ R with a = 0.
If ab = ac then b = c.
Proof. As ab = ac then a(b − c) = 0R . As R is an integral domain then a is not a zero-divisor
and so b − c = 0R .
Proposition 32 Let R be a ring with a 1. Then R∗ , the set of units in R, forms a group under
multiplication.
Proof. Suppose now that u, v are units in R. Then
(uv)(v −1 u−1 ) = u(vv −1 )u−1 = uu−1 = 1;
(v −1 u−1 )(uv) = v −1 (u−1 u)v = v −1 v = 1,
so that uv is a unit. So multiplication is a binary operation on R∗ and is associative by M1.
Further 1R is the identity in R∗ . Also
(u−1 )u = 1 = u(u−1 )
so that u−1 is a unit and the inverse of u in R∗ .

A RECAP ON RINGS 7
Example 33 (a) Z∗ = {1, −1}.
(b) (Mn (F ))∗ = GL(n, F ) for any field F.
(c) (F [x])∗ = F ∗ when identified with the non-zero constant polynomials.
(d) Z∗8 = {1̄, 3̄, 5̄, 7̄} and is isomorphic to C2 × C2 .

Proposition 34 Let R be a ring with an identity.


(a) An element cannot be both a zero-divisor and a unit.
(b) If R is finite and commutative then every non-zero element is a zero-divisor or a unit.
(c) More generally, a non-zero element may be neither a zero-divisor or a unit.

Proof. (a) Suppose that a ∈ R and that b, c, d are non-zero and such that ab = 0R = ca, and
ad = 1R = da. Then
b = 1R b = (da)b = d(ab) = d0R = 0R
which is a contradiction.
(b) For x = 0 the map r → xr is either 1-1 or not. If it is 1-1 then it is onto as R is finite,
and hence there exists r such that xr = 1R and we have that x is a unit. If the map is not
1-1 then there exist r1 = r2 such that xr1 = xr2 . Hence x(r1 − r2 ) = 0 and we see that x is a
zero-divisor.
(c) In Z we see 2 is neither a unit nor a zero-divisor.

In the linear algebra courses, we defined a field to be a triple (R, +, ×) satisfying the field
axioms A1, A2, A3, A4, M 1, M 2, M 3, M4, Z, D. An integral domain satisfies each of these except
(M4): existence of multiplicative inverses. So we can reformulate the field axioms as:

Definition 35 A field is an integral domain in which every non-zero element is a unit.

Theorem 36 A finite integral domain is a field.

Proof. Let R be a finite integral domain and a ∈ R with a = 0. Consider the map ma : R → R
given by r → ar. By the cancellation law, ma is 1-1 and hence is onto as R is finite. In particular,
there is r ∈ R such that ar = 1R . Hence a is a unit and R is a field.

We know that Zp , where p is prime, is a finite field. There are in fact examples of other
finite fields. Below is an example of a finite field with 4 elements.

Example 37 The field of 4 elements can be written as follows. Let F4 = {0, 1, 2, 3} where 0 is
the additive identity and 1 is the multiplicative identity. It must be the case that, as an additive
group, F ∼= C2 × C2 for if F ∼ = C4 we would find we had zero-divisors. And it must be the case
∗ ∼
that F = C3 as this is the only group of order 3. The addition and multiplication tables for F
are then
+ 0 1 2 3 × 0 1 2 3
0 0 1 2 3 0 0 0 0 0
1 1 0 3 2 1 0 1 2 3 .
2 2 3 0 1 2 0 2 3 1
3 3 2 1 0 3 0 3 1 2

A RECAP ON RINGS 8
Definition 38 Given any ring R we define its characteristic charR as follows:
• If such a positive integer exists, we define the charR to be the smallest positive integer n
such that nr = 0 for all r ∈ R.
• If no such n exists, then we define charR = 0.
In the event that R has an identity, then charR is the additive order of 1R when this is finite,
and 0 when that order is infinite.
Example 39 (a) charZn = n.
(b) charZ = charQ = charR = charC = 0.
(c) F4 has characteristic 2.
(d) For the ring R of 4 elements, with (R, +) ∼
= C4 and multiplication defined by ab = 0 for
all a, b, then charR = 4.
Proposition 40 If R is an integral domain then charR is prime or zero.
Proof. If the additive order n of 1R is finite then we have
1R + · · · + 1R = 0R ,
n times

and if n = uv, where neither u nor v were 1, we’d have


1R + · · · + 1R × 1R + · · · + 1R = 0R .
u times v times

As R is an ID then we’d have either that


1R + · · · + 1R = 0R or 1R + · · · + 1R = 0R ,
u times v times

both of which would contradict the minimality of n.


Proposition 41 Let F be a field.
(a) If charF = p, a prime, then there is a smallest subfield of F isomorphic to Zp .
(b) If charF = 0, then there is a smallest subfield of F isomorphic to Q.
This subfield is called the prime subfield.
Proof. In both cases, as 0F and 1F both need to be in any subfield, then the smallest subfield
of F is the one generated by 1F . In the case that charF = p then the map
φ : Zp → F given by φ(k̄) = 1F + · · · + 1F
k times

is well-defined and an isomorphism onto its image. In the case that charF = 0 then the map
  −1
m
φ:Q→F given by φ = 1F + · · · + 1F  1F + · · · + 1F 
n
m times n times

is well-defined and an isomorphism onto its image.

A RECAP ON RINGS 9
Proposition 42 Let R be an integral domain. Then we can create a field of fractions F for
R as follows. We define an equivalence relation on R2 by

(m, n) ∼ (M, N ) ⇐⇒ mN = M n,

and define F = R2 / ∼ to be the set of ∼-equivalence classes with the operations

[(m1 , n1 )] + [(m2 , n2 )] = [(m1 n2 + m2 n1 , n1 n2 )] , [(m1 , n1 )] × [(m2 , n2 )] = [(m1 m2 , n1 n2 )] .

Then
(a) F is a well-defined field with 0F = [(0R , 1R )] and 1F = [(1R , 1R )].
(b) F contains a copy of R as an isomorphic subring, namely the elements R ≡ {[(r, 1R )] :
r ∈ R}.
(c) F is uniquely characterized by the "universal property" that any 1-1 homomorphism
φ : R → K to a field K, can be uniquely extended to a 1-1 homorphism φ̃ : F → K.

Remark 43 The above formulae may be a little less surprising when we realize in Q that
m1 m2 m1 n2 + m2 n1 m1 m2 m1 m2
+ = and × = .
n1 n2 n1 n2 n1 n2 n1 n2
We can also see for non-zero n1 , n2 that
m1 m2
m1 n2 = m2 n1 ⇐⇒ = .
n1 n2
Thus the above construction would yield Q as the field of fractions of Z and would identify m/n
with the equivalence class [(m, n)].

Proof. (Non-examinable) We will not prove the details here, but rather just give a list of all
the facts that need checking, none of which are particularly difficult.
(i) ∼ is an equivalence relation.
(ii) + and × are well-defined binary operations on the set of equivalence classes.
(iii) the field axioms hold amongst the equivalence classes with 0F = [(0R , 1R )] and 1F =
[(1R , 1R )].
(iv) r → [(r, 1R )] is a 1-1 homomorphism from R to F.
(v) defining φ [(m, n)] = φ(m)φ(n)−1 is a well-defined 1-1 homomorphism from F to K
which extends φ.
(vi) For any other field F̃ with this universal property, extending inclusion ι : R → F yields
an isomorphosm ι̃ : F̃ → F.

Example 44 (a) The field of fractions of Z is Q.


(b) The field of fractions of Z[x]
√ is Q(x).

(c) The field of fractions of Z[ 2] is Q[ 2].
(d) The field of fractions of Z[π] is Q(π).

A RECAP ON RINGS 10
2. THE ISOMORPHISM THEOREMS

Definition 45 Let R be a ring. A non-empty subset I ⊆ R is said to be a ideal of R if

whenever i1 , i2 ∈ I then i1 ± i2 ∈ I; whenever i ∈ I, r ∈ R then ri ∈ I and ir ∈ I.

This is then written I ⊳R. In particular, ideals are subrings though the converse is not generally
true. [N.B. any text which insists that rings have 1s will then have a convention that ideals are
not generally subrings.]

Definition 46 Let R be a ring and a ∈ R. Then the principal ideal a generated by a is the
smallest ideal to contain a. So

a = ri asi : ri , si ∈ R .
i

a is also commonly written as (a). An ideal I of R is said to be principal if there exists


a ∈ R such that I = a .

Example 47 {0R } and R are always ideals of R.

Example 48 Let R = C(R) and S = R[x]. Then S is a subring of R but is not an ideal of R,
as x ∈ S, ex ∈ R but xex ∈
/ S.

Example 49 Let R = C(R) and I = {f ∈ R : f (0) = 0} . Then I is an ideal of R.

Example 50 Let R be a ring with a 1R and a1 , . . . , ak ∈ R. Then the ideal generated by


{a1 , . . . , ak } is
I = a1 , . . . , ak = {r1 a1 s1 + · · · + rk ak sk : ri , si ∈ R}
is the smallest ideal of R containing a1 , . . . , ak .

Example 51 Let R = Z[x] and I = 2, x . Then I is an example of an ideal which is not


principal. Note

I = 2, x = {p(x) ∈ Z[x] : p(x) has an even constant term}.

If I were principal then, for some f (x), we would have I = f (x) = f (x)Z[x] and then f (x)
would divide both 2 and x. The only such polynomials are f (x) = ±1 but ±1 = Z[x] = I.

Proposition 52 The ideals of Z are nZ = n where n ∈ Z. So every ideal of Z is principal.

Proof. For any n ∈ Z, we have nZ = n ⊳ Z. Conversely, suppose that I ⊳ Z. Then, in


particular, I is a subgroup of Z and I = nZ for some n ∈ Z.

THE ISOMORPHISM THEOREMS 11


Proposition 53 Let R be a CRI. Then R is a field if and only if the only ideals of R are {0R }
and R.

Proof. Suppose that R is a field and I ⊳ R. If I = {0R } then there exists non-zero a ∈ I. As
R is a field, then a is a unit and so there exists b ∈ R such that ba = 1R . Then, for any r ∈ R,

(rb) a = r ∈ I

and we see I = R. Conversely, suppose that the only ideals are {0R } and R and that a = 0.
Then 1R ∈ R = a and there exists r ∈ R such that ar = 1R . So a is a unit and R is a field.

Remark 54 As with the case of normal subgroups for groups, ideals are the natural subsets a
ring might be "modulo"-ed by. If we are to introduce new rules into the algebra of a ring R and
set i, j ∼ 0R (where ∼ is an equivalence relation) then for reasonable algebra in R/ ∼, we must
expect to have

i ± j ∼ 0R and ri ∼ 0R and ir ∼ 0R for any r ∈ R.

Theorem 55 Let R be a ring and let I ⊳ R. The coset of r ∈ R is

r + I = {r + i : i ∈ I}.

The operations

(r1 + I) ⊕ (r2 + I) = (r1 + r2 ) + I;


(r1 + I) ⊗ (r2 + I) = (r1 r2 ) + I;

lead to well-defined binary operations on the set R/I of cosets of I with (R/I, ⊕, ⊗) being known
as the quotient ring. If R is commutative then R/I is commutative, and if R has an identity
then so does R/I with 1R/I = 1R + I.

Notation 56 The notation r + I will become cumbersome and so we will write r̄ = r + I. This
very much ties in with the notion that R/I is "R mod I" and that r + I contains all those s
such that r ≡ s mod I.

Proof. Suppose that I ⊳ R. Note for r1 , r2 ∈ R that

r1 = r2 ⇐⇒ r1 − r2 ∈ I.

So suppose that r1 = r2 and s1 = s2 . Then

(r1 + s1 ) − (r2 + s2 ) = (r1 − r2 ) + (s1 − s2 ) ∈ I

showing that
r1 ⊕ s1 = r1 + s1 = r2 + s2 = r2 ⊕ s2 .
That is ⊕ is well-defined. Similarly

r1 s1 − r2 s2 = r1 (s1 − s2 ) + s2 (r1 − r2 ) ∈ I

THE ISOMORPHISM THEOREMS 12


showing that
r1 ⊗ s1 = r1 s1 = r2 s2 = r2 ⊗ s2
and that ⊗ is well-defined on the set of cosets R/I.
(R/I, ⊕, ⊗) meets the axioms of a ring because these properties are inherited from the fact
that (R, +, ×) is a ring. R/I has a 1 and/or is commutative as and when R has a 1 and/or is
commutative. In order to have 0 = 1 in R/I then I needs to be a proper ideal.

(A1) : (r̄ ⊕ s̄) ⊕ t̄ = (r + s) + t = r + (s + t) = r̄ ⊕ (s̄ ⊕ t̄) .


(A2) : r̄ ⊕ s̄ = r + s = s + r = s̄ ⊕ r̄
(A3) : 0R/I = 0R = I.
(A4) : −r̄ = −r
(M 1) : (r̄ ⊗ s̄) ⊗ t̄ = (rs)t = r(st) = r̄ ⊗ (s̄ ⊗ t̄) .
(D) : r̄ ⊗ (s̄ ⊕ t̄) = r(s + t) = rs + rt = (r̄ ⊗ s̄) ⊕ (r̄ ⊗ t̄).
(M 2) : r̄ ⊗ s̄ = rs = sr = s̄ ⊗ r̄
(M 3) : 1R/I = 1R = 1R + I.

Example 57 Let R = Z, I = n = nZ. As a ring, we can naturally identify Z/nZ with Zn .

Example 58 Let R = Q[x] and I = x2 . Then R/I is not an integral domain as x̄ is a zero-
divisor (because x̄ = 0̄ and x̄x̄ = 0̄). To better understand the ring R/I note by the division
algorithm, that every polynomial p(x) can uniquely be written

p(x) = a + bx + x2 p(x) for some a, b ∈ Q, f(x) ∈ Q[x].

So every coset in R/I can uniquely be represented as a + bx̄ where a, b ∈ Q. These cosets than
add and multiply as

(a + bx̄) + (c + dx̄) = (a + c) + (b + d)x̄;


(a + bx̄) (c + dx̄) = ac + (bc + ad)x̄,

as x̄2 = 0̄. From this we can see that the only zero-divisors in R/I are of the form bx̄ where
b = 0.

Example 59 Write down the addition and multiplication tables of R1 and R2 where
Z2 [x] Z2 [x]
R1 = , R2 = .
x2 x2 + 1
Show that the rings are isomorphic.

Solution. By the division algorithm, there are four cosets of x2 in Z2 [x] and of x2 + 1 in
Z2 [x], and we can represent these cosets in both rings with the elements

{0̄, 1̄, x̄, x̄ + 1̄}.

THE ISOMORPHISM THEOREMS 13


The addition and mutliplication tables of R1 are then

+ 0̄ 1̄ x̄ x̄ + 1̄ × 0̄ 1̄ x̄ x̄ + 1̄
0̄ 0̄ 1̄ x̄ x̄ + 1̄ 0̄ 0̄ 0̄ 0̄ 0̄
1̄ 1̄ 0̄ x̄ + 1̄ x̄ , 1̄ 0̄ 1̄ x̄ x̄ + 1̄ ,
x̄ x̄ x̄ + 1̄ 0̄ 1̄ x̄ 0̄ x̄ 0̄ x̄
x̄ + 1̄ x̄ + 1̄ x̄ 1̄ 0̄ x̄ + 1̄ 0̄ x̄ + 1̄ x̄ 1̄

and those for R2 are

+ 0̄ 1̄ x̄ x̄ + 1̄ × 0̄ 1̄ x̄ x̄ + 1̄
0̄ 0̄ 1̄ x̄ x̄ + 1̄ 0̄ 0̄ 0̄ 0̄ 0̄
1̄ 1̄ 0̄ x̄ + 1̄ x̄ , 1̄ 0̄ 1̄ x̄ x̄ + 1̄ .
x̄ x̄ x̄ + 1̄ 0̄ 1̄ x̄ 0̄ x̄ 1̄ x̄ + 1̄
x̄ + 1̄ x̄ + 1̄ x̄ 1̄ 0̄ x̄ + 1̄ 0̄ x̄ + 1̄ x̄ + 1̄ 0̄

We can see that the first set of tables match the second set of tables when we swap every x̄
with x̄ + 1̄ and vice-versa. This is perhaps less of a surprise when we note

x2 + 1 = (x + 1)2 in Z2 [x].

Definition 60 Let R and S be rings. A (ring) homomorhism φ : R → S is a map satisfying

φ(r1 +R r2 ) = φ(r1 ) +S φ(r2 ), φ(r1 ∗R r2 ) = φ(r1 ) ∗S φ(r2 ) for all r1 , r2 ∈ R.

If further φ is bijective then φ is called a (ring) isomorphism.

Proposition 61 Let R and S be rings and φ : R → S a homomorphism. Let r ∈ R and n be


an integer. Then:
(a) φ(0R ) = 0S .
(b) if S is an integral domain and R has an identity, then either φ ≡ 0 or φ(1R ) = 1S .
(c) φ(nr) = nφ(r).
(d) if n 1 then φ(rn ) = φ(r)n .

Proof. Left as exercises.

Example 62 Let R be a CRI and a ∈ R. Then the map φ : R [x] → R given by φ(p(x)) = p(a)
is a homomorphism because — by definition — we have

(φ + ψ)(a) = φ(a) + ψ(a), (φψ)(a) = φ(a)ψ(a).

Example 63 In a similar vein, let A be an n × n real matrix. Then the map φ : R[x] → R [A] ,
defined by p(x) → p(A), is a homomorphism.

THE ISOMORPHISM THEOREMS 14


Example 64 Let n 2 be an integer. Then φ : Z → Zn given by x → x̄ is a ring homomor-
phism as
x + y = x̄ + ȳ, xy = x̄ȳ.
More generally if I ⊳ R then the quotient maps π : R → R/I given by π(r) = r̄ is a homomor-
phism.

Example 65 Let n 2. The only ring homomorphism φ: Zn → Z is the zero map because
0 = φ (0̄) = φ(n̄) = nφ(1̄) and so φ(1̄) = 0 and φ ≡ 0.

Example 66 The only ring homomorphisms φ : Z → Z are x → 0 for all x and x → x.

Example 67 Find all the ring homomorphisms φ : Q → Q.

Solution. If φ(1) = 1 then φ(m) = m for any integer m. Likewise, for n > 0, we have
nφ(m/n) = φ(m) = m and so φ(m/n) = m/n. Hence the only ring homomorphisms are

φ(x) = 0 and φ(x) = x.

Example 68 A ring homomorphism φ : Zn → Zn is of the form φ(x̄) = āx̄ where ā2 = ā.
Hence the ring homomorphisms φ: Z6 → Z6 are

φ(x̄) = 0̄, φ(x̄) = x̄, φ(x̄) = 3̄x̄, φ(x̄) = 4̄x̄,

and the only ring homomorphisms φ : Zp → Zp (where p is a prime) are

φ(x̄) = 0̄, φ(x̄) = x̄.

Solution. Certainly any homomorphism φ : Zn → Zn is of the form φ(x̄) = āx̄ where ā = φ(1̄).
For such maps we immediately have

φ(x + y) = φ(x̄) + φ(ȳ).

Futher we need
ā = φ(1̄) = φ(1̄)φ(1̄) = ā2 .
Finally if ā2 = ā then we have

φ(x̄ȳ) = āx̄ȳ = (āx̄) (āȳ) = φ(x̄)φ(ȳ).

We can check that 0̄, 1̄, 3̄, 4̄ are the only roots of x (x − 1) = 0 in Z6 . In Zp , which is a field
when p is prime, there are no zero-divisors and so 0 and 1 are the only roots of this quadratic.

Example 69 Show that R is not (ring) isomorphic to C.

THE ISOMORPHISM THEOREMS 15


Solution. Suppose that φ : C → R is a ring isomorphism. Then φ(1) = 1 and so

φ(i)2 = φ(i2 ) = φ(−1) = −1

but there is no element x ∈ R which satisfies x2 = −1.

Definition 70 Let φ : R → S be a homomorphism between rings. Then the kernel of φ,


written ker φ, is
ker φ = {r ∈ R : φ(r) = 0S } ⊆ R.
The image of φ, written Im φ, is

Im φ = {φ(r) : r ∈ R} ⊆ S.

Proposition 71 Let φ : R → S be a homomorphism between rings. Then:


(a) ker φ is an ideal of R.
(b) Im φ is a subring of S.

Proof. (a) Suppose that i1 , i2 ∈ ker φ and r ∈ R. Then

φ(i1 ± i2 ) = φ(i1 ) ± φ(i2 ) = 0S ± 0S = 0S ;


φ(ri1 ) = φ(ri1 ) = φ(r)φ(i1 ) = φ(r)0S = 0S ,

showing that i1 ± i2 ∈ ker φ and that ri1 ∈ ker φ. Similarly i1 r ∈ ker φ.


(b) Suppose that s1 , s2 ∈ Im φ. Then there exist ri ∈ R such that φ(ri ) = si and so

s1 ± s2 = φ(r1 ) ± φ(r2 ) = φ(r1 ± r2 ) ∈ Im φ;


s1 s2 = φ(r1 )φ(r2 ) = φ(r1 r2 ) ∈ Im φ.

Example 72 Let R be a CRI and a ∈ R. The kernel and image of the map φ : R [x] → R
given by φ(p(x)) = p(a) are

ker φ = x − a ; Im φ = R.

Note for the former that we cannot presume the division algorithm applies (which would be the
case if R were a field) but we can work with identities such as

xn = (xn−1 + axn−2 + · · · + an−2 x + an−1 )(x − a) + an.

Example 73 The kernel and image of the map φ : Z → Zn given by x → x̄ are

ker φ = nZ; Im φ = Zn .

More generally π : R → R/I given by π(r) = r̄ has kernel I and image R/I.

THE ISOMORPHISM THEOREMS 16


Example 74 The kernel and image of the map φ : R[x] → C given by p(x) → p(i) are

ker φ = x2 + 1 ; Im φ = C.

This is because any real polynomial which has i as a root also has −i as a root (conjugate pairs).

Example 75 The kernels and images of the ring homomorphisms φ : Z6 → Z6 are

φ φ(x̄) = 0̄ φ(x̄) = x̄ φ(x̄) = 3̄x̄ φ(x̄) = 4̄x̄


ker φ Z6 {0̄} {0̄, 2̄, 4̄} {0̄, 3̄}
Im φ {0̄} Z6 {0̄, 3̄} {0̄, 2̄, 4̄}

Example 76 Let

0 1 0 1 0 1
X= and Y = and Z= .
0 0 1 0 −1 0

Note that X 2 = 02 , Y 2 = I2 , Z 2 = −I2 . The homomorphisms

α : R[x] → R [X] given by α(p(x)) = p(X);


β : R[x] → R [Y ] given by β(p(x)) = p(Y );
γ : R[x] → R [Z] given by γ(p(x)) = p(Z),

can be shown to have kernels

ker α = x2 , ker β = x2 − 1 , ker γ = x2 + 1 .

As with the case of groups we then have an equivalent version of the isomorphism theorem.

Theorem 77 (First Isomorphism Theorem for Rings) Let φ : R → S be a homomor-


phism between CRIs. Then:
(a) ker φ ⊳ R.
(b) Im φ S.
(c) The map
R
φ̄ : → Im φ given by φ̄(r + ker φ) = φ(r)
ker φ
is a (ring) isomorphism.

Proof. (a) and (b) were proven earlier. To show (c) we note (in a similar fashion to groups)
that

φ̄ (r1 + ker φ) = φ̄ (r2 + ker φ) ⇐⇒ φ(r1 ) = φ(r2 )


⇐⇒ φ(r1 − r2 ) = 0S
⇐⇒ r1 − r2 ∈ ker φ
⇐⇒ r1 + ker φ = r2 + ker φ.

THE ISOMORPHISM THEOREMS 17


The above implications right-to-left show that φ̄ is well-defined and those left-to-right that φ̄
is injective. We also note φ̄ clearly maps onto Im φ. Finally we note

φ̄ ((r1 + ker φ) + (r2 + ker φ)) = φ̄ ((r1 + r2 ) + ker φ)


= φ (r1 + r2 )
= φ(r1 ) + φ(r2 )
= φ̄ (r1 + ker φ) + φ̄ (r2 + ker φ)

and

φ̄ ((r1 + ker φ) (r2 + ker φ)) = φ̄ ((r1 r2 ) + ker φ)


= φ (r1 r2 )
= φ(r1 )φ(r2 )
= φ̄ (r1 + ker φ) φ̄ (r2 + ker φ) .

If we apply the isomorphism theorem to the following homomorphisms we find:

Example 78 Let a ∈ R and φ: R[x] → R where p(x) → p(a). The isomorphism theorem then
says that
R[x] ∼
= R.
x−a

Example 79 The isomorphism theorem applied to the map φ : Z → Zn where n → n̄ shows


that
Z ∼
= Zn .
nZ
Example 80 The isomorphism theorem applied to the map φ : R[x] → C where p(x) → p(i)
shows that
R[x] ∼
= C.
x2 + 1

Example
√ 81 The isomorphism theorem applied to the map φ : Q[x] → Q[ 2] where p(x) →
p( 2) shows that
Q[x] ∼ √
= Q[ 2].
x2 − 2

Example 82 The isomorphism theorem applied to the map φ : Z[x] → Q where p(x) → p 1
2
shows that
Z[x] ∼ m
= : m ∈ Z, k 0 = Z [1/2] .
2x − 1 2k

Example 83 Let

0 1 0 1 0 1
X= and Y = and Z= ,
0 0 1 0 −1 0

THE ISOMORPHISM THEOREMS 18


(as in Example 76) so that

a b
R[X] = {aI + bX : a, b ∈ R} = : a, b ∈ R .
0 a
a b
R[Y ] = {aI + bY : a, b ∈ R} = : a, b ∈ R .
b a
a b
R[Z] = {aI + bZ : a, b ∈ R} = : a, b ∈ R .
−b a

The isomorphism theorem applied to the maps

α(p(x)) = p(X), β(p(x)) = p(Y ), γ(p(x)) = p(Z),

shows that
a b ∼ R[x]
: a, b ∈ R = 2 ,
0 a x
that
a b ∼ R[x]
: a, b ∈ R =
b a x2 − 1
which further we know to be isomorphic to R2 (as in Sheet 1, Question 7), and that

a b ∼ R [x] ∼
: a, b ∈ R = = C,
−b a x2 + 1

(as in Sheet 1, Question 2).

We conclude this discussion with an important application of the isomorphism theorem,


namely the Chinese remainder theorem. Specific instances of its use date back as far as the 3rd
century, and a general alogorithm for its solution in the integers to the 6th century. Historically
it applies to problems such as the following.

Example 84 Say that an integer satisfies x ≡ 3 mod 7 and x ≡ 7 mod 13, what is x mod 91?

Solution. A slightly ad hoc approach is to note that if x ≡ r mod 91, where 0 r < 91, then
x = r + 91k and so

r mod 13 = x mod 13 = 7 =⇒ r = 7, 20, 33, 46, 59, 72, 85.

But these numbers r satisfy

7, 20, 33, 46, 59, 72, 85 ≡ 0, 6, 5, 4, 3, 2, 1 mod 7

so that we see r = 59 is the only one satisfying r ≡ 3 ≡ mod 7. So we see x ≡ 59 mod 91.
More generally the Chinese remainder theorem is a means of decomposing a quotient ring
R/(I ∩ J) where I and J are coprime ideals.

THE ISOMORPHISM THEOREMS 19


Definition 85 Given ideals I and J of a ring R, then we can define the following ideals in
terms of them.
(a) Their sum I + J equals

I + J = {i + j : i ∈ I, j ∈ J}.

We say that I and J are coprime if I + J = R.


(b) Their intersection I ∩ J equals

I ∩ J = {r : r ∈ I and r ∈ J}.

(c) Their product IJ equals


n
IJ = ik jk : ik ∈ I, jk ∈ I .
k=1

Example 86 If m and n are integers, note that

m + n = h where h = hcf(m, n).


m ∩ n = l where l = lcm(m, n).
m n = mn .

Note that m and n are coprime ideals if and only if m and n are coprime integers.

Theorem 87 (Chinese Remainder Theorem) Let I and J be coprime ideals of a ring R


with an identity 1R . Then the map
R R R
φ: → ⊕ given by r + I ∩ J → (r + I, r + J)
I ∩J I J
is a well-defined isomorphism.

Proof. Firstly suppose that r1 + I ∩ J = r2 ∩ I + J. Then r1 − r2 = k for some integer k in


both I and J. Hence r1 + I = r2 + I and r1 + J = r2 + J. Hence φ is well-defined. Further

φ ((r1 + r2 ) + I ∩ J) = ((r1 + r2 ) + I, (r1 + r2 ) + J)


= (r1 + I, r1 + J) + (r2 + I, r2 + J)
= φ (r1 + I ∩ J) + φ (r2 + I ∩ J) .

Similarly

φ (r1 r2 + I ∩ J) = (r1 r2 + I, r1 r2 + J)
= (r1 + I, r1 + J) (r2 + I, r2 + J)
= φ (r1 + I ∩ J) φ (r2 + I ∩ J) ,

and so φ is a ring homorphism. Finally we need to check that φ is 1-1 and onto. If

φ (r1 + I ∩ J) = φ (r2 + I ∩ J)

THE ISOMORPHISM THEOREMS 20


then r1 + I = r2 + I and r1 + J = r2 + J. So r1 − r2 lies in both I and J and we have

r1 + I ∩ J = r2 + I ∩ J.

Hence φ is 1-1. Finally as I and J are coprime ideals then we know that there exist i0 ∈ I and
j0 ∈ J such that i0 + j0 = 1R . So for any r1 , r2 ∈ R we have

φ(r1 j0 + r2 i0 + I ∩ J) = (r1 j0 + r2 i0 + I, r1 j0 + r2 i0 + J)
= (r1 j0 + I, r2 i0 + J)
= (r1 (1 − i0 ) + I, r2 (1 − j0 ) + J)
= (r1 + I, r2 + J)

and we see φ is also onto. [Note in the above how i0 corresponds to (0, 1) and j0 corresponds
to (1, 0) under φ.]

Example 88 Determine the inverse Z11 × Z13 → Z143 of the map

r mod 143 → (r mod 11, r mod 13).

Solution. We see from the above proof that we first need to determine integers u and v such
that 11u + 13v = 1. By inspection we see that u = 6 and v = −5 work. Under the above map

11u = 66 → (0, 1), 13v = −65 → (1, 0)

so that
−65x + 66y → (x, y)
and the map’s inverse is given by

(x mod 11, y mod 13) → 66y − 65x mod 143.

Example 89 Note
R[x] R[x] R[x] ∼ R[x] R[x]
= = = ⊕ ≡ R ⊕ C.
x3 − x2 + x − 1 (x − 1)(x2 + 1) x − 1 ∩ x2 + 1 x−1 x2 + 11
Example 90 How many units are there in Z120 ?

Solution. Using the Chinese remainder theorem note


Z ∼ Z ∼ Z Z
Z120 = = = ⊕ ≡ Z8 ⊕ Z3 ⊕ Z5 .
120Z 8Z ∩ 15Z 8Z (3Z ∩ 5Z)
There are 4 units in Z8 , 2 units in Z3 and 4 units in Z5 and so there are 4 × 2 × 4 = 32 units
in Z120 . In fact we can see that

Z∗120 ∼
= Z∗8 × Z∗3 × Z∗5 ∼
= C2 × C2 × C2 × C4 .

THE ISOMORPHISM THEOREMS 21


3. MORE ON IDEALS AND QUOTIENT RINGS.
FIELD EXTENSIONS.

We now move from the isomorphism theorem to the more general connection between ideals
and quotient rings.

Proposition 91 Let R be a ring and I an ideal of R. Then there is a 1-1 correspondence


between the ideals of R that contain I and the ideals of R/I.

Proof. Let π : R → R/I denote the quotient map and let J be an ideal of R which contains I.
Then I is an ideal of J as well as it is still closed under addition and multiplication by elements
of J. Further J¯ = π(J) = J/I is an ideal of R/I. Conversely say that J¯ is an ideal of R/I. If
we define
J = π −1 (J) ¯ = r ∈ R : r̄ ∈ J¯ ,
then this is an ideal of R and contains I as 0̄ ∈ J. ¯ The above maps J → π(J) and J¯ → π −1 (J)¯
are inverse processes of one another and the result follows.

Example 92 The ideals of Z6 ∼


= Z/6Z are

{0̄}, {0̄, 2̄, 4̄}, {0̄, 3̄}, Z6 .

These correspond to the ideals 6Z, 2Z, 3Z, Z of Z which contain 6Z.

Definition 93 (a) A proper ideal I of a ring R is said to be prime if whenever ab ∈ I then


a ∈ I or b ∈ I (or both).
(b) A proper ideal I of a ring R is said to be maximal if the only ideal of R to strictly
contain I is R itself. (Note therefore that R is not maximal.)

Proposition 94 Let I be a proper ideal of a CRI R. Then


(a) I is prime if and only if R/I is an integral domain.
(b) I is maximal if and only if R/I is a field.

Solution. (a) Assume I is prime and say that r̄s̄ = 0̄. Then rs ∈ I and so r ∈ I or s ∈ I. ˙ But
this is equivalent to r̄ = 0̄ or s̄ = 0̄.
Conversely, say that R/I is an integral domain. If rs ∈ I then rs = r̄s̄ = 0̄ and, as R/I is
an integral domain, then r̄ = 0̄ or s̄ = 0̄ or equivalentlt r ∈ I or s ∈ I. Hence I is prime.
(b) Say that I is maximal. Say that r̄ = 0̄ ∈ R/I. Then the ideal J = I, r strictly contains
I and so by maximality J = R. Consequently we can write 1 = sr + i for some s ∈ R and i ∈ I.
We then have s̄r̄ = 1̄. So r̄ is a unit in R/I and R/I is a field.
Conversely say that R/I is a field. Then the only ideals are {0̄} and R/I. These correspond
(under Proposition 91) to I and R being the only ideals of R containing I and hence I is
maximal.

MORE ON IDEALS AND QUOTIENT RINGS. FIELD EXTENSIONS. 22


Corollary 95 Maximal ideals are prime. (This is something that can also be proved directly
in a relatively straightforward manner.)

Example 96 The prime ideals of Z are pZ where p is prime or p = 0.


The maximal ideals of Z are pZ where p is prime.

Example 97 (See Sheet 1, #7) The ideal x2 − 1 is not prime in Q[x] as Q[x]/ x2 − 1 ∼
= Q2
is not an integral domain. √
The ideal x2 − 2 is maximal in Q[x] as Q[x]/ x2 − 2 ∼
= Q[ 2] is a field.

Example 98 (See Sheet 1, #6) The ideal x2 + x + 1 is maximal in Z2 [x] as Z2 [x]/ x2 + x + 1


is a field.

Proposition 99 Let F be a field.


(a) Every ideal of F [x] is principal.
(b) An ideal p(x) is maximal if and only if p(x) is irreducible (that is f (x) cannot be
written as a product of two non-constant polynomials).

Proof. (a) Let I be an ideal of F [x]. If I = {0} then we have I = 0 . Otherwise there is a
non-zero polynomial f(x) in I of least degree. By the division algorithm, for any g(x) in I we
can write
g(x) = q(x)f(x) + r(x) where deg r < deg f or r = 0.
As r = g − qf ∈ I then we must have r = 0 and so g(x) = q(x)f (x) ∈ f (x) . But f (x) ⊆ I
by definition and hence I = f(x) .
(b) Say that p(x) is irreducible. Then for any q(x) ∈
/ p(x) , p(x) and q(x) are coprime and
so by Bézout’s lemma there exist u(x), v(x) ∈ F [x] such that

u(x)p(x) + v(x)q(x) = 1.

But then p(x), q(x) = F [x] and so p(x) is maximal. Conversely say that p(x) is not irre-
ducible and p(x) = a(x)b(x) where neither a nor b is a constant polynomial. Then p(x) is
strictly contained in a(x) = F [x] and so p(x) is not maximal.

Proposition 100 In an integral domain R in which every ideal is principal, then non-zero
prime ideals are maximal.

Proof. Let I ⊳ R with I prime. Then there exists x = 0 such that I = x . For r ∈ / I then
x, r = y for some y as every ideal is principal. This in particular means that x = yu. As
yu ∈ I and I is prime then either y ∈ I or u ∈ I. Were it the case that y ∈ I = x then we’d
have y = xv for some v and so x = xvu; by the cancellation law u would be a unit and we’d
have I = y = x, r which is a contradiction as x, r strictly contains I. So u ∈ I and y is a
unit and hence x, r = R. We have therefore shown that I is maximal.

MORE ON IDEALS AND QUOTIENT RINGS. FIELD EXTENSIONS. 23


Example 101 As x3 + x + 1 has no roots in Z2 then the polynomial is irreducible. (Any
reduction of a cubic polynomial involves a linear factor.) So

Z2 [x]
F8 =
x3 +x+1

is a field with 8 elements. Every coset has a representative a + bx + cx2 where a, b, c ∈ Z2 .


Similarly x2 + 1 is irreducible over Z3 and

Z3 [x]
F9 =
x2 + 1

is a field with 9 elements.


Note in the first case that we can identify Z2 with {0, 1} ⊆ F8 , its prime subfield and
similarly we can Z3 with {0, 1, 2} ⊆ F9 .

The above is part of a more general approach to adjoining a root.

Proposition 102 Let F be a field and f (x) be an irreducible polynomial in F [x]. Then

F [x]
K=
f (x)

contains an isomorphic copy of F and there is a root of f(x) in K.

Proof. f (x) is maximal and so K is a field. K contains an isomorphic copy of F when we


identify F with the cosets
a ↔ a + f (x) = ā.
Consequently we can make sense of p(k) where k ∈ K and p(x) ∈ F [x]. In particular we have
that
f (x̄) = f (x + f (x) ) = f (x) + f (x) = f(x) = 0K
and we see that f has a root x̄ in K.

Example 103 The polynomial x2 + x + 1 is irreducible in Z2 [x]. If we set

Z2 [y]
F4 =
y2 +y+1

then we note that y is a root of x2 + x + 1 in F4 . We then note

x2 + x + 1 = (x − y)(x + y + 1)

as expansion gives
x2 + (y + 1 − y)x + (−y 2 − y) = x2 + x + 1.

MORE ON IDEALS AND QUOTIENT RINGS. FIELD EXTENSIONS. 24



Example 104 The polynomial x3 − 2 is irreducible over Q. If we set α = 3
2 then we see in

K = Q[α] = q1 + q2 α + q3 α2 : qi ∈ Q

that in K[x] we can factorize as

x3 − 2 = x3 − α3 = (x − α)(x2 + αx + α2 ).

However we cannot factorize further over K — the other two roots are complex and K is a
subfield of R.

Definition 105 Let F be a field. A field extension K of F is a field which contains F as a


subfield. We denote this extension as K : F.
K can then be considered as a vector space over F. The degree of the field extension is
denoted |K : F | and equals dimF K.

Example 106 (a) C is a field extension of R of degree 2. A basis of C over R is {1, i}.
(b) R is a field extension of Q of infinite degree. To see this note that any finite degree field
extension of Q is isomorphic as a vector space to Qn for some n and in particular is finite. As
R is uncountable
√ this cannot be the case. √ √
(c) Q[ √2] is a field extension of Q of degree 2. A basis of Q[ √2] over R is {1, √2}.√
(d) Q[ 3 2] is a field extension of Q of degree 3. A basis of Q[ 3 2] over R is {1, 3 2, 3 4}.
(e) F8 = Z2 [x]/ x3 + x + 1 is a field extenison over Z2 of degree 3 with a basis being
{1, x, x2 }.
(f) F9 = Z3 [x]/ x2 + 1 is a field extenison over Z3 of degree 2 with a basis being {1, x}.

Proposition 107 A finite field has order pn for some prime p and a positive integer n.

Proof. Let F be a finite field. As a field of characteristic 0 contains a prime subfield isomorphic
to Q then F has characteristic p for some prime p and contains a copy of Zp as its prime subfield.
F is the a field extension of some degree n over its prime subfield and so F is isomorphic to Znp
as a vector space. Hence |F | = Znp = pn .

Remark 108 (Off-syllabus) There exists a field of order pn for each prime p and positive
integer n. One can show this by demonstrating that, for each n, there is an irreducible polynomial
over Zp of degree n. It can further be shown that, up to isomorphism, there is a unique field
Fpn of order pn . Finite fields are commonly called Galois fields in honour of Évariste Galois
who first proved these results. The multiplicative group of any finite field is cyclic though it can
be computationally difficult to find a generator, and consequently finite fields can be used in this
way in crytopgraphy.

Example 109 Give an example of a field F16 with 16 elements. How many generators does its
multiplicative group have?

MORE ON IDEALS AND QUOTIENT RINGS. FIELD EXTENSIONS. 25


Solution. We can produce such a field by finding an irreducible quartic over Z2 . Consider
x4 + x + 1. Certainly neither 0 nor 1 are roots; it remains to show that x4 + x + 1 cannot be
written as a product of two irreducible quadratics. The only irreducible quadratic over Z2 is
x2 + x + 1 and we note that

(x2 + x + 1)2 = x2 + 2x3 + 3x2 + 2x + 1 = x4 + x2 + 1,

which is not equal to our polynomial. Hence

Z2 [x]
F16 =
x4 + x + 1

is a field of 16 elements. Given what we’ve been told above, F∗16 ∼ = Z15 which has 8 generators,
1, 2, 4, 7, 8, 11, 13, 14. Note in F16 that x5 = x2 + x = 1 and so x does not have order 3 or 5 and
so must have order 15. So the 8 generators of F∗16 are

x, x2 , x4 , x7 , x8 , x11 , x13 , x14 ,

whose cosets have "preferred" representatives

x, x2 , x + 1, x3 + x + 1, x2 + 1, x3 + x2 + x, x3 + x2 + 1, x3 + 1.

Definition 110 Given a field extension K : F and α ∈ K, we say that α is algebraic over F
if there exists a polynomial f (x) ∈ F [x] such that f (α) = 0.

Proposition 111 Let K : F be a field extension and α ∈ K be algebraic over F.


(a) There exists a unique monic polynomial m(x) in F [x] of least degree such that m(α) = 0.
The polynomial m(x) is known as the minimal polynomial of α.
(b) For f(x) ∈ F [x], we have f (α) = 0 if and only if m(x) divides f(x). Further m(x) is
irreducible.
(c) F [α] is a subfield of K of degree equal to the degree of m(x).

Proof. (a) As there exists a polynomial f (x) such that f (α) = 0 then there exists such a
polynomial m(x) of least degree. By dividing by its leading coefficient if necessary we can
assume m(x) to be monic. If m1 (x) and m2 (x) were monic polynomials of least degree such
that m1 (α) = m2 (α) = 0 then m1 − m2 would be a polynomial of strictly lower degree with α
as a root — a contradiction.
(b) By the division algorithm we can write f (x) = q(x)m(x) + r(x) where deg r < deg m
or r = 0. We then have 0 = f (α) = r(α) and by the minimality of m’s degree it must be
that r = 0. Hence f (x) = q(x)m(x) and we see m(x) divides f (x). The converse is obvious.
Finally if m(x) were reducible as m(x) = g(x)h(x) then we’d have g(α)h(α) = m(α) = 0. As
K is a field, and so an integral domain, g(α) = 0 or h(α) = 0 and either would contradict the
minimality of m.
(c) We already Certainly F [α] is a CRI, it only remains to show that non-zero elements are
units. Say that g(α) ∈ F [α] and g(α) = 0. Then m(x) does not divide g(x) and as m(x) is

MORE ON IDEALS AND QUOTIENT RINGS. FIELD EXTENSIONS. 26


irreducible then it follows that g(x) and m(x) are coprime. By Bézout’s Lemma in F [x] we
know that there are polynomials u(x) and v(x) such that

u(x)g(x) + v(x)m(x) = 1.

But then u(α)g(α) = 1 and we see that g(α) is a unit. Finally, if n = deg m(x) and f(α) ∈ F [α]
we see from the division algorithm that f(α) = r(α) for some r(α) in the span of 1, α, . . . , αn−1 .
Further 1, α, . . . , αn−1 are independent by the minimality of m(x) and so a basis for F [α] over
F. Hence n = |F [α] : F | as well.

Example 112 Let F16 be as in Example 109. Find the minimal polynomials of (i) x, (ii) x +1,
(iii) x2 + x?

Solution. (i) We know that x is a root of satisfies y 4 + y + 1 which is irreducible, so this must
be its minimal polynomial.
(ii) We similarly then have that x + 1 satisfies

(y + 1)4 + (y + 1) + 1 = y 4 + 4y 3 + 6y 2 + 4y + 1 + y + 1 + 1 = y 4 + y + 1

which again is irreducible and so x + 1 has this as its minimal polynomial.


(iii) Note with α = x2 + x we have

α2 = x4 + 2x3 + x2 = x + 1 + 0 + x2 = α + 1.

Hence the minimal polynomial of α is y 2 + y + 1. In particular this means that

Z2 [α] = {0, 1, α, α + 1} = {0, 1, x2 + x, x2 + x + 1}

is a subfield of F16 .

Proposition 113 (Tower Law) Let L, K, F be fields with F ⊆ K ⊆ L. Then L has finite
degree over F if and only if |L : K| and |K : F | are finite. In this case

|L : F | = |L : K| |K : F | .

Proof. Say that |L : K| = m and |K : F | = n are finite and l1 , . . . , lm are a basis for L over K
and k1 , . . . , kn are a basis for K over F. Then we will show that

{li kj : 1 i m, 1 j n}

is a basis for L over F. Independence: say that

0= fij li kj = fij kj li for some fij ∈ F.


i.j i j

As the li are independent over K then it follows that

fij kj = 0.
j

MORE ON IDEALS AND QUOTIENT RINGS. FIELD EXTENSIONS. 27


And as the kj are independent over F then it follows that fij = 0 for all i, j. Consequently the
li kj are linearly independent elements of L over F. Spanning: say that l ∈ L. Then there exists
κi ∈ K such that
l= κi li .
i
Similarly there exist φoj such that
κi = φij kj
j

so that
l= φij kj li
i j

and we see that the li kj are spanning.


Conversely if dimF L is finite then so is dimF K as K is a subspace of L (over F ) and
dimK L dimF L as any set that spans L over F also spans L over K.

Example 114 Find the degrees of the following extensions.


√ √ √
Q[ 2, i] : Q and Q[ 2, 3].

Solution. The former
√ is somewhat easier. The
√ minimal polynomial of 2 over Q is x2 − 2. It
cannot be linear as 2 is irrational. So Q[ 2] : Q = 2. Similarly the minimal polynomial of i
√ √
over Q[ 2] is x2 + 1; it cannot be linear as Q[ 2] is a real subfield and i is not real. So
√ √ √ √
Q[ 2, i] : Q = Q[ 2, i] : Q[ 2] Q[ 2] : Q = 2 × 2 = 4.

Again Q[ 2] : Q = 2. We need to take a little care to show that
√ √ √
Q[ 2, 3] : Q[ 2] = 2.
√ √
Certainly x2 − √ 3 has 3 as
√ a root, but is this polynomial irreducible over
√ Q[ 2]? A general
2
element of Q[ 2] is q1 + q2 2 where q1 , q2 ∈ Q. If x − 3 reduced in Q[ 2] then we’d have
√ √
(q12 − 2q22 ) + 2q1 q2 2 = (q1 + q2 2)2 = 3

for some q1 , q2 . As 1 and 2 are independent over Q then we have q1 = 0√or q2 = 0√both of
which lead to contradictions. Hence x2 − 3 is the minimal polynomial of 3 over Q[ 2] and
we have √ √ √ √ √ √
Q[ 2, 3] : Q = Q[ 2, 3] : Q[ 2] Q[ 2] : Q = 2 × 2 = 4.

Example 115 Show that F16 does not contain a subfield of order 8.

Solution. If F16 has a subfield K of order 8 we would have |K : Z2 | = 3 and


4 = |F16 : Z2 | = |F16 : K| |K : Z2 | = |F16 : K| × 3
which is a contradiction as 3 does not divide 4.

MORE ON IDEALS AND QUOTIENT RINGS. FIELD EXTENSIONS. 28


4. FACTORIZATION. EDS. PIDS. UFDS.

Throughout this section, rings will be assumed to be CRIs (commutative and having an identity)
unless otherwise stated.

Definition 116 Let R be a ring and let a, b, c ∈ R.


(i) If a = 0, we say that a divides b if there exists r ∈ R such that ra = b. This is written

a|b.

Equivalently we say that a is a factor of b or that b is a multiple of a.


(ii) If c = 0, we say that c is a common factor of a and b if c|a and c|b.
(iii) If c is a common factor of a and b, then we say that c is a highest common factor (or
hcf) of a and b if whenever d is a common factor of a and b then d|c.
(iv) If a, b = 0 we say that c is a common multiple of a and b if a|c and b|c.
(v) If c is a common multiple of a and b, then we say that c is a least common multiple (or
lcm) of a and b if whenever d is a common multiple of a and b then c|d.

Proposition 117 In an integral domain, the hcf of two elements, if it exists, is unique up to
multiplication by a unit. The corresponding result also holds for lcms.

Proof. Let R be an integral domain and a, b be non-zero elements of R. Say that h1 and h2
are two hcfs of a and b. In particular h1 and h2 are common factors of a and b, so that h1 |h2 as
h2 is an hcf and h2 |h1 as h1 is an hcf . So there exist r1 , r2 ∈ R such that

h1 r1 = h2 , h2 r2 = h1 .

Hence h1 (r1 r2 − 1R ) = 0 and, as h1 = 0, by the cancellation law r1 r2 = 1R and r1 and r2 are


both units in R.

Example 118 (i) Let R = Z. Let a = 105 and b = 441. Then 21 and −21 are hcfs of a and b.
(ii) Let R = Q[x]. Let a = x2 − 3x + 4 and b = x2 − 2x + 1. Then the hcfs of a and b are of
the form c (x − 1) where c is a non-zero rational.
(iii) Let R = Z[x]. Let a = 2x − 2 and b = 4x2 − 2x + 6. Then the hcfs are 2 and −2.
(iv) Let R = Z[i] = {x + yi : x, y ∈ Z}. Let a = 2 and b = −1 + 3i. Then the hcfs are 1 + i,
−1 − i, −1 + i,√1 − i. √ √
(v) Let R = Z[ −3] = x + y −3 : x, y ∈ Z . Then a = 4 and b = 2 + 2 −3 have no highest
common factor. To appreciate this:
√ √
if 4 = (α + β −3)(γ + δ −3) then 16 = (α2 + 3β 2 )(γ 2 + 3δ 2 )

and, knowing the factorizations of 16 in Z we see that the factors of 4 are ±1, ±1∓ −3, ±2, ±4
and √ √ √
if 2 + 2 −3 = (α + β −3)(γ + δ −3) then 16 = (α2 + 3β 2 )(γ 2 + 3δ 2 )

FACTORIZATION. EDS. PIDS. UFDS. 29


√ √
and so we see that the factors of 2 + 2 −3 are ±1, ±2, ±1 ∓ −3. So the common factors of
a and b are √
±1, ±2, ±(1 + −3).

1 cannot be a highest common factor as it divides the other two. However neither 2 nor 1+ −3
is a highest common factor as neither divides the other.

Definition 119 In an integral domain, non-zero elements are said to be coprime if 1R (or
equivalently a unit) is an hcf of theirs.

Proposition 120 Let R be an integral domain and a, b ∈ R. If there exist u, v ∈ R such that
ua + vb = 1R then a and b are coprime.

Proof. (a) Let c be a common factor of a and b. Then there exist r, s ∈ R such that a = cr
and b = cs. Hence
1R = ua + vb = (ur + vs)c
and we see that c is a unit. Finally, by definition, units are factors of all elements in a ring.

Definition 121 Given a ring R we say that a non-zero, non-unit x ∈ R is


(a) a prime element if whenever x|yz then x|y or x|z.
(b) an irreducible element if whenever x = yz then y is a unit or z is a unit.

Proposition 122 (a) Given a non-zero, non-unit x, then the principal ideal x is prime if
and only if x is a prime element.
(b) In an integral domain, prime elements
√ are irreducible.
(c) 2 is irreducible but not prime in Z[ −3].

Proof. (a) Say that x is prime and ab ∈ x . Then x|ab and as x is prime we have x|a or x|b.
That is a ∈ x or b ∈ x . Conversely say that x is prime with x non-zero and a non-unit.
If x|ab then ab ∈ x and so a ∈ x or b ∈ x i.e. x|a or x|b. As x is a non-unit then x is a
prime element.
(b) Let R be an integral domain and x be prime. If x = yz then x|yz and so x|y or x|z. If
x|y then y = ux for some u and then x = uxz. By the cancellation rule uz = 1 and so z is a
unit, showing that x is irreducible.
√ √ √
(c) Note that 2|(1 + −3)(1 − −3) = 4 but that 2 divides neither 1 ± −3. So 2 is not
prime. But if we could write
√ √
2 = (a + b −3)(c + d −3).

Taking the modulus squared we see

4 = (a2 + 3b2 )(c2 + 3d2 ).



If we have a2 + 3b2 = 1 then a + b −3 is a unit. So a genuine reduction of 2 would mean

a2 + 3b2 = c2 + 3d2 = 2

and these equations have no solution. Hence 2 is irreducible.

FACTORIZATION. EDS. PIDS. UFDS. 30


4.1 Euclidean Domains (EDs)

Definition 123 A Euclidean domain R is essentially an integral domain which admits the
division algorithm. That is R is an integral domain together for which there exists a function,
often called a norm, d : R\{0} → N = {0, 1, 2, 3 . . .} such that
(a) d(a) d(ab) for all a, b ∈ R\{0}.
(b) given a, b ∈ R with b = 0, there exist q, r ∈ R such that a = qb + r with d(r) < d(b) or
r = 0.

Example 124 (a) The integers Z form an ED with d(x) = |x| for x = 0.
(b) Given a field F, the polynomial ring F [x] forms an ED with d(f ) = deg f for any f = 0.

Proposition 125 The Gaussian integers Z[i] form an ED with

d(a + bi) = |a + bi|2 = a2 + b2 where a + bi = 0.

Proof. We immediately have for non-zero α, β in Z[i] that

d(αβ) = |αβ|2 = |α|2 |β|2 |α|2 = d(α).

Further we have that α/β ∈ C and — noting the Gaussian integers form a grid of unit squares
in C — there exists q ∈ Z[i] such that
α 1
−q √ .
β 2
If we set r = α − qβ ∈ Z[i] then we have
2
α 1 2 1
d(r) = |α − qβ|2 = − q |β|2 |β| = d(β) < d(β).
β 2 2


Example 126 The ring Z[ 2] is an ED with

d(a + b 2) = a2 − 2b2 .

Show that the equations x2 − 2y 2 = 1 and x2 − 2y 2 = −1 each have infinitely many integer
solutions.
√ √
Solution. Firstly note that d is multiplication. If α = a1 + a2 2 and β = b1 + b2 2 then
√ √ √
d((a1 + a2 2)(b1 + b2 2)) = d((a1 b1 + 2a2 b2 ) + (a2 b1 + a1 b2 ) 2)
= (a1 b1 + 2a2 b2 )2 − 2(a2 b1 + a1 b2 )2
= a21 b21 + 4a1 a2 b1 b2 + 4a22 b2 2 − 2a22 b21 − 2a21 b22 − 4a1 a2 b1 b2
= a21 − 2a22 b21 − 2b22
√ √
= d(a1 + a2 2)d(b1 + b2 2).

EUCLIDEAN DOMAINS (EDS) 31



Let α, β ∈ Z[
√ 2]\{0}.
√ So we have d(α) d(α)d(β) =√ d(αβ) √as d(β) 1. Also we have
α/β = q1 + q 2 ∈ Q[ 2] and so we can find γ = c1 + c2 2 ∈ Z[ 2] such that |qi − c1 | 1/2
so that
α
d −γ = (q1 − c1 )2 − 2(q2 − c2 )2
β
(q1 − c1 )2 + 2(q2 − c2 )2
1 2 3
+ = .
4 4 4
If we set δ = α − βγ and so

α 3
d(δ) = d (α − βγ) = d − γ d(β) d(β) < d(β).
β 4

If instead we
√ had defined D(a + b 2) = a2 − 2b2 then we see as above that D is multiplicative.
As D(1 + 2) = 12 − 2 × 12 = −1 it then follows that
√ √
D(xn + 2yn ) = D((1 + 2)n ) = (−1)n

and so (x2n , y2n ) is a solution of x2 − 2y 2 = 1 and (x2n+1 , y2n+1 ) is a solution of x2 − 2y 2 = −1.

Proposition 127 Let R be an ED and I ⊳ R. Then I is principal.

Proof. If I = {0R } then we are done as I = 0R . Otherwise there exists x ∈ R with x = 0


and d(x) minimal. Certainly x ⊆ I. Conversely if y ∈ I then there exist q, r ∈ R such that
y = qx + r where d(r) < d(x) or r = 0. As r = y − qx ∈ I then r = 0 by the minimality of
d(x). So y = qx and hence I ⊆ x . Hence I is principal.

Corollary 128 In an ED, irreducible elements are prime.

Proof. Say that x is irreducible and x ⊆ I ⊳ R. There exists y such that y = I and so there
exists z such that x = yz. By the irreducibility of x we either have y is a unit and I = y = R
or z is a unit and we have I = y = x . Hence x is maximal, and so prime, and hence x is
a prime element.

Example 129 Z[x] and Q[x, y] are not Euclidean domains

Solution. In the first case note that 2, x is not principal, and in the second case x, y is not
principal.

EUCLIDEAN DOMAINS (EDS) 32


4.2 Aside — the Euclidean Algorithm (Off-syllabus)

The Euclidean algorithm employs the division algorithm repeatedly to find the hcf of two
integers a and b. The algorithm first appeared in Euclid’s Elements Book VII in around 300
B.C.. First we describe an example in Z to see how the algorithm works.

Example 130 Find the highest common factor of 53714 and 30281.

Solution. Let n1 = 53714 (the greater of the two numbers) and n2 = 30281.
n1 /n2 = 1.77 . . . and so q1 = 1 and we set n3 = r1 = n1 − 1 × n2 = 23433;
n2 /n3 = 1.29 . . . and so q2 = 1 and n4 = r2 = n2 − 1 × n3 = 6848;
n3 /n4 = 3.42 . . . and so q3 = 3 and n5 = r3 = n3 − 3 × n4 = 2889;
n4 /n5 = 2.37 . . . and so q4 = 2 and n6 = r4 = n4 − 2 × n5 = 1070;
n5 /n6 = 2.7 and so q5 = 2 and n7 = r5 = n5 − 2 × n6 = 749;
n6 /n7 = 1.42 . . . and so q6 = 1 and n8 = r6 = n6 − 1 × n7 = 321;
n7 /n8 = 2.33 . . . and so q7 = 2 and n9 = r7 = n7 − 2 × n8 = 107;
n8 /n9 = 3 and so q8 = 3 and n10 = n9 − 3 × n8 = 0.
At this point the algorithm terminates and the output is the last positive number namely 107.

Algorithm 131 (Euclidean algorithm) Let a1 and a2 be non-zero elements of an ED with


d(a1 ) d(a2 ). Then the Euclidean algorithm uniquely defines two sequences qi and ai of integers
by

a1 = q1 a2 + a3 where d(a3 ) < d(a2 )


a2 = q2 a3 + a4 where d(a4 ) < d(a3 )
a3 = q3 a4 + a5 where d(a5 ) < d(a4 )

and so on. The algorithm terminates if ak = 0 for some k.

Theorem 132 The Euclidean algorithm always terminates with ak = 0 for some k and

ak−1 = hcf (a1 , a2 ) .

Proof. The sequence of positive integers d(ai ) is strictly decreasing and bounded below, and so
can only be finite in length. If d(ai ) > 0 then it possible to run (at least) one further application
of the division algorithm and so the Euclidean algorithm terminates when ak = 0 for some k.
Firstly, I claim ak−1 divides ai for 1 i k. The proof follows by reverse induction.
Certainly ak−1 |ak−1 and ak−1 |ak = 0. Suppose (as an inductive hypothesis) that ak−1 divides
ar and ar+1 . Then
ar−1 = qr−1 ar + ar+1
is also divisible by ak−1 . Hence by induction ak−1 is a common factor of a1 and a2 .

ASIDE — THE EUCLIDEAN ALGORITHM (OFF-SYLLABUS) 33


Secondly, we need to show that ak−1 = hcf (a1 , a2 ) . Suppose that m is a common factor
of a1 and a2 . I claim that m is also a factor of ak−1 with the proof following by induction.
Suppose (as an inductive hypothesis) that m divides ar and ar+1 . Then m divides
ar+2 = ar − qr ar+1
also. By induction m divides ak−1 also completing the proof.
Theorem 133 (Bézout’s Lemma) Let a, b be non-zero elements of an ED R with a highest
common factor h. Then there exist u and v in R such that
ua + vb = h.
Proof. Set a1 = a and a2 = b. The proof uses reverse induction working backwards through the
calculations performed in the Euclidean Algorithm. We will show that, for each i = 1, 2, . . . , k,
there exist ui and vi such that
ui ai + vi ai+1 = h. (4.1)
As h = ak−1 then we can set uk−2 = 0 and vk−2 = 1 to see that (4.1) is true for i = k − 2. Now
suppose, as an inductive hypothesis, that (4.1) holds true for i = I, where 1 < I k − 2, and
we shall aim to show it’s true for i = I − 1. We have
h = uI aI + vI aI+1
= uI aI + vI (aI−1 − qI−1 aI )
= vI aI−1 + (uI − qI−1 vI ) aI
thus proving that (4.1) holds true for i = I − 1 with uI−1 = vI and vI−1 = uI − qI−1 vI . By
induction (4.1) holds true when i = 1 which is the required result for the case of a and b.
Example 134 Find integers u and v such that 53714u + 30281v = 107.
Solution. Recall from our earlier calculations that
n3 = n1 − n2 ; n4 = n2 − n3 ; n5 = n3 − 3n4 ;
n6 = n4 − 2n5 ; n7 = n5 − 2n6 ; n8 = n6 − n7 ;
with n9 = n7 − 2n8 = 107 being the highest common factor. We will "reverse" the above
equations to write n9 in terms of n7 and n8 , then in terms of n6 and n7 , and so on repeatedly
until we have n8 in terms of n1 and n2 . We see
107 = n7 − 2n8
= n7 − 2(n6 − n7 ) = −2n6 + 3n7
= −2n6 + 3(n5 − 2n6 ) = 3n5 − 8n6
= 3n5 − 8(n4 − 2n5 ) = −8n4 + 19n5
= −8n4 + 19(n3 − 3n4 ) = 19n3 − 65n4
= 19n3 − 65(n2 − n3 ) = −65n2 + 84n3
= −65n2 + 84(n1 − n2 ) = 84n1 − 149n2

ASIDE — THE EUCLIDEAN ALGORITHM (OFF-SYLLABUS) 34


Example 135 Find the multiplicative inverse of 2167 in mod 65537 arithmetic.

Solution. We first find integers u and v such that 2167u + 65537v = 1. Applying first the
Euclidean algorithm we set:
Let n1 = 65537 and n2 = 2167.
As n1 /n2 = 30.24 . . . and so q1 = 30 and we set n3 = r1 = n1 − 30 × n2 = 527.
As n2 /n3 = 4.11 . . . then q2 = 4 and n4 = r2 = n2 − 4 × n3 = 59.
As n3 /n4 = 8.93 . . . then q3 = 8 and n5 = r3 = n3 − 8 × n4 = 55.
As n4 /n5 = 1.07 . . . then q4 = 1 and n6 = r4 = n4 − n5 = 4.
As n5 /n6 = 13.75 then q5 = 13 and n7 = r5 = n5 − 13 × n6 = 3.
As n6 /n7 = 1.25 then q6 = 1 and n8 = r6 = n6 − n7 = 1.
Hence hcf (65537, 2167) = 1 as required and we can work backwards to find

1 = n6 − n7
= n6 − (n5 − 13n6 ) = −n5 + 14n6
= −n5 + 14 (n4 − n5 ) = 14n4 − 15n5
= 14n4 − 15 (n3 − 8n4 ) = −15n3 + 134n4
= −15n3 + 134 (n2 − 4n3 ) = 134n2 − 551n3
= 134n2 − 551 (n1 − 30n2 ) = −551n1 + 16664n2 .

Hence we have that


(−551) × (65537) + (16664) × (2167) = 1.
So in mod 65537 arithmetic we have

16664 × 2167 = 1 mod 65537 or equally 2167−1 = 16664 mod 65537.

4.3 Unique Factorization in PIDs.

Definition 136 An integral domain R is said to be a principal ideal domain (or PID) if
every ideal is principal.

Example 137 We know that every Euclidean domain is a PID. These include (with F a field)

Z, Z[i], F [x], Z[ 2]

but not Z[x] nor F [x, y].

Definition 138 An integral domain R is said to be a unique factorization domain (or


UFD) if every non-zero, non-unit element x can be written

x = p1 p2 · · · pr

UNIQUE FACTORIZATION IN PIDS. 35


where p1 , . . . , pr are irreducible elements and further this factorization is unique in the sense
that if
x = p1 p2 · · · pr = q1 q2 . . . qs
are two factorizations into irreducible elements then r = s and (with a possible reordering of
the factors) we have
pi = ui qi
for each i where each ui is a unit.

Remark 139 Though we shall not prove it explicitly, the above definition is equivalent to
requiring that (i) non-zero elements can be written as the product of irreducible elements and
(ii) irreducible elements are prime.
√ √
Example
√ 140 (i) 4 = (1 + −3)(1 − −3) = 2 × 2 are two essentially different factorizations
in Z[ −3].
(ii) 5 = (1 + 2i)(1 − 2i) = (2 − i)(2 + i) are essentially the same factorizations in Z[i].
(iii) 2x2 − 6x + 4 = (2x − 2)(x − 2) = (x − 1)(2x − 4) are essentially the same factorization
in Q[x].
(iv) x2 − 1 = (x − 1)(x + 1) = (x − 3)(x − 5) are two essentially different factorizations in
Z8 [x].

Example 141 Factorize 17 − i into irreducibles in Z[i].

Solution. Note that d(17 − i) = 172 + 12 = 290 = 2 × 5 × 29 so that any factorization into
irreducibles can involve at most three factors. Up to multiplication by units, the only elements
with a norm of 2 are 1 ± i. Now note that
17 − i (17 − i)(1 − i) 16 − 18i
= = = 8 − 9i.
1+i 2 2
Likewise the only Gaussian integers with a norm of 5 are 1 ± 2i up to units. Note that
8 − 9i (8 − 9i)(1 − 2i) −10 − 25i
= = = −2 − 5i.
1 + 2i 5 5
Hence
17 − i = (1 + i)(1 + 2i)(−2 − 5i)
is a factorization into irreducibles.

Proposition 142 In a UFD two non-zero elements have a unique hcf.

Proof. Let x, y be non-zero elements of a UFD R. By considering the irreducible elements


which divide a or b we may write
β
x = upα1 1 · · · pαr r and y = vp1 1 · · · pβr r

where u and v are units and αi , β i 0. Then it can be shown that


γ
hcf(x, y) = p1 1 · · · pγr r where γ i = min{αi , β i }

UNIQUE FACTORIZATION IN PIDS. 36


and that this hcf is unique up to multiplication by a unit. Details are omitted here.
We know √that in an ID a prime element is irreducible, but that the converse need not hold
(e.g. 2 in Z[ −3]). We see now that in UFDs and PIDs that irreducibles are prime. As we
shall see, PIDs are UFDs and so the latter should not be surprising. But we shall need the
equivalence of primes and irreducibles in PIDs to actually show that PIDs are UFDs.
Proposition 143 (a) In a UFD an irreducible element is prime.
(b) In a PID an irreducible element is prime.
Proof. (a) Say that x is irreducible and that x|yz. We then have that xv = yz for some
v. The elements v, y, z can each be factorized into irreducibles, and by the uniqueness of the
factorizations of xv = yz it must be that x (up to a unit) is present in the factorizations of y
or z (or both). So x|y or x|z and we see that x is prime.
(b) Say that x is irreducible and x ⊆ I ⊳ R. There exists y such that y = I and so there
exists z such that x = yz. By the irreducibility of x we either have y is a unit and I = y = R
or z is a unit and we have I = y = x . Hence x is maximal, and so prime, and hence x is
a prime element.
Proposition 144 PIDs are Noetherian. That is given an increasing sequence of ideals
I1 ⊆ I2 ⊆ I3 ⊆ · · ·
in a PID R, there exists N such that In = IN for n N.
Proof. This is left to Sheet 2, Exercise 5(i).
Theorem 145 A PID is a UFD.
Proof. Existence: Let R be a PID and x be non-zero. If x is irreducible then we are done;
otherwise we may write x = yz where y and z are not units. If y and z are irreducible then we
are done, and otherwise we may continue factorizing the composite elements that arise and, say,
write next y = ab. If the process terminates (through lack of any genuinely composite remaining
factors) then we have written x as a product of irreducibles and we are done. However if the
process does not terminate then we could produce an infinite strictly increasing sequence of
ideals
x ⊆ y ⊆ a ⊆ ··· .
However as PIDs are Noetherian then this cannot occur and the above factorization will ter-
minate.
Uniqueness: Say that we have
x = p1 p2 · · · pr = q1 q2 · · · qs
are two factorizations of x into irreducible elements. As p1 is irreducible, and so prime, then
p1 |q1 q2 · · · qs implies that p1 |qi for some i. As qi is irreducible and p1 is not a unit we then have
qi = up1 for some unit u. By renumbering q1 , . . . , qs and incorporating the unit u into one of
the other factors we may assume that p1 = q1 and then by the cancellation law we have that
p2 · · · pr = q2 · · · qs
and may proceed along similar lines again, ultimately showing that r = s and that the two
factorizations are essentially the same.

UNIQUE FACTORIZATION IN PIDS. 37


4.4 Factorization in Z[x] and Q[x]

Factorization in C[x] is, in principle at least, very straightforward. The fundamental theorem
of algebra tells us that any complex polynomial can be uniquely written as
qc(x − α1 )(x − α2 ) · · · (x − αn )
for some α1 , . . . , αn , c ∈ C. These linear factors are irreducible and so no further factorization
is possible.
The above also helps us appreciate how factorization works in R[x] when one recalls that a
real polynomial’s non-real roots will arise as conjugate pairs. Hence we can factorize any real
polynomial can be unqiuely factorized as
c(x − a1 )(x − a2 ) · · · (x − ar )q1 (x)q2 (x) · · · qs (x)
where a1 , . . . , as , c are real and q1 , · · · , qs are irreducible monic quadratics.
We know from our earlier discussion that Q[x] is a UFD but we are yet to prove this for
Z[x] (though this is indeed the case). But we have no general theorem like that above to help us
determine these rings’ irreducible elements. We begin with a naive treatment of the following
example.

Example 146 Show that the cubic x3 − 2x + 3 is irreducible in Z[x].

Solution. If the cubic did factorize then its factors would include a linear factor ax + b. Note
that a would have to divide 1 and that b would have to divide 3. So up to units, the only linear
factors could be x − 1, x + 1, x − 3, x + 3. However as none of 1, −1, 3, −3 is a root of the cubic
then we can see that the cubic is irreducible over Z.

How might we have approached this problem in Q[x]? Certainly not so naively treating a
finite number of possibilities. Reassuringly Gauss showed these problems to be largely equiv-
alent. We first note the following useful and quite general approach involving the polynomial
rings Zp [x].

Proposition 147 Let f (x) be an integer polynomial whose leading coefficient is not divisible
by the prime p. If f (x) is irreducible in Zp [x] then f(x) is irreducible in Z[x].

Proof. Suppose that f = gh be a proper factorization in Z[x]. Then g and h are both of
positive degree and we also have deg f = deg f + deg g. In Zp [x] we have f = gh and
deg f = deg f = deg g + deg h deg g + deg h = deg f
as the leading coefficient of f is not divisible by p, as Zp is a field and as deg p deg p in
general. Hence it must be the case that deg g and deg h are both positive and f is reducible in
Zp [x].

Example 148 Show that the following polynomials are irreducible over Z[x].
f1 (x) = 17x3 + 7x + 3, f2 (x) = 2x3 + 3x2 + x − 2, f3 (x) = 7x4 + 5x − 3.

FACTORIZATION IN Z[X] AND Q[X] 38


Solution. f1 (x) mod 2 equals x3 + x + 1 which is irreducible over Z2 as it has no roots in Z2 .
Hence f1 (x) is irreducible over Z.
We cannot use Z2 for f2 (x) as 2 divides the leading coefficient. However mod 3 we obtain
the polynomial
2x3 + x + 1
which does not have a root in Z3 and so f2 (x) is irreducible over Z.
f3 (x) mod 2 equals x4 + x + 1 which is irreducible over Z2 as f3 (x) has no roots in Z2 and
as x2 + x + 1, the only irreducible quadratic mod 2 is not a factor. So f3 (x) is irreducible over
Z.
Whilst thinking along these lines we will also introduce the following criterion for irreducibil-
ity. The criterion may seem somewhat contrived but can in fact be very useful particularly for
cyclotomic polynomials.

Proposition 149 (Eisenstein’s Criterion) Let f (x) = an xn + an−1 xn−1 + · · · + a1 x + a0 be


an integer polynomial and p be a prime such that

p does not divide an ; p divides a0 , a1 , . . . , an−1 ; p2 does not divide a0 .

Then f (x) is irreducible in Z[x].

Proof. Note that mod p we have


f (x) = an xn
where an = 0̄. If f(x) = g(x)h(x) is a genuine reduction into integer polynomials, then, as Zp [x]
is a UFD, we have
g(x) = b̄xr h(x) = c̄xs
where r + s = n and r, s 1. So p divides the constant coefficients of g(x) and h(x) and hence
p2 divides the constant constant of f (x), but this is the required contradiction.

Example 150 Show that x4 + x3 + x2 + x + 1 is irreducible over Z.

Solution. A standard trick for such "cyclotomic" polynomials (irreducible factors of xn − 1)


is to set x = u + 1. The new polynomial in u will be irreducible if and only if the original one
in x is. Now

(u + 1)4 + (u + 1)3 + (u + 1)2 + (u + 1) + 1 = u4 + 5u3 + 10u2 + 10u + 5.

We see that the new polynomial is irreducible — by Eisenstein with p = 5 — and so the original
polynomial also is.
We return now to discussion relating factorization over Q to factorization over Z.

Definition 151 Let f (x) be a polynomial over the integers. The content c(f ) is the hcf of the
coefficients of f. A polynomial is said to be primitive is it has content equal to 1.

Note that any integer polynomial can be written uniquely as f(x) = cp(x) where c = c(f)
and p(x) is a primitive polynomial.

FACTORIZATION IN Z[X] AND Q[X] 39


Lemma 152 (Gauss Lemma) The product of two primitive polynomials is primitive. Con-
sequently
c(f g) = c(f )c(g)
for any two integer polynomials f and g.

Proof. Say that f and g are primitive polynomials and suppose for a contradiction that fg is
not primitive. Let p be a prime factor of c(fg). We then have that

f (x)g(x) = 0 mod p.

As Zp [x] is an ID then either f (x) = 0 or g(x) = 0 in Zp [x]. But then p divides the coefficients
of f or divides the coefficients of g. This would contradict f and g being primitive.
More generally for integer polynomials f and g we have

f = c(f )pf , g = c(g)pg , fg = c(fg)pf g .

Then
c(f g)pf g = c(f )c(g)pf pg .
As pf pg is primitive, then by the uniqueness of the above expressions we have c(f g) = c(f )c(g)
and pf g = pf pg .

Gauss’ lemma plays a key role in relating factorization in Z[x] with factorization in Q[x].
Note

• 2x − 6 is irreducible in Q[x] but not in Z[x] as 2x − 6 = 2(x − 3) and 2 is not a unit in


Z[x].
• But if f(x) in Z[x] reduces into positive degree polynomials in Z[x] then it is reducible in
Q[x] as Z[x] ⊆ Q[x].

• In general polynomials in Q[x] are not in Z[x] but can be multiplied by a non-zero integer
to form a polynomial in Z[x].

Given the example that started this section we would much rather work with irreducibility
in Z[x] rather than in Q[x] if at all possible. The following result addresses precisely this point.

Theorem 153 Let f be a primitive non-constant polynomial in Z[x]. Then f (x) is irreducible
in Z[x] if and only if f (x) is irreducible in Q[x].

Proof. If f is irreducible over Q then it is irreducible over Z, given the comments above.
Say now that f (x) is genuinely reducible in Q[x], say f (x) = q1 (x)q2 (x) for polynomials in
Q[x]. We then have g1 = d1 q1 and g2 = d2 q2 are integer polynomials where di is the lcm of
the denominators of the coefficients in qi . Finally we have gi = c(gi )pi where pi are primitive
integer polynomials. So
d1 d2 f = g1 g2 = c(g1 )c(g2 )p1 p2 .
As f and p1 p2 are both primitive then we have, by uniqueness, that f = p1 p2 and so f is
reducible in Z[x].

FACTORIZATION IN Z[X] AND Q[X] 40


Proposition 154 Z[x] is a UFD.

Proof. Any f (x) in Z[x] can be written uniquely as f (x) = c(f )p(x) where p is primitive. As
Z is a UFD then c(f ) can be uniquely written as a product of prime (integers). As Q[x] is a
UFD then p(x) can be written as a product of irreducible rational polynomials. As seen in the
proof of Theorem 153 any reduction of p(x) into rational polynomials can be replaced with a
reduction of p(x) into integer polynomials, again in a unique fashion (up to the units of Z).

In fact the above theorems generalize quite naturally to demonstrate the following result.
As UFDs have well defined hcfs then we can introduce the notion of content to polynomials in
R[x]. Replacing Z with R and Q with the field of fractions of R then we can rewrite the above
arguments to show:

Proposition 155 If R is a UFD then so is R[x].

FACTORIZATION IN Z[X] AND Q[X] 41


5. MODULES

Modules are essentially the equivalent of vector spaces when the scalars come from a ring rather
than a field. Throughout we will consider only modules over Euclidean domains. (The results
that follow can all be proved over PIDs, though some of the proofs can be somewhat more
laborious and in any case the important examples we will meet are all over EDs.)

Definition 156 Let R be an ED. An R-module M is an abelian group together with scalar
multiplication R × M → M, denoted as (r, m) → rm, such that

• r(m1 + m2 ) = rm1 + rm2 for all r ∈ R and m1 , m2 ∈ M.

• (r1 + r2 )m = r1 m + r2 m for all r1 , r2 ∈ R and m ∈ M.

• (r1 r2 )m = r1 (r2 m) for all r1 , r2 ∈ R and m ∈ M.

• 1R m = m for all m ∈ M.

Example 157 Modules over Z. A Z-module is an abelian group and vice versa an abelian
group is a Z-module. Scalar multiplication by integers is entirely determined by the abelian
group structure as for any positive integer n we have

n.m = (1 + · · · + 1).m = (1.m) + · · · + (1.m) = m + · · · + m

and (−n).m = (−1).(n.m) = −(n.m).

Example 158 Modules over fields. A module over a field is a vector space.

Example 159 Modules over polynomial rings. Say that M is a module over F [x] where
F is a field. Then M is a vector space over F when considered as the constant polynomials in
F [x].
Multiplication by x, that is T (m) = xm has the effect of an F -linear map T : M → M.
Note that the entire F [x]-module structure on V is entirely determined by this map T as by
the module axioms we have
p(x).m = p(T )m
for any polynomial p(x) in F [x].
Conversely given a vector space M over F and a linear map T : M → M we can define M
as a F [x]-module by defining scalar multiplication as

x.m = T (m).

MODULES 42
Example 160 Given a field F and a square matrix A over F then F [A] is an F [x]-module
with
x.p(A) = (xp)(A).
N.B. In general though F [A] is a different module to the F [x]-module defined by A.
For example, consider the real matrix
 
1 0 0
A= 0 1 1 .
0 0 1

The R[x]-module defined by A is a 3-dimensional real vector space. By comparison

R[x]
R[A] ∼
= ,
(x − 1)2

because mA (x) = (x − 1)2 , and is a two-dimensional real vector space. R[A] is spanned by I, A.

Example 161 Given two R-modules M and N then we can form the direct sum M ⊕ N as
an R-module in the natural way by component-wise addition and scalar multiplication.
Generally for a positive integer n we can define the R-module Rn = R ⊕ · · · ⊕ R (n times).
These are the free modules over R.

Example 162 If R is an ED and I is an ideal of R then R/I is naturally an R-module. For


example with R = Q[x] and I = x2 + 1 then

Q[x]
x2 + 1

also has the structure of a two-dimensional vector space over Q with basis 1, x. Scalar multipli-
cation is defined by
x.1 = x, x.x = x2 = −1
so that multiplication by x is represented by the matrix

0 −1
B=
1 0

with respect to the above basis. Hopefully unsurprisingly

Q[x]
and Q[B]
x2 + 1

are isomorphic as Q[x]-modules, with such an isomorphism being a + bx → a + bB.

Here we make rigorous the idea of being isomorphic as R-modules and also introduce the
idea of a module homomorphism.

MODULES 43
Definition 163 Let M and N be R-modules. A map φ : M → N is a module homomor-
phism if
φ(m1 + m2 ) = φ(m1 ) + φ(m2 ), φ(rm1 ) = rφ(m1 )
where r ∈ R, m1 , m2 ∈ M. We say that φ is a module isomorphism if φ is a bijection.

Example 164 A module homomorphism between Z-modules is a group homomorphism.

Example 165 If R is a field then the module homomorphisms are precisely the linear maps.

Example 166 Let


1 0 1 0
C= , D= .
0 2 0 3
Find all module homomorphisms from R[C] to R[D].

Solution. Every element of R[C] can be written as αI + βC and every of R[D] can be written
αI + βD. So if we take initial basis {I, C} and final basis {I, D} then a module homomorphism
φ can must be represented by a 2 × 2 real matrix. Say
a b
φ=
c d

so that φ(I) = aI + cD and φ(C) = bI + dD. Our further requirements on φ are that

bI + dD = φ(C) = φ(x.I) = x.φ(I) = D(aI + cD) = −3cI + (a + 4c)D;


(−2a + 3b)I + (−2c + 3d)D = φ(3C − 2I) = φ(C 2 )
= φ(x.C) = x.φ(C) = D(bI + dD) = −3dI + (b + 4d)D,

noting that C 2 = 3C − 2I and D2 = 4D − 3I. Comparing coefficients we have that

b = −3c, d = a + 4c, −2a + 3b = −3d, −2c + 3d = b + 4d.

Solving these we see there is a one-parameter family of solutions with

a = b = −3c and d = c.

So
−3 −3
φ=c .
1 1
We might have done the above calculation in a slightly more systematic way. Under the
identification (α, β)T ↔ αI + βC of R2 with R[C] we note

α 0 −2 α
x. ↔ x.(αI + βC) = αC + βC 2 = −2βI + (α + 3β)C ↔
β 1 3 β

so that multiplication by x in R[C] is given by the matrix

0 −2
1 3

MODULES 44
with respect to our basis, and similarly multiplication by x in R[D] is given by the matrix

0 −3
1 4

with respect to {I, D}. So the matrices for φ that we found are precisely those that satisfy

0 −3 a b a b 0 −2
=
1 4 c d c d 1 3

as a consequence of requiring x.φ(v) = φ(x.v) for all v in R[C]. (We will see in due course that
these two matrices are the companion matrices for mC (x) and mD (x).)
Why are there limited module homomorphisms here? And how might we better understand
them? The second question will have a clearer answer once we better understand the structure
of modules. However, to answer the first question, note that scalar multiplication by x2 − 3x+2
is the same as multiplication by zero in R[C] as C 2 − 3C + 2I = 0. However this is not the case
in R[D] as D2 − 3D + 2I = 0.

Definition 167 A non-empty subset N of a module M is a submodule if N is closed under


addition and scalar multiplication.

Example 168 When we consider R as an R-module the submodules of R are the ideals.

Example 169 The submodules of a Z-module are the subgroups.

Example 170 Let V be a vector space over a field F and T : V → V be a linear map defining
V as an F [x]-module. Then the submodules of V as the T -invariant subspaces — i.e. those
subspaces U of V such that T (U ) ⊆ U.

Example 171 Given a module homomorphism φ : M → N then

ker φ = {m ∈ M : φ(m) = 0} is a submodule of M ;


Im φ = {φ(m) : m ∈ M } is a submodule of N.

Example 172 The rational matrix

1 1
F =
1 1

defines a Q[x] module structure on Q2 . The submodules of Q2 are the F -invariant subspaces
and so include the eigenspaces

E2 = (1, 1)T , E0 = (1, −1)T

and we see that, as a direct sum of Q[x]-modules,

a a+b 1 a−b 1
Q2 = E2 ⊕ E0 , = +
b 2 1 2 −1

MODULES 45
Definition 173 (a) We say that elements x1 , x2 , . . . , xn of an R-module M are linearly in-
dependent if the only solution of the equation

r1 x1 + r2 x2 + · · · + rn xn = 0, ri ∈ R

is r1 = r2 = · · · = rn = 0.
(b) We say that elements x1 , x2 , . . . , xn of an R-module M generate or span M if every
element x of M can be written

x = r1 x1 + r2 x2 + · · · + rn xn

for some r1 , r2 , . . . , rn in R.
(c) We say that elements x1 , x2 , . . . , xn of an R-module M form a basis for M if the elements
are linearly independent and span M.
(d) We say that a module M is finitely generated if there is a finite subset of M that
generates M.
(e) A module with a basis is called a free module. N.B. Most modules don’t have
bases.

Example 174 The elements e1 = (1, 0, . . . , 0), e2 = (0, 1, 0, . . . , 0), . . . , en = (0, . . . , 0, 1) are a
basis for Rn .

Example 175 The set {2} in Z is linearly independent but not a basis.

Example 176 Z2 is not free as a Z-module as 2.1̄ = 0̄ but 2 = 0. But Z2 is free as a Z2 -module
— of course it remains the case that 2̄.1̄ = 0̄ but now the scalar 2̄ is the zero scalar.

Proposition 177 Any basis of Rn contains n elements. n is known as the rank of Rn . Note
an R-module with a basis containing n elements is isomorphic as an R-module to Rn .

Proof. Say that b1 , . . . , bm is a basis for Rn . Then we may write each bi as

bi = b1i e1 + b2i e2 + · · · + bni en

so that the n × m matrix B = (bij ) is the change of basis matrix from the ei to the bi . However
as the bi are also a basis, we can write the ei in terms of the bi and hence there is an m × n
change of basis matrix A. We then have that

BA = In and AB = Im .

If m = n then we may assume without loss of generality that m > n. We can introduce the
m × m matrices
B
A′ = (A | 0m(m−n) ), B′ =
0(m−n)m
which still satisfy A′ B ′ = Im . However we would then have

1 = det Im = det A′ det B ′ = 0 × 0 = 0.

MODULES 46
A contradiction. (Note that the veracity of the determinant product rule relies only on the
commutativity of the matrices’ entries.)
If a module M has a basis b1 , . . . , bn then every element x of M can be uniquely written as

x = r1 b1 + r2 b2 + · · · + rn bn

and the identification x ↔ (r1 , r2 , . . . , rn ) is a module isomorphism with Rn .

Proposition 178 A submodule M of the free module Rn is finitely generated.

Proof. We shall prove this by induction on n. The submodule of R are the ideals which are
principal, and so generated by a single element, proving the case n = 1. Say now that the result
holds for Rn−1 and let W be a submodule of Rn . Then

W0 = {w = (w1 , . . . , wn ) ∈ W : wn = 0}

is isomorphic to a submodule of Rn−1 and so is finitely generated. If W = W0 then we are


done. Otherwise there is x ∈ W with xn = 0 and with norm d(xn ) being minimal. Say that
y1 , . . . , yk generate W0 By minimality of d(xn ) for any w ∈ W we have wn = rxn for some r
and hence w − rx ∈ W0 . It follows that y1 , . . . , yk , x generate W and so W is finitely generated
and the result follows by induction.

Corollary 179 A submodule M of Rn is free and rank(M ) n.

Proof. From the proof of the proposition, we can also see how one might produce a basis for
W with n or fewer elements. We can create a nested sequence of submodules
(1) (k) (k−1) (n) (n−1) (1)
W0 = W0 , W0 = (W0 )0 , and {0} = W0 ⊆ W0 ⊆ · · · ⊆ W0 ⊆ W,
(k)
so that W0 consists of those elements of W whose last k entries are 0. From the previous
(k)
proof we see that at each stage we need add at most one generator to the generators of W0 to
(k−1) (k) (k−1)
produce a set of generators for W0 and won’t need to add any generator if W0 = W0 .
Further by the nature of how these generators are constructed they are linearly independent.
Hence we can produce a basis for W containing at most n elements.

Remark 180 Do make note of the important differences between the theory of vector spaces
and the more general theory of modules.

• Most modules don’t have bases — only the free modules.

• A linearly independent set cannot always be extended to a basis (even in a free module).

• A spanning set need not contain a basis (even in a free module)..

• A proper submodule of a free module can have the same rank.

Definition 181 A module M is said to be cyclic if it is generated by a single element. For


an F [x]-module such a generator is called a cyclic vector.

MODULES 47
Example 182 The cyclic Z-modules are the cyclic groups.

Example 183 Given a square matrix A over a field F then F [A] is a cyclic module as it is
generated by I.

Example 184 The R[x]-module defined on R3 by


 
1 0 0
A=  0 1 1 
0 0 1

is not cyclic. Note that mA (x) = (x−1)2 and so for any v ∈ R3 we see v generates v, Av = R3
as A2 v = 2Av − v and so there is no cyclic vector.
The x-axis M1 and yz-plane M2 however are submodules (they are A-invariant subspaces)
and are cyclic as they are respectively generated by i and k. So we can decompose the module

R3 = M1 ⊕ M2

as the direct sum of cyclic submodules.

Proposition 185 Let V be an n-dimensional vector space over F with an F [x]-module struc-
ture defined by T : V → V. Say that V is cyclic as an F [x]-module, and that v ∈ V is a cyclic
vector. Then the vectors
v, T v, T 2 v, . . . , T n−1 v
form a basis for V as a vector space. With respect to this basis T has matrix
 
0 0 · · · 0 −a0
 1 0 · · · 0 −a1 
 
 . . . .. 
C(f) =  0 1 . −a2 
 . . .. 
 .. . . . . . 0 . 
0 · · · 0 1 −an−1

where the minimal and characteristic polynomials of T both equal

f(x) = xn + an−1 xn−1 + · · · + a1 x + a0 .

The matrix C(f ) is called the companion matrix of f (x).

Proof. As v generates V as an F [x]-module then the set

v, T v, T 2 v, . . .

spans V as a vector space. If k is the first occasion that

v, T v, T 2 v, . . . , T k v

are linearly dependent as vectors, then we have that v, T v, T 2 v, . . . , T k−1 v are independent and
spanning and so k = dim V = n.

MODULES 48
Hence v, T v, T 2 v, . . . , T n−1 v is a basis for V as a vector space. That T (T k v) = T k+1 v for
0 k n − 2 accounts for the first n − 1 columns of C(f ). Then for some a0 , a1 , . . . , an−1 ∈ F
we have
T n v = −a0 v − a1 T v − · · · − an−1 T n−1 v.
If we define f(x) as above then we have f (T )v = 0. But as polynomials in T commute we
also have f (T )T k v = 0 for all k and hence f (T ) = 0. It must be the case that det mT n as
2 n−1
v, T v, T v, . . . , T v are independent and also the case that mT |f. As f is monic then mT = f.
As cT has degree n and mT |cT then we also have mT = cT .

Definition 186 Given an R-module M and submodule N then we can form the quotient
R-module M/N by
M/N = {m + N : m ∈ M }
with
(m1 + N ) + (m2 + N) : = (m1 + m2 ) + N, r.(m + N ) : = r.m + N.

Example 187 Whilst somewhat cumbersome in notation expressing a module as a quotient


module can make transparent the module structure. For example

R[x]
x+1

is isomorphic to the R[x]-module defined on R by −I1 = (−1). All this information is captured
in writing the module as above — namely that scalar multiplication by x acts in the same way
as multiplication by −1.

Example 188 Note that for any polynomial p(x) in F [x] that

F [x]
p(x)

is a cyclic F [x]-module as it is generated by 1.

Example 189 The Chinese Remainder Theorem still holds for modules and so we have for
example that
R[x] ∼ R[x] R[x]
2
= ⊕
x −1 x−1 x+1
with a module isomorphism being given by

a + bx + x2 − 1 → (a + bx + x − 1 , a + bx + x + 1 ) = (a + b + x − 1 , a − b + x + 1 ).

Theorem 190 (First Isomorphism Theorem) Let φ : M → N be a module homomorphism.


Then the map
M
φ̃ : → Im φ where φ̃(m + ker φ) = φ(m)
ker φ
is a module isomorphism.

MODULES 49
Proof. This proof is almost identical to the first isomorphism theorem for rings, with an extra
line to check the R-module structures align.

Example 191 We return here to the modules R[C] and R[D] introduced earlier where

1 0 1 0
C= , D= .
0 2 0 3

We can decompose these modules in such a way that should make clearer the nature of the
homomorphisms we found between them. We shall show that

R[x] R[x] R[x]


R[C] ∼
= ∼
= ⊕ .
x2 − 3x + 2 x−1 x−2

These isomorphisms are given by

p(C) → p(x)+ x2 −3x+2 and by p(x)+ x2 −3x+2 → (p(x) + x − 1 , p(x) + x − 2 ) .

In a similar fashion
R[x] R[x]
R[D] ∼
= ⊕ .
x−1 x−3
Whilst these may look somewhat cumbersome decompositions they contain explicitly and trans-
parently the R[x]-module structure. Thus finding the module homomorphisms between these two
modules is a lot more straightforward and the answer a lot clearer. Note for example that

R[x] R[x]
x.(α, β) = (α, 2β) in ⊕ .
x−1 x−2

A module homomorphism

R[x] R[x] R[x] R[x]


φ: ⊕ → ⊕
x−1 x−2 x−1 x−3

is, in particular, an R-linear map and so is determined by φ(1, 0) and φ(0, 1). Say that φ(1, 0) =
(a, b) and φ(0, 1) = (c, d); then, as φ is a module homomorphism, we have

(a, 2b) = x.(a, b) = x.φ(1, 0) = φ(x.(1, 0)) = φ(1, 0) = (a, b);


(c, 3d) = x.(c, d) = x.φ(0, 1) = φ(x.(0, 1)) = φ(0, 2) = (2c, 2d).

So we see that b = c = d = 0 and that the only homomorphisms are of the form

(α, β) → (aα, 0).

These module homomorphisms correspond to a scaling in the R[x]/ x−1 factor and a collapsing
of the R[x]/ x − 2 , hopefully not surprisingly as scalar multiplication by x in R[D] never
corresponds to multiplication by 2 as it does in the second summand of R[C].

MODULES 50
Example 192 We return to the R[x]-module R3 defined by the matrix
 
1 0 0
A =  0 1 1 .
0 0 1

We have already seen that this ideal is not cyclic but can be decomposed as the direct sum of the
x-axis M1 and the yz-plane M2 which are cyclic submodumles. Somewhat more transparently
we can represent the module structure by writing

R[x] ∼ R[x] 1 1
M1 ∼
= = R[I1 ] and M2 ∼
= ∼
=R .
x−1 (x − 1)2 0 1

MODULES 51
6. SMITH NORMAL FORM. PRESENTATIONS.

Throughout let R denote a Euclidean domain.

Definition 193 (a) A square matrix P with entries in R is said to be invertible if there is a
matrix Q with entries in R such that P Q = I = QR.
(b) Two m×n matrices A and B are said to be equivalent if there exist an invertible m×m
matrix P and an invertible n × n matrix Q such that P AQ = B. A simple check shows that
equivalence is indeed an equivalence relation.

Example 194 The matrix diag(1, 2) is not invertible over Z but would be, say, over Q[x].

Theorem 195 (Smith Normal Form) Let A be an m × n matrix with entries in R. Then
there exist elements d1 , d2 , . . . , dr , known as invariant factors, and unique up to multiplication
by units, such that A is equivalent to

diag(d1 , d2 , . . . , dr ) 0r(n−r)
0(m−r)r 0(m−r)(n−r)

where
d1 |d2 |d3 | · · · |dr .

Proof. Existence: Our first aim is to employ EROs and ECOs to show that A is equivalent to

d1 01(n−1)
0(m−1)1 M

where d1 divides every entry of M .


Step One: By permuting rows and columns we can move an entry d of least norm to the
first-row-first-column entry. We now aim to clear out the first row using ECOs: note every
entry of the first row can be written in the form a1j = qj d + rj and if a remainder rj of smaller
norm is produced we permute it to the top left entry and begin again. This process must
terminate as the norm we are continually decreasing takes a positive integral value.
Step Two: We then proceed similarly seeking to clear out the first column using EROs.
Should we produce an entry of smaller norm then we move that row to the first row and return
to Step One.
Step Three: Eventually the first row and first column have been cleared and it follows that
the matrix A can be put in the above form but it may not be the case that d divides every
entry of M. If all of the entries of M are divisible by d then we are done. If not then by EROs
and ECOs we can again produce an element of smaller norm than d and we return to Step One.
The process must ultimately terminate as at each state we are producing entries of strictly
smaller norm. Thus we have demonstrated that A is equivalent to a matrix of the above form
and by repeating this process on M and so on we have shown the existence of the Smith normal
form.

SMITH NORMAL FORM. PRESENTATIONS. 52


Uniqueness: To prove uniqueness, we introduce the notion of determinantal divisors. The
ith determinantal divisor Di (A) is the highest common factor of the determinants of all i × i
submatrices of A. Note that Di (A) is only defined up to multiplication by units and is invariant
under EROs and ECOs. When the matrix is in Smith normal form, as above, then we see that
Di (A) = d1 d2 · · · di for i r and Di (A) = 0 for i > r.
Hence dk = Dk (A)/Dk−1 (A) is invariant under EROs and ECOs and so the invariant factors
are unique.
Corollary 196 (Submodules of Free Modules) Let M be a submodule of Rn . Then there
exist elements d1 , d2 , . . . , dr , where r is the rank of M, and a basis f1 , . . . , fn for Rn such that
d1 f1 , d2 f2 , . . . , dr fr is a basis for M and d1 |d2 |d3 | · · · |dr .
To see why this corollary follows we will apply the method to the following matrix.
Example 197 Put the following matrix in Smith normal form
 
12 6 4 8
 3 9 6 12 
 
 2 16 14 28  .
20 10 10 20
Solution. We will do more than simply put the matrix into Smith normal form and keep track
in particular of the ECOs that we are using. We find
     
e1 e2 e3 e4 e1 e2 e3 + 2e4 e4 e1 e2 e3 + 2e4 e4
 12 6 4 8   12 6 4 0   0 
     12 6 4 
 3 9 6 12  ∼  3 9 6 0  ∼  1 −7 −8 0 
     
 2 16 14 28   2 16 14 0   2 16 14 0 
20 10 10 20 20 10 10 0 20 10 10 0
   
e1 e2 e3 + 2e4 e4 e1 e2 e3 + 2e4 e4
 0 90 100 0   0 
   0 0 10 
∼  1 −7 −8 0  ∼  1 −7
  −8 0 

 0 30 30 0   0 30 30 0 
0 150 170 0 0 0 20 0
   
e1 e2 e3 + 2e4 e4 e1 − 7e2 − 8e3 − 16e4 e3 + 2e4 e2 e4
 0 0 10 0   1 0 0 0 
   
∼  1 −7 −8 0 ∼
  0 10 0 0 

 0 30 0 0   0 0 30 0 
0 0 0 0 0 0 0 0
Hence the Smith normal form is  
1 0 0 0
 0 10 0 0 
 
 0 0 30 0  .
0 0 0 0

But we are also able to answer the following.

SMITH NORMAL FORM. PRESENTATIONS. 53


Example 198 Find a basis f1 , f2 , f3 , f4 for Z4 such that d1 f1 , . . . , dr fr is a basis for

M = (12, 6, 4, 8), (3, 9, 6, 12), (2, 16, 14, 28), (20, 10, 10, 20)

where r = rank(M ) and d1 |d2 | · · · |dr .

Solution. In our previous calculation we saw that


   
e1 e2 e3 e4 e1 − 7e2 − 8e3 − 16e4 e3 + 2e4 e2 e4
 12 6 4 8   1 0 0 0 
   
 3 9 6 12  ∼   0 10 0 0 
 .
 2 16 14 28   0 0 30 0 
20 10 10 20 0 0 0 0

The column headings are still a basis for Z4 as we began with a basis e1 , e2 , e3 , e4 and each ECO
is invertible. The rows (with co-ordinates understood wrt this new basis) still span M and are
now clearly independent. So we have shown that

f1 = e1 − 7e2 − 8e3 − 16e4 , f2 = e3 + 2e4 , f3 = e2 , f4 = e4

is a basis for Z4 and f1 , 10f2 , 30f3 is a basis for M.

Before we move on to the Structure Theorem we will need to introduce the ideas of gener-
ators, relations and presentations. We begin with some motivational examples.

Example 199 Consider the following modules.

R[x] R[x]
M1 = Z2 ⊕ Z4 ⊕ Z5 , M2 = R[x] ⊕ ⊕ 2 .
x−1 x +1

The Z-module M1 is generated by

a = (1, 0, 0), b = (0, 1, 0), c = (0, 0, 1).

The module is not free and we see for example the "relations" 2a = 4b = 5c = 0. We could have
chosen other generators for M1 such as a and b+c which satisfy the relations 2a = 20(b+c) = 0.
These facts are represented in the isomorphisms

Z3 Z2
M1 ∼
= ∼
= .
(2, 0, 0), (0, 4, 0), (0, 0, 5) (2, 0), (0, 20)

Note that every relation involving a and b + c can be deduced from 2a = 20(b + c) = 0 with these
relations generating all relations.
For M2 we note again that a, b, c are generators with (x − 1)b = (x2 + 1)c = 0 or again
might have been generated by a and b + c. Again we have

R[x]3 R[x]2
M2 ∼
= ∼
= .
(0, x − 1, 0), (0, 0, x2 + 1) (0, 0), (0, (x − 1)(x2 + 1))

SMITH NORMAL FORM. PRESENTATIONS. 54


For each module, and for each different set of generators, the relations these generators satisfy
can be captured in a "presentation" matrix. In the case of M1 these would be the matrices
 
2 0 0
2 0
 0 4 0  and
0 20
0 0 5
where the columns in the first matrix relate to a, b, c and in the second to a, b + c. For M2 the
two presentation matrices would be
0 x−1 0 0 0
2 and .
0 0 x +1 0 (x − 1)(x2 + 1)

Definition 200 (a) As we have defined already the elements x1 , . . . , xn in the R-module M
are generators if every element x of M can be written in the form

x = r1 x1 + r2 x2 + · · · + rn xn .

(b) A relation in the generators x1 , . . . , xn is any (trivial or non-trivial) combination that adds
to 0, that is
r1 x1 + r2 x2 + · · · + rn xn = 0.

Proposition 201 Given a finitely generated R-module M with generators x1 , . . . , xn there is


an onto module homomorphism φ : Rn → M and the relations in x1 , . . . , xn correspond to ker φ.

Proof. There is a module homomorphism

φ : Rn → M given by ei = (0, . . . , 0, 1, 0, . . . , 0) → xi .

As the xi generate M then φ is onto and we have, by the first isomorphism theorem, that
Rn ∼
= Im φ = M.
ker φ
Further
r1 x1 + r2 x2 + · · · + rn xn = 0
is a relation if and only if (r1 , r2 , . . . , rn ) ∈ ker φ.

Corollary 202 A finitely generated module is isomorphic to a quotient of a free module.

Proof. This is immediate from the above by the First Isomorphism Theorem as M ∼
= Rn / ker φ.

Corollary 203 A submodule of a finitely generated module is finitely generated.

Proof. If M is finitely generated then M ∼ = Rn /K for some submodule K of Rn . There is a


correspondence between the submodules of Rn/K and the submodules of Rn which contain K
(as with rings). So a submodule N of M is of the form P/K where P is a submodule of Rn .
As submodules of free modules are free, and so finitely generated, then N = P/K is finitely
generated.

SMITH NORMAL FORM. PRESENTATIONS. 55


Definition 204 Let M be an R-module generated by x1 , . . . , xn and φ : Rn → M be as above.
Then the relation module ker φ is finitely generated, say by the relations

a11 x1 + a12 x2 + · · · + a1n xn = 0, ··· am1 x1 + am2 x2 + · · · + amn xn = 0.

Then the presentation matrix for M with respect to these generators and relations is A =
(aij ). As ker φ is free the relations can be chosen to be a basis (but do not need to be).

Example 205 Say that A is the abelian group generated by a, b, c subject to the relations

6a + 18b + 12c = 0, 12a − 9b + 15c = 0, 9a − 12b − 24c = 0.

Show that A is isomorphic to Z3 ⊕ Z3 ⊕ Z414 and find, in terms of a, b, c all elements of order
3.

Solution. This module is


Z3
A=
(6, 18, 12), (12, −9, 15), (9, −12, −24)

though this is not a particularly informative or transparent presentation. Rather we put the
presentation matrix into Smith normal form as follows:
     
a b c a b c a b c
 6 18 12   6 18 12   3 −30 12 
     ∼
 12 −9 15  ∼  0 −45 −9  ∼  0 45 9 
9 −12 24 3 −30 12 0 78 −12
     
a − 10b + 4c b c a − 10b + 4c b 5b + c a − 10b + 4c 5b + c b
 3 0 0   3 0 0   3 0 0 
 ∼ ∼ 
 0 45 9   0 0 9   0 −3 138 
0 78 −12 0 138 −12 0 9 0
   
a − 10b + 4c 5b + c b a − 10b + 4c c − 41b b
 3 0 0   3 0 0 
∼ 
∼ 
0 3 −138   0 3 0 
0 0 414 0 0 414

Hence if we set α = a − 10b + 4c, β = c − 41b, γ = b then we see

A = α, β, γ : 3α = 3β = 414γ = 0 ∼
= Z3 ⊕ Z3 ⊕ Z414 .

We can see then that there are 26 elements of order three and that these are of the form

ε1 α + ε2 β + 138ε3 γ where ε1 , ε2 , ε3 ∈ {0, 1, 2} without all being zero.

SMITH NORMAL FORM. PRESENTATIONS. 56


Example 206 Consider the R[x]-module M defined on R3 by
 
1 0 0
A =  0 1 1 .
0 0 1

We have already noted that this module is not cyclic; but we can see that e1 , e3 are generators
as x.e3 = Ae3 = e2 + e3 .
Now note that x.e1 = e1 so that (x − 1).e1 = 0 and we also have

x2 .e3 = x.(e2 + e3 ) = 2e2 + e3 = 2x.e3 − e3

so that (x − 1)2 .e3 = 0. So

(x − 1).e1 = 0, (x − 1)2 .e3 = 0

are relations. In fact these relations are sufficient to generate all relations. To appreciate this
note that dimR M = 3 and that the map

R[x]2 ∼ R[x] R[x]


φ: M → 2
= ⊕
((x − 1), 0), (0, (x − 1) ) x−1 (x − 1)2

given by φ(e1 ) = (1, 0), φ(e2 ) = (0, x−1), φ(e3 ) = (0, 1) is an R[x]-homomorphism and bijective
as the LHS and RHS are 3-dimensional real vector spaces.

Given a vector space V over a field F with basis e1 , . . . , en and defined as a F [x]-module
via a matrix A, then V is generated by e1 , e2 , . . . , en and we have the relations xei − Aei = 0
for each i. In fact these relations generate the relation module as the entire module structure
is a consequence of linearity and inductive use of these relations. Thus the presentation matrix
for V with respect to generators e1 , e2 , . . . , en and the relations xei − Aei = 0 is the matrix

xIn − A.

Example 207 Let V = C3 and  


1 −1 1
T =  0 0 1 .
0 1 0
Put the presentation matrix xI − T into Smith normal form and find the invariant factors.
Find generators for V. What is the minimal polynomial of T ?

Solution. We now put the generators on the left like so


 
e1 x − 1 1 −1
 e2 0 x −1 
e3 0 −1 x

SMITH NORMAL FORM. PRESENTATIONS. 57


so that the first column represents (x − 1).e1 = 0 and the second that x.e2 + e1 − e3 = 0 etc.
So now EROs will change the generators and ECOs have no effect. Then, placing this in Smith
Normal form we find
   
e1 x − 1 1 −1 e1 x−1 0 x−1
 e2 0 x −1  ∼  e2 0 0 x2 − 1 
e3 0 −1 x e3 − e1 − xe2 0 −1 x
   
e1 x−1 0 0 e1 + xe2 − e3 1 0 0
∼ e2 0 0 x2 − 1  ∼∼  e1 0 x−1 0 
2
e1 + xe2 − e3 0 1 0 e2 0 0 x −1

Hence the invariant factors are x − 1 and x2 − 1. This also means that the module defined by
T is isomorphic to
C[x] C[x]
V = ⊕ 2
x−1 x −1
with generators being e1 and e2 and relations being

(x − 1).e1 = 0, (x2 − 1).e2 = 0.

It also follows that the minimal polynomial of T is x2 − 1.

SMITH NORMAL FORM. PRESENTATIONS. 58


7. STRUCTURE THEOREM. APPLICATIONS.

Definition 208 Let M be a finitely-generated R-module. An element m ∈ M is said to be a


torsion element, if r.m = 0 for some r = 0. M is said to have torsion if it has any non-zero
torsion elements. M is said to be torsion-free if 0 is the only torsion element.

Proposition 209 (a) Let M be a finitely-generated R-module. The torsion elements form a
submodule T of M.
(b) A finitely generated torsion-free R-module is free.
(c) The module F = M/T is a free R-module and M ∼ = F ⊕ T.
Proof. (a) Note that 0 ∈ T. If m1 , m2 ∈ T then there exist non-zero r1 , r2 such that r1 .m1 =
r2 .m2 = 0. Then r1 r2 = 0 and we have
r1 r2 .(m1 + m2 ) = 0.
Further for r ∈ R we have that rm1 ∈ T as
r1 .(rm) = r1 r.m = rr1 .m = r.r1 m = 0.
Hence T is a submodule.
(b) Let N be a finitely generated torsion-free R-module. Say that x1 , . . . , xn generate N
and that (by reordering if necessary) x1 , . . . , xm is a maximal linearly independent subset of
the xi . We set F to be the span of x1 , . . . , xm , noting that F is free. Note that there exist
ri , rij ∈ R not all zero and such that
m
ri xi + rij xj = 0.
j=1

This is clear for i m and follows by the maximality of m for i > m. Further as x1 , . . . , xm
are independent then ri = 0 for all i. We set
r = r1 r2 · · · rn = 0.
Note that ri xi is in F for all i and hence rxi ∈ F for all i. But then rN ⊆ F. Note that the map
n → r.n is an injective module homomomorphism as N is torsion-free. So N is isomorphic to
its image which is a submodule of the free module F and so free itself.
(c) As M is finitely generated then F = M/T is finitely generated and is torsion free, and
so free, by construction. Let x1 , . . . , xn be a basis for F. Then any element of M/T can be
uniquely written as
r1 x1 + · · · + rn xn + T.
Then the map φ : M/T ⊕ T → M given by
φ(r1 x1 + · · · + rn xn + T, t) = r1 x1 + · · · + rn xn + t
is a module isomorphism.

STRUCTURE THEOREM. APPLICATIONS. 59


Example 210 A linear map T : V → V on a finite dimensional vector space V over F defines
an F [x]-module. In this module every element is a torsion element as mT (x).v = 0 for all v
yet mT (x) = 0 ∈ F [x].
Theorem 211 (Structure Theorem for Finitely Generated Modules) Let M be a finitely-
generated R-module. Then there exists a non-negative integer r, called the (torsion-free) rank
of M and non-zero, non-unit elements di ∈ R, known as the invariant factors such that
d1 |d2 |d3 | · · · |dk
and such that
R R R
M∼ = Rr ⊕ ⊕ ⊕ ··· ⊕ .
d1 d2 dk
The rank r is unique and d1 , . . . , dk unique up to multiplication by units.
Proof. Say that M , which is finitely generated, is generated by x1 , . . . , xn . There is then a
module homomorphism
φ : Rn → M given by ei = (0, . . . , 0, 1, 0, . . . , 0) → xi .
As the xi generate M then φ is onto and we have, by the first isomorphism theorem, that
Rn ∼
= Im φ = M.
ker φ
Now by the corollary to the Smith normal form, we know that there is a basis f1 , . . . , fn for Rn
and di as above with
ker φ = 0 ⊕ · · · ⊕ 0 ⊕ d1 ⊕ · · · ⊕ dk
and hence we have
Rn R R R
M∼
= ∼
= Rn−k ⊕ ⊕ ⊕ ··· ⊕ .
0 ⊕ · · · ⊕ 0 ⊕ d1 ⊕ · · · ⊕ dk d1 d2 dk
Had any of the di been units then we would have R/ di = R/R ∼ = 0 and we can just omit such
factors.
Now note that dk M = dk Rr is a free module and so the rank of M is uniquely determined.
However we will postpone for now proving the uniqueness of invariant factors until we have
introduced the notion of elementary divisors.
Example 212 Present each of the following modules as described in the structure theorem.
(a) Z6 ⊕ Z12 ⊕ Z16 ∼
= Z2 ⊕ Z3 ⊕ Z3 ⊕ Z4 ⊕ Z16 ∼
= Z2 ⊕ Z12 ⊕ Z48 .
(b)
Q[x]
.
(x − 4)(x3 − 8)
2

This is already cyclic (it is generated by 1) and so is in the required form.


(c)
Z[i] Z[i] Z[i] ∼ Z[i] Z[i]
⊕ ⊕ = ⊕ .
2 4 5 2 20
Note that all of the above three modules all have zero rank.

STRUCTURE THEOREM. APPLICATIONS. 60


There is an alternative form of the structure theorem. The above decomposition might be
viewed as a minimal decomposition into cyclic submodules. Alternatively we can decompose
some of the summands yet further by using the Chinese remainder theorem applied to the
coprime factors of the invariant factors. Let p1 , . . . , pn be the prime factors of at least one of
the di . Then we may write
di = pα1 1i pα2 2i · · · pαnni
where 0 αki α(k+1)i . Then we have

R ∼ R R R
= α1i ⊕ α2i v · · · ⊕ αni
di p1 p2 pn
so that
k k k
R R R
M∼
= ⊕ ⊕ ··· ⊕ .
i=1
pα1 1i i=1
pα2 2i i=1
pαnni
α
The elements pj ji , where αji > 0, are known as the elementary divisors.

Example 213 Applying this alternative decomposition to the previous three examples we would
write

Z6 ⊕ Z12 ⊕ Z16 ∼
= Z2 ⊕ Z4 ⊕ Z16 ⊕ Z3 ⊕ Z3 .
Q[x] ∼ Q[x] Q[x] Q[x]
= ⊕ ⊕ 2 .
(x − 4)(x3 − 8)
2 (x − 2)2 x+2 x + 2x + 4
Z[i] Z[i] Z[i] ∼ Z[i] Z[i] Z[i] Z[i] Z[i] Z[i]
⊕ ⊕ = ⊕ 2
⊕ ⊕ 2
⊕ ⊕ .
2 4 5 1+i (1 + i) 1−i (1 − i) 1 + 2i 1 − 2i

Proposition 214 The elementary divisors and invariant factors are unique (up to multiplica-
tion by units).

Proof. We can separately consider the different irreducibles p for which there is some non-zero
element x of M satisfying pn x = 0 for a power of p, because different (non-associate) irreducibles
will be coprime. So say that M is a module with pn M = 0 for some positive integer n, and
choose n to be the least such n. Then we have that pn−1 M is a non-zero vector space over the
field R/ p for if r1 = r2 in R/ p then r1 − r2 = rp for some r and we have

r1 .pn−1 m = (r2 + rp).pn−1 m = r2 .pn−1 m.

The dimension of pn−1 M as an R/ p -vector space is the number of copies of R/ pn−1 in the
decomposition and so recoverable from M. In a similar fashion pn−2 M/pn−1 M is an R/ p -
vector space and its dimension is the total number of copies of R/ pn−2 and R/ pn−1 in
the decomposition, and subtracting our previously found dimension we now know the number
of summands of R/ pn−2 in the decomposition. Continuing in this fashion we are able to
determine the number of each different summand. The invariant factors are then recoverable
from the elementary divisors in a straightforward, but notationally painful manner. Begin by
putting the highest power of each irreducible among the elementary divisors into dk and keep
repeating this process to produce all the invariant factors.

STRUCTURE THEOREM. APPLICATIONS. 61


Example 215 To help understand the previous proposition, here is the argument made for a
specific Z-module
M = Z2 ⊕ Z2 ⊕ Z4 ⊕ Z8 ⊕ Z8 .
Note that 8M = 0 and that

4M = 0 ⊕ 0 ⊕ 0 ⊕ 4 ⊕ 4 ∼
= Z22 .
This dimension, 2 is the number of Z8 summands. Now
2M ∼ 0 ⊕ 0 ⊕ 2 ⊕ 2 ⊕ 2 ∼ 3
= = Z2 .
4M 0⊕0⊕0⊕ 4 ⊕ 4
This dimension, 3, is the number of Z4 and Z8 summands in total. Finally
M Z2 ⊕ Z2 ⊕ 1 ⊕ 1 ⊕ 1 ∼ 5
= = Z2
2M 0⊕0⊕ 2 ⊕ 2 ⊕ 2
and this dimension 5 is the total number of Z2 , Z4 , Z8 summands. So we see that the number
of each elementary divisor is recoverable from these dimensions.

Corollary 216 (Classification Theorem for Finitely Generated Abelian Groups) Let
A be a finitely generated abelian group. Then there exist unique non-negative r and integers
di 2 with d1 |d2 | · · · |dk such that

A∼
= Zr ⊕ Zd1 ⊕ Zd2 ⊕ · · · ⊕ Zdk .

Proof. This is simply a statement of the Structure Theorem for Z-modules.

Example 217 Find, up to isomorphism, all abelian groups of order 360. What are their ele-
mentary divisors?
If not explicitly on your list, explain which of your groups is isomorphic to Z4 ⊕ Z90 ?

Solution. Note that 360 = 23 32 5. Hence k 3 and we must have 2 × 3 × 5 = 30|dk . We see
that the only abelian groups, up to isomorphism, are

Z360 , , Z2 ⊕ Z180 , Z3 ⊕ Z120 , Z6 ⊕ Z60 , Z2 ⊕ Z2 ⊕ Z90 , Z2 ⊕ Z6 ⊕ Z30 .

The elementary divisors of these groups are respectively

8, 9, 5, 2, 4, 9, 5, 8, 3, 3, 5, 2, 4, 3, 3, 5, 2, 2, 2, 9, 5, 2, 2, 2, 3, 3, 5.

Now note that


Z4 ⊕ Z90 ∼
= Z4 ⊕ Z2 ⊕ Z45 ∼
= Z2 ⊕ Z180
or we might have noted its elementary divisors to be 4, 2, 5, 9 and so it is the second listed
group.

Example 218 Identify the abelian group generated by four elements a, b, c, d subject to the
relations

12a+6b+4c+8d = 0, 3a+9b+6c+12d = 0, 2a+16b+15c+28d = 0, 20a+10b+10c+20d = 0.

STRUCTURE THEOREM. APPLICATIONS. 62


Solution. We know that we can use EROs and ECOs to put the matrix into Smith normal
form, and from our previous calculation we have
   
a b c d a − 7b − 8c − 16d c + 2d b d
 12 6 4 8   1 0 0 0 
   
 3 9 6 12  ∼  0 10 0 0 .
   
 2 16 14 28   0 0 30 0 
20 10 10 20 0 0 0 0

Hence if we set

α = a − 7b − 8c − 16d, β = c + 2d, γ = b, δ=d

then the described abelian group is

α, β, γ, δ : α = 10β = 30γ = 0 ∼
= Z10 ⊕ Z30 ⊕ Z.

There are two elements of order 3 for example, namely (0, ±10, 0) or in terms of the generators
these are
±10β = ±(10c + 20d).

Corollary 219 (Rational Canonical Form) Let A be an n × n matrix over a field F. Then
there A is similar to a matrix in the form

diag(C(d1 ), C(d2 ), . . . C(dk ))

where di ∈ F [x] are monic polynomials, C(di ) denotes the companion matrix of di and

d1 |d2 | · · · |dk .

The di are unique up to multiplication by units. Note further that

mA (x) = dk (x) and that cA (x) = d1 (x)d2 (x) · · · dk (x).

The above matrix representative of A is known as its rational canonical form or Frobenius
normal form.

Remark 220 Equivalently, in terms of the F [x]-module structure defined on F n by A, the


above says that
Fn ∼ = F [C(d1 )] ⊕ F [C(d2 )] ⊕ · · · ⊕ F [C(dk )],
hence decomposing the F [x]-module defined by A, which in general will not be cyclic, into cyclic
F [x]-modules defined by the above companion matrices.

STRUCTURE THEOREM. APPLICATIONS. 63


Proof. Consider the F [x]-module structure defined on F n by the linear map T (v) = Av. This
module has rank zero and every element is a torsion element. By the Structure Theorem we
know that there exist invariant factors di ∈ F [x] such that
d1 |d2 | · · · |dk
with
F [x] F [x] F [x]
Fn ∼
= ⊕ ⊕ ··· ⊕ .
d1 (x) d2 (x) dk (x)
If deg d1 = n1 then 1, x, . . . , xn1 −1 is a basis for F [x]/ d1 (x) as a vector space and with respect
to this basis multiplication by x, or equivalently by A, on F [x]/ d1 (x) is given by the companion
matrix C(d1 ). Taking such a basis for each summand, the union of these bases is a basis for F n
and with respect to this basis T has matrix representative
diag(C(d1 ), C(d2 ), . . . C(dk ))
to which A is similar. (In choosing the above basis we have found a change of basis matrix
P such that P −1 AP equals the above matrix representative.) Conversely any such matrix
representation leads to a decomposition of the F [x]-module as above and we know that the
invariant factors are unique up to multiplication by units.
For elements in the final summand F [x]/ dk (x) we know that the minimal polynomial
is dk (x). As di |dk for all i then dk (A) also annihilates the other summands and we see that
mA (x) = dk (x). We also have that the characteristic polynomial of C(di ) is di (x); as the char-
acteristic polynomial of the above matrix representative equals the product of the characteristic
polynomials of the blocks we have
cA (x) = d1 (x)d2 (x) · · · dk (x).

Corollary 221 (Jordan Normal Form) Let A be a complex n × n matrix with distinct
eigenvalues λ1 , λ2 , . . . , λk . Then A is similar to a matrix of the form
diag(J(λ1 , r11 ), J(λ1 , r12 ), . . . J(λ1 , r1n1 ), . . . J(λk , rk1 ), J(λ1 , rk2 ), . . . J(λk , rknk ))
where J(λ, r) denotes the r × r Jordan block matrix
 
λ 0 0
··· 0
 . . .. 
 1 λ 0 . . 
 ... 
J(λ, r) = 
 0 1 λ 0 

 . ... ... ... 
 .. 0 
0 ··· 0 1 λ
and
ri1 ri2 ··· rini for each i.
Note that
nullity(A − λi I)j − nullity(A − λi I)j−1 = number of J(λi , r) blocks with r j.

STRUCTURE THEOREM. APPLICATIONS. 64


Remark 222 Equivalently, in terms of the C[x]-module structure defined on Cn by A, the
above says that

Cn ∼
= C[J(λ1 , r11 )] ⊕ C[J(λ1 , r12 )] ⊕ · · · ⊕ C[J(λk , rknk )],

hence decomposing the C[x]-module defined by A, which in general will not be cyclic, into cyclic
C[x]-modules defined by the above Jordan block matrices.

Proof. Consider the C[x]-module structure defined on Cn by A. We have that

cA (x) = d1 (x)d2 (x) · · · dk (x)

and by the Fundamental Theorem of Algebra the elementary divisors are all of the form (x−λ)r
for some r and λ an eigenvalue. So the alternative statement of the Structure Theorem gives
us
C[x] C[x] C[x] C[x] C[x]
Cn ∼
= ⊕ ⊕· · ·⊕ ⊕· · ·⊕ ⊕· · ·⊕ .
r
(x − λ1 ) 11 r
(x − λ1 ) 12 r
(x − λ1 ) 1
1n r
(x − λ1 ) k1 (x − λ1 )rknk
Note that is a basis for C[x]/ (x − λ)r as a vector space, and with respect to this basis we see
that multiplication by x is represented by the matrix J(λ, r). To see this note that

x(x − λ)s = λ(x − λ)s + (x − λ)s+1 for 0 s<r−1

and that x(x − λ)r−1 = λ(x − λ)r−1 as (x − λ)r = 0. Hence with respect to the union of these
bases multiplication by A is represented by the above matrix of Jordan blocks.
We know that the elementary divisors are unique (up to units) and so the Jordan normal
form is unique. However we can further appreciate that the number of J(λ, r) blocks equals
nullity(A − λI), a basis for ker(A − λI) consisting of the last basis vector associated with each
such block. A basis for ker(A−λI)2 consists of those vectors just described and the penultimate
basis vectors of those J(λ, r) blocks for which r 2. etc.

Remark 223 If we instead worked with the (ordered) basis (x − λ)r−1 , . . . , (x − λ)2 , (x − λ), 1,
then the Jordan blocks would have the form
 
λ 1 0 ··· 0
 . . . .. 
 0 λ 1 . 
 ... 
 0 0 λ 0 
 
 . . . . 
 . . . . . . . . 1 
0 ··· 0 0 λ
which is a form just as commonly used.

Example 224 Find the RCFs and JNFs of the following complex matrices.
     
0 1 1 0 1 0 2 0 0
X=  0 0 1 , Y =  0 0 1 , Z =  0 2 0 .
0 0 0 1 0 0 0 1 2

STRUCTURE THEOREM. APPLICATIONS. 65


Solution. Note that mX (x) = cX (x) = x3 . Hence the rational canonical form is C(x3 ). The
geometric multiplicity of 0 is 1 and so we have just one Jordan block J(0, 3). So the RCF and
JNF of X are both  
0 0 0
 1 0 0 .
0 1 0
Note that mY (x) = cY (x) = x3 − 1. This has three distinct complex roots 1, ω and ω 2 where
ω = cis(2π/3). So the RCF equals C(x3 − 1) and the JNF equals diag(J(1, 1), J(1, ω), J(1, ω 2 )).
Explicitly these are    
0 0 1 1 0 0
 1 0 0 ,  0 ω 0 .
0 1 0 0 0 ω2
Finally note that C is already in JNF equalling diag(J(2, 1), J(2, 2)). We have mZ (x) = (x−2)2
and cZ (x) = (x − 2)3 so that the invariant factors are (x − 2), (x − 2)2 and the RCF equals
˙ Explicitly the RCF and JNF are
diag(C(x − 2), C((x − 2)2 )).
   
2 0 0 2 0 0
 0 0 −4  ,  0 2 0 .
0 1 4 0 1 2

Example 225 (a) Find the RCF and JNF of


 
1 1 1 1
 1 1 1 1 
T = 1

1 1 1 
1 1 1 1

by finding the minimal and characteristic polynomials.


(b) What are the decompositions of the module defined by T that are associated with the
RCF and JNF?
(c) Rederive the invariant factors by finding the Smith normal form of xI − T

Solution. (a) We have cT (x) = x3 (x − 4) and mT (x) = x(x − 4). So the invariant factors are

x, x, x(x − 4).

Hence the RCF and JNF are respectively

diag(C(x), C(x), C(x(x − 4)) and diag(J(0, 1), J(0, 1), J(0, 1), J(4, 1)).

Explicitly these are    


0 0 0 0 0 0 0 0
 0 0 0 0   0 0 0 0 
 ,  .
 0 0 0 0   0 0 0 0 
0 0 1 4 0 0 0 4

STRUCTURE THEOREM. APPLICATIONS. 66


(b) Two bases that the above matrix representatives for T are with respect to are

{(1, −1, 0, 0)T , (0, 1, −1, 0)T , (0, 0, 1, 0)T , (1, 1, 1, 1)T };
{(1, −1, 0, 0)T , (0, 1, −1, 0)T , (0, 0, 1, −1)T , (1, 1, 1, 1)T }.

These correspond respectively to the following decompositions of C4 into submodules as

C4 = (1, −1, 0, 0)T ⊕ (0, 1, −1, 0)T ⊕ (0, 0, 1, 0)T , (1, 1, 1, 1)T
= (1, −1, 0, 0)T ⊕ (0, 1, −1, 0)T ⊕ (0, 0, 1, 0)T ⊕ (1, 1, 1, 1)T .

or we might write these as

C[x] C[x] C[x] C[x] C[x] C[x] C[x]


C4 ∼
= ⊕ ⊕ ∼
= ⊕ ⊕ ⊕
x x x(x − 4) x x x x−4

or as further alternatives as
0 0
C4 ∼
= C [01 ] ⊕ C [01 ] ⊕ C ∼
= C [01 ] ⊕ C [01 ] ⊕ C [01 ] ⊕ C [4I1 ] .
1 4

(c) If we put xI − T into Smith normal form then we find


     
x − 1 −1 −1 −1 1 1 1 1−x 1 1 1 1−x
 −1 x − 1 −1 −1   −1   −x 
  ∼  −1 x − 1 −1 ∼ 0 x 0 
 −1 −1 x − 1 −1   −1 −1 x − 1 −1   0 0 x −x 
−1 −1 −1 x − 1 x − 1 −1 −1 −1 0 −x −x x2 − 2x
     
1 0 0 0 1 0 0 0 1 0 0 0
 0 x 0 −x   −x   
∼ ∼ 0 x 0 ∼ 0 x 0 0 .
 0 0 x −x   0 0 x −x   0 0 x 0 
2 2 2
0 −x −x x − 2x 0 0 0 x − 4x 0 0 0 x − 4x

Example 226 Let V = C3 and  


1 −1 1
T =  0 0 1 .
0 1 0
Find the RCF and JNF of T.

Solution. In Example 207 we put xI − T into Smith normal form and found the invariant
factors to be x − 1 and x2 − 1. So the RCF and JNF are therefore respectively
   
1 0 0 1 0 0
 0 0 1 ,  0 1 0 .
0 1 0 0 0 −1

STRUCTURE THEOREM. APPLICATIONS. 67

You might also like