Symmetry
Symmetry
N. P. STRICKLAND
1. Symmetry groups in Rn
1.1. General linear groups. We write Mn or Mn (R) for the set of n × n matrices over the real numbers.
Recall that an n × n matrix A is invertible if there is a matrix B such that AB = I = BA. This holds iff
det(A) 6= 0, and in that case the matrix B is unique, and we call it A−1 .
We write GLn or GLn (R) for the set of invertible n × n matrices over R.
Recall that a group is a set G equipped with a binary operation ∗ and an element e ∈ G such that
• The set G is closed under ∗, in other words a ∗ b ∈ G whenever a, b ∈ G.
• The operation is associative, in other words a ∗ (b ∗ c) = (a ∗ b) ∗ c whenever a, b, c ∈ G.
• e is a neutral element, in other words e ∗ a = a = a ∗ e for all a ∈ G.
• The operation has inverses: for any a ∈ G there exists an element a−1 ∈ G with a∗a−1 = e = a−1 ∗a.
For most groups in this course, we will write ab for a ∗ b and 1 for e.
It is easy to check that GLn is a group under matrix multiplication; it is called the general linear group.
1.2. Orthogonal groups. Given vectors x = (x1 , . . . , xn ) and y = (y1 , . . . , yn ) in Rn , we define
n
X
hx, yi = xi yi
i=1
p
kxk = hx, xi = the length of x
d(x, y) = kx − yk = the distance from x to y .
Proposition 1.1 (The Cauchy-Schwartz inequality). For any x, y ∈ Rn we have |hx, yi| ≤ kxk kyk.
Proof. (This is included for completeness but is not examinable.)
For any t ∈ R we define
f (t) = kx + tyk2
= hx + ty, x + tyi
= hx, xi + 2thx, yi + t2 hy, yi
= kxk2 + t2 kyk2 + 2thx, yi.
From the first part of the definition we see that f (t) ≥ 0 for all t. We now take t = −hx, yi/kyk2 ; the
geometric interpretation is that in this case x + ty is the projection of x perpendicular to y. Then
t2 kyk2 = hx, yi2 kyk2 /kyk4 = hx, yi2 /kyk2
and
2thx, yi = −2hx, yi2 /kyk2
so
f (t) = kxk2 + t2 kyk2 + 2thx, yi = kxk2 − hx, yi2 /kyk2 .
0 0 0 1
We then have det(Dt ) = t; this shows that the function det : GLn →− R× is surjective. Moreover, R× is a
group under multiplication, and properties (a) and (b) can be restated as follows:
− R× .
Proposition 1.6. The determinant gives a surjective homomorphism det : GLn →
We next recall the First Isomorphism Theorem:
Theorem 1.7. If φ : G → − H is a surjective homomorphism of groups and N = {g ∈ G | φ(g) = 1} is the
kernel of φ, then:
(a) N is a normal subgroup of G; in other words, it contains 1, is closed under multiplication and
inversion, and satisfies gN g −1 = N for all g ∈ G.
(b) It follows that there is a quotient group G/N . The elements of G/N are the cosets of N . For each
coset C we can choose g ∈ G such that C = gN , but there will usually be many choices for g.
(c) There is a unique function φ : G/N → − H with φ(gN ) = φ(g) for all g ∈ G.
(d) The function φ is actually an isomorphism of groups.
Definition 1.8. We write
− R× ) = {n × n matrices A such that det(A) = 1},
SLn = ker(det : GLn →
and call this the special linear group.
3
The First Isomorphism Theorem implies:
Proposition 1.9. SLn is a normal subgroup of GLn , and there is a natural isomorphism
− R× .
det : GLn /SLn →
1.4. Orthogonal determinants.
Lemma 1.10. If A ∈ On then det(A) ∈ {1, −1} = {±1}.
Proof. det(A)2 = det(A) det(AT ) = det(AAT ) = det(I) = 1.
Clearly {±1} is a subgroup of R× , and det gives a homomorphism from On to {±1}. Clearly D−1T
D−1 =
2
D−1 = I, so D−1 ∈ On , and det(D−1 ) = −1, so our homomorphism det : On →
− {±1} is surjective.
Definition 1.11. We write
SOn = ker(det : On →
− {±1}) = {n × n orthogonal matrices A such that det(A) = 1},
and call this the special orthogonal group.
The First Isomorphism Theorem gives:
Proposition 1.12. SOn is a normal subgroup of On , and there is a natural isomorphism
det : On /SOn →
− {±1}.
1.5. One dimension. A 1 × 1 matrix is just a number. Thus GL1 = R× , and O1 = {±1}. The determinant
map is just the identity, so SL1 = SO1 = {1}, the trivial group.
1.6. Two dimensions. Given an angle θ, we write c = cos(θ) and s = sin(θ) (so s2 + c2 = 1) and define
matrices as follows:
c −s c s
Rθ = Sθ = .
s c s −c
It is easy to see that these are orthogonal, and that det(Rθ ) = 1 and det(Sθ ) = −1. Thus Rθ ∈ SO2 and
Sθ ∈ O2 \ SO2 .
Theorem 1.13. Any matrix A ∈ SO2 has the form Rθ for some θ. Any matrix A ∈ O2 \ SO2 has the form
Sθ for some θ.
Proof. Suppose A ∈ O2 . We have A = ac db for some a, b, c, d. As A is orthogonal we have I = AT A, so
2
a + c2 ab + cd
1 0 a c a b
= = ,
0 1 b d c d ab + cd b2 + d2
so a2 + c2 = b2 + d2 = 1 and ab + cd = 0. In other words, the vectors u = (a, c) and v = (b, d) have
length one and are orthogonal to each other. As u is a unit vector, we have u = (cos(θ), sin(θ)) for some θ,
so a = cos(θ) and c = sin(θ). It is geometrically clear (see the diagram below) that the only unit vectors
orthogonal to u are (−c, a) = (− sin(θ), cos(θ)) and (c, −a) = (sin(θ), cos(θ)). If v = (−c, a) we find that
A = Rθ , and if v = (c, −a) we find that A = Sθ . By equating determinants, we see that the first case must
occur if A ∈ SO2 , and the second case must occur if A ∈ O2 \ SO2 .
θ
θ
Proposition 1.14. We have Rθ .[r, φ] = [r, θ + φ], so Rθ represents an anticlockwise rotation through an
angle θ.
Proof.
cos(θ) − sin(θ) r cos(φ)
Rθ .[r, φ] =
sin(θ) cos(θ) r sin(φ)
cos(θ) cos(φ) − sin(θ) sin(φ) r cos(θ + φ)
=r = = [r, θ + φ].
sin(θ) cos(φ) + cos(θ) sin(φ) r sin(θ + φ)
We also want to characterise Sθ geometrically. Firstly, a very similar calculation shows that Sθ .[r, φ] =
[r, θ − φ]. Now define u+ = [1, θ/2] and u− = [1, (θ + π)/2], so that u+ and u− are unit vectors and are
orthogonal to each other. Note also that −[r, φ] = [r, φ − π], so −u− = [1, (θ − π)/2]. We have
Sθ .u+ = [1, θ − θ/2] = [1, θ/2] = u+
Sθ .u− = [1, θ − θ/2 − π/2] = [1, (θ − π)/2] = −u− .
Thus u+ and u− are eigenvectors of Sθ with eigenvalues +1 and −1 respectively. This means that Sθ
represents reflection across the line through 0 and u+ . We summarise our conclusions as follows:
Proposition 1.15. We have Sθ .[r, φ] = [r, θ − φ], and Sθ represents reflection across a line L through 0 at
angle θ/2 to the x-axis.
u−
L
u+ = Sθ u+
θ/2
−u− = Sθ u−
By working in polar coordinates, it is now easy to check the following facts:
Rθ = Rφ iff θ − φ ∈ 2πZ
Sθ = Sφ iff θ − φ ∈ 2πZ
Rθ Rφ = Rθ+φ
Rθ Sφ = Sθ+φ
Sθ Rφ = Sθ−φ
Sθ Sφ = Rθ−φ
Rθ−1 = R−θ
Sθ−1 = Sθ
Rθ Sφ Rθ−1 = Sφ+2θ .
In particular, we have Rθ Rφ = Rφ Rθ , so the group SO2 is Abelian.
5
We also have Sθ Sθ = Rθ−θ = R0 = I, so all reflections have order 2. A rotation Rθ has order dividing m
iff mθ is an integer multiple of 2π, iff θ = 2πr/m (mod 2πZ) for some r ∈ {0, 1, . . . , m − 1}. It has order
exactly m iff (r, m) = 1. Most rotations have infinite order.
Symm(X) = {A ∈ On | AX = X}.
S0
It is clearly invariant under the reflections S0 (across the x-axis) and Sπ (across the y-axis), and also under
a half-turn (which is Rπ ). We have S0 Sπ = Sπ S0 = Rπ = R−π and R0 = I. The symmetry groups are
Symm(X) = {I, S0 , Sπ , Rπ }
Dir(X) = {I, Rπ }.
Now suppose that A is not a symmetry of X. Then AX is different from X, but it has the same
shape and thus is “just as symmetrical” as X. However, it it not true (as one might naively think) that
Symm(AX) = Symm(X); instead, Symm(AX) is conjugate to Symm(X). The slogan is that “conjugacy is
doing the same thing somewhere else”.
Proposition 1.17. For any X ⊆ Rn and A ∈ On we have Symm(AX) = A Symm(X)A−1 and Dir(AX) =
A Dir(X)A−1 .
Proof. If B ∈ Symm(X) then BX = X so (ABA−1 )(AX) = ABX = AX, which shows that ABA−1 ∈
Symm(AX). Thus A Symm(X)A−1 ⊆ Symm(AX). Conversely, suppose that C ∈ Symm(AX). If we put
B = A−1 CA, then a similar argument shows that B ∈ Symm(X). Thus, the matrix C = ABA−1 lies in
A Symm(X)A−1 , proving that Symm(AX) ⊆ A Symm(X)A−1 as required.
The argument for Dir(X) is the same. It works even if A 6∈ SOn , because
For another example of this sort of phenomenon let L be a line through the origin, and let SL be the
reflection across L. If L has angle φ to the x-axis, then SL = S2φ . The line Rθ L has angle θ + φ to the x-axis,
so SRθ L = S2(θ+φ) . On the other hand, from our formulae for compositions of reflections and rotations, we
see that Rθ S2φ Rθ−1 = S2φ+2θ . In summary, we have:
Proposition 1.18. For any rotation R ∈ SO2 and any line L in R2 we have RSL R−1 = SRL .
6
Sπ Sπ/3
S4π/3
X S0 Rπ/6 X
Example 1.19.
We can see directly that
Symm(X) = {I, S0 , Sπ , Rπ }
Symm(Rπ/6 X) = {I, Sπ/3 , S4π/3 , Rπ }
We also have
−1
Rπ/6 IRπ/6 =I
−1
Rπ/6 S0 Rπ/6 = Sπ/3
−1
Rπ/6 Sπ Rπ/6 = S4π/3
−1
Rπ/6 Rπ Rπ/6 = Rπ .
−1
This shows that Symm(Rπ/6 X) = Rπ/6 Symm(X)Rπ/6 , illustrating Proposition 1.17. Also, if we let L denote
the long axis of X we see that S0 = SL and Sπ/3 is the reflection in the long axis of Rπ/6 X, which is Rπ/6 L.
This illustrates Proposition 1.18.
2. Polygons
2.1. Cyclic and dihedral groups. Fix an integer n > 0. For k = 0, . . . , n − 1 we put
vk = [1, 2πk/n] = (cos(2πk/n), sin(2πk/n)).
We then let Xn be the regular n-gon with vertices v0 , . . . , vn−1 . In the case n = 1 this is to be interpreted
as the line segment from (0, 0) to v0 = (1, 0).
X1 X2 X3 X4 X5
We also define
Cn = Dir(Xn )
Dn = Symm(Xn )
R = R2π/n = 1/n-turn around the origin
S = S0 = reflection across the x-axis .
We call Cn the cyclic group, and Dn the dihedral group.
Theorem 2.1. We have
Cn = {Ri | 0 ≤ i < n}
Dn = {Ri | 0 ≤ i < n} ∪ {Ri S | 0 ≤ i < n}.
7
Proof. First, it is clear that R ∈ Cn and S ∈ Dn , so Cn ⊇ {Ri | 0 ≤ i < n} and Dn ⊇ {Ri | 0 ≤ i <
n} ∪ {Ri S | 0 ≤ i < n}. Suppose that A ∈ Cn . Then Av0 ∈ RXn and kAv0 k = kv0 k = 1. However, it is
easy to see that the only vectors in Xn of length 1 are the vertices, so Av0 = vi for some i with 0 ≤ i < n.
This means that the matrix A0 := R−i A satisfies A0 v0 = v0 . Also, A0 is a rotation, and the only way a
rotation of the plane can have a nonzero fixed point is if it is the identity. Thus A0 = I, so A = Ri . Thus
Cn = {Ai | 0 ≤ i < n} as claimed.
Now suppose that B ∈ Dn . If B ∈ Cn then B = Ri for some i by the above. If B 6∈ Cn then det(B) = −1,
so BS ∈ Dn and det(BS) = det(B) det(S) = 1, so BS = Ri for some i. This means that B = BSS = Ri S,
which proves the claim about Dn .
Remark 2.2. Because SOn is normal in On , we see that Cn is normal in Dn . It is easy to see that
Dn /Cn ' {±1}.
2.2. The classification of subgroups.
Proposition 2.3. Let G be a finite subgroup of SO2 . Then G = Cn for some n.
Proof. Let θ be the smallest angle in the range (0, 2π] such that Rθ ∈ G. I claim that θ = 2π/n for some
n, and that G = Cn . To see this, let φ be any angle such that φ ≥ 0 and Rφ ∈ G. Let k be the largest
integer such that kθ ≤ φ and put ψ = φ − kθ. We then have 0 ≤ ψ < θ ≤ 2π, and Rψ = Rφ Rθ−k ∈ G. If
ψ were in the range (0, 2π], this would contradict our definition of θ, so we must have ψ = 0. Thus φ = kθ
and Rφ = Rθk . This shows that the elements of G are precisely the powers of Rθ .
In particular, we have R2π = I ∈ G, so we can apply the above argument with φ = 2π and deduce that
2π = nθ for some n > 0, so θ = 2π/n. Thus G consists of the powers of R2π/n , in other words G = Cn .
Theorem 2.4. Let G be a finite subgroup of O2 . Then either G = Cn = Dir(Xn ) for some n, or G =
Rθ Dn Rθ−1 = Symm(Rθ Xn ) for some n and θ.
Proof. Put H = G ∩ SO2 ; the Proposition tells us that H = Cn for some n. If G ≤ SO2 then G = H = Cn .
Otherwise G contains some reflection, say S2θ ∈ G. If A ∈ G then either
k
(a) det(A) = 1, so A ∈ H and A = R2π/n for some k; or
k k
(b) det(A) = −1 so AS2θ ∈ G and det(AS2θ ) = 1 so AS2θ = R2π/n for some k so A = R2π/n S2θ .
Next, note that Rθ−1 R2π/n
k k
Rθ = R2π/n and Rθ−1 S2θ Rθ = S0 . It follows that the group G0 := Rθ−1 GRθ
consists of the elements R2π/n and R2π/n S0 , or in other words G0 = Dn . Thus G = Rθ G0 Rθ−1 = Rθ Dn Rθ−1 ,
k k
as required.
3. Affine isometries
Definition 3.1. An isometry of R is a function f : Rn →
n
− Rn of the form f (x) = Ax+a for some orthogonal
n
matrix A ∈ On and some vector a ∈ R . We write Isomn for the set of all such functions.
Remark 3.2. If we have an isometry f (x) = Ax + a as above, then d(f (x), f (y)) = d(x, y) for all x and y
in Rn , or in other words, f preserves distances. To see this, note that
d(f (x), f (y)) = k(Ax + a) − (Ay + a)k = kA(x − y)k = kx − yk = d(x, y).
(At the third step we used the fact that A is an orthogonal matrix, so kAzk = kzk for any vector z ∈ Rn .)
Remark 3.3. Let f : Rn → − Rn be any function that preserves distances. It can be shown that there is a
matrix A ∈ On and a vector a ∈ Rn such that f (x) = Ax + a for all x, so f ∈ Isomn . (The proof takes
about a page and a half, but we will not give it here.) This means that Definition 3.1 is compatible with the
general definition of isometries for metric spaces.
Remark 3.4. Suppose we have isometries f (x) = Ax + a and g(x) = Bx + b. Then
f (g(x)) = (AB)x + (Ab + a)
f −1 (x) = A−1 x + (−A−1 a).
8
We have AB ∈ On and Ab + a ∈ Rn so f ◦ g ∈ Isomn . Similarly A−1 ∈ On and −A−1 a ∈ Rn so f −1 ∈ Isomn .
This shows that Isomn is a group under composition. We will usually write f g instead of f ◦ g, and write 1
for the identity map.
We will not distinguish between a matrix A ∈ On and the corresponding isometry f (x) = Ax. We thus
think of On as a subgroup of Isomn .
For any a ∈ Rn we have an isometry Ta defined by Ta (x) = x + a; this is called a translation. We clearly
have Ta Tb = Ta+b and Ta−1 = T−a . It follows that the translations form an abelian subgroup Transn ≤ Isomn .
Using the correspondence Ta ↔ a we can identify Transn with Rn .
Definition 3.5. Given an isometry f (x) = Ax + a, we define ψ(f ) = A ∈ On and det(f ) = det(A) =
det(ψ(f )) ∈ {1, −1}. This gives functions ψ : Isomn →
− On and det : Isomn →
− {1, −1}.
Proposition 3.6. The map ψ is a surjective homomorphism with kernel Transn , and thus it induces an
isomorphism Isomn / Transn ' On . Moreover, det : Isomn →
− {±1} is also a homomorphism.
which shows that ψ(f g) = AB = ψ(f )ψ(g). This shows that ψ is a homomorphism, and it follows that
det : Isomn →− {±1} is also a homomorphism. For any A ∈ On we can define an isometry f by f (x) = Ax
and then ψ(f ) = A, which shows that ψ is surjective. We have ψ(f ) = I iff f (x) = x + a for all x, iff f
is a translation, so ker(ψ) is the translation subgroup Transn . It now follows from the First Isomorphism
Theorem that Isomn / Transn ' On .
Thus, f Tb = TAb f , and we can multiply by f −1 on the right to get f Tb f −1 = TAb = Tψ(f )b .
Example 3.9. Let X be the subset of R2 illustrated below. It extends infinitely in all directions, and the
distance between adjacent faces is one unit.
9
If we shift X by n units to the right and m units up, we just get X again (assuming that n and m are
integers). In other words, T(n,m) X = X, so T(n,m) ∈ Isom(X). In fact, one can check that these are the only
symmetries, so Isom(X) = {T(n,m) | (n, m) ∈ Z2 }.
We conclude this section by giving a simple criterion for when an isometry is the identity.
Definition 3.10. A list u0 , . . . , un of n+1 points in Rn is in general position if the vectors u1 −u0 , . . . , un −u0
form a basis of Rn .
Proposition 3.11. If u0 , . . . , un are in general position, f ∈ Isomn and f (ui ) = ui for all i, then f = 1.
for all i. As the vectors ui − u0 form a basis, we deduce that A = I, so f (x) = x + b for all x. In particular,
u0 = f (u0 ) = u0 + b, so b = 0. Thus f (x) = x for all x as claimed.
4. Plane isometries
We next define some special types of isometries of R2 .
(1) For any a ∈ R2 and any angle θ, we put Rθ,a = Ta Rθ T−a , so that
Note that Rθ,a (a + x) = a + Rθ x, which means that Rθ,a is the rotation through angle θ around a.
If θ is not a multiple of 2π then for all x 6= 0 we have Rθ x 6= 0; it follows that a is the unique fixed
point of Rθ,a . Note also that ψ(Rθ,a ) = Rθ .
(2) For any line L < R2 (not necessarily passing through the origin) we let SL be the reflection across L.
If L has angle θ/2 to the x-axis and a ∈ L one checks that SL = Ta Sθ T−a . We also have ψ(SL ) = Sα ,
where α is the angle between L and the x-axis.
(3) For any line L < R2 and any vector b that is parallel to L, we define GL,b = Tb SL . It is not hard to
check geometrically that GL,b = SL Tb also, and it follows that
x Tb x
b
Clearly GL,0 = SL . Maps of the form GL,b with b 6= 0 are called glide-reflections. We have
ψ(GL,b ) = Sα , where α is the angle between L and the x-axis.
Proposition 4.1. For any f ∈ Isom2 , precisely one of the following holds:
(a) f = 1
(b) f = Ta for some a ∈ R2 \ {0}
(c) f = Rθ,a for some a ∈ R2 and θ ∈ (0, 2π)
(d) f = SL for some line L < R2
(e) f = GL,b for some L and some nonzero vector b parallel to L.
Proof. We know that there exists a matrix A ∈ O2 and a vector b such that f (x) = Ax + b for all x. If A = I
then we are in case (a) (if b = 0) or case (b) (if b 6= 0). We may thus assume that A 6= I.
If A is a rotation we have A = Rθ for some θ ∈ (0, 2π). As A is a nontrivial rotation, for all x we have
x 6= Ax so (I − A)x 6= 0. Thus, the kernel of I − A is zero, so I − A is invertible. Put a = (I − A)−1 b, so
that b = a − Aa. Then
Rθ,a x = Ta AT−a x = A(x − a) + a = Ax + a − Aa = Ax + b = f (x),
so f = Rθ,a .
Now suppose instead that A is a reflection, say A = Sθ . As before we put u+ = [1, θ/2] and u− =
[1, (θ + π)/2], so u+ and u− are unit vectors and are orthogonal to each other. We can write any vector x
in the form x+ + x− , where x± is a multiple of u± , and then Ax = x+ − x− . It follows that
f (x) = Ax + b = x+ − x− + b+ + b− = (b+ + x+ ) + (b− − x− ).
Now let L be the line through b− /2 at angle θ/2 to the x-axis. We can write any vector x as (x+ + 21 b− ) +
(x− − 12 b− ), where (x+ + 12 b− ) ∈ L and (x− − 21 b− ) is orthogonal to L. It follows that
SL x = (x+ + 12 b− ) − (x− − 12 b− ) = x+ + b− − x− ,
and thus that
GL,b+ x = b+ + x+ + b− − x− = f (x).
Thus, f = GL,b+ , so f is a reflection (if b+ = 0) or a glide-reflection (if b+ 6= 0).
Remark 4.2. Suppose we have an isometry f , and we want to know where it falls in the above clasification.
One can check using the above proof that the following method will work.
(a) Find the matrix A = ψ(f ) ∈ O2 .
(b) If A is the identity, then f = Tu for some u. To find u, let x be any point for which one can easily
find f (x), and then u = f (x) − x.
(c) Now suppose that ψ(f ) = Rθ for some angle θ ∈ (0, 2π). Then there is a unique point a such that
f (a) = a, and it works out that f = Rθ,a .
(d) Suppose instead that ψ(f ) = Sθ for some θ. Then we choose a point x for which we can easily
calculate f (f (x)), and put u = (f (f (x)) − x)/2. We then put L = {x | f (x) = x + u}. It works out
that L is always a line parallel to u, and that f = GL,u (if u 6= 0) or f = SL (if u = 0).
4.1. Subgroups with no translations.
Theorem 4.3. Let H be a subgroup of Isom2 , and suppose that H contains no translations (other than
the trivial translation T0 = 1). Then there is a point a ∈ R2 such that f (a) = a for all f ∈ H, and thus
H ≤ Ta O2 Ta−1 .
11
This theorem implies a classification of finite subgroups of Isom2 , as will be explained in Corollary 4.5.
The proof relies on the following lemma.
We will first prove the theorem using the lemma, then we will prove the lemma.
Proof of Theorem 4.3. I first claim that H contains no glide-reflections. Indeed, if GL,b ∈ H then G2L,b ∈ H
but G2L,b = T2b and 2b 6= 0, contrary to our assumption about H. Thus every element of H is either the
identity, a nontrivial rotation, or a reflection.
Now suppose that H contains a nontrivial rotation Ra,θ . Because this is nontrivial we have Rθ (x) 6= x
for all x, so (1 − Rθ )(x) 6= 0, so 1 − Rθ is invertible. I claim that f (a) = a for all f ∈ H. This is clear if
−1 −1
f = 1. If f is a nontrivial rotation, say f = Rb,φ , then we note that the element g = Ra,θ Rb,φ Ra,θ Rb,φ also
lies in H. Part (b) of the lemma tells us that g = Td , where d = (1 − Rφ )(1 − Rθ )(a − b). As H contains no
nontrivial translations, we have d = 0. As 1 − Rθ and 1 − Rφ are invertible, we must have a − b = 0, and so
a = b, so f = Ra,φ . Thus the element f = Ra,φ has f (a) = a as claimed.
Now suppose instead that f is a reflection, say f = SL . We then note that the element h = (Ra,θ SL )2
also lies in H. Part (a) of the lemma tells us that h = Tc , where c = (1 − Rθ )(a − SL (a)). It follows that
c = 0, and 1 − Rθ is invertible so a − SL (a) = 0, so SL (a) = a. Thus f (a) = SL (a) = a as required.
This proves the theorem when H contains a nontrivial rotation. Now suppose instead that H contains
only reflections and the identity map. I claim that H contains at most one reflection. If not, let SK and SL
be two different reflections in H, so SL SK also lies in H. We see from parts (c) and (d) of the lemma that
SL SK is either a nontrivial translation or a nontrivial rotation, giving a contradiction. It follows that H is
either the trivial group {1} or a group of the form {1, SL } for some line L. In the first case we can take a
to be any point at all, and in the second case a can be any point on L.
Proof of Lemma 4.4. We first check the general type of the various isometries considered, using the method
described in Remark 4.2.
(a) Clearly ψ(Ra,θ SL ) is a rotation times a reflection, which is another reflection. Every reflection in O2
squares to the identity, so ψ((Ra,θ SL )2 ) = 1, so (Ra,θ SL )2 = Tc for some c.
(b) We have
−1 −1
ψ(Ra,θ Rb,φ Ra,θ Rb,φ ) = Rθ Rφ R−θ R−φ = Rθ+φ−θ−φ = 1,
so
−1 −1
Ra,θ Rb,φ Ra,θ Rb,φ = Td
for some d.
(c) As L and K are parallel, they have the same angle with the x-axis, say α. We thus have
ψ(SL SK ) = Sα Sα = 1,
so SL SK = Te for some e.
(d) Here ψ(SL SK ) is a product of two different reflections in O2 , so it is a rotation, say Rφ . This means
that SL SK = Ra,φ for some a and φ.
We next find the details.
12
(a) To find c, we choose any convenient point x, and then c will be c = (Ra,θ SL )2 (x) − x. We will take
x = SL (a). We then have SL (x) = a so Ra,θ SL (x) = Ra,θ (a) = a so
(Ra,θ SL )2 (x) = Ra,θ SL (a)
= Rθ SL (a) + (1 − Rθ )a
c = (Ra,θ SL )2 (x) − x
= Rθ SL (a) + (1 − Rθ )a − SL (a)
= (1 − Rθ )a + (Rθ − 1)SL (a)
= (1 − Rθ )(a − SL (a)).
−1 −1
(b) Put f = Ra,θ Rb,φ Ra,θ Rb,φ . We have seen that ψ(f ) = 1, so f = Td for some d. To find d, we choose
any convenient point x, and then d will be f (x) − x. We will take x = Rb,φ Ra,θ (b), so
−1 −1
f (x) = Ra,θ Rb,φ Ra,θ Rb,φ Rb,φ Ra,θ (b)
= Ra,θ Rb,φ (b)
= Ra,θ (b)
= Rθ b + (1 − Rθ )a.
(At the third step we used the fact that Rb,φ is a rotation around b, so it sends b to itself.)
We also have
x = Rb,φ Ra,θ (b)
= Rb,φ (Rθ b + (1 − Rθ )a)
= Rφ Rθ b + Rφ (1 − Rθ )a + (1 − Rφ )b.
By subtracting these, we get
d = f (x) − x
= Rθ b + (1 − Rθ )a
−Rφ Rθ b − Rφ (1 − Rθ )a − (1 − Rφ )b
= (1 − Rφ )Rθ b + (1 − Rφ )(1 − Rθ )a − (1 − Rφ )b
= (1 − Rφ )(1 − Rθ )(a − b).
(c) We know that SL SK = Te for some e. Choose any point x on the line K, so SK (x) = x. We then
have e = SL SK (x) − x = SL (x) − x.
If we move away from x towards L in a direction perpendicular to K and L, we will eventually
reach L. In other words, there is a vector u perpendicular to K and L such that the point y := x + u
lies in L. As L is parallel to K, it is easy to see that L = K + u. Moreover, SL (x) is the reflection
of x across L, which is just x + 2u. It follows that e = SL (x) − x = 2u as claimed.
K L
x y SL x
u u
(d) Let K and L be lines that are not parallel. It is geometrically clear that they meet in a unique point,
which we call a. Let α be the angle between the x-axis and K, measured anticlockwise from the axis.
Let θ be the angle between K and L, measured anticlockwise from K, so that θ ∈ [0, π). Clearly L
13
is obtained by rotating K around a through an angle of θ, in other words L = Ra,θ K. We also have
SK = Ta S2α T−a and SL = Ta S2α+2θ T−a and S2α+2θ S2α = R2θ so SL SK = Ta R2θ T−a = Ra,2θ .
Corollary 4.5. Let H be a finite subgroup of Isom2 . Then either H = Ta Cn Ta−1 for some a and n, or
H = Ta Rθ Dn Rθ−1 Ta−1 for some a, n and θ.
Proof. Every element of H has finite order, and thus cannot be a nontrivial translation. It follows from the
theorem that H ≤ Ta O2 Ta−1 for some a, so the group H 0 := Ta−1 HTa is contained in O2 . Theorem 2.4 tells
us that H 0 has the form Cn or Rθ Dn Rθ−1 and clearly H = Ta H 0 Ta−1 . The claim follows.
5. Wallpaper
In this section we study symmetry groups of “wallpaper patterns”, which for our purposes will mean
“reasonable” subsets of R2 which are translationally symmetric in two different directions. (I say “reasonable”
to exclude sets like Q2 ; we will be more precise later.) The real importance of this study (and its three-
dimensional analogue) is in the physical chemistry of crystals: the symmetry group of a crystal is a useful
tool in studying the way it vibrates, refracts X-rays, and so on.
The simplest wallpaper group was discussed in Example 3.9. It turns out that there are precisely 17 types
of wallpaper up to a suitable notion of equivalence. Here we will analyse a small selection of these types,
and prove some of the key results in the general classification.
We start with some general concepts.
Definition 5.1. For any subgroup H ≤ Isom2 , the point group of H is the subgroup ψ(H) = {ψ(h) | h ∈
H} ≤ O2 , where ψ is as in Section 3.1. We also write Trans(H) = {a ∈ R2 | Ta ∈ H} and call this the
translation subgroup of H.
For any point a ∈ R2 , we also define σa (H) = {A ∈ O2 | Ta ATa−1 ∈ H}, which is a subgroup of O2 . This
is the part of H that encodes the rotational and reflectional symmetry about a.
Proposition 5.2. For any a ∈ A we have σa (H) ⊆ ψ(H).
Proof. If A ∈ σa (H) then Ta ATa−1 ∈ H, so ψ(Ta )ψ(A)ψ(Ta )−1 ∈ ψ(H). We have ψ(Ta ) = I and ψ(A) = A,
so A ∈ ψ(H), as required.
Definition 5.3. Let G be a group, and let x1 , . . . , xr be elements of G. We say that these elements generate
G if every element in g ∈ G can be expressed in terms of the elements xi , say
g = xni11 xni22 · · · xnirr
for some indices i1 , . . . , ir and integers n1 , . . . , nr .
Equivalently, the xi generate G iff the only subgroup of G containing all the xi is G itself.
We will be interested in finding small sets of generators for some of the wallpaper groups.
5.1. The group p4g. Let M be the figure shown on the left below, and let M 0 be its mirror image, as
shown on the right.
1
3
Remark 5.6. We have ψ(T1 ) = ψ(T2 ) = I and ψ(R) = R and ψ(G) = S0 . The group Isom(X) is generated
by T1 , T2 , R and G, so ψ(Isom(X)) is generated by R and S0 , so ψ(Isom(X)) = D4 . On the other hand,
15
one checks that for each a ∈ R2 , the group σa (Isom(X)) is either C1 , C2 or C4 . In particular, there is no
point a for which σa (Isom(X)) = ψ(Isom(X)).
5.2. The group p4m. Let Cn,m denote the circle of radius 1/3 centred at (n, m), and let X denote the
union of all the circle Cn,m for (n, m) ∈ Z2 .
6
-
√
5.3. The group p6m. Put u = (1, 0) and v = Rπ/6 (u) = (1/2, 3/2), so that 0, u and v are the vertices
of an equilateral triangle of side 1. Let Cn,m be a circle of radius 1/3 centred at nu + mv, and let X be the
union of the circles Cn,m
16
6
-
This can be analysed in much the same way as the previous example. We find that Isom(X) is generated
by Tu , Tv , Rπ/3 and S0 . The point group is D6 , which is the same as σ0 Isom(X).
5.4. Steps towards the classification. We will adopt the following definition.
Definition 5.7. A wallpaper group or two-dimensional crystallographic group is a subgroup H ≤ Isom2 such
that
(a) ψ(H) is finite.
(b) There exist linearly independent vectors u, v ∈ Trans(H) such that every vector in Trans(H) can be
written as nu + mv for some n, m ∈ Z.
It is usual to use a somewhat different definition, which can be shown to be equivalent to that given above.
Let H be a wallpaper group. We say that H is oriented if ψ(H) ≤ SO2 ; if so, we know from Theorem 2.4
that ψ(H) = Cn for some n. We call n the rotational order of H.
Now suppose that H is not oriented, so ψ(H) = Rθ Dn Rθ−1 for some n and θ. We again call n the
rotational order of H.
Lemma 5.8. If A ∈ ψ(H) and b ∈ Trans(H) ⊂ R2 then Ab ∈ Trans(H).
Proof. As b ∈ Trans(H) we have Tb ∈ H. As A ∈ ψ(H), there is an element f ∈ H of the form f (x) =
Ax + c for some c. It follows that f Tb f −1 ∈ H, and we see from Proposition 3.7 that f Tb f −1 = TAb , so
Ab ∈ Trans(H).
To explain √what the next lemma is about, consider the group V ≤ R2 consisting of vectors of the form
n(−1, 0) + m( 2, 0) with n, m ∈ Z.√We can choose a rational number n/m √ which is a very good (but not
perfect) rational approximation to 2, and we find that n(−1, 0) + m( 2, 0) is very small (but nonzero).
By making this precise, we find that for any > 0 there exists v ∈ V \ {0} such that kvk√< . Thus, there
is no shortest vector in V \ {0}. This phenomenon can only happen because (−1, 0) and ( 2, 0) are linearly
dependent vectors; in particular, it does not occur in Trans(H). The point of the next lemma is to prove
this.
Lemma 5.9. If H is a wallpaper group then there exists w ∈ Trans(H) \ {0} such that kbk ≥ kwk for all
b ∈ Trans(H) \ {0}.
Proof. Let u and v be as in Definition 5.7. We claim that there is a positive constant K > 0 such that
p
knu + mvk ≥ n2 + m2 /K.
To see this, define f : [0, 2π] →
− R by f (θ) = k cos(θ)u + sin(θ)vk. As u and v are linearly independent we
have cos(θ)u + sin(θ)v 6= 0 and thus f (θ) > 0 for all θ. It follows that 1/f is a positive continuous function
17
on the closed interval [0, 2π], so 1/f is bounded by some number K > 0, so√f (θ) ≥ 1/K for all θ. Now, for
any n and m we can write (n, m) = r(cos(θ), sin(θ)) for some θ, where r = n2 + m2 . This means that
as claimed.
Now consider a disc D of radius R centred at the origin, and put S = (Trans(H) \ {0}) ∩ D, the set of
nonzero vectors in Trans(H) of length at most R. We choose R large enough that D contains √ at least one
of the nonzero points in Trans(H), so S 6= ∅. If nu + mv ∈ S then R ≥ knu + mvk ≥ n2 + m2 /K, so
|n|, |m| ≤ RK. This means that there are only finitely many possibilities for n and m, so there are only
finitely many points in S. Among this finite list of points, we choose one that is as close as possible to zero,
and call it w. This clearly has the required property.
Proof. Let n be the rotational order, so the element R := R2π/n lies in ψ(H). Let w ∈ Trans(H) be as in
Lemma 5.9. Lemma 5.8 tells us that R(w) ∈ Trans(H) and Trans(H) is a subgroup of R2 so R(w) − w ∈
Trans(H), so kR(w) − wk ≥ kwk by the definition of w. However, for any x and θ we have kRθ (x) − xk =
2 sin(θ/2)kxk, as we see from the diagram below.
Rθ x
kxk x
θ/2
θ/2
kxk
It follows that kR(w) − wk = 2 sin(π/n)kwk, so we must have 2 sin(π/n) ≥ 1, so sin(π/n) ≤ 1/2 = sin(π/6),
so n ≤ 6.
All that is left is to show that the case n = 5 leads to a contradiction, which we do by a variation of the
preceeding argument. Clearly w + R−2 w ∈ Trans(H), but if n = 5 then −R−2 w = Rπ R−4π/5 = Rπ/5 w so
kw + R−2 wk = kw − Rπ/5 wk = 2 sin(π/10)kwk < kwk, which contradicts our choice of w, as required.
Proposition 5.11. Suppose that H has rotational order n, where n ∈ {3, 4, 6}. Let w be as in Lemma 5.9,
and put x = R2π/n (w). Then Trans(H) = {pw + qx | p, q ∈ Z}.
Proof. Put L = {pw + qx | p, q ∈ Z} ≤ Trans(H) and r = kwk. I claim that for each a ∈ R2 , there exists
b ∈ L such that d(a, b) < r. Assuming this, when a ∈ Trans(H) we have a − b ∈ Trans(H) and ka − bk < r
so a − b = 0 by our choice of w, so a = b; this proves that Trans(H) = L as required.
To prove the claim, we first consider the case n = 4, where w and x are orthogonal. After a suitable
change of coordinates we have w = (r, 0) and x = (0, r), and the claim is that every point in R2 lies in
the open ball of radius r centred at (pr, qr) for some p, q ∈ Z. This should be geometrically clear from the
following diagram.
18
For an algebraic proof, note that {w, x} is a basis for R2 , so any vector a ∈ R2 can certainly be written in
the form pw + qx for some p, q ∈ R. We can choose p0 , q 0 ∈ Z with |p − p0 | ≤ 12 and |q − q 0 | ≤ 21 , and then
put b = p0 w + q 0 x ∈ L. We then have a − b = (p − p0 )w + (q − q 0 )x and w and x are orthogonal so
so ka − bk < r as required.
We next turn to the case n = 6. It should be clear from the way the previous case worked that the value
of r is irrelevant, so √
we assume that r = 1. We may also change coordinates and assume that w = (1, 0), so
x = R2π/6 w = (1/2, 3/2). The lattice L consists of the dots in the following diagram:
Each of the triangles is equilateral with side 1, and every point in such a triangle lies at distance < 1 from
at least one of the vertices. (In fact, if T is an equilateral triangle of side 1 with vertices A, B and C and
X ∈ T then the distances d(A, X), d(B, X) and d(C, X) are all less than one unless X is itself a vertex; in
the exceptional case, of course X lies at distance 0 from one of the vertices.) This settles the case n = 6.
Finally, we treat the√case n = 3. With assumptions √ as in the case n = 6, we have w = (1, 0) and
x = R2π/3 (w) = (−1/2, 3/2). Put y = R2π/6 (w) = (1/2, 3/2) and notice that y = x + w and x = y − w.
This shows that every integer combination of w and x is an integer combination of w and y, and vice versa.
This means that the lattice for the n = 3 case is exactly the same as for the n = 6 case, so again every point
in R2 is at distance < 1 from a lattice point.
We now see that ψ(H) is conjugate to Cn or Dn where n ∈ {1, 2, 3, 4, 6}, which gives twelve possibilities
for ψ(H). In the cases n ≥ 3 we have a strong information about Trans(H). Even if we know ψ(H) and
Trans(H) there may be more than one possibility for H, as exemplified by the difference between p4g and
p4m. Nonetheless, we are well on the way to the complete classification of wallpaper groups.
6. Polyhedra
We now turn to the study of symmetries in three dimensions. In this context we will not consider
translations, so we are really just looking at subgroups of O3 . It will turn out that this is strongly related
to the theory of regular polyhedra, otherwise known as Platonic solids.
19
6.1. Actions of groups on sets. In our study of subgroups of O3 (and in later sections of the course) it
will be helpful to think about actions of groups on sets.
Definition 6.1. Let G be a group and X a set. An action of G on X is a rule which assigns to each element
g ∈ G and each element x ∈ X an element g ∗ x ∈ X, such that
A1 1 ∗ x = for all x ∈ X
A2 g ∗ (h ∗ x) = (gh) ∗ x for all g, h ∈ G and x ∈ X.
We will often write gx for g ∗ x.
Example 6.3. Consider the group G = D4 = {1, R, R2 , R3 , S, RS, R2 S, R3 S}, where R = Rπ/2 and S = S0 .
Let L0 be the line with equation x = y, and let L1 be the line with equation x = −y. One checks that
S(L0 ) = L1 and S(L1 ) = L0 , and similarly R(L0 ) = L1 and R(L1 ) = L0 . It follows that for each g ∈ D4 we
either have g(L0 ) = L0 or g(L0 ) = L1 , and similarly we either have g(L1 ) = L1 or g(L1 ) = L0 . Thus, if we
put X = {L0 , L1 } then D4 acts on X.
Example 6.4. Let G be any group. For any g, x ∈ G we define g ∗ x = gxg −1 . This satisfies 1 ∗ x = x and
Thus, we have an action of G on itself, called the conjugation action. In this case it would of course be a
mistake to write gx instead of g ∗ x.
so φ(gh) = φ(g) ◦ φ(h). In particular, we have φ(g)φ(g −1 ) = φ(1) = 1, and similarly φ(g −1 )φ(g) = 1. Thus
φ(g) is a bijection, with inverse φ(g −1 ). We have thus defined a homomorphism φ : G →− S(X). Conversely,
if we start with a homomorphism φ : G → − S(X) we can define an action by g ∗ x = φ(g)(x). Thus, actions
of G on X are essentially the same as homomorphisms from G to S(X).
Example 6.6. Let V = {v0 , v1 , v2 , v3 , v4 } be the set of vertices of the standard pentagon, so the group D5
acts on V , giving a homomorphism φ : D5 → − S(V ). If we write R = R2π/5 and S = S0 as usual then
φ(R)(v0 ) = v1
φ(R)(v1 ) = v2
φ(R)(v2 ) = v3
φ(R)(v3 ) = v4
φ(R)(v4 ) = v0 .
We can write this in cycle notation as φ(R) = (v0 v1 v2 v3 v4 ). If we identify V with {0, 1, 2, 3, 4} in the
obvious way then φ(R) becomes the permutation (0 1 2 3 4). Similarly, we have φ(S) = (1 4)(2 3).
20
v1
v2
2π/5
v0 S0
v3
v4
6.2. Rotations and axes. We have already seen a very simple and concrete description of the elements of
SO2 ; they are just the rotations Rθ for 0 ≤ θ ≤ 2π. Our next task is to see how far this generalises to SO3 ,
or to SOn for n > 3.
Proposition 6.7. If A ∈ SOn and n is odd then 1 is an eigenvalue of A.
Proof. We have AT = A−1 , so
AT (A − I) = I − AT = −(A − I)T .
For any n × n matrix B we have det(B T ) = det(B) and det(−B) = (−1)n det(B) = − det(B) (as n is odd).
We can thus take determinants in the displayed equation to get
det(A) det(A − I) = − det(A − I).
As A ∈ SOn we have det(A) = 1 so det(A − I) = − det(A − I), so det(A − I) = 0 as required.
Corollary 6.8. If A ∈ SO3 then there is an orthonormal basis {u, v, w} of R3 and an angle θ such that
Au = u
Av = cos(θ)v + sin(θ)w
Aw = − sin(θ)v + cos(θ)w.
Thus, A is conjugate in O3 to a matrix of the form
1 0 0
Uθ = 0 cos(θ) − sin(θ) .
0 sin(θ) cos(θ)
Proof. As 1 is an eigenvalue, there is a vector u0 6= 0 such that Au0 = u0 . Put u = u0 /ku0 k, so kuk = 1
and Au = u. Let v be any unit vector perpendicular to u, and let w be either of the two unit vectors that
are perpendicular to the plane spanned by u and v. As Au = u and A preserves inner products, we have
hAv, ui = hAv, Aui = hv, ui = 0, so Av is perpendicular to u. It is clear that v and w form a basis for the
plane perpendicular to u, so Av = cv + sw for some c, s ∈ R. Moreover, we have
1 = kvk2 = kAvk2 = hcv + sw, cv + swi = c2 + s2 ,
so we have (c, s) = (cos(θ), sin(θ)) for some θ. Similarly, we have Aw = c0 v + s0 w for some c0 , s0 with
(c0 )2 + (s0 )2 = 1. As hv, wi = 0 we have hAv, Awi = 0 and thus cc0 + ss0 = 0. Thus (c0 , s0 ) is a unit
vector in R2 which is orthogonal to (c, s); one sees easily that the only possibilities are (c0 , s0 ) = (−s, c) and
(c0 , s0 ) = (s, −c). For the moment we simply assume that (c0 , s0 ) = (−s, c); we will explain later why the
other case is impossible. Define β : R3 → − R3 by β(x, y, z) = xu + yv + zw. As u, v and w are orthonormal
we see that
kβ(x, y, z)k2 = hxu + yv + zw, xu + yv + zwi = x2 + y 2 + z 2 = k(x, y, z)k2 .
21
Thus β is a norm-preserving linear map, so the corresponding matrix B is orthogonal. We have
AB(x, y, z) = A(xu + yv + zw)
= xAu + yAv + zAw
= xu + y(cv + sw) + z(−sv + cw)
= xu + (cy − sz)v + (sy + cz)w
= BUθ (x, y, z),
−1
so B AB = Uθ . Thus A is conjugate to Uθ in O3 , as claimed.
Now suppose instead that (c0 , s0 ) = (s, −c). Then we would have B −1 AB = Uθ0 , where Uθ0 is obtained from
Uθ by multiplying the last column by −1. However, we have A ∈ SO3 by assumption, so det(B −1 AB) =
det(B)−1 det(A) det(B) = 1. We see by direct calculation that det(Uθ0 ) = −1, and this gives a contradiction.
Thus we must have (c0 , s0 ) = (−s, c) after all.
Proposition 6.9. Suppose that A ∈ SO3 and that there are two linearly independent vectors u and v such
that Au = u and Av = v. Then A = I.
Proof. If A is not the identity, then it must be a nontrivial rotation, around an axis L say. This means that
A fixes all the points on L, and moves all other points. As Au = u and Av = v, we see that u and v must
both lie on the line L. This is impossible, because they are linearly independent.
Proposition 6.10. If G ≤ O3 and −1 ∈ G and H = G ∩ SO3 then G = H × {±1}.
Proof. Define µ : H × {±1} → − G by µ(A, t) = tA. As multiplication by any number commutes with
multiplication by any matrix, we have
µ(A, t)µ(A0 , t0 ) = tAt0 A0 = tt0 AA0 = µ(AA0 , tt0 ),
so µ is a homomorphism. Suppose that µ(A, t) = I; then either A = I and t = 1 or A = −I and t = −1,
but the second case is impossible because −I 6∈ SO3 . This shows that ker(µ) is the trivial group, so µ is
injective. Next consider an element B ∈ G. If det(B) = 1 then B ∈ H so B = µ(B, 1) so B is in the image
of µ. If det(B) = −1 then −B ∈ G (because B and −1 both lie in G) and det(−B) = 1 so −B ∈ H. We
also have B = µ(−B, −1), so we again see that B is in the image of µ. This shows that µ is surjective as
well as injective, so it is an isomorphism of groups.
Corollary 6.11. In particular, we have O3 = SO3 × {±1} as groups.
6.3. Symmetries of the tetrahedron. Let Tet be a regular tetrahedron centred at the origin, whose edges
have length 1, and let v1 , . . . , v4 be the vertices of Tet.
1
2
3
4
The action of Symm(Tet) on the vertices gives rise to a homomorphism φ : Symm(Tet) → − S4 . For example,
let g be a 1/3-twist about the z-axis, anticlockwise as seen from above. Then g fixes v1 and sends v2 to v3 ,
v3 to v4 and v4 back to v2 . Thus φ(g) is the 3-cycle (2 3 4).
Theorem 6.12. The homomorphism φ : Symm(Tet) →
− S4 is an isomorphism, and it also gives an isomor-
phism Dir(Tet) →
− A4 .
Proof. Given any pair of vertices vi , vj , let vk and vl be the two remaining vertices, and let P be the plane
through vk , vl and (vi + vj )/2. Let A be the reflection across P (if n is a unit normal to P then A is given
by Ax = x − 2hn, xin.) We find that Avi = vj and Avj = vi , and that vk and vl are fixed by A. Thus φ(A)
is the transposition (i j). The diagram below illustrates the case i = 3, j = 4.
22
1
2
3
4
The image of φ is a subgroup of S4 containing all the transpositions, and any permutation can be written
as a product of transpositions, so the image is all of S4 , so φ is surjective. If A ∈ ker(φ) then Avi = vi for
all i. It is easy to see that {v1 , v2 , v3 } is a basis of R3 so we can conclude that A = I. This proves that φ
is injective as well as surjective, so it is an isomorphism. We have seen that φ−1 sends each transposition
to a reflection, so it sends any product of n transpositions to a product of n reflections, and we see that
det(φ−1 (σ)) = sgn(σ) for all σ ∈ S4 , so φ−1 carries A4 to Dir(Tet). By putting σ = φ(g) we deduce that
sgn(φ(g)) = det(g), so φ carries Dir(Tet) to A4 . Thus φ gives an isomorphism Dir(Tet) ' A4 as claimed.
Remark 6.13. If g is a half turn around the axis shown on the left, then φ(g) is the permutation (1 2)(3 4).
If h is a one-third turn around the axis shown on the right, turning anticlockwise as seen from above, then
φ(h) = (2 3 4). Note that this rotation looks clockwise when seen from below.
1 1
2 2
3 3
4 4
6.4. Symmetries of the cube. We now study the symmetries of a cube. We take our standard cube to
have vertices (±1, ±1, ±1), so the centre is at (0, 0, 0) and the edges have length 2.
3
2
4
1
5
8
6
7
We have marked the vertices so that the vertex labelled i is opposite the one labelled i + 4, which will be
convenient later.
Note that (x, y, z) lies in the cube if and only if (−x, −y, −z) does, so −1 ∈ Symm(Cube) (the corre-
sponding thing is not true for the tetrahedron). We therefore see from Proposition 6.10 that Symm(Cube) =
{±1} × Dir(Cube), so we will focus attention on Dir(Cube).
The action of Dir(Cube) on the eight vertices gives rise to an injective homomorphism Dir(Cube) → − S8 ,
but it turns out that this is far from being surjective. In fact, | Dir(Cube)| = 4! = 24 whereas |S8 | = 8! =
40320, so the image of our homomorphism is a rather small subgroup of S8 . We therefore use a different
approach to study Dir(Cube). Let L1 , L2 , L3 and L4 be the four long diagonals of the cube, as shown below.
23
3
2
4
1
1
4
2
3
Proof. Suppose that g ∈ Dir(Cube) and that φ(g) = 1; we must show that g = 1. Because φ(g) = 1 we
have g(Li ) = Li for all i. In particular, we have v1 ∈ L1 so g(v1 ) ∈ g(L1 ) = L1 , so either g(v1 ) = v1 or
g(v1 ) = v5 = −v1 . Thus g(v1 ) = 1 v1 for some 1 ∈ {1, −1}, and similarly we have g(vi ) = i vi for some
i ∈ {1, −1} for i = 2, 3, 4.
Now suppose that 1 = 2 = 3 = −1, so g(v1 ) = −v1 , g(v2 ) = −v2 and g(v3 ) = −v3 . As v1 , v2 and v3 are
linearly independent (they do not all lie in any plane through the origin), they form a basis of R3 . Given
this, it is clear that g = −1, so det(g) = −1, contradicting the assumption that g ∈ Dir(Cube) ≤ SO3 . So
we cannot have 1 = 2 = 3 = −1 after all.
More generally, any three of {v1 , v2 , v3 , v4 } form a basis, so no three of the ’s can be −1. Thus at most
two of the ’s are −1, so at least two of them are +1, say i = j = 1 with i 6= j. This means that g(vi ) = vi
and g(vj ) = vj , so g has two linearly independent fixed points. If g were a nontrivial rotation then all
the fixed points would lie on the axis and thus any two would be linearly dependent. Thus g must be the
identity.
Proof. Let g be a half turn around the axis shown on the left below. It is clear that g exchanges L3 and L4 .
The line L1 is perpendicular to the axis of g, so when we perform the half turn we send L1 to itself, just
reversing the direction. Thus g(L1 ) = L1 . Similarly, we have g(L2 ) = L2 and so φ(g) = (3 4).
3 3 3
2 2 2
4 4 4
1 1 1
Similarly, if we do a half turn about the other two axes we get the transpositions (2 3) and (1 2). The
transpositions (1 2), (2 3) and (3 4) lie in the image of φ and generate S4 , so φ is surjective. We have already
seen that it is injective, so it must be an isomorphism.
Remark 6.16. Let h be a one-third turn about L4 , rotating clockwise as seen from above.
24
3
2
4
1
v5 v2
v6 v1
v7
v1 7→ v6 7→ v3 7→ v1
v5 7→ v2 7→ v7 7→ v5
We then have φ(k) = (1 2 3 4) and φ(k 2 ) = (1 3)(2 4). We have thus found rotations giving representatives
of all the cycle types in S4 .
4
5
2
1
It turns out that there is a close relationship (called “duality”) between the cube and the octahedron. As
illustrated in the picture on the left, the vertices of the octahedron are the centres of the faces of the cube.
To see this algebraically, note that the vertices of the top face of the cube are (1, 1, 1), (−1, 1, 1), (1, −1, 1)
and (−1, −1, 1). Thus, the centre of the top face is
1
((1, 1, 1) + (−1, 1, 1) + (1, −1, 1) + (−1, −1, 1)) = (0, 0, 1).
4
This is just the top vertex of the octahedron. The calculation for the other faces follows the same pattern.
On the other hand, the centres of the faces of the octahedron are the vertices of a cube one third as big
as the one we started with, as illustrated in the picture on the right.
Proposition 7.1. The group Symm(Oct) is the same as Symm(Cube) (and thus is isomorphic to S4 ×{±1}).
Proof. Suppose that g ∈ Symm(Cube). Let w be a vertex of the octahedron. Then w is the centre of some
face F of the cube. As g is a symmetry of the cube, gF is another face, and gw is the centre of gF , so gw is
a vertex of the octahedron. Thus g sends vertices of the octahedron to vertices, and it follows that it sends
the octahedron to itself. Thus Symm(Cube) ⊆ Symm(Oct).
Now suppose that h ∈ Symm(Oct). Let v be a vertex of the large cube, so v/3 is a vertex of the small
cube, so v/3 is the centre of some face F 0 of the octahedron. As h is a symmetry of the octahedron, hF 0 is
another face, and h(v/3) is the centre of hF 0 , so h(v)/3 = h(v/3) is a vertex of the small cube, so h(v) is a
vertex of the large cube. Thus h sends vertices of the cube to vertices, and it follows that it sends the cube
to itself. Thus Symm(Oct) ⊆ Symm(Cube).
Remark 7.2. You might hope that a similar picture would give interesting information about the tetrahe-
dron. However, the centres of the faces of a tetrahedron are just the vertices of a smaller tetrahedron, as
illustrated below, so we just conclude that the two different tetrahedra have the same symmetry group.
26
For an algebraic approach to this, note that the centre of the tetrahedron is (v1 + v2 + v3 + v4 )/4, but by
assumption the centre is at the origin, so we must have v1 + v2 + v3 + v4 = 0. The vertices of the face
opposite v1 are v2 , v3 and v4 so the centre of the face is (v2 + v3 + v4 )/3 = −v1 /3. More generally, the centre
of the face opposite vk is −vk /3, and the points −v1 /3, −v2 /3, −v3 /3 and −v4 /3 clearly form a tetrahedron
one third as big as the one we started with.
The rest of this section will constitute the proof; a picture of the dodecahedron is shown below.
We will construct the dodecahedron by attaching “tents” to a cube as shown below. We have only shown
two tents here but eventually we will use six tents, one for each face.
The edges of the cube will have length d; later on we will work out exactly what d has to be. The tents
will be as shown below, with dotted edges of length d and solid edges of length 1.
The next diagram shows the result of attaching tents to cubes of three different sizes.
27
Note that we have a bent pentagon with the thick line cutting across it. If d is small as shown on the
left, then the pentagon is bent outwards along the thick line. If d is too large as shown on the right, then
the pentagon is bent inwards along the thick line. If we choose exactly the right value of d as shown in the
middle, we get a flat pentagon.
On the other hand, for any value of d we can flatten out the pentagon and lay it out in the plane.
d d d
1 1 1
If d is too small or too large then the pentagon will not be regular. The miraculous thing is that the value
of d that makes the pentagon flat is the same value that makes it regular; our next task is to prove this.
τ
1 1
Proof. We can divide the pentagon into right angled triangles as shown on the left below. All the angles in
the middle are equal to θ and there are ten of them so θ = 2π/10 = π/5. As the angles of any triangle add
up to π, we have φ = π/2 − θ = 3π/10.
φ 1
φφ 1
ψ ψ
θ cos(ψ) cos(ψ)
Now consider the picture on the right, which shows that τ = 2 cos(ψ). We have a triangle with angles π/2,
φ and ψ so ψ = π − π/2 − φ = π/2 − φ = θ = π/5, so we conclude that τ = 2 cos(π/5).
We next claim that τ 2 − τ − 1 = 0. To see this, put ξ = eπi/5 = cos(π/5) + i sin(π/5), so ξ −1 = e−πi/5 =
cos(π/5) − i sin(π/5), so ξ + ξ −1 = 2 cos(π/5) = τ . We find that
τ 2 − τ − 1 = (ξ 2 + 2 + ξ −2 ) − (ξ + ξ −1 ) − 1
= ξ 2 − ξ + 1 − ξ −1 + ξ −2 ,
28
so
(1 + ξ)(τ 2 − τ − 1) = (1 + ξ)(ξ 2 − ξ + 1 − ξ −1 + ξ −2 )
= ξ 3 + ξ −2 = ξ −2 (ξ 5 + 1).
Now let T be a tent whose base is a square of side τ and whose other edges have length 1. We place T
with its base in the xy-plane parallel to the axes with the centre of the base at the origin and with the ridge
parallel to the x-axis.
z
E
F
y
B
A
D
x
C
Moreover, the line EF is horizontal and lies in the xz plane and it crosses the z-axis at its midpoint. This
means that the y coordinates of E and F are zero, their z-coordinates are the same, and the x coordinate
of E is minus the x coordinate of F . Thus for some a, b we have E = (−a, 0, b) and F = (a, 0, b).
Next, recall that the edges F A, F C, EB, ED and EF have length 1. As EF ~ = (2a, 0, 0) and EF has
length 1 we must have a = 1/2. Thus
Lemma 8.3. The angles α and β indicated below are the same.
29
E
F
α
B
A
β
D
1
2 (E + F ) = (0, 0, 12 )
α
1
2
1
2 (C + D) = (0, − τ2 , 0) (0, 0, 0)
τ
2
The top vertex is the midpoint of EF which is 12 (E + F ) and using our formulae for E and F we see that
this is just (0, 0, 21 ). The bottom right vertex is in the xy-plane directly underneath (0, 0, 12 ), so it must be
(0, 0, 0). The bottom left vertex is the midpoint of CD, which is 21 (C + D) = (0, − τ2 , 0). It follows easily
that the sides have length 12 and τ2 as shown, and thus that tan(α) = τ2 / 21 = τ .
In a similar way, we see that the right hand triangle is as follows:
F = ( 12 , 0, 12 )
1
2
β
( 12 , 0, 0) 1 (A + C) = ( τ , 0, 0)
2 2
τ −1
2
1
tan(β)= 2 /τ −1
This shows that 2=1/(τ −1) . We know from Lemma 8.2 that 1/(τ −1) = τ so tan(β) = tan(α) so β = α.
Now suppose we attach two tents to a cube of side τ as shown on the left.
P
P
Q
β
Q
α R
R
Looking from the side we see the picture on the right. As α = β we see that P , Q and R lie on a straight
line, so the pentagon is flat as required.
We now attach a tent to each face, giving the following picture.
30
The same argument as before shows that all the pentagons are flat. You can just now just look at the picture
to see that we have twelve regular pentagonal faces, as required.
Proposition 8.4. The dodecahedron has 20 vertices and 30 edges.
Proof. There are 12 faces each with 5 edges, apparently giving 5 × 12 = 60 edges. However, each edge is
an edge of two different faces, so we have counted each edge twice; there are really only 60/2 = 30 edges.
Similarly, there are 12 faces each with 5 vertices, but each vertex occurs on three different faces, so there are
12 × 5/3 = 20 vertices altogether.
gE
g2 E
g4 E
g3 E
It is a bit more difficult to show that there are exactly five cubes by this method.
The action of G on X gives a homomorphism φ : G → − S5 . If g is as above then φ(g) is clearly a 5-cycle,
and thus an even permutation. If h is a one-third twist about a vertex of C, then φ(h)3 = φ(h3 ) = 1, so
φ(h) is a permutation of {1, . . . , 5} of order dividing 3. One checks that the only possibilities are the identity
and the 3-cycles, and by inspecting a model we see that φ(h) is not the identity so it must be a 3-cycle. In
particular, it is again an even permutation. Now let kx , ky and kz be the half-twists about the x, y and
z-axes. By inspecting a model again we see that φ(kx ), φ(ky ) and φ(kz ) are distinct elements of S5 of the
form (a b)(c d), so they are again even permutations. It follows that {1, φ(kx ), φ(ky ), φ(kz )} is a subgroup
of A5 of order 4. As elements of the three types just considered generate G, we see that φ(x) is an even
permutation for all x ∈ G, so φ(G) ≤ A5 . We have also seen that φ(G) contains a group of order 4 and
elements of orders 3 and 5, so |φ(G)| is divisible by 3 × 4 × 5 = 60. However, we also have |A5 | = 5!/2 = 60,
so we must have φ(G) = A5 . Thus φ : G → − A5 is a surjective map between two sets that both have exactly
60 elements, so φ must be a bijection. Thus φ gives an isomorphism G ' A5 as claimed.
vk = (cos(2kπ/n), sin(2kπ/n), 0)
uk = (cos((2k + 1)π/n), sin((2k + 1)π/n), 0).
These are defined for all k ∈ Z but vk+n = vk and uk+n = uk so there are really only n v’s and n
u’s. We show the case n = 5 below.
u1 v1
v2 u0
u2 v0
v3 u4
u3 v4
It is geometrically clear that a half twist around vk or uk preserves the standard n-gon Xn and thus
lies in D e n . We also need to think about the points −ui and −vj .
e n , so the u’s and v’s are poles of D
When n is odd (as in the above picture) we see that −ui = vk for some k and −vj = ul for some l.
If n is even we see instead that −ui has the form uk and −vj has the form vl . Either way, we get no
new poles. Also, any symmetry of Xn sends vertices to vertices, and we can move any vertex to any
other vertex, so the v’s form an orbit. Similarly, the u’s form an orbit. Thus, the u’s and v’s give
2 orbits, each consisting of n poles of degree 2. Moreover, we have Rk (w) = w and S(w) = −w so
{w, −w} is an orbit consisting of 2 poles of degree n. Thus we again have three orbits altogether.
We now let G be an arbitrary finite subgroup of SO3 . Our next task is to show that the number of
poles and orbits for G matches one of the possibilities discussed above. Our main tool is the orbit counting
theorem:
Theorem 11.9. Let H be a finite group that acts on a finite set X. For each h ∈ H put P Fix(h) = {x ∈
X | hx = x}, the set of fixed points of h. Then the number of orbits of H in X is |H|−1 h∈H | Fix(h)|, or
in other words the average number of fixed points of an element of H.
Proposition 11.10. Let G be a nontrivial finite subgroup of SO3 , and let P be the set of poles of G. Put
n = |G| and p = |P |, and let m be the number of orbits of G in P . Let dk be the degree of the poles in the
35
k’th orbit; we can order the orbits in such a way that d1 ≤ d2 ≤ . . . ≤ dm . Then
m = (p + 2n − 2)/n
m
X
p= n/dk .
k=1
Proof. The orbit counting theorem says that m = n−1 g∈G | Fix(g)|. If g 6= 1 then Fix(g) consists of
P
the two unit vectors on the axis of g, so | Fix(g)| = 2. There are n − 1 elements g ∈ G with g 6= 1, so
P
g6=1 | Fix(g)| = 2(n
P
− 1) = 2n − 2. In the remaining case g = 1 we have Fix(g) = P and thus | Fix(g)| = p.
This means that g∈G | Fix(g)| = p + 2n − 2 and thus m = (p + 2n − 2)/n.
Next, choose a point xk in the k’th orbit for each k. Then stabG (xk ) has order dk . The orbit-stabiliser
theorem says that |G| = | stabG (xk )|| orbG (xk )|, so the size of the k’th orbit is |G|/| stabG (xk )| = n/dk .
As PPis the disjoint union of the orbits we see that p = |P | is the sum of the orders of all the orbits, so
p = k n/dk , as claimed.
Proposition 11.11. With notation as above we have either
(1) m = 3, d1 = 2, d2 = d3 = 3 and n = 12; or
(2) m = 3, d1 = 2, d2 = 3, d3 = 4 and n = 24; or
(3) m = 3, d1 = 2, d2 = 3, d3 = 5 and n = 60; or
(4) m = p = 2 and d1 = d2 = n; or
(5) there is an integer d ≥ 2 such that m = 3, n = 2d, d1 = d2 = 2 and d3 = d.
Proof. First note that dk is the order of the stabiliser group of xk . As xk is a pole, this stabiliser group is
nontrivial, so dk ≥ 2.
Next, we can rearrange the equation m = (p + 2n − 2)/n as m = 2 + (p − 2)/n. By assumption G is a
nontrivial group, and any nontrivial element has two poles, so p ≥ 2, which implies that (p − 2)/n ≥ 0 and
m ≥ 2. Pm
Alternatively, we can rearrange to get p = mn − 2n + 2. We also know that p = k=1 n/dk . As each dk
is at least 2, each term in the on the right hand side is at most n/2, and there are m terms, so p ≤ mn/2.
After feeding this back into the equation p = mn − 2n + 2 we find that mn/2 ≤ 2n − 2 < 2n so mn < 4n so
m < 4. As m ≥ 2 and m < 4 we must have m = 2 or m = 3. P
If m = 2 then the equation m = 2 + (p − 2)/2 implies that p = 2. The equation p = k n/dk now says
that 2 = n/d1 + n/d2 . As dk divides n for all k the terms n/d1 and n/d2 are positive integers, so the only
way their sum can be 2 is if n/d1 = n/d2 = 1, so d1 = d2 = n, so case (4) holds.
Now suppose instead that m = 3. The P equation m = 2 + (p − 2)/n then simplifies to give p = n + 2, and
we can feed this into the equation p = n/dk = n/d1 + n/d2 + n/d3 and rearrange to give
2 1 1 1
= + + − 1.
n d1 d2 d3
Recall that 2 ≤ d1 ≤ d2 ≤ d3 . If the d’s are reasonably large then 1/d1 , 1/d2 and 1/d3 will be small and
so 1/d1 + 1/d2 + 1/d3 − 1 will be negative, which is absurd because 2/n is certainly positive. Thus, the d’s
must be fairly small. We can complete the proof by making this argument more precise.
We first claim that d1 = 2. Indeed, if not then 3 ≤ d1 ≤ d2 ≤ d3 , so 1/d1 , 1/d2 and 1/d3 are P all less than
or equal to 1/3 so 1/d1 + 1/d2 + 1/d3 − 1 ≤ 3/3 − 1 = 0, which contradicts the equation 2/n = ( 1/dk ) − 1.
We thus have d1 = 2 as claimed. Suppose we also have d2 = 2. Then 2/n = 1/2 + 1/2 + 1/d3 − 1 = 1/d3 ,
so n = 2d3 . We are thus in case (5).
Now suppose instead that d2 > 2. We claim that in fact d2 = 3. Indeed, if not then 4 ≤ d2 ≤ d3 so
1/d2 and 1/d3 arePat most 1/4, so 1/d1 + 1/d2 + 1/d3 − 1 ≤ 1/2 + 1/4 + 1/4 − 1 = 0, which contradicts the
equation 2/n = ( 1/dk ) − 1.
We thus have d1 = 2 and d2 = 3 and d3 ≥ 3, so 2/n = 1/2 + 1/3 + 1/d3 − 1 = 1/d3 − 1/6. If d3 = 3 this
gives 2/n = 1/3 − 1/6 = 1/6 so n = 12 and we are in case (1). If d3 = 4 then 2/n = 1/4 − 1/6 = 1/12 so
n = 24 and we are in case (2). If d3 = 5 then 2/n = 1/5 − 1/6 = 1/30 so n = 60 and we are in case (3). If
d3 ≥ 6 then 2/n = 1/d3 − 1/6 ≤ 0, which is absurd.
Proposition 11.12. If case (1) holds in Proposition 11.11 then G is conjugate to G1 = Dir(Tet).
36
Proof. Let V be the third orbit, which has order n/d3 = 12/3 = 4, so V = {v1 , v2 , v3 , v4 } say. Let g be a
one-third turn around v4 , which lies in G because v4 is a pole of degree 3. Clearly g gives a permutation
of {v1 , v2 , v3 }. The only way that a one-third turn can permute a set of three points is if they form an
equilateral triangle perpendicular to the axis of rotation, with the centre of the triangle on the axis. It
follows that the distances d(v1 , v4 ), d(v2 , v4 ) and d(v3 , v4 ) are all the same. Similarly, by rotating around
v3 we see that d(v1 , v3 ) = d(v2 , v3 ) = d(v4 , v3 ). We can also rotate around v1 or v2 and we find that all
the distances d(vi , vj ) (for i 6= j) are the same. This means that v1 , v2 , v3 and v4 are the vertices of a
regular tetrahedron T . As G permutes these vertices, it is a subgroup of Dir(T ), but |G| = 12 = | Dir(T )| so
G = Dir(T ). Let r be the distance from the origin to the vertices of the standard tetrahedron Tet, and put
T 0 = rT ; it is not hard to see that Dir(T 0 ) = Dir(T ) = G. As T 0 is a regular tetrahedron the same size as
Tet, we can choose an isometry f ∈ SO3 with f (Tet) = T 0 . It follows that
Dir(T 0 ) = Dir(f (Tet)) = f Dir(Tet)f −1 = f G1 f −1 ,
so G is conjugate to G1 .
Proposition 11.13. If case (2) holds in Proposition 11.11 then G is conjugate to G2 = Dir(Oct).
Proof. Let V be the third orbit, which has order n/d3 = 24/4 = 6. If v ∈ V then v is a pole of degree 4
so −v is also a pole of degree 4. The poles in the other two orbits have degree 2 or 3, so we must have
−v ∈ V . It follows that V has the form {v1 , v2 , v3 , −v1 , −v2 , −v3 } for some v1 , v2 and v3 . Let g be a quarter
turn around v3 , which lies in G because v3 is a pole of degree 4. Clearly g(v3 ) = v3 and g(−v3 ) = −v3 so
g must permute the remaining vertices {v1 , v2 , −v1 , −v2 }. The only way that a quarter turn can permute a
set of four points is if they form a square perpendicular to the axis of rotation. It follows that the distances
d(v3 , v1 ), d(v3 , v2 ), d(v3 , −v1 ) and d(v3 , −v2 ) are all the same, equal to r say. We also have
d(−v3 , v1 ) = kv1 − (−v3 )k = kv3 + v1 k = kv3 − (−v1 )k = d(v3 , −v1 ) = r,
and by the same method we find that d(−v3 , −v1 ) = d(−v3 , v2 ) = d(−v3 , −v2 ) = r. We can also rotate
about v1 or v2 instead, and we find that
d(±v1 , ±v2 ) = d(±v1 , ±v3 ) = d(±v2 , ±v1 ) = d(±v2 , ±v3 ) = r.
Using this, we find that the points in V are the vertices of a regular octahedron O, so G ≤ Dir(O), but
|G| = 24 = | Dir(O)| so G = Dir(O). By the same method as in the previous proposition we find that G is
conjugate to G2 .
Proposition 11.14. If case (3) holds in Proposition 11.11 then G is conjugate to G3 = Dir(Icos).
Proof. Let V be the third orbit, which has order n/d3 = 60/5 = 12. We will show that the points in V are
the vertices of an icosahedron.
If v ∈ V then v is a pole of degree 5 so −v is also a pole of degree 5. The poles in the other two orbits
have degree 2 or 3, so we must have −v ∈ V . Put V 0 = V \ {v, −v} so |V 0 | = 10, let g be a one-fifth turn
around v, and let H be the subgroup of order 5 generated by g. We see geometrically that for any x ∈ R3
that does not lie on the axis of g, the orbit Hx has order 5. None of the points in V 0 lie on the axis, so V 0
must split into two orbits of order 5 under the action of H, say V 0 = W1 ∪ W2 . All the points in W1 lie at
the same distance (say r1 ) from v, and all the points in W2 lie at some other distance r2 from v. We may
assume that r1 ≤ r2 (otherwise rename W1 as W2 and W2 as W1 ). We will actually assume that r1 < r2 ;
one can check by going through the following argument more carefully that the equation r1 = r2 leads to a
contradiction.
Now let u be any point in V . As V is an orbit there exists an element h ∈ G with hv = u. For any point
w0 ∈ W1 we then have d(u, hw0 ) = d(hv, hw0 ) = d(v, w0 ) = r1 . Thus all the points in hW1 lie at distance r1
from u, and similarly the points in hW2 lie at distance r2 . This means that u has 5 nearest neighbours, and
they all lie at distance r1 from u.
Choose a point w ∈ W1 . I claim that −w ∈ W2 . Indeed, −w is certainly a pole of degree 5 and
−w 6= ±v so −w ∈ V 0 = W1 ∪ W2 . It will thus be enough to show that −w 6∈ W1 . If −w ∈ W1 we have
−w = g k w for some k, which means that −w = (−1)5 w = g 5k w = w (because g 5 = 1). This means that
w = 0, which is impossible as w is a unit vector. We must therefore have −w ∈ W2 as required. Note that
37
d(−v, −w) = d(v, w) = r1 and d(−v, w) = d(v, −w) = r2 , so all the points in W2 lie at distance r1 from −w
and all the points in W1 lie at distance r2 from −w.
We thus have a picture like this. The points in W1 are the vertices of the top pentagon, and the points
in W2 are the vertices of the bottom pentagon.
v
r1 r1
W1 w
−w W2
r1 r1
−v
Because the nearest neighbours of any vertex lie at distance r1 from that vertex, we see that all the edges
have length r1 so we have a regular icosahedron, which we call I. Clearly G ≤ Dir(I) but |G| = 60 = | Dir(I)|,
so G = Dir(I). It follows as usual that G is conjugate to G3 .
Proposition 11.16. If case (5) holds in Proposition 11.11 and d > 2 then G is conjugate to D
e d.
Proof. Let P1 , P2 and P3 be the three orbits in P . Choose w in P3 , so w has degree d. Then −w also has
degree d and the poles in the first two orbits have degree 2 so −w ∈ P3 . We also have |P3 | = n/d3 = 2d/d = 2,
so P3 = {w, −w}. Let g be the rotation through 2π/d around w, and let U be the plane through the origin
perpendicular to w.
Now suppose that v ∈ P2 , and let h be the half turn around v, which lies in G because v is a pole of
degree 2. As P3 is an orbit we have hP3 = P3 so hw = ±w. The only way that a half twist around v can
send w to −w is if v is perpendicular to w. Thus all the points in P2 lie in the plane U , and similarly all the
points in P1 lie in U .
Note also that the points v, gv, . . . , g d−1 v are all different and all lie in P2 , and |P2 | = n/d2 = 2d/2 = d,
so we must have P2 = {v, gv, . . . , g d−1 v}. We can choose coordinate so that w = (0, 0, 1) and v = (1, 0, 0).
Then U is the xy-plane and P2 consists of the vertices of the standard polygon Xd , with polar coordinates
[1, 2kπ/d].
By a similar argument, the set P1 consists of d equally spaced points on the unit circle in the xy-plane,
and these points are all different from the points g k v. Thus there must be precisely one of the the points
in P1 lying in the gap between v and gv; call this point u. Let α be the angle between u and v, and let β
be the angle between u and gv. The half twist around u must send P2 to itself, and clearly this can only
happen if v and gv are exchanged, and this means that α = β. As α + β is the angle between v and gv,
which is 2π/d, we have α = β = π/d. We also have P1 = {u, gu, . . . , g d−1 u}, which is the set of points with
polar coordinates [1, (2k + 1)π/d]. The group G consists of the rotations g k together with the half-twists
around the points in P1 and P2 , so G = D e d . This refers to D
e d as defined with respect to our new coordinate
system: if Dd is defined using the original coordinate system, then G is merely conjugate to D
e e d.
Proposition 11.17. If case (5) holds in Proposition 11.11 and d = 2 then G is conjugate to D
e 2.
Proof. In this case we have m = 3, d1 = d2 = d3 = 2 and |G| = n = 4. Let Pi be the i’th orbit, so
|Pi | = n/di = 2, so Pi = {vi , wi }, say. Let gi be a half twist around vi , which lies in G because di = 2. Note
38
that every element of G sends Pi = {vi , wi } to itself and gi sends vi to vi so it must send wi to wi . Thus, wi
is a fixed point of gi and the only two fixed points are vi and −vi so we must have wi = −vi .
Now consider g2 v1 . As the orbit of v1 is {v1 , −v1 } we must have g2 v1 = ±v1 and v1 is not one of the fixed
points of g2 so we must have g2 v1 = −v1 .
You should be able to see from the following picture that g2 b = −b if and only if b is perpendicular to v2 .
v2
g2 a a
−b = g2 b b
−a
As g2 v1 = −v1 , the vectors v1 and v2 must be orthogonal to each other. By a similar argument, they are
both orthogonal to v3 . Thus G consists of the identity together with half twists around three orthogonal
axes, whereas De 2 consists of the identity together with half-twists about the standard x, y and z-axes. It
follows that G is conjugate to De 2.
On the other hand, an eement x ∈ X lies in Fix(P ) if and only if the orbit P x consists of the single element
x, so | Fix(P )| is the number of orbits of size 1, which is r. Thus |X| = | Fix(P )| (mod p), as claimed.
Now suppose that |X| = 6 0 (mod p). Then | Fix(P )| = 6 0 (mod p), so | Fix(P )| =
6 0, so Fix(X) 6= ∅.
Proof of (c) and (d). Let P be a Sylow p-subgroup of G, and let Q be any p-subgroup of G. Note that
|P | = pv and |Q| = pw for some w ≤ v. As usual we write G/P for the set of right cosets of P , so a typical
element of G/P has the form xP for some x ∈ G. Note that |G/P | = |G|/|P | = m, which is not divisible by
p. We let Q act on G/P by g ∗ (xP ) = gxP for g ∈ Q. Note that |G/P | = m 6= 0 (mod p) and Q is a p-group
so by Lemma 12.4 there is a fixed point, in other words a coset xP such that gxP = xP for all g ∈ Q. This
40
means that x−1 gxP = P , so x−1 gx ∈ P , so g = x(x−1 gx)x−1 ∈ xP x−1 . This proves that Q ⊆ xP x−1 . Now
xP x−1 is conjugate to P so it is a subgroup of G with the same order as P , in other words it is another
Sylow p-subgroup. This shows that Q is contained in a Sylow p-subgroup, as claimed in (d).
Now suppose that Q itself is a Sylow p-subgroup. Then Q ≤ xP x−1 but |Q| = |xP x−1 | = pv so
Q = xP x−1 . Thus Q is conjugate to P , as claimed in (c).
Proof of (b). Let P be the set of all Sylow p-subgroups of G, so np = |P|. Let G act on P by conjugation,
so g ∗ P = gP g −1 . Choose a Sylow p-subgroup P ∈ P, and put
N = stabG (P ) = {g ∈ G | gP g −1 = P }.
It is clear that P ≤ N ≤ G so pv divides |N | and |N | divides pv m, so |N | = pv k for some k dividing m.
As all Sylow p-subgroups are conjugate to P , we have P = orbG (P ) and so |G| = |N ||P|, so pv m = pv knp ,
so m = knp . Thus np divides m.
Note also that P can be thought of as a Sylow p-subgroup of N . Part (c) of the Theorem works for any
finite group, in particular it works for the group N , so any other Sylow p-subgroup Q of N is conjugate in
N to P . This means that Q = gP g −1 for some g ∈ N . By the definition of N , this means that Q = P .
Thus, P is the unique Sylow p-subgroup of N .
Before we considered the action of all of G on P; now we restrict attention to the action of the subgroup
P . Lemma 12.4 tells us that | Fix(P, P)| = |P| = np (mod p). We want to prove that np = 1 (mod p),
so it will be enough to show that Fix(P, P) = {P }. Clearly if g ∈ P then gP g −1 = P , which shows that
P ∈ Fix(P, P). Conversely, suppose that Q ∈ Fix(P, P), so Q is a Sylow p-subgroup and gP g −1 = P for all
g ∈ Q. This means that Q is a Sylow p-subgroup of N , which means that Q = P by the previous paragraph.
Thus Fix(P, P) = {P } and | Fix(P, P)| = 1 as required.
Proposition 12.5. If np = 1 then the Sylow p-subgroup of G is a normal subgroup. If np > 1 then none of
the Sylow p-subgroups is normal.
Proof. Suppose that np = 1, so there is a unique Sylow p-subgroup, which we call P . If g ∈ G then gP g −1
is a Sylow p-subgroup so it must be equal to P ; this says that P is normal.
Now suppose that np > 1. If P is any Sylow p-subgroup, we can choose a different Sylow p-subgroup, say
Q. As all such subgroups are conjugate, there is some g ∈ G such that gP g −1 = Q 6= P . This means that
P is not normal.
i + j < n then
φ(Ri .Rj ) = φ(Ri+j ) = g i+j = g i g j = φ(Ri )φ(Rj ).
Suppose instead that i + j ≥ n. By assumption we have 0 ≤ i, j < n, so i + j < 2n. Thus, if we put
k = i + j − n then 0 ≤ k < n, so φ(Rk ) = g k . We also have
Ri Rj = Ri+j = Rk−n = Rk (Rn )−1 = Rk
g i g j = g i+j = g k−n = g k (g n )−1 = g k
so
φ(Ri .Rj ) = φ(Ri+j ) = φ(Rk ) = g k = g i g j = φ(Ri )φ(Rj ).
We thus have φ(Ri .Rj ) = φ(Ri )φ(Rj ) in all cases, showing that φ is a homomorphism.
41
Lemma 13.2. If n and m are coprime, then Cnm ' Cn × Cm .
i
Proof. We write rk for the generator of Ck , and define φ : Cnm →
− Cn ×Cm by φ(rnm ) = (rni , rm
i
). It is easy to
see that this is a homomorphism. Suppose that rnm lies in the kernel of φ. This means that (rni , rm
i i
) = (1, 1),
i i i
which means that rn = 1 in Cn and rm = 1 in Cm . As rn = 1 we see that i must be divisible by n, and
i
as rm = 1 we see that i must be divisible by m. As n and m are coprime this means that i is divisible by
i
nm, so rnm = 1. This shows that the kernel of φ is the trivial group, so φ is injective. This means that
|φ(Cnm )| = |Cnm | = nm = |Cn × Cm , so φ(Cnm ) must be all of Cn × Cm , so φ is surjective as well as
injective, so φ is an isomorphism.
We next recall the basic result about groups of prime order.
Proposition 13.3. If G is a group whose order is a prime number p, then G is isomorphic to Cp .
Proof. Let g be any element of G other than the identity. Then the order of g is not equal to 1 and it divides
p so it must be equal to p. The subgroup generated by g is thus equal to the whole group, and it follows
that G is cyclic of order p. More precisely, we can define a homomorphism φ : Cp → − G by φ(Ri ) = g i , and
we find that φ is an isomorphism.
We would next like to study groups of order p2 , where p is prime. We will first need a result about general
p-groups.
Definition 13.4. The centre of a group G is the set Z(G) = {z ∈ G | zg = gz for all g ∈ G}, so an element
z lies in the centre if and only if it commutes with all other elements. One checks that Z(G) is a normal
subgroup of G and that it is Abelian.
Example 13.5. The centre of the symmetric group Sn is the trivial group (provided that n > 2). To see
this, suppose that σ lies in the centre. For each i, let ρi be the (n − 1)-cycle formed by the numbers 1, . . . , n
with i missing, so ρi (i) = i and ρi (j) 6= j if j 6= i. Now ρi σ = σρi so ρi (σ(i)) = σ(ρi (i)) = σ(i), so σ(i) is
fixed by the action of ρi . The only fixed point is i, so σ(i) = i. This holds for all i, so σ = 1.
Proposition 13.6. If P is a nontrivial p-group then Z(P ) 6= {1}.
Proof. Let P act on itself by conjugation, so g ∗ x = gxg −1 . Note that g ∗ x = x if and only if gx = xg, or in
other words g commutes with x. Thus x is fixed under the action of P iff g ∗ x = x for all g, iff x ∈ Z(P ).
Thus Lemma 12.4 tells us that |Z(P )| = |P | (mod p). As P is a nontrivial p-group we have |P | = pv for
some v > 0 so |P | = 0 (mod p), so |Z(P )| is divisible by p. Moreover, 1 ∈ Z(P ) so |Z(P )| > 0. It follows
that |Z(G)| ≥ p, so Z(G) 6= {1}.
Lemma 13.7. Let G be a finite group, and let P and Q be subgroups of G. Define a function φ : P × Q →
− G
by φ(x, y) = xy.
(a) If every element of P commutes with every element of Q, then φ is a homomorphism.
(b) If we also have P ∩ Q = {1}, then φ is injective.
Proof. (a) Suppose that every element of P commutes with every element of Q. Consider elements
x0 , x1 ∈ P and y0 , y1 ∈ Q, so (x0 , y0 ) and (x1 , y1 ) are elements of P × Q. We then have
φ((x0 , y0 )(x1 , y1 )) = φ((x0 x1 , y0 y1 )) (definition of P × Q)
= x0 x1 y0 y1 (definition of φ)
= x0 y0 x1 y1 (because x1 commutes with y0 )
= φ((x0 , y0 ))φ((x1 , y1 )) (definition of φ)
This shows that φ is a homomorphism.
(b) Now suppose as well that P ∩ Q = {1}. Consider an element (x, y) ∈ ker(φ). This means that
(x, y) ∈ P × Q and and φ((x, y)) = 1, or in other words, x ∈ P and y ∈ Q and xy = 1. This means
that x = y −1 , and y ∈ Q, so x ∈ Q. We are also given that x ∈ P , so x ∈ P ∩ Q = {1}, so x = 1.
This means that y −1 = 1, so y = 1, so (x, y) = (1, 1). This proves that the kernel of φ is the trivial
group, so φ is injective.
42
Proposition 13.8. Let G be a group of order p2 . Then G is isomorphic either to Cp × Cp or to Cp2 (and
so G is always Abelian).
Proof. If G has an element of order p2 then it is cyclic and thus isomorphic to Cp2 . Suppose instead that all
nontrivial elements of G have order p. By Proposition 13.6, we can choose a nontrivial element z ∈ Z(G).
This generates a subgroup P ≤ Z(G) ≤ G of order p. Let g be any element of G not lying in P , and let Q
be the subgroup generated by g, which again has order p. By Lemma 13.7, we can define a homomorphism
φ: P × Q →− G by φ(x, y) = xy. Let H be the image of φ, so H is a subgroup of G, so |H| divides |G| = p2 ,
so |H| is 1, p or p2 . It is clear that P ≤ H and g ∈ H \ P , so |H| ≥ p + 1, so |H| must be p2 . This
means that H = G, so φ is surjective. As P × Q and G have the same size, any sutrjective function between
them must be bijective, so φ is an isomorphism. We also know that P and Q are both isomorphic to Cp , so
G ' P × Q ' Cp × Cp .
Proposition 13.9. Let G be a finite group, and let P and Q be normal subgroups of orders p and q. Suppose
that p and q are coprime, and that pq = |G|. Then G ' P × Q.
Proof. First, put r = |P ∩ Q|. As P ∩ Q is a subgroup of P , we see that r divides p. As P ∩ Q is a subgroup
of Q, we see that r also divides q. As p and q are coprime, this means that r = 1, so P ∩ Q is the trivial
group.
Now consider elements x ∈ P and y ∈ Q, and put z = xyx−1 y −1 . We will show that z ∈ P ∩ Q, so
that z = 1. Indeed, we have y ∈ Q and Q is normal, so xyx−1 ∈ Q. As y −1 also lies in Q, we deduce that
z = (xyx−1 )y −1 ∈ Q. Similarly, we know that x−1 ∈ P and P is normal so yx−1 y −1 ∈ P . We also know
that x ∈ P , so z = x(yx−1 y −1 ) ∈ P . This gives z ∈ P ∩ Q, so z = 1, or in other words 1 = xyx−1 y −1 . If we
multiply this on the right by yx, we get yx = xyx−1 y −1 yx = xy, so x commutes with y. Lemma 13.7 now
tells us that we can define an injective homomorphism φ : P × Q → − G by φ(x, y) = xy. As this is injective,
we have
|φ(P × Q)| = |P × Q| = pq = |G|,
so φ(P × Q) = G, so φ is also surjective. This means that φ is an isomorphism of groups.
Proposition 13.10. Let G be a group of order pq where p and q are primes and p < q. Suppose also that
q − 1 is not divisible by p. Then G ' Cp × Cq .
Proof. We know that np divides q and q is prime so np = 1 or np = q. However we also know that np = 1
(mod p) so p divides np − 1. We are told that p does not divide q − 1, so np cannot be equal to q, so np = 1.
It follows that there is a unique Sylow p-subgroup, which we call P . Note that |P | = p and so P ' Cp , and
also that P is normal.
Next, we know that nq divides p, so nq = 1 or nq = p. We also know that nq = 1 (mod q), so nq − 1 is
divisible by q. Note that 0 < p − 1 < q, so p − 1 cannot be divisible by q, so we must have nq = 1. We
therefore have a unique Sylow q-subgroup, which we call Q. We note that |Q| = q and that Q is normal.
It is now clear that the conditions of Proposition 13.9 are satisfied, so G ' P × Q ' Cp × Cq .
Proposition 13.11. Let p be a prime number with p > 2, and let G be a group with |G| = 2p. Then either
G ' Cp × C2 ' C2p or G ' Dp .
Proof. We know that np divides 2 and that np − 1 is divisible by p. It follows that np = 1, so there is a
unique Sylow p-subgroup, which we call P . We choose a nontrivial element g ∈ P , and define an isomorphism
− P by φ(Ri ) = g i .
φ : Cp →
Next, let Q be a Sylow 2-subgroup, so |Q| = 2, so Q = {1, h} for some element h with h2 = 1. As P is
normal, we know that hgh−1 ∈ P , so hgh−1 = g a for some a. It follows that
2
h2 gh−2 = hg a h−1 = (hgh−1 )a = (g a )a = g (a ) .
2 2
On the other hand, we have h2 = 1, so h−2 = 1, so h2 gh−2 = g. We thus have g = g a , so g a −1 = 1, so
a2 − 1 must be divisible by p. As p is prime and a2 − 1 = (a + 1)(a − 1), we see that either a + 1 or a − 1
must be divisible by p.
If a − 1 is divisible by p then g a = g, so h−1 gh = g, so g commutes with h. In this case, the conditions of
Lemma 13.7 are satisfied and we find that G ' P × Q ' Cp × C2 ' C2p .
43
Suppose instead that a + 1 is divisible by p. This means that g a = g −1 , so hgh−1 = g −1 . We then define
− G by φ(Ri ) = g i and φ(Ri S) = g i h for 0 ≤ i < p. It is straightforward to check that
a function φ : Dp →
this is a homomorphism, the key point being that SRS −1 = R−1 in Dp , which corresponds to the relation
hgh−1 = g −1 in G. The image of φ contains both P and Q, so the order of the image is divisible by p and
2 and thus by 2p. It follows that φ is surjective, but the groups Dp and G have the same order, so φ must
actually be an isomorphism.
Remark 13.12. The above proposition can be extended to show that when p and q are distinct primes,
any group of order pq is a “semidirect product” of Cp and Cq .
Now consider groups of order at most 40. Using Propositions 13.3, 13.8, 13.10 and 13.11, we can classify
all groups of the following orders:
1, 2, 3, 4, 5, 6, 7, 9, 10, 11, 13, 14, 15, 17, 19, 22, 23, 25, 26, 29, 31, 33, 35, 37, 38.
More work is needed to classify groups of the remaining orders:
8, 12, 16, 18, 20, 21, 24, 27, 28, 32, 34, 36, 39.
Proof. Suppose a ∈ Zn . If x ∈ G then xn = 1 so the powers of x depend only on the exponent modulo
n, so it makes sense to talk about xa . We define a function αa : G → − G by αa (x) = xa . Because G is
cyclic it is Abelian and this implies that (xy)a = xa y a or in other words αa (xy) = αa (x)αa (y), so αa is
a homomorphism. Clearly also αa αb (x) = αa (xb ) = (xa )b = xab = αab (x), so αa αb = αab . Similarly
α1 (x) = x, so α1 is the identity map, and α0 (x) = 1, so α0 is the trivial homomorphism.
We next claim that every homomorphism β : G → − G has the form β = αa for a unique element a ∈ Zn .
Indeed, we can choose a generator g ∈ G so that G = {1, g, . . . , g n−1 } = {g a | a ∈ Zn }. We then have
β(g) ∈ G so β(g) = g a for a unique element a ∈ Zn . As β is a homomorphism we have
β(g i ) = β(g)i = (g a )i = g ia = αa (g i )
for all i, so β = αa .
Now, if a ∈ Z× ×
n then a has an inverse b ∈ Zn and then αa αb = αab = α1 = 1 and similarly αb αa = 1 so
αb is an inverse for αa . This means that αa is an automorphism of G, in other words αa ∈ Aut(G).
Conversely, suppose that β is an automorphism of G. Then β and β −1 are homomorphisms from G to
itself, say β = αa and β −1 = αb for some a, b ∈ Zn . We then have αab = ββ −1 = 1 = α1 , so ab = 1, so
a ∈ Z× ×
n . This shows that the automorphisms of G are precisely the maps αa with a ∈ Zn .
×
We can now define a map φ : Zn → − Aut(G) by φ(a) = αa , and we find that this is an isomorphism.
Construction 14.4. Suppose that G is a group and N is a normal subgroup. For any g ∈ G we know that
N = gN g −1 . We can therefore define γg : N →
− N by γg (x) = gxg −1 . This is a homomorphism because
γg (x)γg (y) = gxg −1 gyg −1 = gxyg −1 = γg (xy).
44
We also have
γg (γh (x)) = γg (hxh−1 ) = ghxh−1 g −1 = (gh)x(gh)−1 = γgh (x),
so γg γh = γgh . In particular, this means that γg−1 is an inverse for γg , so γg is an automorphism of N , in
other words γg ∈ Aut(N ).
Construction 14.5. Now suppose that G is a semidirect product of N and Q. We define a function
φ: Q →
− Aut(N ) by φ(g) = γg . We then have φ(g)φ(h) = γg γh = γgh = φ(gh), so φ is a homomorphism.
Example 14.6. Consider the group G = Zn oa Zm as the semidirect product of N = {(v, 0) | v ∈ Zn } and
Q = {(0, w) | w ∈ Zm }. We then have Aut(N ) ' Z×
n and Q ' Zm . We also have
γ(0,w) (v, 0) = (0, w)(v, 0)(0, −w) = (aw v, w)(0, −w) = (aw v, 0).
This means that the automorphism γ(0,w) ∈ Aut(N ) corresonds to the element aw ∈ Z×
n and that the
homomorphism φ : Zm →− Z× w
n is given by φ(w) = a .
45