0% found this document useful (0 votes)
29 views116 pages

Math 124 Notes Fall 2023

Math 124: Number Theory, taught by Mark Kisin with Hahn Lheem as the course assistant in Fall 2023, covers topics such as unique prime factorizations, congruences, and Diophantine equations. The course has no prerequisites and includes weekly problem sets, a midterm, and a final exam based on homework problems. The main textbook is Ireland-Rosen's 'A Classical Introduction to Modern Number Theory,' and students are encouraged to ask questions if they encounter difficulties.

Uploaded by

Chí Vũ
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views116 pages

Math 124 Notes Fall 2023

Math 124: Number Theory, taught by Mark Kisin with Hahn Lheem as the course assistant in Fall 2023, covers topics such as unique prime factorizations, congruences, and Diophantine equations. The course has no prerequisites and includes weekly problem sets, a midterm, and a final exam based on homework problems. The main textbook is Ireland-Rosen's 'A Classical Introduction to Modern Number Theory,' and students are encouraged to ask questions if they encounter difficulties.

Uploaded by

Chí Vũ
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 116

Math 124: Number Theory

Hahn Lheem (as Course Assistant)


Taught by Mark Kisin
Fall 2023

Contents
0 Preface 5

1 09/08 - Unique Prime Factorizations for Integers 5


1.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Integers have unique prime factorization . . . . . . . . . . . . . . . . . 6
1.3 Proving Uniqueness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Unique Factorization of k[x] . . . . . . . . . . . . . . . . . . . . . . . . 10

2 9/11 - Generalizing Unique Factorization 10


2.1 Unique Factorization of k[x], cont. . . . . . . . . . . . . . . . . . . . . . 10
2.2 Euclidean Domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3 9/15 - Results on Primes 15


3.1 Infinitely Many Primes in the Integers . . . . . . . . . . . . . . . . . . 15
3.2 Infinitely Many Primes for Polynomials . . . . . . . . . . . . . . . . . . 18

4 9/18 - Proving Weaker Version of Prime Number Theorem 20


4.1 Upper Bound on π(x) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.2 Lower Bound on π(x) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.3 Modular Congruence . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

5 9/22 - Euler’s Totient 25


5.1 Proving Euler’s Totient Formula . . . . . . . . . . . . . . . . . . . . . . 25
5.2 Möbius Inversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.3 Euler’s Totient Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 28

1
Math 124: Number Theory CONTENTS

6 09/25 - Unit Groups 30


6.1 Proving Existence of Primitive Root . . . . . . . . . . . . . . . . . . . 30
6.2 Structure of Unit Groups . . . . . . . . . . . . . . . . . . . . . . . . . . 33

7 09/29 - Quadratic Reciprocity 34


7.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
7.2 Quadratic Residues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
7.3 Proof of Quadratic Reciprocity, Step 1 . . . . . . . . . . . . . . . . . . 38
7.4 Proof of Quadratic Reciprocity, Step 2 . . . . . . . . . . . . . . . . . . 40
7.5 Proof of Quadratic Reciprocity, Step 3 . . . . . . . . . . . . . . . . . . 41

8 10/02 - Algebraic Numbers & Integers 43


8.1 Algebraic Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
8.2 Algebraic Numbers (Integers) form a Field (Ring) . . . . . . . . . . . . 45
8.3 Properties of Algebraic Numbers . . . . . . . . . . . . . . . . . . . . . 47
8.4 Quadratic Character of 2 . . . . . . . . . . . . . . . . . . . . . . . . . . 48

9 10/06 - Quadratic Gauss Sums 50


9.1 Gauss Sum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
9.2 Second Proof of Quadratic Reciprocity . . . . . . . . . . . . . . . . . . 52
9.3 Kronecker’s Result for Quadratic Extensions . . . . . . . . . . . . . . . 53

10 10/13 - Finite Fields 58


10.1 Construction of Finite Fields . . . . . . . . . . . . . . . . . . . . . . . . 61
10.2 Existence of Fq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

11 10/16 - Finite Fields, continued 64


11.1 Completing Proof of Existence . . . . . . . . . . . . . . . . . . . . . . . 64
11.2 Uniqueness of Fq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
11.3 Interlude: Galois theory preview . . . . . . . . . . . . . . . . . . . . . . 66
11.4 Proof 2.5 of Quadratic Reciprocity . . . . . . . . . . . . . . . . . . . . 67

12 10/23 - Diophantine Equations 68


12.1 Gaussian Integers, a review . . . . . . . . . . . . . . . . . . . . . . . . 68
12.2 Irreducible Elements in Gaussian Integers . . . . . . . . . . . . . . . . . 69
12.3 Pythagorean Triples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

13 10/27 - More Diophantine Equations 72


13.1 Method of Infinite Descent . . . . . . . . . . . . . . . . . . . . . . . . . 72

Hahn Lheem Page 2


Math 124: Number Theory CONTENTS

13.2 Fermat’s Last Theorem for n = 4 . . . . . . . . . . . . . . . . . . . . . 73


13.3 Sophie Germain’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 74

14 10/30 - Fermat’s Last Theorem for n = 3 76


14.1 Eisenstein Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
14.2 Properties of λ = 1 − ω . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
14.3 Proving Theorem 14.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

15 11/03 - Pell’s Equations 81


15.1 Approximating with Fractions . . . . . . . . . . . . . . . . . . . . . . . 82
15.2 Proving Solutions to Pell’s Equation . . . . . . . . . . . . . . . . . . . 83

16 11/06 - More on Pell’s Equation 84



1+ d
16.1 Motivation for 2
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
16.2 Units of Ring of Integers . . . . . . . . . . . . . . . . . . . . . . . . . . 85
16.3 Finding Solutions when d ≡ 5 (mod 8) . . . . . . . . . . . . . . . . . . 86

17 11/10 - Dirichlet’s Theorem, an Introduction 86


17.1 Riemann Zeta Function . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
17.2 Euler Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
17.3 Dirichlet Density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
17.4 Dirichlet L-functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
17.5 Dirichlet’s Theorem for m = 4 . . . . . . . . . . . . . . . . . . . . . . . 93

18 11/13 - Dirichlet Characters 94


18.1 Dual Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
18.2 Orthogonality Relations . . . . . . . . . . . . . . . . . . . . . . . . . . 97
18.3 Dirichlet L-functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

19 11/20 - Dirichlet’s Theorem, Part II 100


19.1 Proof of Dirichlet’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . 103

20 11/27 - Proving Proposition 19.3 104


20.1 Reducing to Analytic Continuation . . . . . . . . . . . . . . . . . . . . 104
20.2 Analytic Continuation for Riemann Zeta . . . . . . . . . . . . . . . . . 105

21 12/01 - Proving Theorem 20.1 108


21.1 Proof of Analytic Continuation of L(s, χ) . . . . . . . . . . . . . . . . . 109
21.2 Evaluating L(1, χ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

Hahn Lheem Page 3


Math 124: Number Theory CONTENTS

22 12/04 - Last Lecture 111


22.1 So... what is an L-function? . . . . . . . . . . . . . . . . . . . . . . . . 114

Hahn Lheem Page 4


Math 124: Number Theory Preface

0 Preface
This class is from the Fall 2023 semester. Meeting times are Monday and Friday from
12-1:15pm in SC221. The main textbook is Ireland-Rosen’s A Classical Introduction
to Modern Number Theory.1 There are no prerequisites to this course, so “if you don’t
understand something, it’s [Kisin’s] fault, not yours.” Bottom line: don’t be afraid to
ask questions!
Problem sets will be assigned approximately weekly. Kisin’s office hours will be 2-
3pm on Wednesdays in SC232. Office hours and section times for the course assistants
can be found on Canvas. There will be both a midterm and final exam, which will be
“absolutely routine if you do all of the homework.” (The problems will be taken from
the homework.)
If you see anything wrong or unclear, let me know at [email protected]!

1 09/08 - Unique Prime Factorizations for Integers


A little bit more review of the syllabus. From Ireland-Rosen, the basic plan is to cover
chapters 1-8, 10, 11, 16, and 17. The topics include:

• unique factorization,

• congruences,

• Quadratic Reciprocity,

• equations over finite fields,

• Diophantine equations.

The magic of number theory is that very sophisticated tools and results can be
developed from the most simple techniques. This means that the beginning of the
course may feel slow, perhaps even underwhelming, but it comes together very nicely
(and quickly!) in the latter half of the semester. So hold your horses.

1.1 Notation
Always good to establish from the getgo.

• Z := {0, ±1, ±2, . . . }

• N+ = {1, 2, 3, . . . }
1
Kisin believes this textbook should be titled “An Elementary Introduction to Classical Number
Theory”; I share this sentiment.

Hahn Lheem Page 5


Math 124: Number Theory 09/08 - Unique Prime Factorizations for Integers

• ∈ indicates an element of a set, e.g. 5 ∈ N+


• For a, b ∈ Z, a | b means “a divides b”

1.2 Integers have unique prime factorization


A longstanding theme of number theory is that integers are understood from the primes.
This makes sense: every number can be factorized into primes. Thus, it is important
to have an unambiguous definition of prime numbers from the start.

Definition 1.1 (Prime Number). An integer p ∈ N+ is a prime number if p > 1


and its only divisors are ±1, ±p, i.e. if a ∈ Z such that a | p, then a is either ±1
or ±p.

As mentioned already, primes are important because every integer factors into primes:

Theorem 1.2 (Unique factorization of integers)


If n ∈ N+ , then n is a product of primes in a unique way. In other words, we
can uniquely express n = pa11 pa22 · · · pas s , where p1 < p2 < · · · < ps are prime and
ai ∈ N+ .

Remark 1.3. For n = 1, we take the empty product, i.e. when s = 0.

We’re all familiar with this theorem, perhaps we use it every day. But just how
“obvious” is this theorem? Is it immediate from the definition?
The theorem statement seems to incorporate two parts to it. First, we need to show
that a prime factorization actually exists in the first place, and then assert that it is
unique. This distinction is important: there are some worlds where existence holds,
but the factorization is not unique! (For those who have taken Math 122, this may be
familiar. Otherwise, an √ 2 · 3, but√if you
√ example I like to give is that you can write 6 =
allow for terms with −5 then suddenly you can also write 6 = (1 + −5)(1 − −5).
So the factorization is not unique.)

1.3 Proving Uniqueness


Because this uniqueness condition is harder to satisfy, it is harder to prove. The ex-
istence condition is almost immediate from the definition, which we now demonstrate
with the following lemma:

Lemma 1.4
Every n ∈ N+ is a product of primes.

Hahn Lheem Page 6


Math 124: Number Theory 09/08 - Unique Prime Factorizations for Integers

Proof. We prove by (strong) induction on n. For n = 1, we take the empty product,


aka the product of no primes, so our base case is satisfied.
Now suppose the statement is true for all 1 ≤ m < n. Consider n. If n is prime,
take the obvious factorization n = n, done. Otherwise, if n is not prime, then n must
have some divisor a where 1 < a < n. The quotient n/a must also be an integer, call b,
so n = a · b. But since a, b < n, by our inductive hypothesis, both a and b are a product
of primes! Therefore, n = a · b is also a product of primes, as desired.

So the subtle thing we are using here that makes this work is the very convenient
fact that the integers have a well-ordering, i.e. given any two integers a, b, we can
compare their sizes. This is used in the fact that the two divisors a, b are both less than
n, which allows us to activate the inductive hypothesis.
Again, this may not seem the most enlightening stuff now, but when we grow up
from N+ to some other set of numbers and try to prove similar statements, the work
we’ve done here will become very handy.
Now, we want to prove that the factorization is unique, up to ordering. For instance,
we can factor 36 = 6 · 6 = 2 · 3 · 2 · 3, or 36 = 4 · 9 = 2 · 2 · 3 · 3, which are the same up
to ordering of the factors. We want to show this is always the case, no matter which n
we choose.
We first develop a definition for greatest common divisor, an important concept that
will come up again and again.

Definition 1.5. If a1 , a2 , . . . , an ∈ Z, define (a1 , a2 , . . . , an ) = {a1 x1 + a2 x2 + · · · +


an xn | xi ∈ Z}.

Remark 1.6. For those familiar with ring theory, this is the ideal in Z generated
by a1 , a2 , . . . , an .

Definition 1.7. If a, b ∈ Z, an integer d is called a greatest common divisor


(gcd) of a, b if

1. d | a and d | b, and

2. whenever c ∈ Z such that c | a and c | b, then c | d.

Might seem weird at first, but if you think about it at first, it makes perfect sense.
Remark 1.8. You might be wondering why we don’t replace the second condition
of gcd with a more familiar condition “whenever c | a and c | b, then c ≤ d. There
are a few reasons why we don’t want to do this, the first being that this takes
away some factors which would qualify as gcd. For instance, gcd(36, 60) = ±12,
but using the ≤ condition would remove the −12 possibility. This may not be a

Hahn Lheem Page 7


Math 124: Number Theory 09/08 - Unique Prime Factorizations for Integers

concern now, but it would cause problems when we go beyond the integers.
The second reason is that it’s just not necessary. Why do we need to invoke
the well-ordering of the integers when we don’t need to? This is a good thing to
practice in math: if you don’t need something, don’t use it. You’ll thank yourself
in the long run.

Okay, we have this definition. Now we want to know this gcd always exists.

Lemma 1.9 (Existence of gcd)


If a, b ∈ Z, then (a, b) = dZ = {d · n | n ∈ Z} for some d ∈ Z, and d is a gcd for a
and b.

Proof. Let’s take care of the stupid case first. If a = b = 0, then gcd(a, b) = 0. Now
assume a ̸= 0. Let d be the smallest positive element of (a, b). We will show (a, b) = dZ.
When we want to show equality of two sets, it is customary to show that one contains
the other, and vice versa. One inclusion is immediate: given our choice of d above, it
is clear that d · Z ⊆ (a, b). (⊆ means “contains”.) So we are left to prove (a, b) ⊆ d · Z.
Suppose e ∈ (a, b). We invoke the Division Algorithm, which tells us that e = q·d+r
where q, r ∈ Z and 0 ≤ r < d. (This is just saying when you divide e by d, you get a
quotient q with a remainder r < d.) Then, r = e − qd, so r ∈ (a, b) since d, e ∈ (a, b).
But since we chose d to be the smallest positive element of (a, b), this forces r = 0,
which means e = q · d ∈ d · Z. Since this works for any e ∈ (a, b), we conclude
(a, b) ⊆ d · Z, and equality of the two sets follows.
Now we must address the last part of the result: showing that d = gcd(a, b). Showing
that d is a common factor is immediate: a, b ∈ (a, b) = dZ, so d | a and d | b. Also note
that d ∈ (a, b), so there are integers x, y such that d = ax + by. Thus, if an integer
c ∈ Z satisfies c | a and c | b, then c | ax + by = d, so d is a gcd of a, b.

Just as a cook must know the ingredients in a recipe and not just the steps, we
should understand what are the key things being used in our proofs. Here, the main
content of the proof unravels from the Division Algorithm, a simple/intuitive yet very
important result in number theory.
We continue to develop some more definitions and results for our proof of unique
prime factorization.

Definition 1.10 (Coprime). If a, b ∈ Z, we say a, b are coprime if (a, b) = Z.


(Note: from above, this means gcd(a, b) = 1, i.e. if c | a and c | b, then c = ±1.)

Remark 1.11. Notation: if (a, b) = dZ, i.e. d = gcd(a, b), then we abuse notation
by suppressing the gcd and writing (a, b) = d.

Hahn Lheem Page 8


Math 124: Number Theory 09/08 - Unique Prime Factorizations for Integers

Lemma 1.12
If a | bc and (a, b) = 1, then a | c.

Proof. This follows from previous definitions. As (a, b) = Z (in particular, 1 ∈ (a, b)),
there exists r, s ∈ Z such that 1 = ra + sb. Thus, c = rac + sbc. Clearly, a | rac, and
from assumption, a | s · bc, so a | c as desired.

What we will use in our proof of unique factorization is a specific version of this
where a = p is a prime.

Corollary 1.13
If p is prime and p | bc, then p | b or p | c.

Proof. As the only factors of p are ±1, ±p, we either have (p, b) = 1, in which case p | c
from above, or (p, b) = p, in which case p | b.

One more definition, for ease of notation.

Definition 1.14 (Order). If p is prime and n ∈ Z, then ordp n is the largest integer
a such that pa | n.

A fact about orders:

Corollary 1.15
If p is prime, and a, b ∈ Z, then ordp (ab) = ordp (a) + ordp (b).

Proof. Let α = ordp a, β = ordp b. By definition, a = pα · a′ and b = pβ · b′ , where


p ∤ a′ , b′ . We see ab = pα+β a′ b′ . Because p ∤ a′ , p ∤ b′ means p ∤ a′ b′ , we get ordp ab =
α + β = ordp a + ordp b, as desired.

Note that we can apply this above corollary repeated, e.g. ordp (abc) = ordp (ab) +
ordp c = ordp a + ordp b + ordp c. Now, finally, we are ready to prove uniqueness of prime
factorization.

Proof of second part of Theorem 1.2. Let n ∈ N+ . Write n = pa11 pa22 · · · pann , where the
pi ’s are distinct primes and ai > 0. (Note that we already proved a prime factorization
exists, so this is fair game.) We want to show that the exponent of any prime p in the
factorization depends only on n, not on the choice of factorization.
Let a := ordp n. By the above corollary, we have ordp n = ordp (pa11 ) + ordp (pa22 ) +
· · · + ordp (pann ). If pi ̸= p, then p does not divide pi , so ordp pai i = 0. On the other

Hahn Lheem Page 9


Math 124: Number Theory 9/11 - Generalizing Unique Factorization

hand, if pi = p, then ordp n = ordp pai = ai , so a = ai . Since a is determined by n alone


(a = ordp n), ai is only dependent on n as well. In particular, we can write
Y
n= pordp n ,
p prime

and this is the unique factorization.

1.4 Unique Factorization of k[x]


I warned you. We graduate from the integers and now move on to polynomials.
Let k be a field. If you don’t know what a field is, think either Q (the rational
numbers), R (the real numbers), or even C (the complex numbers). A field, very
briefly, is a structure with addition and multiplication, where all nonzero elements have
a multiplicative inverse. (So for example, Z is not a field, since the multiplicative inverse
of 2 is 1/2 which is not an integer.)
Denote k[X] as the set of polynomials in X over k (i.e. with coefficients in k), so
formally

k[X] = {f (X) := a0 + a1 X + · · · + an X n | ai ∈ k, an ̸= 0 if n > 0}.



For example, if k = R, then 2 + πX + eX 2 is a polynomial in R[X]. If f (X) =
a0 + a1 X + · · · + an X n , where an ̸= 0, we say the degree of f is deg f = n.

2 9/11 - Generalizing Unique Factorization


2.1 Unique Factorization of k[x], cont.
Today, we will try to transfer our work in Z (namely, unique factorization) to our new set
k[X]. Let’s lay out some definitions beforehand to ease our work moving forward:

Definition 2.1 (Irreducible). Let f ∈ k[x], and deg f > 0. Then, f is called
irreducible if whenever f = gh, either deg g = 0 or deg h = 0.

Definition 2.2 (Monic). A polynomial f (X) = a0 + a1 X + · · · + am X m with


m = deg f > 0 is called monic if am = 1.

Finally, we specify the units of our ring k[x]. In Z, the units are {±1}, as they are the
only two integers with a multiplicative inverse. In k[x], try to convince yourself of the
following:

Hahn Lheem Page 10


Math 124: Number Theory 9/11 - Generalizing Unique Factorization

Exercise 2.3. The units of k[x] are (k[x])× = k × = k − {0}.

Given this, we have the following theorem, analogous to unique prime factorization in
Z.

Theorem 2.4 (analogous to Theorem 1.2)


Every nonzero f ∈ k[x] has a factorization into irreducibles f = c · f1 · · · fn , unique
up to k × .

Turns out the proof of this is almost identical to what we did last time, which is why we
went through it in the first place! So our approach will be, like last time, to first prove
existence of such a factorization, then prove its uniqueness. We continue expanding our
analogy between Z and k[x] in order to prove this theorem.

Set Z k[x]
Units {±1} k×
Size |n| deg f

Proof of existence. We may assume f is monic. Like with our proof last time, we can
start with showing a factorization exists via induction. Any linear polynomial (i.e.
deg f = 1) is irreducible, so the base case is satisfied. Otherwise, suppose factorization
exists for all h such that deg h < n. Take a degree-n polynomial f . If f is irreducible,
great, we have the trivial factorization f = f . Otherwise, we can write f = gg ′ , where
deg g, deg g ′ < deg f . By the inductive hypothesis, each of g, g ′ has a factorization into
irreducibles, so the product gg ′ = f does as well.

Recall the proof of uniqueness in the integer scenario was solo carried by the won-
derful thing called the Division Algorithm: given a, b ∈ Z (b ̸= 0), there exists integers
q, r ∈ Z such that a = bq + r and 0 ≤ r < |b|. To have the same proof apply to k[x],
we need an analogy of this result. Because this was pretty intuitive for the integers, we
omitted a proof there, but here we will need to put in the work.

Lemma 2.5 (Division Algorithm for Polynomials)


If f, g ∈ k[x], g ̸= 0, then there exists q, r ∈ k[x] such that f = qg + r and either
deg r < deg g or r = 0.

Proof. Consider the set S := {f −ζg : ζ ∈ k[x]}. Let r be an element of S with minimal
degree. By definition, we can write r = f − qg for some q ∈ k[x]. We have two cases
for g.
First, suppose g ∈ k × . Then, we can set q = f /g, in which case r = 0. That’s good.

Hahn Lheem Page 11


Math 124: Number Theory 9/11 - Generalizing Unique Factorization

Now, suppose deg > 0. We wish to show deg r < deg g. Write r = axℓ + . . . and
g = bxm + . . . , so deg r = ℓ and deg g = m. Suppose for the sake of contradiction
that deg r ≥ deg g. Then, we have r − ab−1 xd−m g = f − (q + ab−1 xd−m )g so it is in S,
but deg(r − ab−1 xd−m g) < deg g, hence contradicting the minimality of g in S. Thus,
deg r < deg g, and we conclude.

This stuff may look familiar to you, because it is from last lecture.

Definition 2.6. If f1 , . . . , fn ∈ k[x], denote

(f1 , . . . , fn ) := {f1 h1 + · · · + fn hn : hi ∈ k[x]} ⊂ k[x].

(Again, this is the ideal generated by f1 , . . . , fn .)

Definition 2.7 (GCD). If f, g ∈ k[x], then d ∈ k[x] is a greatest common


divisor (gcd) for f, g if

1. d|f and d|g, and

2. if c ∈ k[x] such that c | f and c | g, then c | d.

How do we get the GCD? Well...

Lemma 2.8 (analogous to Lemma 1.9)


If f, g ∈ k[x], then (f, g) = (d) = d · k[x] and d is a gcd for f, g.

Proof. Let d be an element of minimal degree in (f, g). It is clear (d) ⊆ (f, g), so we
now prove the reverse inclusion. If c ∈ (f, g), then by Division Algorithm (Lemma 2.5),
we can write c = qd + r where deg r < deg d or r = 0. Rewrite as r = c − qd. Because
c, d ∈ (f, g), we have r ∈ (f, g) as well. This forces r = 0, because otherwise we would
have deg r < deg d, contradicting the minimality of d. Thus, when r = 0, we have
c = qd, namely c ∈ (d). This means (f, g) ⊆ (d), so equality follows. This completes
the first part of this lemma.
Now we show that this choice of d is a gcd of f, g. First, note that (f, g) = (d),
so in particular both f ∈ (d) and g ∈ (d), meaning d | f and d | g, respectively. Let
c ∈ k[x] where c | f and c | g. Then, c | af + bg for any a, b ∈ k[x]. But note d ∈ (f, g),
so d = a′ f + b′ g for some a′ , b′ ∈ k[x] by definition! Thus, c | d, as desired.

We continue to make headway.

Hahn Lheem Page 12


Math 124: Number Theory 9/11 - Generalizing Unique Factorization

Definition 2.9 (Relatively Prime). We say f, g ∈ k[x] are relatively prime if


(f, g) = (1) = k[x].

Lemma 2.10 (analogous to Lemma 1.12)


If f, g ∈ k[x] are relatively prime, and h ∈ k[x] such that f | gh, then f | h.

Proof. Since (f, g) = (1), we may write 1 = af + bg for some a, b ∈ k[x]. Then,
h = af h + bgh. Clearly, f | af h, and f | b · gh, so f | h, as desired.

Corollary 2.11
If p ∈ k[x] is irreducible and p | f g, then p | f or p | g.

Proof. Let (p, f ) = (d), so p = d · d′ for some d′ ∈ k[x]. By definition of irreducible,


either (d) = (1) or (d′ ) = (1) =⇒ (d) = (p). For the latter, f ∈ (d) = (p), so p | f .
Otherwise, if (d) = (1), then p | g by the above lemma, and we conclude.

Definition 2.12 (Order). For p ∈ k[x] irreducible, f ∈ k[x], define the order
ordp f as the largest integer a such that pa | f .

Corollary 2.13 (analogous to Corollary 1.15)


If f, g ∈ k[x], then ordp (f g) = ordp f + ordp g.

This is left as an exercise, but you should just follow the proof for what we did in Z.
Now we are ready to complete the proof of Theorem 2.4 by proving uniqueness of
factorization.

Proof of uniqueness. Let f ∈ k[x]. We already proved it has a factorization into irre-
ducibles, so write f = c · f1a1 · · · fnan where the fi ’s are irreducible, ai ∈ N+ , and c ∈ k × .
We may assume that f and the fi ’s are monic and c = 1. But the exponents ai are
uniquely determined by the order of p, namely if p = fi , then ai = ordp f . But the
order is determined solely by f , so the factorization is indeed only dependent on f , i.e.
it is unique.

Hooray! Time to celebrate.

Hahn Lheem Page 13


Math 124: Number Theory 9/11 - Generalizing Unique Factorization

2.2 Euclidean Domains


But what’s so special about Z and k[x]? Surely, if this entire thread of results holds
for these two rings/sets, then shouldn’t we be able to apply the same reasoning to any
ring given sufficient properties?
This is why we care about not just the steps of the proofs, but also what we use,
because then we can extract exactly what information we are using, and set that to be
our definition. It turns out that the specific set of properties we need for all of these
proofs to translate nicely gives rise to a ring called a Euclidean domain. We will now
define this.
First, a Euclidean domain is a type of integral domain. Most things in life are
integral domains, so Kisin does not define this in class, but I will add a short definition.

Definition 2.14 (Integral Domain). A ring R is an integral domain if ab = 0 =⇒


a = 0 or b = 0.

Again, most things are integral


√ domains.√ For instance, Z[i] = {a + bi | a, b ∈ Z} is an
integral domain, as are Z[ 2] = {a + b 2 | a, b ∈ Z} and Z[e2πi/3 ] = {a + b · e2πi/3 |
a, b ∈ Z}. A non-example of an integral domain is Z/6Z, since 2 · 3 = 0 but 2, 3 ̸= 0.

Remark 2.15. (bypassing rings) If you don’t know what a ring is, think of an
integral domain R as a subset of C such that 0, 1 ∈ R and the set is closed under
both addition and multiplication. This will be sufficient most of the time for this
class, I think.

Definition 2.16 (Euclidean Domain). An integral domain R is called Euclidean


if there exists a map λ : R → Z≥0 (the non-negative integers) such that for any
a, b ∈ R, b ̸= 0, there exists q, r ∈ R such that a = qb + r and either λ(r) < λ(b)
or r = 0.

Let’s look at an example of a Euclidean domain.



Claim 2.17. Z[i] = {a + bi | a, b ∈ Z} (here, i = −1) is a Euclidean domain.

Proof. Let us take the function λ(a + bi) = a2 + b2 = (a + bi)(a − bi). (This will be true
even when a, b ∈ R, not just in Z.) It is clear that λ outputs non-negative integers, as
a2 , b2 ∈ Z≥0 . Furthermore, λ is multiplicative, i.e. λ((a+bi)(c+di)) = λ(a+bi)λ(c+di).
(You can prove this by expanding everything out directly, or using basic facts about
conjugation and use λ(a + bi) = (a + bi)(a − bi).)
Now we want to show that we can construct a Division Algorithm using λ. Take
α, γ ∈ Z[i], γ ̸= 0, and let α/γ = r + si, where r, s ∈ R. We can choose integers
m, n ∈ Z such that |r − m| ≥ 1/2 and |s − n| ≥ 1/2 (just choose the closest integers to
r and s respectively). Let δ = m + ni ∈ Z[i] and ζ = α − γδ.

Hahn Lheem Page 14


Math 124: Number Theory 9/15 - Results on Primes

If ζ = 0, then α = γδ, which satisfies the form we want. Otherwise, we have


λ(ζ) = λ(α − γδ) = λ(γ)λ(α/γ − δ)
≤ 1/2λ(γ) < λ(γ)
when γ ̸= 0. The last line follows from the observation
λ(α/γ − δ) = λ((r − m) + (s − n)i) ≤ 1/4 + 1/4 = 1/2.
The inequality λ(ζ) < λ(γ) follows the definition of Euclidean domain, as desired.

3 9/15 - Results on Primes


Last time, we wrapped up our discussion of unique factorization into irreducibles. We
continue our study of primes, as they can be thought of as the “building blocks” of
everything in number theory, but with a slightly different focus.

3.1 Infinitely Many Primes in the Integers


The first question we’ll answer is

How many primes are there in Z?

We can answer this from a remarkably simple result given by Euclid.

Theorem 3.1 (Euclid)


There are infinitely many primes.

Proof. Suppose there are only finitely many primes, so {p1 , . . . , pn } is our complete list
of distinct primes. Consider z = p1 p2 · · · pn +1. If any pi | z, then pi | z −p1 p2 · · · pn = 1,
so pi ∤ z. But we know z has a prime factorization, so there must exist a prime dividing
z but not contained in our list {p1 , . . . , pn }. This contradicts our assumption of finitely
many primes.

There are many proofs of this result, but this was the first one, and I find it partic-
ularly nice because of its simplicity. Even better, one can use this technique to prove
even stronger statements. For instance, I challenge you to prove:

Exercise 3.2. Prove there are infinitely many primes congruent to 1 mod 4. (You
can also prove this for 3 mod 4.)

The fact about infinitely many primes in Z also comes as a consequence of the following
result:

Hahn Lheem Page 15


Math 124: Number Theory 9/15 - Results on Primes

Theorem 3.3
X 1
The sum diverges.
p prime
p

Before we prove this, we will prove the following lemma:

Lemma 3.4
If s ∈ N+ , then the sum ∞ 1
= 1 + 21s + 31s + · · · diverges if s = 1 and converges
P
n=1 ns
if s ≥ 2.

Proof. We ´ ∞invoke some stuff from high school calculus. ´ ∞ The sum is the left Riemann
1 1
sum for 1 (x+1) s dx and the right Riemann sum for 1 xs
dx, so the sum is bounded
by
ˆ ∞ ∞ ˆ ∞
1 X 1 1
s
dx ≤ s
≤ dx.
1 (x + 1) n=1
n 1 xs
´z
The left integral when s = 1 is limz→∞ 1 (x + 1)−1 dx = limz→∞ log(x + 1)| z
´ z1 → +∞,
−s
so the sum diverges for s = 1. For s ≥ 2, the right integral converges: 1
x dx =
1 1 z 1 1

− s−1 xs−1 1 = s−1 1 − zs−1 which is bounded as z → ∞.

Now we prove Theorem 3.3.

Proof. Let p1 , . . . , pℓ(n) be all primes ≤ n. Then, consider the product

ℓ(n)  −1
Y 1
λ(n) = 1− .
i=1
p i

 −1
1
Each 1 − pi
is the sum of a geometric series in disguise (with initial term 1 and
1
common ratio pi
), so we can write

ℓ(n)  −1
Y 1
λ(n) = 1−
i=1
pi
ℓ(n)  
Y 1 1
= 1 + + 2 + ···
i=1
p i pi
  
1 1 1 1
= 1+ + + ··· 1+ + + ··· ··· .
p1 p21 p2 p22

Hahn Lheem Page 16


Math 124: Number Theory 9/15 - Results on Primes

Clearly, the prime factorization of any integer m ≤ n uses only the primes ≤ n, i.e.
the primes in {p1 , . . . , pℓ(n) }. Thus, in the expansion of this product, m1 appears for all
m ≤ n. In particular,
1 1
λ(n) ≥ 1 + + · · · + ,
2 n
which we know from Lemma 3.4 diverges as n → ∞.
Now we do some clever manipulations. Using the Taylor series expansion of log,
which recall is ∞
X xm
log(1 − x) = − ,
m=1
m
we can now write
ℓ(n)  
X 1
log λ(n) = − log 1 −
i=1
pi
ℓ(n) ∞
X X
−1
= (mpm
i )
i=1 m=1
ℓ(n) ∞
1 1 1 XX
−1
= + + ··· + + (mpm
i ) .
p1 p2 pℓ(n) i=1 m=2

Call the remaining double sum at the end as S. We will now show S converges.

X ∞
X
(mpm
i )
−1
< (pm
i )
−1
= p−2 −3
i + pi + · · ·
m=2 m=2
p−2
= i
≤ 2p−2
i
1 − p−1
i
ℓ(n) ∞
XX
−1
=⇒ S = (mpm
i )
i=1 m=2
ℓ(n) ∞
X X 1
< 2p−2
i < 2 2
,
i=1 n=1
n

which we know converges by Lemma 3.4.


Now, we are equipped to complete the proof. We know
1 1 1
log λ(n) = + + ··· + + S,
p1 p2 pℓ(n)

and as n → ∞, S converges (aka P it is bounded) but log λ(n) → ∞, so the sum


1 1 1
p1
+ · · · + pℓ(n) → ∞, i.e. the sum p prime p diverges, as desired.

Hahn Lheem Page 17


Math 124: Number Theory 9/15 - Results on Primes

3.2 Infinitely Many Primes for Polynomials


Just like what we did in the first two days, we will try to extend this question to
polynomial rings. So we will pose the question:

Are there infinitely many primes in k[x], where k is a field?

We’re going to revise this question a little bit. Note that if k is an infinite field, say
k = Q or C, then I could take any irreducible polynomial of the form ax + b, and there
are infinitely many choices for (a, b), so the question is obvious. Thus, we will focus on
when k is a finite field.

Example 3.5 (Finite Field)


Perhaps the simplest example of a finite field is the integers modulo a prime. For in-
stance, p = 5 is prime, and Z/5Z = {0, 1, 2, 3, 4}, under addition and multiplication
modulo 5, is a field.

We actually know how to classify all finite fields. This is not straightforward to prove,
but we will provide the fact:

Fact 3.6. A finite field has q = pr elements for some prime p and r ∈ N. (We
denote such a field as Fq .)

Now, we can provide an analogous statement to Theorem 3.3 for k[x]. This highlights
the usefulness of this theorem, because then we can conclude the infinitude of primes
for other rings beyond just the integers.

Theorem 3.7
If |k| = q, then the sum
X X 1
q − deg p(x) =
p prime
q deg p(x)
p(x)∈k[x]
p(x) irreducible

diverges.

The proof of this is very similar in flavor to the proof we provided for Theorem 3.3.
Someone asked about the density of primes, so we’ll provide a brief interlude here
to address this question.

Hahn Lheem Page 18


Math 124: Number Theory 9/15 - Results on Primes

Definition 3.8. If x ∈ R+ , then let π(x) be the number of primes p such that
1 < p ≤ x.

We have a remarkable theorem, so important, it is called the Prime Number Theorem.

Theorem 3.9 (Prime Number Theorem)


π(x) ∼ logx x , i.e.
π(x)
lim → 1.
x→∞ x/ log x

This was conjectured by Gauss at the (impressive) age of 16.2 It was proven as a theorem
by Hadamard and Poussin in 1896 using the Riemann zeta function. In particular,
proving strong bounds on the error term for the Prime Number Theorem requires
assuming the Riemann Hypothesis, just one of many reasons why the elusive conjecture
is so important in math.
We won’t prove the Prime Number Theorem, as it is very involved, but we will
prove a pretty strong, related result:

Theorem 3.10
There exists constants c1 , c2 > 0 such that
x x
c2 · < π(x) < c1 · .
log x log x

Definition 3.11. For x ∈ R+ , let θ(x) =


P
p≤x log p.

We can provide a pretty nice bound for θ(x).

Proposition 3.12
For x ∈ R+ , θ(x) < (4 log 2) · x.

 (2n!)
Proof. Consider the binomial coefficient 2n
n
= n!n! . (If you have never seen this before,
this is the number of ways to choose n objects from 2n total objects.) We see that this
is divisible by all primes p with n + 1 ≤ p ≤ 2n. (They appear in the numerator, but
2
If you ever feel yourself having too big of an ego, just think of Gauss. On the other hand, if you
feel yourself having pretty low self-esteem, just think of the time when Grothendieck needed to use
a particular prime in a lecture and said, “Alright, take 57,” which became so famous that 57 is now
called the Grothendieck prime.

Hahn Lheem Page 19


Math 124: Number Theory 9/18 - Proving Weaker Version of Prime Number Theorem

2n

not in the denominator.) Also note appears in the binomial expansion of (1 + 1)2n ,
n
so we have  
2n Y
22n 2n
= (1 + 1) > > p.
n n<p≤2n
p prime

Taking the log on both sides, we have


X
2n log 2 > log p = θ(2n) − θ(n). (1)
n<p≤2n
p prime

Because θ(1) = 0, we can write


m
X
m m 0
θ(2 ) = θ(2 ) − θ(2 ) = θ(2i ) − θ(2i−1 ).
i=1

Using Equation 1, we have


m
X
m
θ(2 ) = θ(2i ) − θ(2i−1 )
i=1
< log 2(2m + 2m−1 + · · · + 2)
< (log 2) · 2m+1 .

Thus, for any x ∈ R+ , we can find m ∈ N such that 2m−1 ≤ x ≤ 2m , which gives us

θ(x) ≤ θ(2m ) < (log 2) · 2m+1 ≤ 4(log 2) · x,

as desired.

4 9/18 - Proving Weaker Version of Prime Number


Theorem
Let’s pick off from last time. We wanted to prove a weaker version of the Prime
NumberPTheorem, given by Theorem 3.10, and to do this we had a nice result bounding
θ(x) = p≤x log p (Proposition 3.12). We proved the proposition at the end of the last
lecture.
Well, Proposition 3.12 gives us an upper bound related to the primes less than a
given x, so with a little bit more work, we can determine a constant c1 > 0 such that
π(x) < c1 · logx x .

4.1 Upper Bound on π(x)

Hahn Lheem Page 20


Math 124: Number Theory 9/18 - Proving Weaker Version of Prime Number Theorem

Corollary 4.1
x
There exists a constant c1 > 0 such that π(x) < c1 · log x
for x ≥ 2.

Proof. We have X X
θ(x) = log p, π(x) = 1.
1≤p≤x 1≤p≤x

How can we relate π(x) with θ(x)? One (less fruitful) observation one could make
is log p ≤ log x, so θ(x) ≤ π(x) log(x). But we will do something a bit more useful,
because after all, we want an upper bound for π(x), not a lower one.
P
We will consider the sum x≥p≥√x log p. This may seem a bit out of the blue, but

log x = 21 log x, so we are just taking the sum from the logarithmically top half of the
range 1 ≤ p ≤ x. We now have a really nice lower bound for θ(x) (equivalently, an
upper bound for π(x)):
X
θ(x) ≥ log p

x≥p≥ x
√ √ √
≥ log x · π(x) − log x · π( x)
√ √ √
≥ log x · π(x) − x log x
1 √
= (log x) · (π(x) − x).
2
2θ(x) √
=⇒ π(x) ≤ + x
log x
x √
< 8 log 2 + x.
log x

So
√ this is basically whatxwe want. To make the x extra term √ disappear, observe that
2x
x grows slower than log x (more specifically, one can show x < log x for x ≥ 2), so
x √ x
π(x) ≤ 8 log 2 + x < (8 log 2 + 2) ,
log x log x
so we can take c1 = 2 + 8 log 2.

To recap on the above proof, because it is a bit technical: we have this wonderful
result from Proposition 3.12, and we want to turn this into an upper bound for π(x).
This means we have to write θ(x) as an upper bound of some expression in terms of
π(x). The trick we employ is to consider the sum that only takes the “top half” of the
range 1 ≤ p ≤ x, and the magic happens in the computations.

4.2 Lower Bound on π(x)


We are left with constructing a constant c2 for the lower bound, which we address now.

Hahn Lheem Page 21


Math 124: Number Theory 9/18 - Proving Weaker Version of Prime Number Theorem

Proposition 4.2
x
There exists a constant c2 > 0 such that π(x) > c2 · log x
.

Proof. We have the nice (combinatorial) result3


     
n n n
ordp n! = + 2 + 3 + ··· .
p p p
j k
To see why this is the case: pni counts the number of multiples of pi at most n. Let’s
consider the sum of the first two terms. All numbers only divisible by p (and not p2 ) are
counted once from the first term, but the multiples of p2 are counted twice, once from
the first term and once from the second. In general, the sum will count all multiples of
pi exactly i times, which by definition is what we want for the order.4
Now let us consider 2n

n
. We can compute
 
2n (2n)!
ordp = ordp
n n!n!
tp    
X 2n n
= j
−2 j ,
j=1
p p
j k
log 2n
a
where tp is the largest integer such that ptp ≤ 2n, i.e., tp = log p
. Let b
be the
fractional part of ab , so 53 = 23 . Then, we have


  ( n o
1 iff pnj ≥ 12
 
2n n
−2 j =
pj p 0 otherwise.

Thus, we have
 
n 2n Y
2 ≤ ≤ ptp
n p≤2n
X X  log 2n 
=⇒ n log 2 ≤ tp log p = log p
p≤2n p≤2n
log p
X  log 2n  X 
log 2n

= log p + log p.
√ log p √ log p
p≤ 2n 2n<p≤2n

3
Kisin: “I always remember this formula, because it was needed in an olympiad problem while I
was in high school, but I had never seen it before, so I just proved it on the spot.” Absolute chad.
4
If this still concerns you, do this explicitly for a specific prime p and integer n. Doing examples
will help!

Hahn Lheem Page 22


Math 124: Number Theory 9/18 - Proving Weaker Version of Prime Number Theorem


We now consider the latter sum separately. If p > 2n, then
log 2n log 2n log 2n
< √ = 1 = 2,
log p log 2n 2
log 2n
j k
log 2n
so log p
= 1. Substituting this back into our inequality above, we have

X  log 2n  X
n log 2 ≤ log p + log p.
√ log p √
p≤ 2n 2n<p≤2n

Noting that ⌊a/b⌋ · b ≤ a, we can extend the inequality to


X  log 2n  X
n log 2 ≤ log p + log p
√ log p √
p≤ 2n 2n<p≤2n

< 2n log(2n) + θ(2n)

=⇒ θ(2n) ≥ n log 2 − 2n log(2n).
√ √
Like in the above proof where we used x grows slower than logx x (specifically, x <
2x
√ √
2n log(2n)
log x
), rearranging tells us that 2n log(2n) grows slower than n, i.e. n
→ 0
as n → ∞. Thus, the right side of the above inequality is dominated by n log 2, an in
particular we have θ(2n) > T · n for some T > 0 when n >> 0. If 2n ≤ x < 2n − 2,
then
x−1
θ(x) ≥ θ(2n) ≥ T · n > T · > c2 · x
2
for some c2 > 0 and for all x ≥ 2. Thus, c2 · x < θ(x) ≤ π(x) · log x, as desired.

Whew, okay that was a very involved proof with lots of big steps. Let’s get back to
ground level and deal with things a bit less stressful.

4.3 Modular Congruence


Definition 4.3 (Modular Congruence). Let n ∈ N+ . If a, b ∈ Z, we say a ≡ b(n)
if n | a − b.

Sometimes, when n is understood, we will suppress the (n) or the mod n and just
write a ≡ b.
This is what we call an equivalence relation. It is especially nice because it complies
with all the algebraic operations we’d want in life: if a ≡ b and c ≡ d, then a + c ≡ b + d
and a · c ≡ b · d. (The latter can be seen via ac − bd = c(a − b) + b(c − d).) You are
probably familiar with all of this already from the integers modulo n, denoted Z/nZ.

Hahn Lheem Page 23


Math 124: Number Theory 9/18 - Proving Weaker Version of Prime Number Theorem

Lemma 4.4
If (a, n) = (1), then ∃ x ∈ Z such that ax ≡ 1 (mod n).

Proof. This follows very quickly from definitions. If (a, n) = (1) = Z, then 1 ∈ (a, n),
so 1 can be expressed as a linear combination of a and n. In other words, ∃ x, y ∈ Z
such that ax + ny = 1. Equivalently, this means ax ≡ 1 (mod n), as desired.

One can think, therefore, of a such that (a, n) = 1 as an invertible element modulo
n. In Z/nZ, the element corresponding to a has a multiplicative inverse.

Definition 4.5 (Units of Z/nZ). The units of Z/nZ, denoted (Z/nZ)× , are the
congruence class of a for any (a, n) = 1.

Example 4.6 (Units of Z/pZ)


Let p be a prime. Then, every nonzero element of Z/pZ is a unit: if p ∤ a, then
(a, p) = 1, so the congruence class of a is a unit in Z/nZ. In particular, (Z/pZ)× =
(Z/pZ) \ {0}. (By definition, this means Z/pZ is a field. Additionally, (Z/pZ)× is a
(multiplicative) group, because the product of any two nonzero elements (p ∤ a, p ∤ b)
stays nonzero (p ∤ ab).)

A result we will prove next time is Fermat’s Little Theorem, which states that if
x ∈ (Z/pZ)× , then xp−1 ≡ 1 (mod p).
A natural question one might ask is how we can count the number of units of Z/nZ,
or equivalently count the number of 1 ≤ a < n such that (a, n) = 1. This is an important
quantity in number theory, so it has a specific function related to it, called the Euler
totient function φ. For an integer n, φ(n) = |(Z/nZ)× | = {1 ≤ a < n : (a, n) = 1}.
We have a nice way of counting this:

Lemma 4.7
If n = pa11 pa22 · · · par r , then
    Y r
1 1
ϕ(n) = n 1 − ··· 1 − = pai i −1 (pi − 1).
p1 pr i=1

This gives rise to a more general version of Fermat’s Little Theorem (called Euler’s
Totient Theorem), which we will state and prove next time.

Hahn Lheem Page 24


Math 124: Number Theory 9/22 - Euler’s Totient

5 9/22 - Euler’s Totient


5.1 Proving Euler’s Totient Formula
We will start off today with proving Lemma 4.7 above. We will prove two proofs, one
more direct, and the other one using a nifty trick called Möbius inversion, which is used
often in number theory.

Proof 1. We wish to compute the size of the set {1 ≤ m ≤ n | (m, n) = 1}. This is
now just a counting problem. We first start with all integers from 1 to n – there are n
of them. Now, we subtract all multiples of pi for each pi . This leaves us with
n n n
n− − − ··· − .
p1 p2 pr
But we have subtracted too much! For instance, we remove the number p1 p2 twice: once
for the multiples of p1 , and the other for p2 . Thus, we must add back in all multiples
of pi pj for i ̸= j. This gives us now
n n n n
n− − ··· − + + ··· + .
p1 pr p1 p2 pr−1 pr

Continuing this process of correcting for under/overcounting, we eventually get


n n n n n
n− − ··· − + + ··· + + + ···
p1 pr p1 p2 pr−1 pr p1 p2 p3
    
1 1 1
=n 1− 1− ··· 1 − ,
p1 p2 pr

as desired.

Remark 5.1. Note that this is not actually a proof, this is more of a general
argument. There needs to be some work to formalize this. For those familiar with
some combinatorics, this is essentially invoking the Principle of Inclusion-Exclusion.

5.2 Möbius Inversion


The next proof will be a lot more formal. First, we will introduce a new operation on two
functions on the integers. This operation is sometimes called (Dirichlet) convolution.

Definition 5.2 (Convolution). Let f, g : N → C be two functions. We define the

Hahn Lheem Page 25


Math 124: Number Theory 9/22 - Euler’s Totient

convolution f ∗ g of f and g as
X
(f ∗ g)(n) = f (d1 )g(d2 ).
d1 d2 =n
d1 ,d2 ∈N

One can show that this operation is both associative and commutative (just write
it out, it’s not bad at all).
We introduce three particular functions.

1. Consider the function I : N → C where I(n) = 1 for all n ∈ N. Then, for any
f : N → C, convolution gives
X
(f ∗ I)(n) = (I ∗ f )(n) = f (d).
d|n

2. Let φ : N → C send φ(1) = 1 and φ(n) = 0 for all n ̸= 1. Then, for any
f : N → C, we have f ∗ φ = f .
3. The last is called the Möbius function, denoted µ, and it is defined by
(
(−1)s n = p1 · · · ps , each pi distinct
µ(n) =
0 otherwise.

Some examples: µ(p) = −1 for any prime p, µ(10) = (−1)2 = 1, and µ(28) = 0
since it has two factors of 2. In particular, if n is divisible by a perfect square,
then µ(n) = 0.

Considering that we want to use a technique called Möbius inversion, it makes sense
that the Möbius function will be an object of interest. We begin with one nice property
of it:

Lemma 5.3
P
If n > 1, then d|n µ(d) = 0.

Proof. Let n = pa11 · · · pas s , where the pi ’s are distinct. Then,


X X
µ(d) = µ(pb11 · · · pbss )
d|n 0≤bi ≤ai
X
= µ(pb11 · · · pbss )
bi ∈{0,1}
     
s s s s
=1−s+ − + · · · + (−1)
2 3 s
s
= (1 − 1) = 0.

Hahn Lheem Page 26


Math 124: Number Theory 9/22 - Euler’s Totient

Here, the second equality follows from the fact that any integer divisible by a square
evaluates to 0, the third line follows from simply counting how many tuples (b1 , . . . , bs )
have i 1’s for 0 ≤ i ≤ s, and the last line follows from the Binomial Theorem.

Now, we introduce Möbius Inversion.

Theorem 5.4 (Möbius Inversion)


P
If f : N → C and F = f ∗ I, i.e., F (n) = d|n f (d), then
X
f = F ∗ µ, i.e., f (n) = F (d)µ(n/d).
d|n

Proof. The statement may come as a surprise, but the proof is actually not bad at all.
We will use all three functions I, φ, µ that we had before. First, note that
X X
(µ ∗ I)(n) = µ(d)I(n/d) = µ(d),
d|n d|n

which by the previous lemma is 0 for n > 1. One can easily compute it is 1 for n = 1,
so in fact µ ∗ I = φ. In this case, we have

F =f ∗I
=⇒ F ∗ µ = f ∗I ∗µ
=f ∗µ∗I
=f ∗ φ = f,

as desired.

Recall we want to prove Lemma 4.7 in a different way using Möbius inversion. Here
is the first step towards this proof, which gives the flavor of being related to µ.

Proposition 5.5
P
d|n ϕ(d) = n. In other words, ϕ ∗ I is the identity map id : n 7→ n.

Proof. Consider the fractions n1 , n2 , . . . , n−1


n
, nn . Write all of them in lowest terms. Given
any d | n, there must be exactly ϕ(d) fractions with denominator d. (For example,
if n = 15, then the fractions with denominator 3 are 5/15 = 1/3 P and 10/15 = 2/3.
Indeed, ϕ(3) = 3 − 1 = 2.) But there are n fractions in total, so d|n ϕ(d) = n follows
as a counting argument.

Now, Lemma 4.7 follows as a consequence of Proposition 5.5.

Hahn Lheem Page 27


Math 124: Number Theory 9/22 - Euler’s Totient

Corollary 5.6
   
If n = pa11 · · · pas s , then ϕ(n) = n 1 − 1
p1
··· 1 − 1
ps
.

Proof. Proposition 5.5 gives ϕ ∗ I = id. Möbius Inversion now tells us ϕ = id ∗µ, so
X
ϕ(n) = µ(d) id(n/d)
d|n
X n
= µ(d)
d
d|n
n n n
=n− − ··· − + + ···
p1 ps p1 p2
   
1 1
=n 1− ··· 1 −
p1 ps

as we had before, so we conclude.

5.3 Euler’s Totient Theorem


Disregarding this explicit formula for ϕ(n), we can provide some nice results involving
ϕ(n).

Theorem 5.7 (Euler’s Totient Theorem)


If h ∈ (Z/nZ)× , then hϕ(n) ≡ 1 (mod n).

This is actually a consequence of a more general fact in group theory called La-
grange’s Theorem. If you know some group theory, you can replace (Z/nZ)× as any
finite group G, h with some element of G, and ϕ(n) with the size |G| of G, and the
statement would still be true as equality in the group.
Actually, it looks like Kisin wants to talk about this result in its group-theoretic
generality, so we will talk a little bit about groups. For our purposes, we will consider
a subgroup of (Z/nZ)× as a subset H ⊆ (Z/nZ)× such that 1 ∈ H and for any
h1 , h2 ∈ H, their product h1 h2 ∈ H is also in H.
A few more definitions, unfortunately all called order. (This is not to be confused
with the more strictly number-theoretic definition of order we provided way back in
Definition 1.14.)

Definition 5.8 (Order of an element). Let h ∈ H. The smallest a ∈ N+ such that


ha = 1 is called the order of h.

Hahn Lheem Page 28


Math 124: Number Theory 9/22 - Euler’s Totient

If we consider all the powers of h in a set {1, h, h2 , . . . , hi , . . . }, then if H is finite,


the set must start repeating elements at some point. (The set is contained in H, so it
can only contain finitely different elements.) The order is just the smallest exponent at
which the set begins to repeat itself.

Definition 5.9 (Order of a subgroup). If H ⊆ G is a subgroup (here, G =


(Z/nZ)× ), then |H| is called the order of H.

One can see that these two definitions of order are related. Consider the subgroup
generated by an element h ∈ H, that is, the set of all powers of h. We will notate as
⟨h⟩ = {1, h, h2 , . . . }. It follows that the order of h agrees with the order of ⟨h⟩.
Now we prove Theorem 5.7 via the following Proposition.

Proposition 5.10
|H| divides ϕ(n) = |(Z/nZ)× |.

Proof. This proof might seem a little weird; I think this is because Kisin is trying to
explain a proof in group theory without defining new terminology. For the more purely
group theory explanation, see the last paragraph.
We can define an equivalence relation between elements in (Z/nZ)× as follows. If
g1 , g2 ∈ (Z/nZ)× , then we say g1 ∼ g2 if there exists some h ∈ H such that g1 ∼ g2 h.
For example, if we take h = −1, then a ∼ −a for any a ∈ (Z/nZ)× . Another example:
all elements of H are equivalent to each other.
Any equivalence relation produces equivalence classes. We can consider the equiva-
lence class of some a ∈ (Z/nZ)× ; this is given by the set {ah | h ∈ H}. Each element
in this set is distinct (if ah1 = ah2 , then h1 = h2 ), so this equivalence class has exactly
|H| elements. Note that if b is in this equivalence class, i.e., b = ah′ for some h′ ∈ H,
then the equivalence class of b is the same as the equivalence class of a. (I will leave
this as an exercise, but reach out if you have questions.)
Now the finish line is in sight. Each element of (Z/nZ)× belongs to an equivalence
class (namely, its own), and each equivalence class has size |H|. Therefore, ϕ(n) =
|(Z/nZ)× | is equal to |H| times the number of equivalence classes. The latter number
is clearly an integer, so it follows that |H| divides ϕ(n), as desired.
For people familiar with group theory, we are simply considering the cosets of H in
(Z/nZ)× . Each coset has size |H|, the number of cosets is clearly an integer, and every
element in (Z/nZ)× is contained in a coset of H. It follows that |H| times the number
of cosets is ϕ(n).

Now the proof of Theorem 5.7 comes easily.

Proof of Theorem 5.7. Let a be the order of h, so ha = 1. Then, H = {1, h, . . . , ha−1 }

Hahn Lheem Page 29


Math 124: Number Theory 09/25 - Unit Groups

and a = |H| divides ϕ(n) by the above Proposition. Thus, hϕ(n) = (ha )ϕ(n)/a = 1,
done.

Restricting our attention to when n = p is prime, we get Fermat’s Little Theorem.

Corollary 5.11 (Fermat’s Little Theorem)


If p is prime, then hp−1 ≡ 1 (mod p).

Proof. Directly follows from Euler’s Totient Theorem using ϕ(p) = p − 1.

Note that what Fermat’s Little Theorem is telling us, in terms of orders, is that the
order of any element in (Z/pZ)× divides p − 1. But it is a neat fact about (Z/pZ)×
that there actually exists an element whose order is exactly p − 1:

Theorem 5.12
If p is prime, then there exists some h ∈ (Z/pZ)× such that h has order p − 1. In
other words, (Z/pZ)× = {1, h, h2 , . . . , hp−2 } has one generator, so it is cyclic.

Definition 5.13 (Primitive Root). An element h ∈ (Z/pZ)× with order p − 1 (i.e.,


a generator of the subgroup of units) is a primitive root mod p.

Example 5.14 (Primitive Root)


Let’s identify primitive roots mod p for small values of p. For p = 3, we have
(Z/3Z)× = {1, 2}. The element 2 generates the set, since 21 = 2 and 22 = 4 = 1.
For p = 5, one can see 2 is also a primitive root; 2 is also a primitive root for p = 11.
(I promise 2 is not always a primitive root; for instance, in p = 7, the powers of 2
are 21 = 2, 22 = 4, 23 = 8 = 1, so it does not cover everything in (Z/7Z)× .)

A natural question one may ask after seeing the above examples is whether 2 is a
primitive root for infinitely many primes. This is actually an unsolved problem! There
is a conjecture by Artin, though, which claims that if a ̸= −1 is not a square, then a is
a primitive root mod p for infinitely many p. So we expect for 2 to be a primitive root
for infinitely many primes.

6 09/25 - Unit Groups


6.1 Proving Existence of Primitive Root
Theorem 5.12 is quite remarkable, as it gives (Z/pZ)× perhaps the nicest structure
possible. We will work towards proving this amazing fact.

Hahn Lheem Page 30


Math 124: Number Theory 09/25 - Unit Groups

Lemma 6.1
If k is a field (e.g., Z/pZ, Q, R, C) and f (x) ∈ k[x] is a monic polynomial of degree
n, then f (x) = 0 has at most n solutions in k.

Proof. We will induct on n = deg f . If n = 1, then f (x) is of the form f (x) = x − a for
some a ∈ k. Clearly, the only root is x = a, so there is exactly one solution.
Before we move on to the inductive step, we will develop a useful condition for
divisibility. Suppose f (x) is a polynomial and α ∈ k such that f (α) = 0. Then, since
k[x] has a Division Algorithm (recall Lemma 2.5), we can write f (x) = (x−α)q(x)+r(x),
where deg r < deg(x−α) = 1. This forces deg r = 0, i.e., r(x) = c is a constant function.
But then 0 = f (α) = r(α) = c, which means (x − α) | f (x).
Now we proceed with the inductive step. Suppose the statement is true for polyno-
mials of degree n − 1, and let deg f = n. If f has no roots, then the Lemma is clearly
satisfied. Otherwise, choose some α such that f (α) = 0. By the above, (x − α) | f (x),
so f (x) = (x − α) · f1 (x). But now deg f1 , so our inductive hypothesis tells us that f1
has at most n − 1 roots. The conclusion follows.

Exercise 6.2. (For fun) Out of the fields Z/pZ, Q, R, C, can you find which ones
produce exactly n roots for a degree n polynomial? (Such fields are called al-
gebraically closed, and they are useful because, well, we can always factor a
polynomial into linear factors.)

Note that the above lemma tells us that the polynomial xp−1 − 1 = 0 has at most
p − 1 roots, but Fermat’s Little Theorem (5.11) tells us that it actually has exactly
p − 1 distinct roots. We can strengthen this observation:

Corollary 6.3
If d | p − 1, then xd = 1 (mod p) has exactly d solutions in Z/pZ.

Proof. Fermat’s Little Theorem tells us that every element a ∈ (Z/pZ)× satisfies ap−1 =
1, so we can factor Y
xp−1 − 1 = (x − a).
a∈(Z/pZ)×

(We can do this because from the proof above, each (x − a) | xp−1 − 1, and as
the (x − a) linear factors are coprime, their product must collectively divide xp−1 − 1.
Comparing degrees and leading coefficients, it follows that the two are in fact equal,
hence the factorization.)
Note that if d | p − 1, then xd − 1 | xp−1 − 1. (This is a strictly algebraic fact; for
example, x15 − 1 = (x5 − 1)(x10 − x5 + 1).) Write xp−1 − 1 = (xd − 1) · g(x), where

Hahn Lheem Page 31


Math 124: Number Theory 09/25 - Unit Groups

deg g = (p − 1) − d. Now we invoke the above lemma. Since deg(xd − 1) = d, the


equation xd − 1 = 0 has at most d roots. But deg g = (p − 1) − d, so g has at most
(p − 1) − d roots, which means xd − 1 = 0 has at least (p − 1) − ((p − 1) − d) = d roots.
Therefore, xd − 1 = 0 must have exactly d roots, and we conclude.

Recall we’re doing all of this to prove Theorem 5.12: there exists an element in
(Z/pZ)× of order p−1. It turns out that the above Corollary gives us enough “restricting
conditions” to force this to be true.
Let’s elaborate more in this big-picture argument. Suppose no such primitive root
(element with order p − 1) exists. We already know the order of any element must
divide |(Z/pZ)× | = p − 1. (This is the point of Proposition 5.10, and it comes as a
consequence of Euler’s Totient Theorem, Theorem 5.7.) But Corollary 6.3 gives us, for
any divisor d | p − 1, an exact number for how many elements have order d. A quick
computation, invoking some of the work from §5.2, will show us that counting over all
such elements for all divisors d < p − 1 is not enough, so there must be an element with
order p − 1.

Proof of Theorem 5.12. Let ψ(d) be the number of elements of (Z/pZ)× of order d.
Note that an element satisfies xd − 1 = 0 if it has order dividing d, i.e., ifPc is the
smallest positive integer such that xc − 1 = 0, then c | d. Thus, we have d = c|d ψ(c).
(At this point, you can see we are set up nicely to use Möbius Inversion, Theorem 5.4.)
P
Recall (the remarkably clever) Proposition 5.5, which tells us d = c|d ϕ(c), where
ϕ is the Euler Totient function. Our desire now is to show ψ = ϕ. By Möbius Inversion,
taking f to be either ψ or ϕ and F = id, we have
X
ϕ(d) = µ(c) · d/c = ψ(d),
c|d

so ψ(p − 1) = ϕ(p − 1) > 0. The conclusion follows.

Remark 6.4. Note that not only does this show the existence of a primitive root,
it also tells us exactly how many primitive roots there are! This is given by ψ(p−1),
which at the end we saw is just ϕ(p − 1). One can see this more directly, though:
once we know (Z/pZ)× is a primitive root, say a, then we know

(Z/pZ)× = {a, a2 , . . . , ap−1 = 1}.

Then, it is not hard to show directly that for any m such that (m, p − 1) = 1, am is
also a primitive root mod p. The number of m such that (m, p − 1) = 1 is ϕ(p − 1)
by definition.

Hahn Lheem Page 32


Math 124: Number Theory 09/25 - Unit Groups

6.2 Structure of Unit Groups


The punchline of the first part of this lecture is that for prime p, the subgroup of
units (Z/pZ)× has a cyclic structure. We can now ask the following natural ques-
tion:

What is the structure of the unit group (Z/nZ)× for any n?

The answer comes from a significant result in number theory called the Chinese Re-
mainder Theorem.5

Theorem 6.5 (Chinese Remainder Theorem/Sunzi’s Theorem)


Let m1 , m2 , . . . , ms ∈ N+ be pairwise coprime, i.e., (mi , mj ) = 1 for all i ̸= j.
Choose some ai ∈ Z/mi Z for each 1 ≤ i ≤ s. Then, there exists some a ∈ Z
such that a ≡ ai (mod mi ) for each i, and this solution is unique as an element of
Z/(m1 · · · ms )Z.

Proof. Since all mi ’s are pairwise coprime, for any prime p, p divides at most one mi .
Denote n = m1 · · · ms and, for each i, denote ni = n/mi = m1 · · · mi−1 mi+1 · · · ms . By
construction, (mi , ni ) = 1, so there exist integers (ri , si ) such that ri mi + si ni = 1. Let
ei = si ni . Again, by construction, observe
(
0 mod mj if j ̸= i
ei ≡
1 mod mi .

Now, we can prove the existence of P such an a ∈ Z satisfying all congruences ai


(mod mi ). We can simply construct a = si=1 ai ei ; by the above congruences on ei , we
have a ≡ ai (mod mi ).
Now, we prove uniqueness modulo n. If a′ is another such solution, then a − a′ ≡
aj − aj ≡ 0 (mod mj ), so mj | a − a′ . Since the mj ’s are pairwise coprime, it follows
that n = m1 · · · ms | a − a′ , i.e., a ≡ a′ (mod n) as desired.

Now we can describe the structure of (Z/nZ)× by first decomposing Z/nZ following
Sunzi’s Theorem above.
5
Note the subtle discrimination going on in the name: any result created by a European/American
is credited by name (e.g., Fermat’s Little Theorem), but here they fail to give a specific name despite
knowing its founder. There is some push in the math community to rename it Sunzi’s Theorem, since
the result is first known to be stated by Sunzi.

Hahn Lheem Page 33


Math 124: Number Theory 09/29 - Quadratic Reciprocity

Corollary 6.6 (Structure of (Z/nZ)× )


If n = pb11 · · · psbs where the pi are distinct primes, then

Z/nZ ≃ Z/pb11 Z × · · · × Z/pbss Z


a ↔ (a1 , a2 , . . . , as )

and, by taking units,

(Z/nZ)× ≃ (Z/pb11 Z)× × · · · × (Z/pbss Z)×


a−1 −1 −1
1 ↔ (a1 , . . . , as ).

There is no new content in here; the first isomorphism is just a reformulation of


Sunzi’s Theorem. Note that a unit of the product on the right must be a unit in each
component (if it were non-invertible in some Z/pbi i Z, then it cannot be invertible in the
product), and so we get the second isomorphism.
We can elaborate even more on this result by describing the structure of these
(Z/pb Z)× unit groups. This is definitely doable, it just takes a little time, so we will
leave the following as just a fact and move on to our next topic.

Fact 6.7. If p is an odd prime (i.e., p > 2) and b ∈ N+ , then (Z/pb Z)× is always
cyclic.

7 09/29 - Quadratic Reciprocity


We technically started this topic in the last 15 minutes of the last lecture, but because
this is the next main topic of the course, it felt fitting to just start at new section.

7.1 Motivation
Here is the premise of the topic.6 In general, we are interested in solving polynomial
equations. Doing this over R, even C, for linear and quadratic equations is the whole
point of the algebra sequence in middle/high school. Doing this in generality is the
birthplace of algebraic geometry, one of the most prominent fields in modern mathe-
matics. (Take Math 137 or 232A/B if this piques your interest.)
Number theory cares about things modulo n. We can do even better: by Sunzi’s
Theorem, to study something modulo n, it suffices to study it in modulo p for primes
p | n. For instance, if we want to solve x3 − 3 = 0 (mod 30), we can solve it in mod 2,
3, and 5, then combine our findings to find solutions modulo 30.
6
This was not covered in class, I’m just adding this for more context.

Hahn Lheem Page 34


Math 124: Number Theory 09/29 - Quadratic Reciprocity

Solving linear equations modulo p is easy in some cases and doable in all cases. (We
might talk more about this later in the course.) If I give you something like x − 3 = 0
(mod p), it is obvious what x can be mod p. An equation like 3x ≡ 1 (mod 7) is also
completely doable. (Do it!)
Even if I give you something clunky like Ax ≡ B (mod C) (take something ridicu-
7
lous like A = 111234 , B = 420420 , and C = 77 ), we know in general that we can find
integers x and y such that Ax+Cy = (A, C) using the Euclidean algorithm/the process
you did on your homework. Thus, so long as (A, C) | B (which in our example is true,
since (A, C) = (11, 7) = 1), we have B = (A, C) · d, so A(dx) + C(dy) = (A, C) · d =
B =⇒ A · (dx) = B (mod C). The point is that solving linear equations is completely
understood, and pretty efficient.
The next step is solving quadratic equations. The simplest such equation is of the
form x2 ≡ a (mod p). (In fact, every quadratic can be reduced to this form. For
instance, if 2x2 + 3x − 1 ≡ 0 (mod 7), then 2x2 − 4x = 1 = 8 (mod 7) =⇒ x2 − 2x =
4 =⇒ (x−1)2 = 5 (mod 7).) Quadratic reciprocity allows us to answer these questions
in a marvelously efficient way. For example, once we lay out the main result, then we
can compute problems like these very easily:

Exercise 7.1. Does there exist an x ∈ Z such that x2 ≡ 37 (mod 67).

Before reading the next section, I invite you to play around with these two baby
exercises:
Exercise 7.2. Take the first eight odd primes p ∈ {3, 5, 7, 11, 13, 17, 19, 23, 29}.
For each of these primes, determine whether −1 is a quadratic residue. I’ll start:
in mod 3, −1 = 2, but 12 = 22 = 1 modulo 3, so −1 is not a quadratic residue. On
the other hand, in mod 5, −1 = 4 = 22 , so −1 is a quadratic residue. Do this for
all primes, and try to see a pattern! The answer may come as a surprise.
For the even more curious, do the same for 2. We saw above that 2 = −1 is
not a quadratic residue modulo 3, and it turns out that 2 is also not a quadratic
residue modulo 5. Can you find any patterns?

7.2 Quadratic Residues


In light of the above discussion, we proceed with a natural definition.

Definition 7.3 (Quadratic Residue). We say a ∈ Z/pZ is a quadratic residue


modulo p if there exists some x ∈ Z/pZ such that x2 ≡ a (mod p).

We will define a funny-looking, half ab -looking, half fraction-looking symbol as a




kind of indicator function on whether or not a is a quadratic residue mod p.

Hahn Lheem Page 35


Math 124: Number Theory 09/29 - Quadratic Reciprocity

Definition 7.4 (Legendre Symbol). Let a ∈ Z and p prime such that (a, p) = 1.
Then, the Legendre symbol is defined as
  (
a 1 if a is a quadratic residue mod p
=
p −1 otherwise.

Most of the time, we will restrict our attention to when a ̸= 0, i.e., (a, p) = 1,
because if a = 0, then we can do the obvious 02 = 0, which is not so interesting.
From Theorem 5.12, we know that (Z/pZ)× is cyclic. Let h be a primitive root
modulo p. Thus, for any a ∈ (Z/pZ)× , we can write a = hi for some integer i. One can
show that a is a quadratic residue if and only if i is even (in which case a = (hi/2 )2 ).
Let us elaborate on this a little bit more via the following result.

Lemma 7.5
For a ∈ Z and p an odd prime such that (a, p) = 1, we have
 
a p−1
=a 2 (mod p).
p

This is quite an exciting result! For those who entertained Exercise 7.2, you probably
figured there must be a better way to check if a number is a quadratic residue. Well,
here we go.
 
Initially, the fastest way to compute ap is by going through all x ∈ (Z/pZ)× and
seeing if x2 ≡ a (mod p). In particular, if a is not a quadratic residue, that would
require us to go through all x ∈ (Z/pZ)× . For each x, we have two computations –
square x, then reduce mod p – giving a total of 2(p − 1) computations. Now, we can
compute it just from multiplying a to itself p−1
2
times, which is far less computationally.

Proof. We will first show a = hi is a quadratic residue if and only if i is even. If i is


even, then a = (hi/2 )2 , as demonstrated above. On the other hand, if x2 = a (mod p),
then we can also write x = hj for some j since x ∈ (Z/pZ)× . But then
x2 ≡ a (mod p)
=⇒ h2j ≡ hi (mod p)
=⇒ h2j−i ≡ 1 (mod p).
Since the order of h mod p is p − 1, this implies p − 1 | 2j − i. But for odd p, p − 1
is even, so 2j − i must be even as well. In particular, i must be even.
We continue with this equivalence. We have i is even if and only if p − 1 | i · p−12
,
which is equivalent to
p−1 p−1 p−1
hi· 2 = (hi ) 2 =a 2 ≡1 (mod p).

Hahn Lheem Page 36


Math 124: Number Theory 09/29 - Quadratic Reciprocity

 
Thus, to summarize, there exists an x such that x2 ≡ a (mod p) (i.e., ap = 1)
p−1
 
if and only if a 2 ≡ 1 (mod p). When no such x exists, i.e., when ap = −1, then
p−1
 p−1 2
a 2 ̸≡ 1. But note that a 2 = ap−1 = 1 by Fermat’s Little Theorem, and this is
p−1
  p−1
only possible if a 2 ≡ ±1, so ap = −1 ⇐⇒ a 2 ≡ −1 (mod p). We have now
covered both cases, so the conclusion follows.

This lemma now gives us a really nice characterization of when −1 is a quadratic


residue modulo p, just by plugging in a = −1. This answers the first part of Exercise
7.2.
 
Corollary 7.6 (Criterion for −1
p
)
  p−1
 
−1 −1
p
= (−1) . In other words, p = 1 if p ≡ 1 (mod 4) and −1 if p ≡ 3
2

(mod 4).

So −1 being a quadratic residue mod p is determined by p mod 4, which is a condi-


tion that kind of comes out of nowhere at first glance. The condition for a = 2 requires
a little more care, and we will prove it next time, but considering p mod 8 provides
sufficient information.

Proposition 7.7
p2 −1
 
2
p
= (−1) 8 . In other words

  (
2 1 p ≡ 1, 7 (mod 8)
=
p −1 p ≡ 3, 5 (mod 8).

Finally, we state this truly remarkable result, first proved by none other than Gauss.
It may look a bit complicated at first, but it makes this question regarding quadratic
residues very simple.

Theorem 7.8 (Quadratic Reciprocity)


Let p and q be odd primes. Then,
  
p q p−1 q−1
= (−1) 2 · 2 .
q p

This makes answering questions like Exercise 7.1 not only doable, but even doable
by hand.

Hahn Lheem Page 37


Math 124: Number Theory 09/29 - Quadratic Reciprocity

Answer to Exercise 7.1. We compute


     
37 67 30
= =
67 37 37
   
2 3 5
=
37 37 37
  
37 37
= (−1)
3 5
  
1 2
= (−1)
3 5
= (−1) · 1 · (−1) = 1,
so indeed, 37 is a quadratic residue mod 67.

7.3 Proof of Quadratic Reciprocity, Step 1


We will work towards proving Theorem 7.8. The proof we follow here is a slick one;
the most “elementary” one involving Gauss sums will be demonstrated next week. We
use the word “elementary” here with caution, though, because elementary does not
necessarily mean easy. As we are using less powerful tools, we have to be more creative,
and we’ll see that next week’s proof of Quadratic Reciprocity requires a lot of jumping
through hoops.
Today, we are bargaining a little with elementary methods and higher-power tools.
All the steps in each proof, albeit scary-looking, follow pretty smoothly, but we do have
to reference a result that goes beyond standard number theory (this is Lemma 7.10).
This makes the proof a little less grounded, and thus feel a bit more magical, but bear
with us for a little while.
We first lay out a few details. Let S = − p−1 , − p−3 , . . . , −1, 1, 2, . . . , p−1

2 2 2
be the
set of non-zero residues mod p. Let a ∈ Z be an integer coprime to p, so (p, a) = 1.
p−1

Consider the set a, 2a, . . . , 2 · a . Again, this may seem weird (why are we doing
this?), but observe that this set has no duplicates: if ai ≡ aj (mod p), then a(i−j) ≡ 0
(mod p) which is only possible if i = j (as (p, a) = 1).
For 1 ≤ i ≤ p−1 , let mi ∈ 1, . . . , p−1

2 2
such that ia ≡ ±mi (mod p). We will
consider all instances where we take the negative sign in this equivalence, i.e., define
 
p−1 p−1
µ := µ(a) = 1 ≤ i ≤ : ia ≡ −mi , 1 ≤ mi ≤ .
2 2
Let’s get to the point:

Lemma 7.9 (Gauss)


 
a
p
= (−1)µ .

Hahn Lheem Page 38


Math 124: Number Theory 09/29 - Quadratic Reciprocity

Proof. As proven above, all elements of a, 2a, . . . , p−1



2
a are distinct modulo p, for if
ai ≡ aj (mod p), then p | a(i − j) =⇒ i ≡ j (mod p), which is not possible when
1 ≤ i, j ≤ p−1
2
unless i = j.
Even better, we can show all mi ’s are distinct. If the sign of mi and mj are the same
in ia ≡ ±mi , ja ≡ ±mj , then we can use the argument before to show mi ≡ mj =⇒
ai ≡ aj =⇒ i = j. Likewise, if the signs for mi and mj are distinct, then we have
mi ≡ −mj =⇒ ai ≡ −aj =⇒ i ≡ −j (mod p), which is impossible as 1 ≤ i, j ≤ p−1 2
.
p−1
Thus, the mi ’s all take on values from 1 to 2 , and they are all distinct, so
n o  p−1

m1 , m2 , . . . , m p−1 = 1, 2, . . . , .
2 2
Recall ai ≡ ±µ (mod p), and the number of times the sign is negative is µ by
definition. We now multiply all mi ’s together to get
p−1 p−1
  2 2
p−1 p−1 Y Y
a 2 != ai = (−1)µ mi .
2 i=1 i=1
n o
But note from the equality of sets m1 , m2 , . . . , m p−1 = 1, 2, . . . , p−1

2
, we have
2
p−1
Q Q 
i mi = ii = 2
!. Canceling this out on both sides, we conclude
p−1
(−1)µ = a 2 ,
 
a
so (−1)µ = p
by Lemma 7.5, as desired.

As a corollary, we can prove Proposition 7.7, which says that 2 is a quadratic residue
when p ≡ 1, 7 (mod 8), and not a quadratic residue otherwise.

Proof of Lemma 7.5. We use Lemma 7.9, which we just proved. Here, we may compute
µ explicitly. Given i between 1 and p−1 2
, note 2i < p, so we have 2i ≡ −mi for some
1 ≤ mi ≤ (p − 1)/2 if and only if 2i > p−1 2
, i.e. i > p−1
4
. Thus, µ is equal to the
p−1
number of 1 ≤ j ≤ (p − 1)/2
 p−1 such that j > 4 , which by complementary counting is
just p−1
2
− m where m = 4
. We do this by casework:
• (p ≡ 1 (mod 8)) We can write p = 8k + 1. Then, m = 2k, so p−1
2
− m = 4k =
2k = 2k. Thus, µ is even.
• (p ≡ 3 (mod 8)) We can write p = 8k + 3. Then, m = 2k again, so µ = p−1
2
−m =
4k + 1 − 2k = 2k + 1 is odd.
• (p ≡ 5 (mod 8)) Write p = 8k + 5. Then, m = 2k + 1, so µ = p−1
2
−m =
(4k + 2) − (2k + 1) = 2k + 1 is odd.
• (p ≡ 7 (mod 8)) Write p = 8k + 7. Then, m = 2k + 1, so µ = p−1
2
−m =
(4k + 3) − (2k + 1) = 2k + 2 is even.
Collecting all of these computations, the result follows.

Hahn Lheem Page 39


Math 124: Number Theory 09/29 - Quadratic Reciprocity

7.4 Proof of Quadratic Reciprocity, Step 2


More magic to ensue. The next lemma is the step where we go beyond just working
with the integers, which relieves future computations but requires us to prove something
slightly more difficult at the onset.
Consider the function f (z) = e2πiz − e−2πiz . Observe f (z) = f (z + 1) and f (−z) =
−f (z). If you know eiθ = cos θ + i sin θ, then see that f (z) = 2i sin(2πz).

Lemma 7.10

(n−1)/2 
f (nz) Y m  m
= f z+ f z− .
f (z) m=1
n n

Proof. We first prove a smaller lemma.

Lemma 7.11
If n > 0 is odd, then
n−1
Y
xn − y n − ζ k x − ζ −k y ,

k=0

where ζ = e2πi/n .

Proof. Note that for ζ = e2πi/n , we have (ζ k )n = e2πik = 1, so all ζ k ’s are a root of the
polynomial z n − 1 = 0. But there are n such powers of ζ, and deg(z n − 1) = n, so they
each appear as a root exactly once. In particular,
n−1
Y
n
z −1= (z − ζ k ).
k=0

If z = x/y, then we can write xn − y n = n−1 k


Q
k=0 (x − ζ y).
Now for odd n, the map x 7→ −2x is a bijection between Z/nZ and itself (it is
injective, as −2a = −2b =⇒ a = b, and its inverse is y 7→ −1/2 · y, and −1/2 exists

Hahn Lheem Page 40


Math 124: Number Theory 09/29 - Quadratic Reciprocity

mod n since n is odd). Thus, we have


n−1
Y
n n
x −y = (x − ζ k y)
k=0
n−1
Y
= (x − ζ −2k y)
k=0
n−1
n−1
Y
= (ζ )n 2 (x − ζ −2k y)
k=0
n−1
Y
=ζ 1+2+···+(n−1))
(x − ζ −2k y)
k=0
n−1
Y
= (ζ k x − ζ −k y),
k=0

as desired.

Great, let us return to the main lemma at hand. We apply the lemma above with
x = e2πiz and y = e−2πiz to get

f (nz) = e2πinz − e−2πinz


n−1
k k
e2πi(z+ n ) − e2πi(− n −z)
Y
=
k=0
n−1 
Y k
= f z+ .
k=0
n

From f (z) = f (z + 1), we have f (z + k/n) = f (z + k/n − 1) = f z − n−k



n
. Now, we
n+1 n−1
can rewrite our product above: for 2 ≤ k ≤ n − 1, we have 1 ≤ n − k ≤ 2 , so
(n−1)/2 n−1
Y Y
f (nz) = f (z + 0/n) f (z + k/n) f (z + k/n)
k=1 k= n+1
2
n−1
2    
Y k k
= f (z) f z+ f z− ,
k=1
n n

which is equivalent to what we want.

7.5 Proof of Quadratic Reciprocity, Step 3


The payout for this lemma is high, as promised.

Hahn Lheem Page 41


Math 124: Number Theory 09/29 - Quadratic Reciprocity

Proposition 7.12
If p is an odd prime and a ∈ Z such that (p, a) = 1, then
(p−1)/2     (p−1)/2  
Y ℓa a Y ℓ
f = f .
ℓ=1
p p ℓ=1
p

Proof. We can write ℓa ≡ ±mℓ (mod p) for some 1 ≤ mℓ ≤ (p − 1)/2. This tells us
that ℓa∓m
p

is an integer, so using the relations f (z) = f (z + 1) and f (z) = −f (−z),
f (ℓa/p) = −f (∓mℓ /p) = ±f (mℓ /p).
Multiplying across all 1 ≤ ℓ ≤ (p − 1)/2 on both sides, we have
p−1 p−1 p−1
2 2 Y2
Y
µ
Y a
f (ℓa/p) = (−1) f (ℓ/p) = f (ℓ/p),
ℓ=1 ℓ=1
p ℓ=1
where the last equality invokes Lemma 7.9.

Now why is this useful? Well, we can now prove Quadratic Reciprocity.

Proof of Theorem 7.8. Take p, q odd primes, and apply the above Proposition to a = q.
(Note that p, q are “symmetric” in the sense that they are interchangeable, so whatever
we do for q, we can do the same for p.) The Proposition tells us
p−1 p−1
2  Y2
Y q
f (ℓq/p) = f (ℓ/p),
ℓ=1
p ℓ=1

so by Lemma 7.10 (here z = ℓ/p and n = q,


p−1 p−1 q−1
  Y 2 2 2    
q f (ℓq/p) Y Y ℓ m ℓ m
= = f + f − .
p ℓ=1
f (ℓ/p ℓ=1 m=1
p q p q
Switching q and p, we can go through the same process by applying the Proposition for
a = p to get
q−1 p−1
  Y 2 2    
p Y ℓ m ℓ m
= f + f −
q ℓ=1 m=1
q p q p
q−1 p−1
2 2    
Y Y m ℓ m ℓ
= f + f −
m=1 ℓ=1
q p q p
 
p−1 q−1 q
= (−1) 2 · 2 ,
p
which, quite remarkably, is what we wanted.

Hahn Lheem Page 42


Math 124: Number Theory 10/02 - Algebraic Numbers & Integers

8 10/02 - Algebraic Numbers & Integers


We will use something called “quadratic Gauss sums” (discussed in next lecture) to
provide another proof of Quadratic Reciprocity. The upshot of these techniques is that
we can generalize our results to higher dimensions. We begin with some definitions.

8.1 Algebraic Numbers


Definition 8.1 (Algebraic Numbers/Integers). An algebraic number is a num-
ber α ∈ C which is a solution to a polynomial equation of the form

αn + an−1 αn−1 + · · · + a0 = 0

for an−1 , . . . , a0 ∈ Q. (Equivalently, we can clear denominators and say α satisfies


cn αn + cn−1 αn−1 + · · · + c0 = 0 for integers ci .
We say α is an algebraic integer if an−1 , . . . , a0 ∈ Z, or equivalently if cn = 1.

Another way to think about this is that α is an algebraic number if it satisfies some
polynomial relation p(X) = X n + an−1 X n−1 + · · · + a0 = 0 for ai ∈ Q, and likewise
ai ∈ Z for algebraic integers.

Example 8.2 (Algebraic Numbers/Integers)


√ √ √
2 is an algebraic integer,
√ since ( 2)2 − 2 = 0. On the other hand, 3/4 is an
algebraic number, since ( 3/4)2 − 3/16 = 0, but this is not an algebraic integer.

In general, we can always find algebraic numbers which are not algebraic integers. We
in fact have a nice characterization of when algebraic numbers are algebraic integers.
(Spoiler: it’s given in the name.)

Proposition 8.3
Let r ∈ Q. Then,

1. r is an algebraic number.

2. if r is an algebraic integer, then r ∈ Z.

Proof. The first is trivial: r satisfies p(X) = X − r = 0. We now focus on the second
statement. Assume there are ai ∈ Z such that rn + an−1 rn−1 + · · · + a0 = 0. Write
r = pq , where p, q ∈ Z and (p, q) = 1. Clearing denominators in our polynomial relation,

Hahn Lheem Page 43


Math 124: Number Theory 10/02 - Algebraic Numbers & Integers

we have
pn + an−1 pn−1 q + · · · + an q n = 0
=⇒ −q(an−1 pn−1 + an−2 pn−2 q + · · · + an q n−1 ) = pn .
But this means q divides pn , which is only possible when (p, q) = 1 if q = ±1. This
means r = ±p ∈ Z, as desired.

We have provided characterizations of algebraic numbers and integers as elements.


Now, let us consider the set of algebraic numbers (resp. integers). We will show that
these have very reasonable and familiar structures: the set of algebraic numbers is a
field, and the set of algebraic integers is a ring.
If you think about it for a little bit, this is quite difficult to do √just from
√ the
definitions! Even if I give you really simple algebraic integers, say like 2√and √ 3, it
requires a lot of brainpower to construct a polynomial p(X) such that p( 2 + 3) =
0. We will find a way to prove this without providing explicit constructions for our
polynomial relations.

Definition 8.4 (Module). Let V ⊆ C be a subset. Then, V is a Q-vector space


(equivalently, a Q-module) of finite dimension if

1. ∀ x, y ∈ V , x + y ∈ V ;

2. ∀ r ∈ Q, x ∈ V , r · x ∈ V ;

∃ γ1 , . . . , γn ∈ V such that ∀ x ∈ V , ∃!(r1 , . . . , rn ) ∈ Qn such that x =


3. P
n
i=1 ri γi . (The γi ’s are the generators of V .)

So we have entered the land of linear algebra, which is great, because we know linear
algebra really well.7

Proposition 8.5
Let V be a Q-module. Let α ∈ C such that α · V ⊆ V . Then, α is an algebraic
number.

Proof. Let γ1 , . . . , γn be a basis of V . Consider the map


mα : V → V
x 7→ αx.

It is easy to see that this is a Q-linear map. Let M ∈ Mn (Q) be its matrix with
respect to the basis γ1 , . . . , γn . Take the characteristic polynomial P (X) = det(M −
XIn ) = X n + an−1 X n−1 + · · · + a0 ; since M ∈ Mn (Q), the coefficients here live in Q.
7
“We” means the math community at large. I myself am pretty bad at linear algebra, oops.

Hahn Lheem Page 44


Math 124: Number Theory 10/02 - Algebraic Numbers & Integers

Now we invoke the Cayley-Hamilton Theorem, one of the most important results in
linear algebra. The theorem states that plugging in the matrix into the characteristic
polynomial of the matrix gives 0, so here, we have P (mα ) = mnα +an−1 mn−1
α +· · ·+a0 = 0.
But we know exactly what mkα is from definition: mkα (x) = αk x. Thus, P (mα )(x) =
(αn + an−1 αn−1 + · · · + a0 )x = 0 for all x. Setting x ̸= 0, this forces the sum in
the parentheses to be 0, so αn + an−1 αn−1 + · · · + a0 = 0. Hence, α is algebraic, as
desired.

8.2 Algebraic Numbers (Integers) form a Field (Ring)


We now put this sick result to use:

Proposition 8.6
The set of algebraic numbers is a field.

Proof. Let α, β ∈ C be two algebraic numbers. We want to show three things are
algebraic: (1) 1/α (so the set has inverses), (2) α · β (closed under multiplication), and
(3) α + β (closed under addition).
We start with (1). We know by definition that α, β satisfy the relations
αn + an−1 αn−1 + · · · + a0 = 0
β m + bm−1 β m−1 + · · · + b0 = 0
for ai , bi ∈ Q. Assume that a0 ̸= 0 (otherwise we can divide the first equation by α).
Dividing by a0 αn , we get a new equation
 i  n
1 an−1 1 an−i 1 1
+ · + ··· + · + = 0,
a0 a0 α a0 α α
so 1/α is an algebraic number.
We will prove (2) and (3) with the same approach. Let V be the Q-module with
basis {αk β j : 0 ≤ k < n, 0 ≤ j < m}. By construction, this has finite dimension.
Furthermore, any element in V is of the form v = i,j rij · αi β j , and we have αv =
P
i+1 j
P
i,j rij · α β , which is still an element of V . (The only thing we have to check is
n−1
still lives in V , which is true since α1+(n−1) = αn = − i<n ai αi ∈ V .)
P
that α · α
Likewise, βv ∈ V .
But this means α · V ⊆ V and β · V ⊆ V . Now, we can take sums and products to
get (α + β) · V ⊆ V and (αβ)V ⊆ V . Proposition 8.5 tells us then that α + β and αβ
are algebraic numbers, as desired.

To show that the set of algebraic integers for a ring, we now provide the proof given
in the textbook. It is basically going to be the same, except instead of working over Q,
we will work over Z.

Hahn Lheem Page 45


Math 124: Number Theory 10/02 - Algebraic Numbers & Integers

Remark 8.7. We called a Q-vector space as a Q-module because modules are more
general than vector spaces. For example, there are no such things as a Z-vector
space since Z is not a field. However, modules are defined over rings, so it makes
sense to talk about a Z-module.

Definition 8.8 (Z-module). Let W ⊆ C. We say that W is a Z-module if

1. W is an abelian subgroup;

2. if n ∈ Z, α ∈ W , then n · α ∈ W ;

∃ γ1 , . . . , γn ∈ W such that every element x ∈ W can be written as x =


3. P
n
i=1 ni γi for some ni ∈ Z.

Akin to Proposition 8.5, we have this analogous result, which uses a similar Cayley-
Hamilton argument.

Proposition 8.9
Let W be a Z-module. Let α ∈ C such that αW ⊆ W . Then, α is an algebraic
integer.

Proof. Let γ1 , . . . , γn be as in the definition (a set of generators of W ). Since α ·γi ∈ W ,


we have that for any 1 ≤ i ≤ n, there exists cij ∈ Z such that
n
X
α · γi = cij γj
j=1
n
X
⇐⇒ (αδij − cij )γj = 0,
j=1

where δij is the Kronecker delta function that gives 1 when i = j and 0 otherwise.
Consider the n × n-matrix M = (αδij − cij )i,j . Note the above equality indicates
M · (γ1 · · · γn )⊺ = 0, which forces det M = 0 since the γi ’s are nonzero. But det M
is a polynomial in α with integer coefficients since each cij ∈ Z. Note also that each
nonzero coefficient of α in M is δii = 1, so det M is a monic polynomial in α. Hence, α
is an algebraic integer.

The proof that the set of algebraic integers forms a ring follows straight from this
Proposition, similar to what we did for algebraic numbers.

Corollary 8.10
The set of algebraic integers is a ring.

Hahn Lheem Page 46


Math 124: Number Theory 10/02 - Algebraic Numbers & Integers

Proof. Take W to be the Z-module generated by {αi β j | 0 ≤ i < n, 0 ≤ j < m}, and
proceed as in the proof of Proposition 8.6.

8.3 Properties of Algebraic Numbers


Denote Ω as the set of algebraic integers. If w1 , w2 , γ ∈ Ω with γ ̸= 0, then we say
w1 ≡ w2 (mod γ) if we can find some δ ∈ Ω such that w1 − w2 = δ · γ. (Note this is
basically how we defined modular congruence in the integers as well: a ≡ b (mod q) in
the integers if a − b = qm for some m ∈ Z.)
Suppose we have a, b, c ∈ Z such that a ≡ b (mod c). Thinking of these are elements
of Ω ⊇ Z, we have a ≡ b (mod c) in Ω, so a − b = cδ for some δ ∈ Ω. Thus,
δ = a−b
c
∈ Q ∩ Ω = Z by Proposition 8.3, so in fact over the integers, these two modular
congruences agree.
Given this, we can show that the Freshman’s Dream (a + b)p ≡ ap + bp (mod p) also
holds when a, b ∈ Ω.

Proposition 8.11
Let w1 , w2 ∈ Ω and p ∈ Z be a prime number. Then, (w1 + w2 )p ≡ w1p + w2p
(mod p).

Proof. The proof follows exactly like the proof for the integers. By the Binomial The-
orem,
p−1  
p p p
X p k p−k
(w1 + w2 ) = w1 + w2 + w1 w2 .
k=1
k
p

Since p | k for 1 ≤ k ≤ p − 1, the result follows.

Well, we know an algebraic number satisfies some polynomial relation. Can we find
this polynomial√ relation? I mentioned earlier√ that this is a bit difficult to do by hand;
for instance, 2 satisfies X 2 − 2 = 0, and 3 corresponds to X 2 − 3 = 0, √ but √
given
these two polynomials, it is hard to find the minimal polynomial that has 2 + 3 as
a root. We will “find” this minimal polynomial through some algebra.
Let α ∈ Q be an algebraic number. Then, S = {P ∈ Q[X] | P (α) = 0} is an ideal
of Q[X]. But Q[X] is a principal ideal domain (it has a Euclidean algorithm, recall our
work for k[x] in Lecture 2), so S = (f ) for some irreducible monic f ∈ Q[X].
In fact, f is the polynomial of minimal degree such that f (α) = 0 and f is monic.
We call f the minimal polynomial of α, and the degree of f is called the degree of
α.
Note that we could have also come to this more directly without considering Q[X]
as a PID. Take the set S, take the element f ∈ S which is monic and has minimal
degree. Suppose g ∈ S as well. Then, the Division Algorithm in Q[X] tells us that

Hahn Lheem Page 47


Math 124: Number Theory 10/02 - Algebraic Numbers & Integers

g(X) = f (X)q(X) + r(X) for some q, r ∈ Q[X] such that deg r < deg f . But then
r(α) = g(α) − f (α)q(α) = 0, so r ∈ S as well. This is only possible if r = 0 by
minimality of f , which implies f | g. Hence, S = (f ), and f is the minimal polynomial
of α by construction.
Define the sets

Q[α] = {P (α) | P ∈ Q[X]} ⊂ C


 
P (α)
Q(α) = : P, Q ∈ Q[X] ⊂ C.
Q(α)

Note Q[α] ⊆ Q(α). The natural question is, then, when does equality hold? We
have one answer to this:

Proposition 8.12
If α is an algebraic integer, then Q[α] = Q(α) and it is a Q-vector space of dimension
equal to the degree of α.

P (α)
Proof. Let f be the minimal polynomial of α. Let Q(α) ∈ Q(α), where P, Q ∈ Q[X] and
Q(α) ̸= 0. This means f ∤ Q, and since f is irreducible, this means (f, Q) = 1. Thus,
the Euclidean algorithm in Q[X] tells us that ∃ h, k ∈ Q[X] such that f g + Qk = 1.
Substituting X = α, we get f (α)h(α) + Q(α)k(α) = 1 =⇒ k(α) = 1/Q(α), so
P (α)/Q(α) = P (α)k(α) ∈ Q[α]. This proves the first part of the statement.
Notice that αn = −an−1 αn−1 − · · · − a0 , so Q[α] is generated by 1, α, . . . , αn−1 .
These elements are linearly independent: if not, then α would satisfy some polynomial
relation of the form
bn−1 αn−1 + · · · + b0 = 0
where bi ∈ Q. But g(X) = n−1 i
P
i=0 bi X satisfies g(α) = 0 and deg g < deg f , which only
complies with the minimality of f if g = 0, meaning bi = 0 for all i. The conclusion
follows.

8.4 Quadratic Character of 2


Recall that this entire discussion was to provide another proof of Quadratic Reciprocity.
We return to the land of Quadratic Reciprocity now by providing a new proof of Propo-
sition 7.7, which characterizes when 2 is a quadratic residue mod p. Recall we had
 
2 p2 −1
= (−1) 8 .
p

We will prove this statement using roots of unity. In general, the nth roots of unity
are the complex numbers z ∈ C satisfying z n = 1. (If you think about it enough, you
can see that z must be of the form e2πik/n for some integer k.)

Hahn Lheem Page 48


Math 124: Number Theory 10/02 - Algebraic Numbers & Integers

Proof. Let ζ = e2πi/8 , so ζ 8 = 1 and ζ is a (primitive) eighth root of unity. We can


factor ζ 8 − 1 = (ζ 4 − 1)(ζ 4 + 1); since ζ 4 − 1 ≠= 0, we have ζ 4 + 1 = 0 =⇒ ζ 4 = −1.
Dividing by ζ 2 on both sides, we get ζ 2 + 1/ζ 2 = 0. Observe, then, that
 2
1 1
ζ+ = ζ 2 + 2 + 2 = 2.
ζ ζ

Let τ = ζ + ζ −1 . Note ζ ∈ Ω since ζ 8 − 1 = 0, and τ ∈ Ω from τ 2 − 2 = 0 from above.


  p−1
Let p be an odd prime. Using ap = a 2 , we have
 
p−1 2 p−1 p−1 2
τ = (τ ) 2 =2 2 ≡ (mod p),
p
 
2
so τ p = p
τ (mod p). Since τ ∈ Ω, we apply Freshman’s Dream (Proposition 8.11)
to see τ = (ζ + ζ −1 )p ≡ ζ p + ζ −p (mod p). But ζ 8 = 1, so in fact we have
p

(
ζ + ζ −1 p ≡ ±1 (mod 8)
ζ p + ζ −p =
ζ 3 + ζ −3 p ≡ ±3 (mod 8).

We can write these sums in terms of τ . The first one is just τ = ζ +ζ −1 by definition,
and the second can be written as ζ 3 + ζ −3 = −ζ −1 + (−ζ −1 )−1 = −(ζ + ζ −1 ) = −τ since
ζ 4 = −1 =⇒ ζ 3 = −ζ −1 . To summarize,
(
p p −p τ p ≡ ±1 (mod 8)
τ =ζ +ζ = = (−1)ε τ,
−τ p ≡ ±3 (mod 8).
2
where ε = p 8−1 .
Now we put all of this together. From above, we have
 
ε p 2
(−1) τ ≡ τ ≡ τ (mod p).
p

Multiplying by τ on both sides gives us factors of τ 2 = 2 on the left and the right.
But since pis odd, 2 ∤ p, so we can cancel out the 2’s on both sides to be left with
(−1)ε ≡ p2 (mod p). But if both (−1)ε and the Legendre symbol only take on values
of ±1, then this equivalence mod p translates to equivalence as integers. This concludes
the proof.

I think this proof is a bit nicer than the proof we provided previously for this result,
since it gives a little more explanation as to why considering p mod 8 is the correct
thing to do. Before, we were like “oh, haha, if you look at p mod 8 and go through all
the cases then it works, what a happy coincidence har har har” but here the 8 comes
naturally from this whole business with considering the eighth root of unity ζ.

Hahn Lheem Page 49


Math 124: Number Theory 10/06 - Quadratic Gauss Sums

9 10/06 - Quadratic Gauss Sums


Recall our marvelous work from the end of last class where we took an eighth root
of unity
 ζ = e
2πi/8
and defined τ = (ζ + ζ −1 )2 . Using these values, we proved the
p2 −1
result p2 = (−1) 8 . We will do something in greater generality in order to achieve
Quadratic Reciprocity once more.
Let p be an odd prime. We will now let ζ = e2πi p, so ζ is a pth root of unity. In
particular, ζ p − 1 = 0.

Lemma 9.1
Let a ∈ Z. then,
p−1
(
X p if p | a
ζ at =
t=0
0 (p, a) = 1.

Proof. We first consider the case when p | a. Then, since ζ p = 1, it follows that ζ a = 1
as well, and so the result follows.
Now suppose p ∤ a. This is similarly easy: this sum is just the sum of a finite
geometric series with starting term 1 and common ratio ζ, so we have
p−1
X (ζ a )p − 1
ζ at = =0
t=0
ζa − 1

again because (ζ a )p = (ζ p )a = 1.

Corollary 9.2
Let x, y ∈ Z. Then,
p−1
(
1 X t(x−y) 0 if p ∤ x − y
ζ =
p t=0 1 if x ≡ y (mod p).

Now, we will prove a useful lemma regarding sums of the Legendre symbol.

Lemma 9.3
Pp−1  t 
t=0 p
= 0.

Proof. Ha syke, this is a homework problem. (Proof was given in


 class, though, so come
t
to class!) The idea, though, is to count the number of times p = ±1 respectively.

Hahn Lheem Page 50


Math 124: Number Theory 10/06 - Quadratic Gauss Sums

9.1 Gauss Sum


We now introduce the main ingredient in this approach towards Quadratic Reciprocity:
the Gauss sum.
Definition 9.4 (Gauss Sum). Let a ∈ Z. The sum
p−1  
X t
ga = ζ at
t=0
p

is called a Gauss sum.

We now prove some fundamental properties of this Gauss sum.

Proposition 9.5
 
ga = ap g1 .

P t
Proof. If p | a, then ζ a = 1, so ga = p−1
t=0 p
= 0 by Lemma 9.3. Now, assume that
p ∤ a. We have an isomorphism from Z/pZ → Z/pZ where t 7→ at, so we can change
variables to get
  p−1    
a X a t
ga = ζ at
p t=0
p p
p−1  
X at
= ζ at
t=0
p
p−1  
X s
= ζ s = g1 ,
s=0
p
    
a a a
=⇒ ga = ga = g1 ,
p p p

as desired.

Author’s Note 9.6. Some notation remarks. We will let g = g1 , and I will
denote Z/pZ as Fp (the F stands for “field” because this is a field
P with p elements).
Whenever I suppress the indices of a sum (e.g., if I just write x ), then assume x
goes from 0 to p − 1.

Moving forward,

Hahn Lheem Page 51


Math 124: Number Theory 10/06 - Quadratic Gauss Sums

Proposition 9.7
p−1
g 2 = (−1) 2 p.

Proof. Let a ̸≡ 0 (mod p). Then,


  
a −a
ga g−a = g2
p p
 
−1
= g2
p
p−1
= (−1) 2 g2.
Now taking the sum over all a, we have
p−1
p−1 X
(p − 1) · (−1) 2 g2 = ga g−a
a=0
p−1
X X X x y 
= ζ a(x−y)
a=0 x y
p p
p−1
X  xy  X
= ζ a(x−y)
0≤x,y<p
p a=0
X  x2 
= p = p(p − 1),
x
p
from which we conclude our desired result. The second-to-last equality follows from
Corollary 9.2.
p−1
We will denote (−1) 2 p = p∗ to make things less clunky. So, to rewrite the above
Proposition, g 2 = p∗ .

9.2 Second Proof of Quadratic Reciprocity


Quadratic Reciprocity relates two primes, so let’s now introduce q ̸= p another odd
prime. We begin our second proof of Quadratic Reciprocity.

Proof of Quadratic Reciprocity, Theorem 7.8. Let q ̸= p be another odd prime. We


then have
q−1 q−1
g q−1 = (g 2 ) 2 = (p∗ ) 2
 ∗
p
≡ (mod q),
q
 ∗
q p
=⇒ g = g (mod q).
q

Hahn Lheem Page 52


Math 124: Number Theory 10/06 - Quadratic Gauss Sums

On the other hand, we can use the definition of g to explicitly expand (using Freshman’s
Dream, which states (a + b)q ≡ aq + bq mod q):
!q
Xt X  t q
q t
g = ζ ≡ ζ qt ≡ gq (mod q),
t
p t
p
 q  
where the last equality follows because pt = pt since the symbol is either −1, 0, 1
and q is odd.
   
But Proposition 9.5 tells us that gq ≡ pq g mod q, so g q ≡ gq ≡ pq g. At the
 ∗
same time, we wrote above that g ≡ pq g, so we have
q

 ∗  
p q
g≡ g (mod q)
q p
 ∗  
p q
=⇒ g·g ≡ g·g
q p
 ∗  
p ∗ q
=⇒ p ≡ p∗ (Proposition 9.7)
q p
   ∗
q p
=⇒ ≡
p q
p−1
!
(−1) 2 p
= (again Prop 9.7)
q
  p−1  
−1 2 p
=
q q
 
q−1 p−1 p
≡ (−1) 2 · 2 , (Corollary 7.6)
q
which is exactly the statement for Quadratic Reciprocity.

9.3 Kronecker’s Result for Quadratic Extensions


We now shift our focus back to these Gauss sums. Our end goal for this section will be
to prove the following:
(
p p ≡ 1 (mod 4)
g 2 = p∗ =
−p p ≡ 3 (mod 4),
or equivalently ( √
± p p ≡ 1 (mod 4)
g= √
±i p p ≡ 3 (mod 4).
We step back a little bit and, even before concerning ourselves with specific values, we
ask: what is the sign of g?

Hahn Lheem Page 53


Math 124: Number Theory 10/06 - Quadratic Gauss Sums

Proposition 9.8
X p −1
Let P (X) = X p−1 + X p−2 + · · · + 1 = X−1
. Then, P is an irreducible polynomial
in Q[X].

Remark 9.9. We call P (X) above the pth cyclotomic polynomial. Note also
that if we take ζ = e2πi p again, then P (ζ) = 0, so P is the minimal polynomial of
ζ by the Proposition, as it is irreducible.

Proof. By contradiction, assume that P is reducible. Then, we can write P (X) =


f (X)g(X) where f, g ∈ Q[X] are monic and deg f, deg g > 0. We now have a really
strong result:

Exercise 9.10. For such f, g ∈ Q[X] above, it follows that f, g ∈ Z[X].

This is Exercise 4 in Chapter 6 of Ireland-Rosen, but one way to approach this is


noting that (1) the coefficients of f and g are polynomials in ζ k (these are the roots of
P (X)), (2) each ζ k is an algebraic integer, and the set of algebraic integers Ω forms a
ring, and (3) Ω ∩ Q = Z.
Using this exercise, though, we can write P (X + 1) = f (X + 1)g(X + 1), but we
can explicitly write P (X + 1) as
p  
(X + 1)p − 1 X p
P (X + 1) = = X k−1 .
X k=1
k

Taking this modulo p, we have X p−1 ≡ f (X + 1)g(X + 1) mod p, which means in mod
p, we have f (X + 1) ≡ X r and g(X + 1) ≡ X s for some r, s > 0. Taking X = 0,
we see that p | f (1), g(1), so p2 | f (1)g(1) = P (1) = p, which is a contradiction. The
conclusion follows.

Proposition 9.11
Taking ζ = e2πi/p again,a
p−1
2
Y 2 p−1
ζ 2k−1 − ζ −(2k−1) = (−1) 2 p.
k=1
a
Apparently Gauss thought about this almost every day for four years before being able to
prove it. Look at Gauss, man, so inspirational.

Hahn Lheem Page 54


Math 124: Number Theory 10/06 - Quadratic Gauss Sums

Proof. Consider again P (X) from the above Proposition; since every ζ t is a root of P ,
p−1
we can write P (X) = X p−1 + · · · + 1 = t=1 (X − ζ t ). Plugging in X = 1, we have
Q

p−1
Y
p = P (1) = (1 − ζ t ).
t=1

Now we do something that is a bit ad hoc, which is only reasonable if it took a chad
like Gauss four years to come up with this. Observe that the set ±(4k − 2) : 1 ≤ k ≤ p−1

2
is a complete set of residues mod p. This is the case because the expression 4k − 2,
where we take the values 1 ≤ k ≤ p+14
, gives the 2 mod 4 residues 2, 6, . . . ,, while those
p+1
greater than 4 give the 3 mod 4 residues. Taking the negatives of everything covers
the rest.
Given this, though, we can now rewrite our product, splitting along when we take
+(4k − 2) or −(4k − 2):
p−1 p−1
2
Y 2
Y
p= (1 − ζ 4k−2 ) (1 − ζ −(4k−2) )
k=1 k=1
p−1
Y2

ζ 2k−1 ζ −(2k−1) − ζ 2k−1 ζ −(2k−1) ζ 2k−1 − ζ −(2k−1)


 
=
k=1
p−1
2
p−1 Y 2
= (−1) 2 ζ 2k−1 − ζ −(2k−1) .
k=1

Rearranging gives the desired result.

This essentially takes us to the promised land.

Proposition 9.12
Let p be an odd prime, and ζ = e2πi/p . Then,
p−1 (√
2
Y
−(2k−1) p p≡1 (mod 4)
ζ 2k−1 − ζ

= √
k=1
i p p≡3 (mod 4).

Proof. From the above Proposition, we know that the magnitude of the product is going

to be p, so we are only concerned about the sign/what power of i we multiply.
We need to get a little bit more involved in the complex numbers this time. You
may have seen before eiθ = cos θ + i sin θ; thus, we have eix − e−ix = 2i · sin(x), so we

Hahn Lheem Page 55


Math 124: Number Theory 10/06 - Quadratic Gauss Sums

have
p−1 p−1
2 2 
2iπ(2k−1) 2iπ(2k−1)
Y Y 
ζ 2k−1 − ζ −(2k−1)
− e−

= e p p

k=1 k=1
p−1
2  
Y (4k − 2)π
= 2i sin
k=1
p
p−1
2  
p−1 Y (4k − 2)π
=i 2 2 sin .
k=1
p

Let’s look at when sin is negative. In general, sin x is negative when π < x < 2π,
so sin (4k−2)π
p
is negative if p+2
4
< k ≤ p−1
2
. Counting by case, this gives p−1
4
negative
p−3
terms if p ≡ 1 (mod 4) and 4 negative terms if p ≡ 3 (mod 4).
Now we proceed by case mod 4. If p ≡ 1 (mod 4), then the sign of the product is
p−1 p−1 p−1 p−1 p−1
i (−1) 4 = (−1) 4 (−1) 4 = (−1) 2 = 1. This agrees with what we have in our
2

proposition statement.
If p ≡ 3 (mod 4), then the sign is equal to
p−1 p−3 p−3 p−3
i 2 (−1) 4 = i(−1) 4 (−1) 4 = i,

which is also what we have in the proposition. This concludes the proof.

We have seem to gone a long ways away from our temporary home of Gauss sums,
but now we have found a road to circle back around. Recall Proposition 9.7 tells us
p−1
that g 2 = (−1) 2 p. But this is exactly the expression we have from Proposition 9.11,
so we have
p−1
2
p−1 Y 2
g 2 = (−1) 2 p= ζ 2k−1 − ζ −(2k−1)
k=1
p−1
2
Y
ζ 2k−1 − ζ −(2k−1) .

=⇒ g = ε
k=1

You are probably on the edge of your seat at this point whether ε is positive or
negative. Kronecker gives us the answer.

Theorem 9.13 (Kronecker)


ε = 1.

Hahn Lheem Page 56


Math 124: Number Theory 10/06 - Quadratic Gauss Sums

Proof. Let
p−1
p−1   2
X j Y
xj − ε x2k−1 − xp−(2k−1) .

f (x) =
j=1
p k=1

Note that when x = ζ, we are just computing f (ζ) = g − g = 0. One can also
use Lemma 9.3 to deduce f (1) = 0. This means that f is divisible by the minimal
polynomials of ζ and 1, respectively; in particular, X p−1 + · · · + 1 | f and X − 1 | f .
This means
(X − 1)(X p−1 + · · · + 1) = X p − 1 | f,
so we can write f (X) = (X p − 1)g(X) for some g(X). Substituting x = ez in the above
expression, we have
p−1   (p−1)/2
X j jz
Y
ez(2k−1) − ez(p−(2k−1)) = (epz − 1)g(ez ).

e −ε (2)
j=1
p k=1

P∞ (jz)k
We can identify ejz with its Taylor series expansion k=0 k!
, in which case the
sum can be expressed as
p−1   ∞ p−1  
X j jz
X zk X j
e = jk.
j=1
p k=0
k! j=1
p

We will now identify thez (p−1)/2 coefficient in Equation 2. The sum contributes
  a
Pp−1 j  p−1 p−1
coefficient of ((p−1)/2)! j=1 p j 2 . But note that mod p, we have j 2 ≡ pj , so
1

the coefficient mod p reduces to


p−1  2
1 X j p−1
= p−1  .
((p − 1)/2)! j=1 p 2
!

Meanwhile, if we consider the Taylor series of the difference ez(2k−1) − ez(p−(2k−1)) ,


the constant term of 1 cancels out so the minimal term is z(2k − 1) − z(p − (2k − 1)).
Taking the product as 1 ≤ k ≤ (p − 1)/2, we see that the coefficient of z (p−1)/2 in the
product must be the product of the linear part of each term, namely
(p−1)/2 (p−1)/2
Y Y
((2k − 1) − (p − (2k − 1))) = (4k − p − 2).
k=1 k=1

Hahn Lheem Page 57


Math 124: Number Theory 10/13 - Finite Fields

Reducing this mod p as well, we have


(p−1)/2 (p−1)/2 (p−1)/2
Y Y Y
(4k − p − 2) ≡ (4k − 2) = 2(p−1)/2 (2k − 1)
k=1 k=1 k=1
(p−1)/2 (p−1)/2
2 (p − 1)! 2 (p − 1)!
= = (p−1)/2 p−1 
2 · 4 · · · (p − 1) 2 2
!
1
≡ − p−1  (mod p),
2
!
where the last line follows from Wilson’s Theorem.
We now consider the coefficient of z (p−1)/2 for (epz −1)g(eP
z
). Note that f (X) ∈ Z[X],
so g(X) ∈ Z[X] as well. But all coefficients of e − 1 = k≥1 (pz)k /k! are 0 mod p,
pz

hence all coefficients in the expansion of (epz − 1)g(ez ) vanish mod p.


Collecting our results, this means the coefficient of z (p−1)/2 for the sum and product
must agree mod p, i.e.,
p−1 1
p−1
 ≡ −ε p−1  (mod p).
2
! 2
!
It is clear now that ε = +1.

10 10/13 - Finite Fields


We now study a really beautiful and fruitful topic: finite fields. We are quite familiar
with perhaps the simplest kind of finite field: Fp = Z/pZ, the field with p elements.
One can show that any two fields with p elements are isomorphic to each other. This
gives us a way of classifying all finite fields of a certain prime order – namely, there
is only one up to isomorphism. Looking outward at all possible sizes, we pose the
question

Can we classify all finite fields?

Just as a note, a field is a set closed under addition and multiplication, and it has
the important property that every nonzero element has a multiplicative inverse. For
example, Q and Z/7Z are fields because, for instance, 2−1 = 1/2 ∈ Q and 2−1 = 4 in
Z/7Z (since 2 · 4 ≡ 1 (mod 7)), but 2−1 = 1/2 ∈ / Z so Z is not a field.
We first provide a result that tells us exactly what the size of a finite field can be.

Lemma 10.1 (Finite fields have prime power order)


If k is a finite field, then Fp ⊆ k is a subfield for some prime p and |k| = pr = q for
some r ∈ Z. (In this case, we often write k = Fq .)

Hahn Lheem Page 58


Math 124: Number Theory 10/13 - Finite Fields

Proof. As k is finite, the set under addition is a finite abelian group, so for any x,
|k| · x = x
| + ·{z
· · + x} = 0.
|k| times

Denote q := |k|. Let m be the minimal positive integer such that m · 1 = 0. We


claim that m is prime. If m is not prime, then we can write m = a · b for some integers
a, b > 1. But then
m · 1 := |1 + ·{z
· · + 1}
m times
· · + 1} 1| + ·{z
= |1 + ·{z · · + 1}
a times b times
= (a · 1)(b · 1),
and in general if rs = 0 in a field, then either r = 0 or s = 0. But then we have either
a · 1 = 0 or b · 1 = 0, which contradicts the minimality of m. Thus, m = p is a prime,
· · + 1} for ℓ ≤ p, i.e. it contains Fp .
so k contains any 1| + ·{z
ℓ times
The fact that |k| is some power of p comes from the fact that k is an Fp -vector space.
We can check the axioms of a vector space manually: given x, y ∈ k and a, b ∈ Fp , we
have (1) a·x is just multiplication in k; (2) a·(x+y) = a·x+a·y; (3) a(b·x) = (a·b)·x;
(4) (a + b) · x = a · x + b · x.8
Now, thinking of k as an Fp -vector space, we will find an Fp -basis for k. In other
words, we will find a minimal set of elements r1 , . . . , rn ∈ k such that any a ∈ k can be
written as a = a1 r1 + · · · + an rn for some ai ∈ Fp . We can show that the minimality of
this set implies these ai are unique.9 If a = a1 r1 + · · · + an rn = b1 r1 + · · · + bn rn with
not all ai = bi (WLOG rearrange basis elements so that a1 ̸= b1 ), then we have
0 = (a1 − b1 )r1 + (a2 − b2 )r2 + · · · + (an − bn )rn
(b1 − a1 )r1 = (a2 − b2 )r2 + · · · + (an − bn )rn
=⇒ r1 = (b1 − a1 )−1 ((a2 − b2 )r2 + · · · + (an − bn )rn ),
which is legal because we assumed a1 ̸= b1 =⇒ b1 − a1 ̸= 0 but it is an element of Fq ,
so there exists an inverse. But then we can express r1 in terms of r2 , . . . , rn , so the set
r1 , . . . , rn is no longer minimal. Thus, the ai ’s are unique.
This provides us with a bijection of sets (in fact, an isomorphism of additive groups)

k = Fp × · · · × Fp where a 7→ (a1 , . . . , an ).10 Thus, |k| = |Fp × · × Fp | = pn as desired.
| {z }
n times

8
Honestly I’m not sure if this covers all the vector space axioms, but the main point is that every-
thing is sunshine and rainbows because Fp literally lives in k, so k is an Fp -vector space.
9
This is just saying basis elements are linearly independent, for those who have seen this stuff
before.
10
Note this is not a field isomorphism; in fact, Fp × · · · × Fp not only fails to be a field, but it fails
to be an integral domain. Convince yourself of this!

Hahn Lheem Page 59


Math 124: Number Theory 10/13 - Finite Fields

This is really great news and a strong result off the bat. However, this tells us
practically nothing else about the field k. The only elements we have a hold on is the
copy of Fp living inside k, but otherwise we are a bit lost. (What even are the basis
elements of k?) We will now prove some things which tell us more information about
Fq (where q = pn ).

Lemma 10.2
Suppose k is a finite field with q = pn elements. Then, the polynomial X q − X ∈
Fp [X] factors as Y
Xq − X = (X − α).
α∈k

Proof. Note k × = k \ {0} has size q − 1. Since k × is itself a multiplicative group, if


y ∈ k × , then y q−1 = 1. Equivalently, αq − α = 0 for any α ∈ k. Thus, α is a root of
X q − X = 0, so we have (X − α) | X q − X. All of the (X − α) terms are relatively
prime, so it follows that
Y
(X − α) X q − X.
α∈k
But both the left and right hand sides have degree q, so they must be equal.

Whenever we talk about a new object, we always care about maps related to the
object. Finite fields have a really, really useful map that comes with them, called
the Frobenius automorphism. An automorphism is just a bijection which is also a
(field) homomorphism: if σ is an automorphism, then σ is a bijection and it satisfies
both σ(x + y) = σ(x) + σ(y) and σ(xy) = σ(x)σ(y).

Lemma 10.3 (Frobenius automorphism)


Let k be a finite field with q = pn elements. Then, the map

σ:k→k
α 7→ αp

is an automorphism, called the Frobenius automorphism. Furthermore, the set of


elements fixed by σ, given by k σ = {x ∈ k | σ(x) = x}, is exactly Fp ⊂ k.

Proof. We first verify that σ is a homomorphism. We easily observe σ(xy) = (xy)p =


xp y p = σ(x)σ(y). We also have
   
p p p p−1 p
σ(x + y) = (x + y) = x + x y + ··· + xy p−1 + y p
1 p−1
p p
= x + y = σ(x) + σ(y),

Hahn Lheem Page 60


Math 124: Number Theory 10/13 - Finite Fields

where the second line follows from the fact p | mp for 1 ≤ m ≤ p − 1, so all of the


middle terms vanish in k.


Now we show σ is injective. It suffices to show that σ is injective, because then it
shows | Im σ| = |k|, but the range of σ is just k, so Im σ = k, i.e. σ is also surjective. If
σ(x) = σ(y), then σ(x − y) = (x − y)p = 0. But this is only possible if x − y = 0, i.e.
x = y.
Finally, we show k σ = Fp . If β ∈ k σ , then by definition σ(β) − β = β p − β = 0.
But we know any α ∈ Fp satisfies X p − X = 0 by Fermat’s Little Theorem, so as in the
above lemma, we have Y
(X − α)|X p − X.
α∈Fp

Comparing degrees, we see that elements of Fp are the only roots of X p − X, so


β ∈ Fp as desired.

10.1 Construction of Finite Fields


Now we work towards constructing these finite fields, since again, we have no hold on
its elements yet besides the elements of Fp . This discussion will develop in conjunction
with working towards the following result:

Theorem 10.4 (Finite fields of order q are unique)


Let q = pn . Then, there exists a field k with q elements, and k is unique up to
isomorphism.

Note that because of the existence of the Frobenius automorphism, k is not unique
up to canonical isomorphism. However, it turns out that any automorphism of k is
simply some power of the Frobenius automorphism. (This is why we say the Frobenius
automorphism is really important.) This is a taste of a beautiful study called Galois
theory, which on the ground is about these automorphisms of fields but can be used to
prove wildly vast things! (In fact, the study arose from proving that there is no quintic
formula.)
We now proceed with the construction of a finite field with q elements. Let k be a
field and f (x) = k[x] is an irreducible monic polynomial. We will consider f [x] modulo
f : we say g, h ∈ k[x] are equivalent (denoted g ∼f h) if f | g − h. (Think of this as just
g ≡ h (mod f ).) This now gives us a construction:

Proposition 10.5
The set of equivalence classes of k[x] under the equivalence relation ∼f is a field.

Proof. We first show that the set has addition and multiplication. I will also suppress

Hahn Lheem Page 61


Math 124: Number Theory 10/13 - Finite Fields

the f in ∼f so I can type faster lol. If h1 ∼ g1 and h2 ∼ g2 , then f | h1 − g1 and


f | h2 − g2 , so f | (h1 + h2 ) − (g1 + g2 ) =⇒ h1 + h2 ∼ g1 + g2 . Similarly,
f | h(h2 − g2 ) + g2 (h1 − g1 ) = h1 h2 − g1 g2 =⇒ h1 h2 ∼ g1 g2 .
Now we show that the set is a field. Take some nonzero g ∈ k[x]/ ∼. Since f
is irreducible, this just means f ∤ g. Consider the ideal (f, g) = {af + bg | a, b ∈
k[x]}. Recall from the first few lectures (somewhere around Theorem 2.4) that k[x] is
a principal ideal domain (a GCD exists between any two elements), so (f, g) = (h) for
some h ∈ k[x]. But this means f ∈ (h), or equivalently h | f , which is only possible
given f is irreducible when deg h = 0 or h = f . If h = f , then g ∈ (h) =⇒ f = h | g,
a contradiction to our choice of g. Thus, ∃ a, b ∈ k[x] such that af + bg = 1, so
bg = 1 − af ∼ 1. This shows k[x]/ ∼ is a field.
(If this was a lot to handle, this is the tl;dr of the argument. If g lives in a nonzero
equivalence class, then since f is irreducible, f and g are coprime, so their gcd is 1.
That means we have some polynomials a, b such that af + bg = 1, so bg ≡ 1 (mod f ),
i.e. b is the inverse of g.)

Okay, we constructed a field. What does this field look like? How many elements
does it have?

Corollary 10.6
If f has degree d, then k[X]/ ∼f is a vector space of k with dimension d, with a
basis given by 1, X, X 2 , . . . , X d−1 .

Proof. Since k[X] is a Euclidean domain, it has a Division Algorithm. Thus, for any
g ∈ k[X], we can write g = q · f + r for some q, r ∈ k[X] and deg r < deg f . This
means g ∼f r, but r is written using just 1, X, . . . , X d−1 (as deg r < deg f = d), so we
are done.

This notation k[X]/ ∼f is a little clunky. Sometimes, we write k[X]/ ∼f as k(α),


where α is a “solution” of f (X) = 0. I find this a particularly useful perspective, because
polynomials are a bit clunky to actually work with, but I can work with numbers. For
instance, if we expand our view a little to consider polynomial rings over Z, for instance,
then we can consider f (X) = X 2 + 1. Then, we could show in a similar fashion that
Z[X]/ ∼f is a ring. But what we’re actually doing here is we’re adjoining an element √ X
2
satisfying X + 1 = 0. We have another name for such an element! We call it i = −1.
So Z[X]/ ∼f is actually just Z[i], written in a way where you don’t have to write down
some element out of thin air.
Another perspective, for those who know stuff about rings and ideals, is that this
is actually just the quotient k[X]/(f ), where (f ) is again the ideal generated by f . If
f is irreducible, then (f ) is maximal, which is equivalent to saying k[X]/(f ) is a field.
You probably never showed that when k is a finite field, this quotient field is actually
finite; this is what we did today.

Hahn Lheem Page 62


Math 124: Number Theory 10/13 - Finite Fields

10.2 Existence of Fq
Now we prove another quite remarkable fact. We showed earlier that X q − X = 0
factors into q linear factors in k[X], since every α ∈ k satisfies αq − α = 0. But what
if we just considered the factorization of X q − X = 0 in Fp [X]? A remarkable thing
happens:

Theorem 10.7
Let q = pn . Then,
n
Y
Xq − X = Xp − X = Fd (X) ∈ Fp [X],
d|n

where Fd is the product of all monic irreducible polynomials in Fp [X] of degree d.

This gives us access to irreducible polynomials! This may not seem so cool at first,
but I challenge you to find an irreducible polynomial of, say, F7 [X] of degree 4 and
see who’s laughing by the end of your computation. We care about these irreducible
polynomials because, well, we use them to construct our finite fields. In line with this,
we have the existence of a finite field of order pn :

Corollary 10.8
There exists an irreducible polynomial f ∈ Fp [X] of degree n.

Proof. Assume Theorem 10.7. Let Nd be the number of irreducible polynomials in


Fp [X] of degree d. Then, looking P at the degrees of both sides of the equation in the
n
Theorem, we have q = p = d|n d · Nd . We will now apply Möbius inversion to
f (d) = d · Nd ; this gives us
X
f (n) = nNn = µ(n/d)pd .
d|n

This gives us an explicit form for Nn , and one can check that the sum on the right
is nonzero (it is a sign of distinct powers of p up to sign), so Nn ̸= 0 and indeed there
exists an irreducible polynomial of degree n in Fp [X].

Corollary 10.9
There exists a finite field of order pn .

Proof. From the above corollary, there is an irreducible polynomial f ∈ Fp [X] of degree
n. Then, Fp [X]/ ∼f is your desired field.

Hahn Lheem Page 63


Math 124: Number Theory 10/16 - Finite Fields, continued

We will give a proof for Theorem 10.7 in the next lecture.

11 10/16 - Finite Fields, continued


11.1 Completing Proof of Existence
As promised, we will now prove Theorem 10.7.

Proof of Theorem 10.7. First, we will show that no factor of X q − X appears twice,
n n
namely if f ∈ Fp [X] with deg f > 0 such that f | X p − X, then f 2 ∤ X p − X. Suppose
n
the contrary, so f 2 · g = X p − X. We will reach a contradiction shortly after a brief
interlude on derivatives.
In finite fields, we have a notion of a formal derivative. In high school calculus,
you learn the derivative via the limit definition, but afterwards, you forget about the
d 2
limit and just manipulate symbols. For example, you can prove dx x = 2x via the limit
definition, but you probably know this as just “oh, the rule for the derivative of xn is
just nxn−1 .”
For finite fields, we define the derivative as just this symbolic manipulation. So in
d 2
Fp [X], we still write dx x = 2x, except now it doesn’t mean anything more than what
we just wrote down. Limits don’t make sense here, anyways! But this derivative still
obeys all the things we expect from calculus (e.g. Chain Rule, Product Rule, etc.), so
we can work with this.
d
To summarize, we have a map dx : Fp [X] → Fp [X] where X n 7→ n · X n−1 . Taking
n
the “derivative” on both sides of f 2 · g = X p − X, we get
n −1
2f ′ (X)f (X)g(X) + f 2 (X)g ′ (X) = pn X p − 1 = −1,

where the last equality follows because we are working over Fp . But f divides the left
hand side, so we must have f | −1, which is our contradiction. Thus, any factor of
n
X p − X has multiplicity one.
n
Now, it remains to show that the only factors of X p − X have degree d | n. In
n
other words, if f ∈ Fp [X] is an irreducible monic, then f | X p − X if and only if
d = deg f | n. We will approach this via the following lemma:

Lemma 11.1
Let ℓ, m ∈ N+ . If F is a field, then X ℓ − 1 | X m − 1 in F [X] if and only if ℓ | m.
Likewise, if a ∈ N>1 , then aℓ − 1 | am − 1 if and only if ℓ | m.

Proof. If ℓ | m, then clearly X ℓ − 1 | X m − 1. We prove the reverse implication. Let

Hahn Lheem Page 64


Math 124: Number Theory 10/16 - Finite Fields, continued

m = qℓ + r where 0 ≤ r < ℓ. Then,

Xm − 1 X qℓ+r − 1
=
Xℓ − 1 Xℓ − 1
qℓ
r X −1 Xr − 1
=X · ℓ + ℓ .
X −1 X −1
Note X ℓ − 1 | X qℓ − 1 (we have X qℓ − 1 = (X ℓ )q − 1 ≡ 1q − 1 = 0 (mod X ℓ − 1)), so
the first term on the right is a polynomial. However, the last term on the right is not
a polynomial since r < ℓ, unless r = 0 in which case ℓ | m, as desired.
The proof for the second part of the statement follows from our work above (the
proof is exactly the same).

We return to the problem at hand. Suppose f ∈ Fp [X] is a monic irreducible poly-


nomial with deg f = d. Consider the field constructed from f , namely K = Fp [X]/ ∼f =
Fp (α) (where α is a root of f , and more specifically f is the minimal polynomial of α).
We showed last time (Corollary 10.6) that K is a finite field of dimension d over Fp , so
d
|K| = pd . Since α ∈ K × and |K × | = |K| − 1 = pd − 1, it follows that αp −1 − 1 = 0, or
d d
αp − α = 0. But f is the minimal polynomial of α, so we have f (X) | X p − X.
This sets us up nicely to complete the proof from here. If d | n, then Lemma 11.1
d n
tells us that pd − 1 | pn − 1; the Lemma again tells us now that X p −1 − 1 | X p −1 − 1, or
d n n
X p − X | X p − X. Thus, since f divides the left hand side, we have f (X) | X p − X.
n
Now we wish to show the converse, namely d | n. If f (X) | X p − X, then we
n n
can write X p − X = f (X) · g(X). Then, αp − α = f (α)g(α) = 0. Recall (basically
Corollary 10.6) that 1, α, . . . , αn−1 is an Fp -basis of K = Fp (α), so any β can be written
as β = a0 + a1 α + a2 α2 + · · · + ad−1 αd−1 for some ai ∈ Fp . Taking both sides to the
pn -power, the Freshman’s Dream (which just says (a + b)p = ap + bp in Fp ) tells us that
n n n n n n
β p = ap0 + ap1 αp + · · · + apd−1 α(d−1)p
= a0 + a1 α + · · · + ad−1 αd−1 = β,
n
so β p − β = 0 for any β ∈ Fp (α). But at the same time, Lemma 10.2 tells us that
d
Y
Xp − X = (X − β),
β∈K

d d
so in particular β satisfies β p − β = 0. But then any β ∈ K satisfies both X p − X = 0
n
and X p − X = 0, and the only roots of the former are elements of K, so we have
d n d n
X p − X | X p − X. In other words, X p −1 − 1 | X p −1 − 1, in which case Lemma 11.1
tells us that pd − 1 | pn − 1 and hence d | n.

The crux of the above proof relies on Lemma 11.1, which basically gives us a nice
divisibility criterion on the exponents given divisibility of polynomials.

Hahn Lheem Page 65


Math 124: Number Theory 10/16 - Finite Fields, continued

11.2 Uniqueness of Fq
We have almost completed the story of finite fields. We will now complete the promise
given by Theorem 10.4, which not only guarantees existence, but says that the field Fq
is unique up to isomorphism. We now prove the uniqueness.

Proof of uniqueness, Theorem 10.4. Let q = pn and suppose F is a finite field of order q.
We will show that it is isomorphic to Fp [X]/ ∼f for some monic irreducibleQf ∈ Fp [X] of
degree n. Note Theorem 10.7 tells us that f (X) | X q − X, but X q − X = α∈F (X − α),
so ∃ α ∈ F such that f (α) = 0. We can now identify F as Fp (α) via the following
isomorphism:

Fp [X]/f −
→F
X 7→ α.

One should be more careful than Kisin here by showing this is actually an isomorphism,
but if you know the First Isomorphism Theorem, this is not too bad. Since f is
irreducible, it is the minimal polynomial of α. Thus, the map Fp [X] → F sending X 7→
α is clearly surjective, and its kernel is exactly f , so Fp [X]/f → F is an isomorphism,
as desired.

Proposition 11.2
If |F | = q = pn , then the subfields of F are in bijection with the divisors of n.

Proof. Let E ⊆ F be a subfield, and denote d = dimFp E. We have E × ⊆ F × as a


multiplicative subgroup, which means |E × | = pd −1 | pn −1 | |F × |, which means (again,
Lemma 11.1) that d | n.
n o
d
To construct a subfield given a divisor d of n, consider E = α ∈ F | αp − α = 0 .
d
In other words, E is the set of solutions to X p − X = 0, which we know has pd distinct
d n
solutions. (This is true because X p − X | X p − X, and the latter has all distinct
roots.) We now show E has a field structure: it is closed under addition because
d d d d d d
(α + β)p = αp + β p , it is closed under multiplication since (αβ)p = αp β p , and it
d
has multiplicative inverses because if αp = α, then we can take the inverses on both
d
sides and it still satisfies X p = X.

11.3 Interlude: Galois theory preview


Let us give you a little preview of Galois theory, because what we’re doing here when
talking about subfields is really discussing the simplest scenario in Galois theory. Recall
the Frobenius automorphism σ : F → F sending α 7→ αp . We saw in Lemma 10.3 that
the subfield fixed by σ, notated F σ , is just Fp . If E ⊆ F is a subfield with |E| = pd (we

Hahn Lheem Page 66


Math 124: Number Theory 10/16 - Finite Fields, continued

just showed then that d | n), then we can identify E is the subfield of F fixed by σ d .
In short,
d d
E = F σ = {α ∈ F | σ d (α) = αp = α}.
What Galois theory does is it relates these subfields to subgroups of what we call
the Galois group, which is just the set of automorphisms of F fixing Fp . In the case
of F a field over Fp , it turns out that the only automorphisms of F fixing Fp are powers
of the Frobenius automorphism, so Aut(F/Fp ) = {1, σ, . . . , σ n−1 } ≃ Z/nZ. (You may
see this as Gal(F/Fp ); this is because F is what we call a Galois extension over Fp . If
you’re curious to learn more, take Kisin’s Math 123 next semester!)
But what are the subgroups of Z/nZ? They are simply the multiples of d in Z/nZ
for d | n. (For instance, {0, 3, 6} is a subgroup of Z/6Z.) The multiples of d correspond
to the subgroup {1, σ d , σ 2d , . . . , σ n−d }, or the subgroup generated by σ d . But we showed
that the field fixed by σ d (hence fixed by this subgroup) is E! So there is a bijection
between subgroups of Aut(F/Fp ) and subfields of F/Fp where a subgroup corresponds
to the subfield it fixes. This is really nice, because we have a lot of results in group
theory that we can now use.
To conclude this discussion on finite fields, we provide the following nice result:

Lemma 11.3
If |F | = q = pn , then F × is cyclic. (Hence it is isomorphic to Z/(q − 1)Z.)

The proof follows the proof for when q = p is just a prime (Theorem 5.12), so we omit
for brevity.

11.4 Proof 2.5 of Quadratic Reciprocity


We shift back to Quadratic Reciprocity, using our new information on finite fields.
Let p, q be odd primes. We want to prove again Quadratic Reciprocity. Choose some
n ∈ N+ such that q n ≡ 1 (mod p) (e.g., n = p − 1). LEt F be a finite field of order
q n , so |F × | = q n − 1. By Lemma 11.3 above, F × is cyclic; let γ ∈ F × be a generator.
n
Denote λ := γ (q −1)/p , so λ has order exactly p. Now consider the Gauss sums
p−1  
X t
τa = λat .
t=0
p

Denote τ := τ1 for ease of notation. Recall


 the following results we proved back
in §9: Proposition 9.5, which says τa = p τ , and Proposition 9.7, which says τ 2 =
a

(−1)(p−1)/2 p. Denote p∗ := (−1)(p−1)/2 ·p. We now track through the following equivalent
statements:

Hahn Lheem Page 67


Math 124: Number Theory 10/23 - Diophantine Equations

We have p∗ is a square mod q if and only if τ ∈ Z/qZ, as τ 2 = p∗ and the square


root of p∗ is unique up to sign. But this is equivalent to τ q = τ , or equivalently
p−1   p−1  q p−1    
X t t
X t qt
X t qt q
τ= λ = λ = λ = τq = τ,
t=0
p t=0
p t=0
p p
   ∗  
so pq = 1. To recap, we have pq = 1 ⇐⇒ pq = 1, so their product must be 1.
This gives
 ∗  
(−1)(p−1)/2 p
 
p q q
= =1
q p q p
      (p−1)/2    
(−1)(p−1)/2

p q −1 p q
=⇒ =
q q p q q p
  
p−1 q−1 p q
= (−1) 2 2 = 1,
q p
and Quadratic Reciprocity follows.

12 10/23 - Diophantine Equations


We now transition to the next main part of the class: Diophantine equations. As we go
along, the equations that we consider may seem ad hoc. And this is a fair reflection of
how math has developed over time: big results are a product of literally many, many
years of intense thought. Rather, it is surprising that we are able to assign such a nice
narrative to the development of mathematics.

12.1 Gaussian Integers, a review


We will begin by reviewing the Gaussian integers, given by Z[i]. This is the set {a + bi :
a, b ∈ Z}. We can look at the field of fractions of Z[i], which are elements of the form
a+bi
c+di
. Rationalizing the denominator, we can express this as a′ + b′ i for a′ , b′ ∈ Q. Thus,
Frac Z[i] = Q[i], which is a 2-dimensional Q-vector space with basis elements 1 and i.
(Like with Z[i], we have Q[i] = {r + is : r, s ∈ Q}.)
We will review that Z[i] is a Euclidean domain. We have a norm function N : Q[i] →
Q where N (r + is) = (r + is)(r + is) = r2 + s2 . We can check that this is multiplicative:
N (αβ) = αβαβ = ααββ = N (α)N (β). (Here, we are using the fact αβ = α · β, which
is easy to check.)

Proposition 12.1
N : Z[i] → Z is a Euclidean function.

Hahn Lheem Page 68


Math 124: Number Theory 10/23 - Diophantine Equations

Proof. Let m, n ∈ Z[i] with m, n ̸= 0. Write n/m = a + bi where a, b ∈ Q. Choose


x, y ∈ Z such that |a − x|, |b − y| ≤ 1/2. Letting q = x + iy and r = (a − x) + (b − y)i, we
have n/m = q + r so n = m · q + m · r. By construction, both m · q and m · r are in Z[i].
We can bound N (m · r) = N (m)N (r) ≤ N (m) · ((1/2)2 + (1/2)2 ) = N (m)/2 < N (m),
so the norm is a Euclidean function.

As a consequence, we obtain Z[i] is a UFD, since any Euclidean domain is a UFD.


(This was the whole point of the first two or so lectures of the course.)
Now we review a result that you all proved on the problem set, which is an incredible
theorem attributed to Fermat.

Theorem 12.2 (Fermat)


If p ≡ 1 (mod 4) is prime, then p = a2 + b2 for some a, b ∈ Z.

Proof. We use  fact that −1 is a quadratic residue mod p when p ≡ 1 (mod 4).
 the
−1
(In general, p = (−1)(p−1)/2 , which is just 1 when p ≡ 1 (mod 4).) Therefore, this
means there exists some integers s, k ∈ Z such that s2 + 1 = pk.
Now we claim p is reducible in Z[i]. Suppose not. Then, p | s2 + 1 implies either
p | s + i or p | s − i. Clearly this cannot be possible, so p is indeed reducible, meaning
we can write p = α · β where α, β ∈ Z[i] are not units. Write α = a + bi. Then,

p2 = N (p) = N (α)N (β) = (a2 + b2 ) · N (β).

Then, as a2 +b2 | p2 , we have a2 +b2 is either 1, p, or p2 . If it is 1, then (a+bi)(a−bi) =


1, which means α is a unit, contradicting our assumption that α is a non-unit. Likewise,
if a2 + b2 = p2 , then N (β) = β · β = 1, which again means β is a unit, giving us a
contradiction. It follows p = a2 + b2 , and we conclude.

12.2 Irreducible Elements in Gaussian Integers


Okay, this is great – in fact, really great. But this is saying that primes 1 mod 4 are
not irreducible. This begs the question,

What are the irreducible elements in Z[i]?

We actually have a very good starting point to answer this question.

Lemma 12.3
If N (a + bi) is a prime integer, then a + bi is irreducible.

Hahn Lheem Page 69


Math 124: Number Theory 10/23 - Diophantine Equations

Proof. We prove the contrapositive. If a + bi is reducible, i.e., a + bi = αβ for non-units


α, β, then N (a + bi) = N (α)N (β) where N (α), N (β) ̸= 1. But then N (a + bi) is not
prime, as desired.

Here, the fact that N (α) ̸= 1 follows from the fact that α is not a unit. We can
actually use this kind of idea to find all units of Z[i].

Lemma 12.4
The units in Z[i] are {±1, ±i}.

Proof. If α = a + bi is a unit, then there exists some β ∈ Z[i] such that α · β = 1.


Taking the norm, we have N (α)N (β) = 1, which forces N (α) = 1 as the norm is always
nonnegative. (It is the sum of two squares.) But a2 + b2 = 1 is only possible when one
of them is 0 and the other is ±1, which corresponds to the units we listed.

Going back to our original question, we determined that any Gaussian integer with
prime norm must be irreducible. But suppose I have an element with norm that is not
prime. Then, can it be irreducible? (Basically, is the converse of Lemma 12.3 true?)
To hint at the answer of the above question, let me ask a different question. If p ∈ Z
is prime, can p be irreducible in Z[i]? And if so, when? Fermat showed that p is not
irreducible when p ≡ 1 (mod 4), and we just proved it, so we only need to consider
primes 3 mod 4. It turns out that p ≡ 3 (mod 4) is always irreducible.

Proposition 12.5
Any prime p ≡ 3 (mod 4) is irreducible in Z[i].

Proof. Let p = α · β, with α = a + bi (a, b ∈ Z). Since p ∈ Z ⊂ R, α and β must be


complex conjugates (if you don’t believe me, just expand out the product and see what
you need for the imaginary part to disappear). Thus, p = α · α = a2 + b2 . But squares
can only be 0, 1 mod 4, so a2 + b2 ̸≡ 3 (mod 4). In particular, it cannot be equal to p,
so p must be irreducible.

Now, we have primes 3 mod 4 and Gaussian integers with prime norm as our irre-
ducibles. How many more are there? It turns out that these cover all irreducibles!

Lemma 12.6
If a + bi ∈ Z[i] is irreducible, where a, b ̸= 0, then N (a + bi) = p is prime.

Note that if one of a, b = 0, then we are just dealing with integers, from which we
concluded that the only integers which stay irreducible in Z[i] are primes 3 mod 4. I

Hahn Lheem Page 70


Math 124: Number Theory 10/23 - Diophantine Equations

guess we have to be a little careful and make sure to include p = 2 in our discussion,
but we can factor 2 = (1 + i)(1 − i) = i(1 − i)2 , so it is reducible in Z[i].

Proof. Let π = a + bi be irreducible. If π · π = N (π) = α · β for some non-units α, β ∈ Z


(i.e., α, β ̸= ±1), then either π | α and π | β or the other way around. Without loss of
generality, assume the former. By unique factorization of N (π) in Z[i], we have α = π ·u
and β = π · u′ for units u, u′ ∈ Z[i]. But then π ∈ {±α, ±iα}, but we assumed a, b ̸= 0
in π = a + bi, so we have reached a contradiction. Hence, N (π) is prime.

12.3 Pythagorean Triples


Let’s put our hard work to some good use. Let’s solve the Diophantine equation (ah,
there’s the magic word) x2 +y 2 = z 2 for integers x, y, z. One can do this using elementary
methods, and in fact it is not too bad, but the x2 + y 2 expression is just begging us to
consider this equation in the Gaussian integers. So let’s do that.
Let us assume gcd(x, y, z) = 1 (otherwise we can just divide through by the com-
mon factor). We consider the equation in Z[i] and use the fact that Z[i] is a unique
factorization domain. We have (x + iy)(x − iy) = z 2 . The key trick now, which will
help us take advantage of unique factorization, is the fact that x + iy and x − iy are
coprime.
Claim 12.7. If x, y ∈ Z are coprime as integers, then x + iy and x − iy are coprime
in Z[i].

Proof. Let π be an irreducible dividing x + iy. Suppose for the sake of contradiction
that π | x − iy. Then, π | (x + iy) + (x − iy) = 2x = (1 + i)(1 − i)x, so π must divide
one of those factors. If π | 1 + i, then N (π) | N (1 + i) = 2; as π is a non-unit, we have
N (π) = 2. But we have π | x + iy, and this means π | x + iy = x − iy; combining gives
2 = N (π) | N (x + iy) = x2 + y 2 = z 2 . Thus, 2 | z 2 , so z is even.
But then x2 + y 2 = z 2 ≡ 0 (mod 4), which is only possible if x2 ≡ y 2 ≡ 0 (mod 4)
(if they were both odd, then x2 + y 2 ≡ 1 + 1 = 2 (mod 4)). In particular, that means
x, y are both even. This contradicts the assumption that x, y, z are coprime, so π ∤ 1+i.
Likewise, we can make the same argument for 1 − i to show π ∤ 1 − i.
Thus, π | (1 + i)(1 − i)x but π ∤ 1 + i, 1 − i, so π | x. Likewise, π | (x + iy) − (x − iy) =
2iy = i(1 + i)(1 − i)y, so π | y as well. But x, y are coprime as integers, so π cannot be
a prime integer itself. Lemma 12.6 tells us then that N (π) is some prime p. Taking the
norm of our divisibility conditions, we have p = N (π) | N (x) = x2 and p | y 2 similarly.
This contradicts again our assumption that x, y are coprime, so indeed π ∤ x − iy and
thus x + iy, x − iy are coprime.

Now we will use this claim to classify all Pythagorean triples. Recall we have
x2 + y 2 = (x + iy)(x − iy) = z 2 , and the two terms in the middle are coprime. We can

Hahn Lheem Page 71


Math 124: Number Theory 10/27 - More Diophantine Equations

factor z into irreducibles in Z[i]: let z = u · π a1 · · · πrar where u is a unit and πj ’s are
irreducibles. Then,
(x + iy)(x − iy) = (u · π1a1 · · · πrar )2 .
But since x + iy and x − iy are coprime and their product is a square, they must
each be squares! (Up to units, of course.) Thus, x + iy = w · β 2 for some unit w and
β ∈ Z[i]. Write β = a + bi, and suppose w = 1. (It turns out that if you choose some
other unit for w, then we would get the same answer we are about to obtain, so we will
just do the w = 1 case here.) Then,

x + iy = (a + bi)2 = (a2 − b2 ) + (2ab)i,

so any primitive Pythagorean triple is of the form (x, y, z) = (a2 − b2 , 2ab, a2 + b2 ) for
integers a, b ∈ Z.

13 10/27 - More Diophantine Equations


Today, we’ll look at some special (note: easier) cases of Fermat’s Last Theorem. The
story of Fermat’s Last Theorem, first claimed by Fermat in 1637,11 is truly an incredible
one, which was resolved by Andrew Wiles’s famous proof in 1995. His proof sparked
some of the most important mathematics in modern number theory (most importantly,
elliptic curves and modular forms – these two are related by something called the
Modularity Conjecture, which Wiles proved), and much of it is being developed to this
day. Okay, let me state the theorem, which is a incredibly deceptively simple one.

Theorem 13.1 (Fermat’s Last Theorem)


For n > 2, there does not exist integers x, y, z such that xn + y n = z n and xyz ̸= 0.

Given that the proof of this took more than 350 years to arise, it is clearly out of
the scope of this class, but we can investigate this problem for small values of n.

13.1 Method of Infinite Descent


We will use an argument called the method of infinite descent, whose mantra is
basically: given a solution, find a smaller solution. This is used for proofs by contra-
diction, as if we can perform descent on integer solutions, then we will continuously
get smaller and smaller integer solutions. But the integers can only get so small in
magnitude, so the descent must break at some point. This is best illustrated by an
example:
11
He claimed “I have discovered a truly marvelous proof of this, which this margin is too narrow to
contain.” In French, of course, which makes it even more pretentious. He was definitely lying here.

Hahn Lheem Page 72


Math 124: Number Theory 10/27 - More Diophantine Equations

Exercise 13.2. Let p be a prime. Show that x3 + py 3 + p2 z 3 = 0 has no solutions


with xyz ̸= 0.

Proof. Note p | x3 , so p | x. Write x = px′ . Substituting gives p3 x′3 + py 3 + p2 z 3 = 0;


clear p to get p2 x′3 +y 3 +pz 3 = 0. Now, we have p | y. Write y = py ′ and repeat the same
process: we’ll get px′3 + p2 y ′3 + z 3 = 0, so p | z. Repeat to get x′3 + py ′3 + p2 z ′3 = 0. But
this means from a solution (x, y, z), we can always generate a smaller solution (x′ , y ′ , z ′ ),
and we can do this ad infinitum. But clearly this is not possible for the integers, so
there are no solutions.

Remark 13.3. Note we could technically prove this by induction, or a similar


argument, by being like “Suppose there are no solutions less than n; we will show
there are still no solutions less than n+1 by performing descent once.” For instance,
we could make the above proof cleaner by starting with “Assume (x, y, z) is the
solution to the equation that minimizes |x|+|y|+|z|”; then, we could have stopped
once we got x = px′ as the first step gave us y 3 + pz 3 + p2 x′2 = 0. So this idea is
not really super new. The real upshot, in my opinion, is that saying you proved
something by method of descent gives people the impression that you’re some cool
mad mathematician, which is fun.

13.2 Fermat’s Last Theorem for n = 4


Now we are going to make some reductions to Fermat’s Last Theorem. We will first
prove a neat theorem by Fermat, which he actually did prove:

Theorem 13.4 (Fermat’s Last Theorem, n = 4)


The equation x4 + y 4 = z 2 has no integer solutions x, y, z with xyz ̸= 0.

Suppose this is true. Then, Fermat’s Last Theorem is reduced to the case when
n = p is an odd prime. To see why, let n = mp for some odd prime p. If xp + y p = z p
has no solutions, then so will (xm )p + (y m )p = (z m )p , which is just xn + y n = z n . The
only case not covered here is when n is not divisible by an odd prime, i.e., when n is a
power of 2. But this is covered by the n = 4 case above, which we will now prove.

Proof. We may assume (x, y, z) are pairwise coprime, as if p divided two of them, then
the equation forces p to divide the third, and then we could consider the smaller solution
(x/p, y/p, z/p2 ).
Suppose (x, y, z) is the smallest solution to this equation; by smallest, we mean |z| is
minimized. If (x, y, z) satisfies the equation, then (x2 , y 2 , z) is a primitive Pythagorean

Hahn Lheem Page 73


Math 124: Number Theory 10/27 - More Diophantine Equations

triple. We found a general form for Pythagorean triples! This means that there exist
coprime k, ℓ ∈ N such that

x2 = k 2 − ℓ2 , y 2 = 2kℓ, z = k 2 + ℓ2 .

The first equation gives x2 + ℓ2 = k 2 , which is – you guessed it – yet another


Pythagorean triple! (So our work on finding Pythagorean triples at the end of last class
wasn’t all that capricious after all.) This means that there exist coprime a, b ∈ N such
that
x = a2 − b 2 , ℓ = 2ab, k = a2 + b 2 .
Note that there is also the case where x = 2ab and ℓ = a2 − b2 , but this cannot hold
since from our choice x2 = k 2 − ℓ2 and y 2 = 2kℓ we are declaring x to be odd.
Therefore, y 2 = 2kℓ = 2ab(a2 + b2 ), which means y is even and (y/2)2 = ab(a2 + b2 ).
Observe that for (a, b) = 1, we have (ab, a2 + b2 ) = 1. So in fact, we have three pairwise
coprime elements: a, b, and a2 + b2 . But their product is a square! So in the spirit
of the work that we did for finding irreducible elements in Z[i] (from last class), this
means a, b, and a2 + b2 are each perfect squares themselves. Write a = x′2 , b = y ′2 , and
a2 + b2 = z ′2 . This means x′4 + y ′4 = z ′2 . But

z ′ ≤ (z ′ )2 = a2 + b2 = k ≤ k 2 < z,

so we found a smaller solution (x′ , y ′ , z ′ ), and the proof concludes by an infinite descent
argument.

From our remarks above, now we are just left with the cases of Fermat’s Last
Theorem when n = p is an odd prime. Again, we won’t do this in full generality, but
we have the tools to prove it for n = 3. We will do this next time.

13.3 Sophie Germain’s Theorem


Here is another neat Diophantine equation. This is known as the “first case” of Fermat’s
Last Theorem.

Theorem 13.5 (Sophie Germain)


Let p be an odd prime such that 2p+1 is also prime. Then, the equation xp +y p = z p
has no integer solutions with p ∤ xyz.

Remark 13.6. We can list some primes p such that 2p + 1 is also prime: 3,5,11,
23, ... It is actually an open problem whether or not there are infinitely many such
primes, yet another exhibit for why primes are so elusive.

Hahn Lheem Page 74


Math 124: Number Theory 10/27 - More Diophantine Equations

Proof. Note that if xp + y p = z p , then xp + y p + (−z)p = 0, so we will consider the


equation xp + y p + z p = 0 instead. Like in our previous arguments, we may assume
x, y, z are pairwise coprime. We can factor xp + y p = (x + y)(xp−1 − xp−2 y + · · · + y p−1 ).
We will prove that the two factors on the right are relatively prime. First, observe that
p ∤ xyz =⇒ p ∤ xp , which means p cannot divide either factor. If r ̸= p is a prime
dividing both factors, then r | x + y =⇒ x ≡ −y (mod r), in which case

0 ≡ xp−1 − xp−2 y + · · · + y p−1


≡ (−y)p−1 − (−y)p−2 y + · · · + y p−1
≡ py p−1 (mod r),

but r ̸= p, so y p−1 ≡ 0 (mod r) which is only possible if r | y. But then this means
r | z as well, contradicting y, z being relatively prime.
Thus, since the two factors are coprime, and their product is (−z)p a perfect pth
power, each of the factors must be a perfect pth power. Write x + y = Ap and xp−1 −
xp−2 y +· · ·+y p−1 = T p for A, T ∈ Z. But we can do this same process for the equivalent
equations xp +z p = (−y)p and y p +z p = (−x)p to get x+z = B p and y+z = C p for some
B, C ∈ Z. Letting q := 2p + 1, which by assumption is prime, we have p = (q − 1)/2,
so xp + y p + z p = 0 implies
q−1 q−1 q−1
x 2 +y 2 +z 2 ≡0 (mod q).

q−1
 
If q ∤ xyz, then each term in the sum is either ±1 (recall x 2 = xp which is either
±1 for p ∤ x), so the left hand side cannot possibly be 0 mod q. WLOG suppose q | z
but q ∤ x, y. Using x + y = Ap , x + z = B p , y + z = C p , we have B p + C p − Ap = 2z,
which means q−1 q−1 q−1
B 2 + C 2 − A 2 ≡ 0 (mod q).
By the same argument as in the above paragraph (each term is ±1 mod q), this
forces q | ABC. But looking at the definitions of A, B, C tells us q ∤ B and q ∤ C, so we
must have q | A. So now we return to A: we have x + y = Ap ≡ 0 (mod q), meaning
y ≡ −z (mod q). We haven’t talked about T yet, so let’s bring that in now: we see
T p = xp−1 − xp−2 y + · · · + y p−1 ≡ py p−1 (mod q). Noting that since q | x, we have
Ap = x + y ≡ y (mod q), so T p ≡ p(Ap )p−1 (mod q). Rewriting, we obtain
q−1 p
= p ≡ T · (B p−1 )−1 (mod q)
2
 q−1
= T · (B p−1 )−1 2 (mod q)
≡ ±1 (mod q),

so −1/2 ≡ ±1 (mod q). But p ≥ 3 implies q ≥ 7, and this equality cannot hold when
q ≥ 7, so we have reached a contradiction! Sophie Germain’s Theorem follows.

Hahn Lheem Page 75


Math 124: Number Theory 10/30 - Fermat’s Last Theorem for n = 3

14 10/30 - Fermat’s Last Theorem for n = 3


The title says it all. Recall we did this for n = 4 last time, and given the special
condition where 2p + 1 is also a prime, we did this for odd primes p (Theorem 13.5).
Now we will do the n = 3 case in full generality, without the assumption given in Sophie
Germain’s result.
To work in the n = 3 case, we take some inspiration from the equation a2 + b2 = c2 .
(This is the work we did for finding Pythagorean triples, see §12.3.) For this equation,
we factored a2 + b2 = (a + bi)(a − bi), so the key was working this equation over Z[i]. In
the case a3 + b3 = c3 , we can factor a3 + b3 = (a + b)(a + bω)(a + bω 2 ), where ω = e2πi/3
(so ω 3 = 1, meaning ω is a primitive third root of unity). Let us state Fermat’s Last
Theorem for n = 3 from this lens:

Theorem 14.1 (Fermat’s Last Theorem for n = 3)


Let u ∈ Z[ω] be a unit. Then, the equation x3 + y 3 = uz 3 has no solutions for
x, y, z ∈ Z[ω] with xyz ̸= 0.

Note this is actually stronger than what we need for Fermat’s Last Theorem, but
hey, we will take stronger statements any day.

14.1 Eisenstein Integers


Note that our problem is actually a problem on Z[ω] in disguise. The ring Z[ω] is often
called the Eisenstein integers. We will prove some facts about this ring, beginning
with something you showed in your first (second?) homework:

Lemma 14.2
Z[ω] is a Euclidean domain.

Before we start working with ω, let’s lay out a few identities we will consistently
use. First, note ω = ω 2 = ω −1 and ω 3 − 1 = 0 =⇒ 1 + ω + ω 2 = 0. This means
ω + ω = ω + ω −1 = −1.

Proof. Let λ : Q[ω] → Q send α 7→ α · α. Indeed, A(α) ∈ Q: if α = a + bω, then

αα = (a + bω)(a + bω)
= a2 + b2 + ab(ω + ω)
= a2 − ab + b2 ∈ Q.

If m, n ∈ Z[ω] with m ̸= 0, then we can write n/m = q + r, where q ∈ Z[ω] and

Hahn Lheem Page 76


Math 124: Number Theory 10/30 - Fermat’s Last Theorem for n = 3

r = x + yω ∈ Q[ω] such that |x|, |y| ≤ 1/2. Thus, n = mq + mr, with


3
λ(mr) = λ(r) · λ(m) ≤ λ(m) < λ(m),
4
so we found a valid Euclidean function λ.

Our statement (Theorem 14.1) mentions the units of Z[ω]; let us state what they
are.

Lemma 14.3
Z[ω]× = {±1, ±ω, ±ω 2 }.

Proof. If α ∈ Z[ω] is a unit, then 1 = N (αα−1 ) = N (α)N (α−1 ), so N (α) = 1. Likewise,


if N (α) = 1, then by definition of the norm function, α has an inverse, i.e., it is a unit.
Letting α = a + bω, we have N (α) = a2 − ab + b2 = 1, which we can rewrite as
(2a − b)2 + 3b2 = 4a2 − 4ab + 4b2 = 4. We have two cases here.
If 2a − b = ±1, then b2 = 1, so we get a = 0 and b = ±1. For the other case, if
2a − b = ±2, then b2 = 0, so a = ±1. These give our four claimed units, as desired.

14.2 Properties of λ = 1 − ω
We continue with proving more properties of elements in Z[ω] to build up machinery.

Lemma 14.4
Let λ = 1 − ω. Then,

1. N (λ) = 3,

2. λ is irreducible in Z[ω],

3. (λ2 ) = (3),

4. Z[ω]/λZ[ω] ≃ Z/3Z.

Proof. Let’s see how fast we can work through all of these.
The first one is literally just using the equation N (a + bω) = a2 − ab + b2 . Let’s
move on to (2). If λ = αβ, then 3 = N (λ) = N (α)N (β), which is only possible if one
of N (α) or N (β) is 1, i.e., one of α, β is a unit.
For (3), note λ2 = (1 − ω)2 = 1 − 2ω + ω 2 = −3ω. Taking the ideals generated on
each side and noting −ω is a unit, we get the result.
The hardest part is (4), but it is really not that bad. In fact, you did most of this
work in a previous problem set! (Problem Set 2, maybe?) The homework problem

Hahn Lheem Page 77


Math 124: Number Theory 10/30 - Fermat’s Last Theorem for n = 3

demonstrated that any α ∈ Z[ω] is congruent to either 0 or ±1 mod λ. Thus, we can


construct a map Z[ω]/λZ[ω] → Z/3Z where some α on the left maps to its residue mod
λ, which we just said is either 0, ±1. This map is surjective ({−1, 0, 1} map to {−1, 0, 1}
identically), but from (3), we see that |Z[ω]/λZ[ω]| divides 3. But it is certainly not
the trivial group, so it has size 3, which means the map must be an isomorphism. This
gives our desired result.

Another interesting fact related to λ:

Fact 14.5. If x ∈ Z[ω] and x ≡ 1 (mod λ), then x3 ≡ 1 (mod λ4 ). Additionally,


if λ ∤ x, then x3 ≡ ±1 (mod λ4 ).

Proof. If x = 1 + λt, then

x3 − 1 = (x − 1)(x − ω)(x − ω 2 )
= λt(1 − ω + λt)(1 − ω 2 + λt)
= λt(λ + λt)(λ(1 + ω) + λt)
= λ3 t(1 + t)(1 + ω + t).

Note that any t must be ≡ 0, ±1 (mod λ)). If λ ≡ −1, the λ | 1 + t, so λ4 | x3 − 1. If


t ≡ 0, then λ4 | λ3 t | x3 − 1. Finally, if t ≡ 1, then 1 + ω + t ≡ 2 + ω ≡ 3 − λ = λ(λ − 1),
so again we get an extra factor of λ and the first conclusion follows.
Note x ≡ 0, ±1 (mod λ). If x ≡ −1 (i.e., −x ≡ 1), then (−x)3 ≡ 1 (mod λ4 ) =⇒
3
x ≡ −1. This finishes the result.

14.3 Proving Theorem 14.1


Now we make our first big step towards proving Theorem 14.1. This also shows why
we are caring so much about this λ element.

Lemma 14.6 (Weaker Version of Theorem 14.1)


Theorem 14.1 holds if λ ∤ xyz.

Proof. We know from Lemma 14.4.2 that λ is irreducible, so if λ ∤ xyz, then λ does
not divide each factor. Look at the equation x3 + y 3 = uz 3 in modulo λ4 . From Fact
14.5 above, the left hand side is of the form ±1 ± 1, which takes on values {0, ±2}.
Meanwhile, the right hand side is congruent to ±u, but we found all the units of Z[ω]
in Lemma 14.3! So the right hand side takes on values {±1, ±ω, ±ω 2 }. There is no
overlap mod λ4 (note (λ4 ) = (9), and it is clear none of these values are congruent mod
9), so there are no solutions.

We strive to do even better.

Hahn Lheem Page 78


Math 124: Number Theory 10/30 - Fermat’s Last Theorem for n = 3

Lemma 14.7
If x3 + y 3 = uz 3 , with λ ∤ xy and λ | z, then λ2 | z, i.e., ordλ z ≥ 2.

Proof. Again, we use the incredibly useful Fact 14.5. We reduce our equation to mod
λ4 . Again, like above, the left hand side takes on values {0, ±2}. Let L be the value
of the left hand side. Then, we have L ≡ uz 3 (mod λ4 ), but λ | z, so we must
have L ≡ 0 (mod λ). This is only true when L = 0, so uz 3 ≡ 0 (mod λ4 ). Thus,
ordλ z 3 = 3 ordλ z ≥ 4, which means ordλ z ≥ 2, as desired.

We will again use the method of infinite descent. The following result activates the
descent step by finding a smaller solution given an initial one.

Lemma 14.8
If x3 + y 3 = uz 3 with λ ∤ xy and ordλ z ≥ 2, then there exist x1 , y1 , z1 ∈ Z[ω],
where x1 y1 z1 ̸= 0, and a unit u1 ∈ Z[ω]× such that x31 + y13 = u1 z13 with λ ∤ x1 y1
and ordλ z1 = ordλ z − 1.

As a consequence, the infinite descent method tells us that there are no solutions when
λ ∤ xy.

Proof. If α, β ∈ Z[ω] are nonzero, then in general we have the inequality ordλ (α + β) ≥
min(ordλ α, ordλ β). Equality holds when ordλ α ̸= ordλ β. WLOG let s = ordλ α <
ordλ β. Then, we can write α + β = λs (α/λs + β/λs ).
We may assume x, y, z have no common factors in Z[ω]. We can factor our equation
as
(x + y)(x + ωy)(x + ω 2 y) = uz 3 .
From our assumption ordλ z ≥ 2, we have ordλ uz 3 ≥ 6, so one of the factors on the
left hand side must have order ≥ 2. But the terms on the left are all “symmetric” (for
instance, we could replace y with ωy and get the same exact expression), so we can
assume without loss of generality that ordλ (x + y) ≥ 2. Then,

(x + y) − (x + ωy) = (1 − ω)y = λy,

and as λ ∤ y, this means ordλ ((x + y) − (x + ωy)) = 1. By the work we did in the
beginning of the proof, it follows that ordλ (x + ωy) = 1. Likewise, ordλ (x + ω 2 y) = 1,
so ordλ (x + y) = 3 ordλ z − 2.
If π ∤ λ is an irreducible and π | x+y, x+ωy, then π | (x+y)−(x+ωy) = (1−ω)y =
λy, which is only possible if π | y. But then π | x, which contradicts our assumption
that x, y, z share no common factors. Likewise, we have that all three of the terms on

Hahn Lheem Page 79


Math 124: Number Theory 10/30 - Fermat’s Last Theorem for n = 3

the left are pairwise coprime. But their product is a perfect cube up to some unit, so
we can write

x + y = u1 α3 λt (3)
x + ωy = u2 β 3 λ (4)
x + ω 2 y = u3 γ 3 λ, (5)

where t = 3 ordλ z − 2, ui are units, and (α, β, γ) = 1. Note that (x + y) + ω(x + ωy) +
ω 2 (x + ω 2 y) = (x + y)(1 + ω + ω 2 ) = 0, so we have

0 = (3) + (4)ω + (5)ω 2 = u1 α3 λt + u2 β 3 λω + u3 γ 3 λω 2


=⇒ 0 = u1 α3 λt−1 + u2 β 3 ω + u3 γ 3 ω 2
=⇒ (−u2 ω)−1 u1 α3 λt−1 = β 3 + (u2 ω)−1 u3 ω 2 γ 3 .

Alright, we are in the thick of it, but we will reach the end of this tunnel soon.
We now want to construct our smaller solution, which completes the descent step.
Let z1 = αλ(t−1)/3 (in particular, t−1 3
= ordλ z − 1), y1 = γ, and x1 = β. Also, let
−1 −1 2
ε2 = (−u2 ω) u1 and ε1 = (u2 ω) u3 ω ; note that both ε1 , ε2 are units. This leaves us
with the equation ε2 z13 = x31 + ε1 y13 .
Now we may reduce this equation mod λ2 . Recall z1 = αλ(t−1)/3 , so ordλ z13 ≥ 3 · 1 >
2, so the left hand side is congruent to 0 mod λ2 . Following Fact 14.5, the right hand
side reduces to ±1 ± ε1 (mod (λ2 )), and combining with (λ2 ) = (3) (Lemma 14.4.3),
we get ε1 ≡ ±1 (mod 3). Then, ε2 z13 = x31 ± y13 ; replacing y1 with −y1 if the sign is
negative, we get ε2 z13 = x31 + y13 , as desired.

As mentioned before the proof, the infinite descent method tells us that there are
no solutions when λ ∤ xy, as we can continue decreasing ordλ z but the order must stay
non-negative.

Corollary 14.9
The equation x3 + y 3 = uz 3 , where u ∈ Z[ω]× and x, y, z ∈ Z[ω] such that xyz ̸= 0
and λ ∤ xy, has no solutions.

We can finally prove Theorem 14.1, in a slightly more general setting.

Proposition 14.10 (implies Fermat’s Last Theorem for n = 3)


The equation x3 + y 3 = uz 3 has no solutions for x, y, z ∈ Z[ω], where xyz ̸= 0 and
u ∈ Z[ω]× .

Proof. We may assume (x, y, z) = 1, else we can divide through by their common factor.
We proved this above for when λ ∤ xy. Suppose λ | x. Then, we must have λ ∤ yz for

Hahn Lheem Page 80


Math 124: Number Theory 11/03 - Pell’s Equations

(x, y, z) = 1 to hold. Since λ ∤ y, z, by Fact 14.5, we have y 3 , z 3 ≡ ±1 (mod λ4 ),


so u ≡ (yz −1 )3 ≡ ±1 (mod λ4 ). Reducing to mod λ2 , we have u ≡ ±1 mod (λ2 ) =
±1 mod (3), so 3 | u ± 1 in Z[ω]. But we know all the units of Z[ω]× from Lemma 14.3!
Going through all six possibilities, we conclude u = ±1.
Thus, our equation looks like x3 + y 3 = ±z 3 . We can rewrite this as x3 = −y 3 ± z 3 =
(−y)3 + (±z)3 . But we have λ ∤ yz, so it now follows from our Corollary above that
there are no solutions, as desired.

15 11/03 - Pell’s Equations


We continue to study Diophantine equations which look simple but are deceptively
difficult to solve. Welcome to Pell’s Equations.12
Let d ∈ N be a square-free integer. A Pell equation is of the form x2 − dy 2 = 1,
and we are interested in finding solutions x, y ∈ N. Note we just need to care about
square-free d, as if D = d · n2 , then we have x2 − Dy 2 = x2 − d(ny)2 , so it suffices to find
solutions for the latter Pell equation. The study of finding solutions to these equations
is very beautifully related to the √ theory of continued fractions, as for large √
enough x, y,
we have x2 ∼ dy 2 =⇒ x/y ∼ d, and the continued √ fraction expansion of d provides
the “closest possible” rational approximations to d.
2 2

√ Note that√ the expression x − dy is just begging for us to factor it as (x + y d)(x −
y d) in Z[ d]. Indeed, this is where we are going.

Theorem 15.1 (Solutions to Pell’s Equations)


The equation x2 − dy 2 = 1 has infinitely many solutions x, y ∈ N, all of the form
(xn , yn ) where √ √
xn + yn d = (x1 + y1 d)n ,
where (x1 , y1 ) is the smallest solution to the equation. (Smallest here can just mean
x1 is minimized.)

√ √
Remark 15.2. Although Z[ d] does not have any imaginary part (d ∈ N so√ d ∈
notion akin√to conjugation. In particular, the map Z[ d] →
R),√we still have a √
Z[ d] sending a + b d 7→ a − b d is an automorphism (bijective homomorphism)!
√ √
Denoting a + b d = a − b d, one can check that α + β = α + β and α · β = α · β.

This is useful because it gives us a nice new perspective of Pell’s equation. If we


take the norm map as the product of an√element with √ its conjugate,
√ as we do in the
complex numbers, then we have N (x + y d) √ = (x + y d)(x − y d) = x2 − dy 2 . So we
are essentially just finding the elements of Z[ d] with norm 1.
12
Note: not due to Pell! Impostor.

Hahn Lheem Page 81


Math 124: Number Theory 11/03 - Pell’s Equations

This is nice because it verifies that all of our claimed solutions for Pell’s equation√as
described in Theorem 15.1 are indeed solutions! If N (α) = 1 (so writing α = x1 + y1 d,
we have x21 − dy12 = 1), then by multiplicativity of the norm, we have N (αn ) = 1, so
(xn , yn ) is also a solution to the equation. Also, if we find some α such that N (α) = −1,
then we can recover a solution to Pell’s equation by just looking at α2 , since N (α2 ) =
(−1)2 = 1.

Example 15.3
√ √
Consider x2 − 5y 2 = 1. We have 22 − 5 = −1, so we look at (2 + 5)2 = 9 + 4 5.
Indeed, 92 − 5 · 42 = 1, so we found a solution!

15.1 Approximating with Fractions



Let us take a step away from this real quadratic Z[ d] for a second and indulge in the
perspective of continued fractions.

Lemma 15.4
Let ξ ∈ R be irrational. Then, there are infinitely many x/y ∈ Q (where x, y ∈ Z
and (x, y) = 1) such that
x 1
− ξ < 2.
y y

This is quite a powerful statement! The error being bounded by 1/y 2 is forcing the
error to be impressively small.
Before we begin the proof, let us lay out some notation. For α ∈ R, denote [α] as
the largest integer less than or equal to α, and denote {α} = α − [α] ∈ [0, 1) as the
fractional part of α.

Proof. Choose some n ∈ N. We divide up the interval [0, 1) = [0, 1/n)∪[1/n, 2/n)∪· · ·∪
[(n − 1)/n, 1) into n equal subintervals. Consider the list 0, {ξ}, {2ξ}, . . . , {nξ}. This
has n + 1 terms, so by Pigeonhole Principle, two of them are in the same subintervals.
In other words, ∃ 0 ≤ j < k ≤ n such that |{jξ} − {kξ}| < 1/n. Rewriting this via
{ξ} = ξ − [ξ], we have
|jξ − kξ + [kξ] − [jξ]| < 1/n.
Letting x = [kξ] − [jξ] and y = k − j, we have |x − yξ| < 1/n. Note that both
0 ≤ j, k ≤ n, so we have y = k − j ≤ n, meaning
x 1 1
−ξ < ≤ 2.
y ny y
Great, so this gives us one fraction x/y satisfying the inequality. Let us generate
infinitely many more! Note that we could run the same argument for any n, but we

Hahn Lheem Page 82


Math 124: Number Theory 11/03 - Pell’s Equations

must be careful in not producing the same fraction x/y over and over again. We will
be more careful in our choice of n.
̸ 0; choose m ∈ N such that |x/y −ξ|−1 < m.
Since ξ is irrational, we know |x/y −ξ| =
Now, run the same argument as above but with m instead of n to get some fraction
x1 /y1 such that
x1 1 1 x
−ξ < ≤ < −ξ ,
y1 my1 m y
and we know xy11 satisfies the inequality by construction. We can continue this process
to construct infinitely many xn /yn ∈ Q satisfying the inequality, as desired.

15.2 Proving Solutions to Pell’s Equation


Lemma 15.5
Let d ∈ N. There exists some M > 0 such that √ |x2 − dy 2 | < M has infinitely many
solutions x, y ∈ Z. (In fact, we can take M = 2 d + 1.)


Proof. By the previous lemma (15.4) with ξ =√ d, there are infinitely many x/y√∈ Q,
2
with x, y ∈ N and (x,
√ y) = 1, such√that |x/y√− d| < 1/y , or equivalently |x − y√ d| <
1/y. Noting
√ x+√ y d = (x − √ y d) + 2y d, so by Triangle Inequality |x + y d| ≤
|x − y d| + |2y d| < 1/y + 2y d, we have
√ √ √
|x2 − dy 2 | = |(x − y d)(x + y d)| < 1/y(1/y + 2y d)
√ √
= 1/y 2 + 2 d ≤ 2 d + 1.

Taking M = 2 d + 1 will do the trick.

Great, this actually gives us enough to tackle Theorem 15.1!

Proof of Theorem 15.1. By Lemma 15.5 above, there exists some m ∈ Z such that
x2 − dy 2 = m has infinitely many solutions x, y ∈ Z. Fix such an m, and let the
solutions be (xn , yn ) for n ∈ N. By Pigeonhole Principle again, there must be two
solutions (xi , yi ) and (xj , yj ) such that x1 ≡ x2 (mod m) and y1 ≡ y2 (mod m).
√ √
Let α = xi − yi d and β = xj − yj d. Note N (α) = x2i − dyi2 = m, and likewise
N (β) = m.
α · β = α(α + β − α)
= α · α + α(β − α).

Let A + B d = α · β. Note that α · α = m and, by√our choice of √
i, j, we have
m | β − α, so m | A and m |√
B. Thus, we can write A + B d = m(u + v d) for some
u, v ∈ Z. We have N (A + B d) = N (α)N (β) = m · m = m2 , so
√ √ √
m2 = N (m(u + v d)) = N (m)N (u + v d) = m2 N (u + v d),

Hahn Lheem Page 83


Math 124: Number Theory 11/06 - More on Pell’s Equation


so N (u + v d) = 1. Aha, this is promising – we know elements with norm 1 correspond √
to a solution of Pell’s equation! Furthermore, note that yi ̸= yj , so the coefficient of d
in β − α is mv = yj − yi ̸= 0. In particular, v ̸= 0.
Choose a solution
√ (x, y) of x2 − dy 2 = 1√with x, y ∈ N and x as small as possible.
Let α = x + y d, and denote β = u + v d. (I know we defined α and β earlier
in the proof,
√ but we are repurposing them here. Sorry!) We observed above that
N (u + v d) = u2 − dv 2 = 1. Since α is our minimal solution, we have β > α. This
means either β lies between αn and αn+1 for some n ∈ N, or it is equal to αn on the
dot for some n. The second case is what we want, so we suppose the first case is true,
and we hope to reach a contradiction.
Suppose αn < β < αn+1 . Multiplying by αn on both sides, we have

1 = N (α)n = αn · αn
< β · αn < αn+1 · αn = α.

Let γ = β · αn . We have N (γ) = N (β) · N (α)n = 1, but this contradicts the minimality
of α, so we must have β = αn for some n, as desired.

16 11/06 - More on Pell’s Equation


2 2
Some recaps of what went on in √ last lecture,
√ and additional remarks. Notably, √ x − dy
is the norm of the element x + y d ∈ Z[ d]. Such an element is a unit in Z[ d] if and
only if its norm is a unit in Z, i.e. x2 − dy 2 = ±1. Furthermore, if α is the smallest
solution such that N (α) = −1, then α2 is the smallest solution to Pell’s equation
x2 − dy 2 = 1. The crux behind this is that all solutions to a certain equation are just
powers of the smallest solution, so any solution to x2 − dy 2 = 1 must be an even power
of x2 − dy 2 = −1.

1+ d
16.1 Motivation for 2

Let us take a very brief interlude. Consider when d ≡ 1 (mod 4), and let α = 1+2 d .
We can compute the norm to be N (α) = α · α = 1−d
4
; since we assumed d ≡ 1 (mod 4),
we have N (α) ∈ Z. Therefore, the minimal polynomial of α, that is, the polynomial of
smallest degree for which α is a root, is

(X − α)(X − α) = X 2 − (α + α)X + α · α
1−d
= X2 − X + ∈ Z[X].
4
h √ i
Thus, when d ≡ 1 (mod 4), it may make sense to consider the ring Z 1+2 d in-

stead of Z[ d]. This is in fact what happens in number theory! To give a bit more

Hahn Lheem Page 84


Math 124: Number Theory 11/06 - More on Pell’s Equation


explanation, what we actually want is to find the “integers” in the field Q( d). In Q,
we can recover Z creatively via Proposition 8.3: the integers are
√ the algebraic integers
contained in Q. Likewise, we can define the “integers” of Q( d) has thei algebraic in-

tegers contained in this field. Turns out, this ring of integers is Z 1+2 d when d ≡ 1
(mod 4).

16.2 Units of Ring of Integers


h √ i
1+ d
√ h √ i
Let us consider the units of Z 2
.First, note that Z[ d] ⊆ Z 1+2 d , so we can
√ h √ i× √
take the units on both sides to get Z[ d] ⊆ Z 1+2 d . But all units of Z[ d]× ,

by the solutions to Pell’s equation, are of the form (x1 + y1 d)n where (x1 , y1 ) is the
fundamental solution and n ∈ Z. Note that given any solution (xn√ , yn ), we can also
√ have
(±xn , ±yn ). The (xn , −yn ) can be recovered via xn − yn d = (x1 − y1 d)n =
solution √
(x1 + y1 d)−n , and the other two√sign changes can be recovered by just multiplying
our unit by −1. Thus, we have Z[ d]× ≃ Z × {±1}: the Z comes from the exponent
of the fundamental unit, and {±1} is just including all possible signs.
h √ i √
We can write any β ∈ Z 1+2 d as β = A2 + B2 d, where A, B ∈ Z and A ≡ B
(mod 2). (Write out elements explicitly if you don’t believe this, it just follows from
h √ i× 2 2
construction.) Suppose β ∈ Z 1+2 d . This means N (β) = A −dB 4
= ±1, so we are
really considering solutions to the equation A2 − dB 2 = ±4.
Let us take d = 5 as our example, as that is indeed the smallest
√ nontrivial √
d > 0 such
that d ≡ 1 (mod 4). (Wow quick maths.) We see that 2+ 5 satisfies N (2+ 5) = −1.
But if the number 2 is too big for you, then what we could do is look for a solution
√ to
2 2
A − 5B =  ±4, which is quite easy to find here: A = B = 1.h Indeed, N (1 + 5) = −4,
√  √ i
1+ 5 1+ d
and so N 2
= −1. This is the fundamental unit in Z 2 . By the same logic
h √ i
as two paragraphs above, we have an isomorphism Z 1+2 d ≃ Z × {±1}.
√ h √ i
But wait, we have a copy of Z × {±1} ≃ Z[ d]× contained in Z 1+2 d ≃ Z × {±1}!
Even more, it is a (nontrivial) subgroup, and as any nontrivial subgroup of Z has finite
√ h √ i×
index (i.e. it is of the form nZ for some n ̸= 0), we have the Z[ 5]× ⊆ Z 1+2 d has
finite index. Here, for d = 5, we can see that the index is 3, because
√ !3 √
1+ 5 16 + 8 5 √
= = 2 + 5,
2 8
h √ i
1+ d
so we raise the fundamental unit of Z 2
to the third power to get the fundamental

unit of Z[ d].

Hahn Lheem Page 85


Math 124: Number Theory 11/10 - Dirichlet’s Theorem, an Introduction

16.3 Finding Solutions when d ≡ 5 (mod 8)


It turns out that the index is always 3 for d ≡ 5 (mod 8). We require this congruence
because if A2 − dB 2 = ±4 for A, B odd, then since A2 , B 2 ≡ 1 (mod 8), it follows that
d ≡ 5 (mod 8). (Consequently, when d ≡ 1 (mod 8), we do not have such solutions to
√ × h √ i×
this equation, so Z[ d] = Z 1+2 d .) We can check this as follows: if A2 −dB 2 = ±4,
 √ 2 √  √ 3 √
where A, B are odd, then A2 + B2 d ∈ / Z[ d] but A2 + B2 d ∈ Z[ d]. I will verify
the latter first: we have
A B√ √
 3
1
+ d = (A3 + 3AB 2 d + (3A2 B + B 3 d) d),
2 2 8

and when d ≡ 5 (mod 8), we have A3 + 3AB 2 d ≡ A3 − AB 2 ≡ A(A2 − B 2 ) ≡ 0 (mod 8)


and 3A2 B + B 3 d ≡ B(3A2 − 3B 2 ) ≡ 0 (mod 8). One can run the computations for

A B
√ 2 √
2
+ 2
d and show that it does in fact live in Z[ d].
This neat fact about the index being 3 gives us power in computations! Let us take
the example when d = 29.

Example 16.1 (d = 29)


√ h √ i
We have 52 − 29 · 12 = −4, so u = 5+2 29 is a fundamental unit of Z 1+2 29 . Then,
 √ 3
5+ 29
2
is the smallest solution to the equation x2 − 29y 2 = −1, and so the sixth
power is the smallest solution to Pell’s equation x√2 −29y 2 = 1. Finding this smallest
solution straight from the wild, without this 1+2 29 business to guide us, would be

very difficult! (It is equal to 9801 + 1820 29.)

17 11/10 - Dirichlet’s Theorem, an Introduction


We started this topic at the end of the last lecture, but it is more fitting to start it for
a new section. We begin this topic with an absolute banger of a theorem, credited to
Dirichlet.

Theorem 17.1 (Dirichlet)


Suppose a, m ∈ N with (a, m) = 1. The set of prime p congruent to a (mod m) is
infinite. In other words, the arithmetic progression a, a + m, a + 2m, . . . contains
infinitely many primes.

A baby example of this is when m = 4 and a = 3, which we can prove in a more


grounded way a la Euclid.

Hahn Lheem Page 86


Math 124: Number Theory 11/10 - Dirichlet’s Theorem, an Introduction

Proposition 17.2
There are infinitely many primes congruent to 3 (mod 4).

Proof. Suppose there are finitely many; let {p1 , . . . , pr } be the complete list. Consider
N = 4p1 · · · pr + 3. None of the pi ’s divides N , so all of its prime factors must be
1 mod 4. But the product of numbers 1 mod 4 will still be 1 mod 4, but N ≡ 3 (mod 4)
by construction, so we reach a contradiction.

17.1 Riemann Zeta Function


We define the Riemann zeta function by

X
ζ(s) = n−s .
n=1

For this class, we will just concern ourselves with s ∈ R>1 , but its real power comes
from taking s ∈ C. (This requires some knowledge of complex analysis, which is a really
fun subject (everything is so nice in complex analysis!) but goes beyond the scope of
this course. Take Math 113 if you’re interested, though!)
We can check the sum converges for s > 1. By the integral test, we can bound
ˆ n+1
−s
(n + 1) < t−s dt < n−s
ˆn∞
=⇒ ζ(s) − 1 < t−s dt < ζ(s),
ˆ ∞ 1

−s −t−s+1 1
t dt = = < ∞.
1 s−1 1 s−1

Although ζ(s) diverges for s = 1 (it is then the harmonic series, which we’ve shown
in one of the first lectures on the Prime Number Theorem that it diverges), we see that
the divergence is “not too bad.” (For people who know complex analysis, the following
says the pole of ζ(s) at s = 1 is simple with residue 1.)

Proposition 17.3
lims→1+ (s − 1)ζ(s) = 1.

Proof. From the second line of inequalities above, multiplying by s − 1 gives us ζ(s)(s −
1) − (s − 1) < 1 < ζ(s)(s − 1), so both 1 < ζ(s)(s − 1) and ζ(s)(s − 1) < s. As s → 1+ ,
we get the desired.

Hahn Lheem Page 87


Math 124: Number Theory 11/10 - Dirichlet’s Theorem, an Introduction

I was trying to avoid discussing what happens when s ∈ C, but Kisin is going full
force with this complex discussion, so let me try to provide some explanation. We can
define the exponential function ez for z ∈ C by ex+iy = ex · eiy . The first part ex is
just the exponential function for the reals, which we know and love. The second is just
eiθ = cos θ + i sin θ; note, importantly, that |eiθ | = 1, so the magnitude of ez really
comes from just the ex = eRe z part. Writing n = elog n , we have, for s = α + iβ,

n−s = e−s log n = e−α log n · e−iβ log n = n−α e−iβ log n .

By Triangle Inequality, we have |ζ(s)| ≤ ∞


P −s
P∞ − Re s
n=1 |n | = n=1 |n |, so in fact our
check for convergence when s > 1 really checked it for when Re s > 1.
This means we can define the Riemann zeta function without tears for the “half”-
plane where Re s > 1. The pesky pole (i.e., place where ζ(s) goes to infinity) at s = 1
prevents us from doing better. However, we just demonstrated that this pole is “not
bad”: in fact, it is perhaps the most nicely behaved pole possible. There is a method
in complex analysis called analytic continuation which allows to bypass this pole and
define ζ(s) for the whole complex plane minus s = 1.
The next question is, then, where are the zeroes of this Riemann zeta function? This
is the content of the famous Riemann Hypothesis, one of the Millennium Problems.
First, one can show that there are zeroes at the negative even integers s = −2n for
n ∈ N. These are not hard to show, and so they are called the “trivial zeroes.” Besides
these, it is conjectured that the zeroes lie on the vertical line Re s = 1/2, which is weird
and seems to come out of nowhere. This has been checked for really large values of |s|,
so people believe it to be true, but no proof has been provided yet.
Here is one formal consequence of Proposition 17.3 above.

Corollary 17.4
log ζ(s)
lims→1+ log(s−1) −1 = 1.

Proof. Let ρ(s) = (s − 1)ζ(s), so log ρ(s) = log(s − 1) + log ζ(s). Dividing by log(s −
1)−1 = − log(s − 1) on both sides, we have

log ρ(s) log ζ(s)


−1
= −1 + .
log(s − 1) log(s − 1)−1

Taking the limit as s → 1+ , we note from Proposition 17.3 that lims→1+ ρ(s) = 1, so

ρ(s) log ζ(s)


lim+ −1
= 0 = −1 + ,
s→1 log(s − 1) log(s − 1)−1

and the conclusion follows.

Hahn Lheem Page 88


Math 124: Number Theory 11/10 - Dirichlet’s Theorem, an Introduction

17.2 Euler Factorization


So we’ve mentioned before that the Riemann Hypothesis is related to the primes in some
way. But how? So far, in our definition ζ(s) = n≥1 n−s , primes appear nowhere. But
P
some really smart guy named Euler (surprise!) made the following observation:

Proposition 17.5 (Euler Product)


For Re s > 1,
Y
ζ(s) = (1 − p−s )−1 = (1 − 2−1 )−1 (1 − 3−1 )−1 · · · .
p prime

Before we begin the formal proof, I think it is useful to just convince yourself,
perhaps slightly non-rigorously, that this is true. A good place to start would be a
simpler problem such as:
Exercise 17.6. What is the sum
X 1
,
a b
n
n=3 5

where the sum is taken over all n with only 3 and 5 in the prime factorization?

Once you get this, it is not too difficult to see how this generalizes when we sum
over all n ∈ N.

Proof. Our good old sum of infinite geometric series tells us that
1
(1 − p−s )−1 = = 1 + p−s + p−2s + · · · .
1 − p−s
Take some N ∈ N. We know, trivially, that any n ≤ N must factor into primes also
at most N . Thus, by unique factorization,
Y Y X
(1 − p−s )−1 = (1 + p−s + p−2s + · · · ) = n−s + RN (s),
p≤N p≤N n≤N

n−s . Taking the limit on both sides as N → ∞, we conclude


P
where RN (s) ≤ n>N
Y Y
(1 − p−s )−1 = lim (1 − p−s )−1
N →∞
p p≤N
X
= lim n−s + RN (s)
N →∞
n≤N

= ζ(s) + lim RN (s) = ζ(s),


N →∞

as desired.

Hahn Lheem Page 89


Math 124: Number Theory 11/10 - Dirichlet’s Theorem, an Introduction

This is perhaps the simplest example of an Euler Product. There are many classes
of functions which exhibit an Euler Product similar to this; if it does, then because
this factorization is so nice, it suggests that the function has some really nice proper-
ties/deeper connections to number theory.
One upshot of expressing the zeta function as a product is that when we take the
logarithm, we can split it up based on the terms in the product (we can’t do anything
with log(x + y), but we know log(xy) = log x + log y). We see this here:

Proposition 17.7
For Re s > 1, log ζ(s) = p p−s + R(s) for some function R(s) bounded near s = 1.
P

Proof. We will use the fact from calculus

x2 X xn
− log(1 − x) = x + + ··· = .
2 n≥1
n

From the Euler factorization, we can write


Y
ζ(s) = (1 − p−s )−1 λN (s)
p≤N

for some function λN with limN →∞ λN (s) = 1. Taking the logarithm on both sides and
using our Taylor series expansion above gives
X
log ζ(s) = − log(1 − p−s ) + log λN (s)
p≤N
X X p−ms
= + log λN (s)
p≤N m≥1
m
X X p−ms
=⇒ lim log ζ(s) = log ζ(s) = lim + log λN (s)
N →∞ N →∞
p≤N m≥1
m
X X p−ms
= + log 1
p m≥1
m
X X X p−ms
= p−s + .
p p m≥2
m

The double sum at the very end is our R(s). Looking at the last sum separately, we
can bound
X p−ms X
≤ p−ms = p−2s (1 + p−s + · · · ) = p−2s (1 − p−s )−1 .
m≥2
m m≥2

Hahn Lheem Page 90


Math 124: Number Theory 11/10 - Dirichlet’s Theorem, an Introduction

Therefore,
X X
R(s) ≤ p−2s (1 − p−s )−1 ≤ (1 − 2−s )−1 p−2s
p p
−s −1
≤ (1 − 2 ) ζ(2) ≤ 2ζ(2),

which is just a constant, and hence clearly bounded near s = 1.

17.3 Dirichlet Density


Our goal is to prove Dirichlet’s Theorem (17.1). The way we will do this is, roughly
speaking, show that for any a such that (a, m) = 1, the “proportion of primes which are
congruent to a mod m is nonzero.” This implies there are infinitely many such primes,
since there are an infinite number of primes in total.
But what do we mean exactly by “proportion” here? How do we define fractions
when we are counting over an infinite set? We do this by defining something called the
Dirichlet density.

Definition 17.8 (Dirichlet Density). Let P be a set of primes. Then,


−s
P
p∈S p
d(P ) = lim+
s→1 log(s − 1)−1

is called the Dirichlet density of P , if it exists.

To illustrate what is going on, recall Corollary 17.4, which told us a certain limit is
1. We can rewrite it as
P −s
log ζ(s) pp
lim+ −1
= lim+ = 1.
s→1 log(s − 1) s→1 log(s − 1)−1

This is the simplest example of Dirichlet density: this tells us that the density of all
primes in, well, all primes is 1.
Dirichlet’s Theorem cares about primes congruent to a mod m, so let us define
P (n, m) as the set of primes p such that p ≡ a (mod m). Then, we can reformulate
Dirichlet’s Theorem as:

Theorem 17.9 (Dirichlet)


d(P (n, m)) = 1/ϕ(m).

In fact, this is stronger than our original statement of Dirichlet’s Theorem, as it not
only guarantees infinitely many primes in each residue class, but they are distributed
equally.

Hahn Lheem Page 91


Math 124: Number Theory 11/10 - Dirichlet’s Theorem, an Introduction

Example 17.10
Taking m = 4 and noting that the only possible residue classes are 1 and 3 mod 4
(excluding the prime p = 2), we have d(P (1, 4)) = d(P (3, 4)) = 1/2, i.e., “half” of
the primes are in each residue class.

Remark 17.11. A couple of remarks. If P1 , P2 are disjoint sets of primes, then the
definition tells us d(P1 ∪P2 ) = d(P1 )+d(P2 ), which should agree with our intuition.
If P is finite, then it also makes sense that d(P ) = 0, which we can observe from the
definition of the Dirichlet density. (If the set is finite, the numerator is bounded,
but the denominator goes to infinity.) Thus, since Dirichlet’s Theorem as described
above guarantees d(P (n, m)) = 1/ϕ(m) > 0, it follows that there are infinitely
many primes in P (n, m).

17.4 Dirichlet L-functions


Tackling the theorem directly is daunting, so we will first provide a proof for when
m = 4. We will use something called Dirichlet characters (wow, this Dirichlet guy
did a lot of stuff huh).

Remark 17.12. In general, whenever you see a “character” (especially in number


theory), it is a group homomorphism to some multiplicative group. See below for
an example.

We will define a function χ : Z → {0, ±1} where 2Z 7→ 0 (the evens map to 0)


and any odd a ∈ Z \ 2Z maps to its residue mod 4. For instance, χ(17) = 1 and
χ(23) = −1, while χ(122) = 0. It is not difficult to see that this χ is multiplicative,
that is, χ(mn) = χ(m)χ(n).
Given this character, we can now generalize the Riemann zeta function to what we
call Dirichlet L-functions. In this m = 4 case, this is given by

X
L(s, χ) = χ(n)n−s = 1 − 3−s + 5−s − 7−s + · · · .
n=1

(For any m, we can define a Dirichlet character χ, and then the definition of L(s, χ)
would be the same.)
Since χ takes on nonzero values {±1}, we have |χ(n)n−s | ≤ |n−s |, so Triangle
Inequality tells us

X X X
|L(s, χ)| = χ(n)n−s ≤ |χ(n)n−s | ≤ |n−s |,
n≥1 n≥1 n≥1

Hahn Lheem Page 92


Math 124: Number Theory 11/10 - Dirichlet’s Theorem, an Introduction

so L(s, χ) converges for Re s > 1.


Even more, this L-function follows the story of the Riemann zeta function by ad-
mitting an Euler factorization. This is indeed the case because χ is multiplicative. The
factorization looks like Y
L(s, χ) = (1 − χ(p)p−s )−1 .
p

17.5 Dirichlet’s Theorem for m = 4


Now, we will tackle Dirichlet’s Theorem for m = 4, which – note – does not mention
this Dirichlet character anywhere. But we will see it is the key tool in the proof.

Proposition 17.13
d(P (1, 4)) = d(P (3, 4)) = 1/2.

Proof. Let ζ ∗ (s) = 2∤n n−s . For any n = 2k m for m odd, we make the simple obser-
P

vation that n−s = 2−ks m−s . Thus, we may factor

ζ(s) = ζ ∗ (s)(1 + 2−s + 2−2s + · · · ) = ζ ∗ (s)(1 − 2−s )−1 .

(A more direct way to see this is that, just like how we can construct an Euler
product for ζ(s), we can do the same for ζ ∗ (s), except we omit p = 2 since we are only
summing over odd n.)
Akin to Proposition 17.7, with just omitting the p = 2 term in the sum, we have
X
log ζ ∗ (s) = p−s + R2 (s),
p̸=2

where R2 (s) is bounded near s = 1. We can do the same to L(s, χ), which is similar
to ζ ∗ (s) but where the coefficients in the sum expansion alternate between ±1. Again,
akin to Proposition 17.7, we have
X
log L(s, χ) = χ(p)p−s + Rχ (s),
p

where Rχ (s) is bounded near s = 1.


Recall the definition of Dirichlet density; for d(P (1, 4)), the sum on on the numerator
is p≡1 (mod 4) p−s . How can we get this sum from our two sums above? Well, since
P

Hahn Lheem Page 93


Math 124: Number Theory 11/13 - Dirichlet Characters

χ(p) = 1 iff p ≡ 1 (mod 4) and the main sum in log ζ ∗ (s) has coefficients all 1, we have
X X X
2 p−s = p−s + χ(p)p−s
p≡1(4) p̸=2 p
X X X
2 p−s = p−s − χ(p)p−s
p≡3(4) p̸=2 p
X
∗ −s
=⇒ log ζ (s) + log L(s, χ) = 2 p + R(s) (*)
p≡1(4)
X
log ζ ∗ (s) − log L(s, χ) = 2 p−s + R(s),
p≡3(4)

where the R(s) error terms are bounded near s = 1. (Here, the two R(s)’s are different,
I am just abusing notation because they don’t really matter.)
We can construct crude bounds for log L(s, χ). We can group the elements on our
sum in two simple ways:

L(s, χ) = 1 − 3−s + 5−s − 7−s + · · ·


= (1 − 3−s ) + (5−s − 7−s ) + · · · > 2/3,
L(s, χ) = 1 − (3−s − 5−s ) − (7−s − 9−s ) − · · · < 1,

so 2/3 < L(s, χ) < 1 for s > 1. Taking the log, we have log 2/3 < log L(s, χ) < 0. In
particular, this means log L(s, χ) is finite, so we have

log ζ ∗ (s) + log L(s, χ) ζ ∗ (s)


lim+ = lim = 1.
s→1 log(s − 1)−1 s→1+ log(s − 1)−1

But through Equation (*), we see that this is equal to

2 p≡1(4) p−s + R(s)


P
1 = lim+ = 2d(P (1, 4)),
s→1 log(s − 1)−1

so d(P (1, 4)) = 1/2 and consequently d(P (3, 4)) = 1/2 as well.

18 11/13 - Dirichlet Characters


Professor Kisin is disappointed that not more people are showing up to lecture. To
encourage attendance, the final exam will have a question where you need to provide
your favorite Kisin joke or anecdote. For instance, he gave a really funny anecdote at
the beginning of this class, but I am not allowed to share it.
Last time, we proved Dirichlet’s theorem for m = 4. For the next three to four
lectures, we will prove Dirichlet’s theorem in full generality. The central characters
(pun fully intended) in this story are the Dirichlet characters, so we begin there.

Hahn Lheem Page 94


Math 124: Number Theory 11/13 - Dirichlet Characters

Dirichlet characters are very important, but developing the theory might feel a bit
like eating vegetables. Bear with us for a bit.

For m = 4, we defined this character χ : Z − 2Z → (Z/4Z)× − → {±1} where an
element outputs its residue mod 4, and we could extend this to be 0 on the even integers.
This is a baby example of the Dirichlet character for a general m. Consider a map

χ : (Z/mZ)× → C×

which is a group homomorphism (i.e. χ(ab) = χ(a)χ(b), and consequently χ(1) = 1).
Recall |(Z/mZ)× | = ϕ(m), so by multiplicativity, we have

χ(a)ϕ(m) = χ(aϕ(m) ) = χ(1) = 1,

so χ(a) must be a ϕ(m)th root of unity. Explicitly, this means

χ(a) = e2πik/ϕ(m)

for some k ∈ Z.
For instance, let m = p be prime. We know (Z/pZ)× is cyclic, so (Z/pZ)× ≃
Z/(p − 1)Z. Thus, we can consider χ : (Z/pZ)× → C× as a function from Z/(p − 1)Z,
and the map would be

χ : Z/(p − 1)Z → C×
a 7→ e2πiak/(p−1)

for some k ∈ Z.
Like in the m = 4 case, we can extend these characters to be a Dirichlet character
mod m.
Definition 18.1 (Dirichlet character). A Dirichlet character mod m is a map
χ : Z → C such that for a ∈ Z,

1. If (a, m) ̸= 1, then χ(a) = 0;

2. otherwise, there exists some χ : (Z/mZ)× → C× such that χ(a) = χ(a mod
m).

Example 18.2
The simplest character is the trivial one, where χ extends from the trivial character
χ : (Z/mZ)× → C× where χ(a) = 1 for all a ∈ (Z/mZ)× .

Hahn Lheem Page 95


Math 124: Number Theory 11/13 - Dirichlet Characters

18.1 Dual Group


Great, we have these individual characters. We will slightly generalize first, then con-
sider the set of all Dirichlet characters and see what structures come with this set.
Let A be a finite abelian group. This just means A is finite as a set, and abelian
means that a · b = b · a for any a, b ∈ A. Now, consider the set of characters A b = {χ :
A → C× | χ(ab) = χ(a)χ(b)}. IT follows that A b is an abelian group, where the group
b and χ−1 (a) := χ(a)−1 .
structure is given by (χ1 · χ2 )(a) := χ1 (a) · χ2 (a) for χ1 , χ2 ∈ A,

Example 18.3
Consider A ∼ = Z/nZ, and let r be a generator of Z/nZ. (This amounts to just
having (r, n) = 1, as we’ve seen repeatedly.) Like we worked out before, χ(r) ∈ C×
has to satisfy χ(r)n = χ(rn ) = χ(1) = 1, so χ(r) = e2πik/n for some k ∈ Z.
Denote ζm = e2πik/n . Then, χ(rj ) = ζnj by multiplicativity, and this completely
determines χ. But all characters mod n must be of this form, so it is really depen-
dent on the choice of k in the exponent. As k ranges across Z/nZ (they technically
range along all of Z, but k and k + n produce the same character), we conclude
Ab ≃ Z/nZ, so in fact A ≃ A. b This may not seem so significant at first, but it is
remarkably deep.

It turns out that this isomorphism A ≃ A,


b as demonstrated when A ≃ Z/nZ above,
is a taste of the more general result.

Lemma 18.4 (Dual is Isomorphism)


There is a non-canonical isomorphism A ≃ A.
b In particular, |A| = |A|.
b

Proof. We proved this already for cyclic groups, as any finite cyclic group is isomorphic
to some Z/nZ.
In general, the Classification of Finite Abelian Groups tells us that any finite abelian
group is of the form
A ≃ Z/n1 Z × · · · × Z/nr Z,
where the group operation is just addition component-wise.
Since these components are independent of each other, any χ ∈ A b must be a charac-
ter on each of its components, and given characters on each components, we can patch
it up via multiplication to produce a character on all of A. Thus, specifying χ ∈ A b is
\
equivalent to giving characters χi ∈ Z/n i Z for 1 ≤ i ≤ r, as χZ/ni Z = χi and we can
reconstruct χ from the χi ’s via χ(a1 , . . . , ar ) = χ1 (a1 )χ2 (a2 ) · · · χr (ar ). Now, we use
our work done for cyclic groups to get
r
Y r
Y
b≃
A \
Z/niZ ≃ Z/ni Z ≃ A.
i=1 i=1

Hahn Lheem Page 96


Math 124: Number Theory 11/13 - Dirichlet Characters

This might feel familiar if you’ve learned some linear algebra: the dual of a vector
space V is isomorphic to V , but not canonically (it requires the choice of a basis).
Similarly, we call A
b as the dual of A. But we do know from linear algebra that the
dual of the dual of V is canonically isomorphic to V . We replicate the same result
here.

Corollary 18.5 (Double Dual is Canonical Isomorphism)


ˆ
There is a cnonical isomorphism A ≃ Â.

ˆ b → C× . We
Proof. If a ∈ A, we wish to produce an element of Â, which is a map χ : A
have a really choice-free way of doing so: we define ψa : χ 7→ χ(a). This gives us our
≃ ˆ
isomorphism A − → Â where a 7→ ψa .
We check that this is a bijective homomorphism. We start with the latter: we have,
for a, b ∈ A,
ψab (χ) = χ(ab) = χ(a)χ(b) = ψa (χ)ψb (χ).
It now remains to show a 7→ ψa is injective. This suffices, since we know |A| =
ˆ ˆ
|A|
b = |Â|, so if we get an injective map A → Â, then it must be surjective as well.
Injectivity amounts to proving that for any 1 ̸= a ∈ A, then ψa is not the trivial map,
or equivalently there is some χ ∈ A b such that χ(a) ̸= 1.
Via the decomposition A ≃ Z/n1 Z × · · · × Z/nr Z (where a 7→ (a1 , . . . , ar )), we can
decompose χ into characters (χ1 , . . . , χr ). So now we have reduced this problem to the
cyclic group case. If a ̸= 1, then ai ̸= 1 for some i. Select χi ’s such that for j ̸= i, χj = 1
is the trivial character, and χi (a) ̸= 1. Then, χ(a) = χ1 (a1 ) · · · χr (ar ) = χi (ai ) ̸= 1,
and we win.

18.2 Orthogonality Relations


We have seen a result before similar in flavor to the one below, when χ was the Legendre
symbol.

Hahn Lheem Page 97


Math 124: Number Theory 11/13 - Dirichlet Characters

Proposition 18.6
Let A be a finite abelian group, and let n = |A|. If χ, ψ ∈ A,
b then
(
X n if χ = ψ
χ(a)ψ(a) = n · δχ,ψ = .
a∈A
0 if χ = ̸ ψ

Likewise, for a, b ∈ A, then


(
X n if a = b
χ(a)χ(b) = n · δa,b = .
0 if a ̸= b
χ∈A
b

Remark 18.7. Note that for a ∈ A and χ ∈ A, b we have χ(a)n = χ(an ) = χ(1) = 1,
so in particular |χ(a)| = 1. In this case, we have χ(a) = 1/χ(a). We will use this
repeatedly.

Before we start the proof, we will prove the following lemma, which we have proven
before when the character is the Legendre symbol. In this lemma, 1 represents the
trivial character.

Lemma 18.8
If χ ∈ A,
b then
(
X n if χ = 1
χ(a) = .
a∈A
0 if χ ̸= 1

Proof. If χ = 1, this is clear from n = |A|. If χ ̸= 1, then χ(b) ̸= 1 for some b ∈ A. We


now take advantage of the group structure of A:
X X X
χ(b) · χ(a) = χ(ab) = χ(a),
a∈A a∈A a∈A

as the multiplication-by-b map A → A is an isomorphism. The conclusion follows from


χ(b) ̸= 1.

Now we prove Proposition 18.6.

Proof. Using ψ(a) = ψ(a)−1 , we have


X X X
χ(a)ψ(a) = χ(a)ψ(a)−1 = (χ · ψ −1 )(a)
a∈A a∈A a∈A
(
n if χ · ψ −1 = 1
= ,
0 if χ · ψ −1 ̸= 1

Hahn Lheem Page 98


Math 124: Number Theory 11/13 - Dirichlet Characters

where the last equality follows from Lemma 18.8 above. The first statement follows.
ˆ
For the second statement, we can apply the first relation to  and use the isomorphism
ˆ
 ≃ A to get our desired result.

We now bring ourselves back to (Z/mZ)× : we will apply Proposition 18.6 to A =


(Z/mZ)× .

Corollary 18.9
If χ, ψ are Dirichlet characters mod m, then
m−1
X
χ(a)ψ(a) = ϕ(m)δχ,ψ .
a=0

Likewise, if a, b ∈ Z, where (a, m) = (b, m) = 1, then


X
χ(a)χ(b) = ϕ(m)δa,b .
χ Dirichlet

Proof. Uh oh, we have a notation conflict here with the overline bar, but we will close
our eyes and push forward. Let χ and ψ be extensions of characters χ, ψ on (Z/mZ)× .
Then, the Proposition tells us that
m−1
X X
χ(a)ψ(a) = χ(a)ψ(a).
a=0 a∈(Z/mZ)×

18.3 Dirichlet L-functions


Okay, we are done eating our vegetables. Let’s see how this pays off.
Fix m ∈ N, and let χ : Z → C be a Dirichlet character. Consider the Dirichlet
L-function ∞
X
L(s, χ) = χ(n)n−s ,
n=1
which we showed converges for when Re s > 1. We have the Euler factorization
Y Y
L(s, χ) = (1 − χ(p)p−s )−1 = (1 − χ(p)p−s )−1 .
p p∤m

Let us see what happens when χ = 1 is the trivial character. Then, we can recover
the Riemann zeta function minus finitely many primes; specifically,
Y Y
L(s, 1) = (1 − p−s )−1 = ζ(s) (1 − p−s ),
p∤m p|m

Hahn Lheem Page 99


Math 124: Number Theory 11/20 - Dirichlet’s Theorem, Part II

which is exactly how we defined ζ ∗ when m = 4.

19 11/20 - Dirichlet’s Theorem, Part II


Author’s Note 19.1. There was class on Friday, but since I was not here because
of Harvard-Yale and the intersection of people in today’s class and Friday’s class
is exactly one, Kisin decided to repeat this lecture.

Kisin starts this lecture with a recap of the definition of Dirichlet characters (§18),
the orthogonality conditions (§18.2), and Dirichlet L-functions (§18.3). So in short,
read the previous lecture notes!
We will pick up from the last equation line from the last lecture, namely the
L-function for the trivial character χ = 1. Recall Proposition 17.3, which stated
lims→1+ (s − 1)ζ(s) = 1. Then, we can evaluate
Y
lim+ (s − 1)L(s, 1) = lim+ (s − 1)ζ(s) (1 − p−s )
s→1 s→1
p|m
Y
−1
=1· (1 − p )
p|m

= ϕ(m)/m,

where ϕ(m) is, as it always has been, the Euler totient function. (The last equality just
follows from the formula we gave for ϕ, see Lemma 4.7.)
We want to consider log L(s, χ), in a sensibility akin to Corollary 17.4. This is
relevant because, ultimately, we care about the density of certain primes, and so we
need an expression that reflects the Dirichlet density. Given our factorization of L(s, 1),
we can write
log p|m (1 − p−s )
Q
log L(s, 1) log ζ(s)
lim = lim+ + lim+ .
s→1+ log(s − 1)−1 s→1 log(s − 1)−1 s→1 log(s − 1)−1

The product in the numerator of the latter term is a finite product that goes to some
nonzero value when s → 1+ , so the latter term vanishes as s → 1+ . (The denominator
is unbounded.) Thus, we have

log L(s, 1) log ζ(s)


lim+ −1
= lim+ = 1,
s→1 log(s − 1) s→1 log(s − 1)−1

where the last equality again follows from Corollary 17.4.


We must proceed with caution when consider log L(s, χ), though, since these L-
functions are complex-valued, but there is no single-valued log function on C. To see
why, consider log z as z goes around the unit circle. We know we can parameterize

Hahn Lheem Page 100


Math 124: Number Theory 11/20 - Dirichlet’s Theorem, Part II

z = eiθ as θ ∈ R. Start with z = ei·0 = 1. As we go around the circle, we reach 1 again


when z = e2πi . But log is a continuous function, so from ei·0 to e2πi , log goes from 0
to 2πi. But then we have obtained two distinct values for log 1, which should never
happen for a well-defined function!
The way to rectify this is to take what we call a branch cut in complex analysis.
Basically, we have an obstruction on well-definedness when we make a full circle around
the origin. To avoid this, we basically remove a ray from the plane (cut out a branch,
if you will) so that we can never have a full revolution around the origin. For instance,
if we remove the positive real axis R+ , then now we can play the same game without
any problems. Technically, log is not defined on 1 ∈ R+ anymore, but if we were to go
from 1 + iε to 1 − iε, then we would traverse through the range (0, 2πi), and the branch
cut allows us to make the discontinuous jump from 2πi to 0 again.
So now we return´ to log´L(s, χ). We can recover the Taylor series of log(1 − x): we
know log(1 − x) = 1−x = 1 + x + x2 + · · · = x + 12 x2 + 13 x3 + · · · . In this vein, denote
1


XX 1
G(s, χ) = χ(p)k p−ks .
p k=1
k

We claim that this is the logarithm of L(s, χ).

Lemma 19.2
G(s, χ) converges absolutely for Re s > 1, and exp(G(s, χ)) = L(s, χ) for Re s > 1.

Proof. We first address convergence. It is not hard to see |1/k · χ(p)p−ks | ≤ p−ks .
Therefore,

XX
|G(s, χ)| ≤ |1/kχ(p)p−ks |
pk=1
XX∞
≤ |p−ks |
p k=1
X X
= |p−s (1 − p−s )−1 | ≤ 2 |p−s |,
p p

which we know converges for Re s > 1, completing the first part of the proof.
Now we show that exp(G(s, χ)) = L(s, χ). Using the Taylor series of the logarithm
log(1 − x) = x + 21 x2 + 13 x3 + · · · that I put above, we have

!
X zk
exp = (1 − z)−1 .
k=1
k

Hahn Lheem Page 101


Math 124: Number Theory 11/20 - Dirichlet’s Theorem, Part II

Take z = χ(p)p−s . Then, the above gives



!
X 1
exp χ(p)k p−ks = (1 − χ(p)p−s )−1 .
k=1
k

Both sides converge when |z| < 1, so we get



!
XX 1
exp(G(s, χ)) = exp χ(p)k p−ks
p k=1
k
!
Y X1
= exp χ(p)k p−ks
p k
k
Y
= (1 − χ(p)p−ks )−1 = L(s, χ),
p

as desired.

Now we reach a key step in our proof of Dirichlet’s Theorem, which describes the
behavior of G(s, χ). Again, we can think of G(s, χ) as the logarithm of L(s, χ) in some
sense.

Proposition 19.3
Define G(s, χ) as above.

1. If χ ̸= 1, then G(s, χ) is bounded near s = 1.


G(s,1)
2. If χ = 1, then lims→1+ log(s−1)−1
= 1.

We will only prove (2) for now. We will assume (1) to prove Dirichlet’s Theorem,
then we will go back to prove (1) afterwards.

Proof of (2). We have seen before that we can write



X X 1
G(s, 1) = p−s + χ(p)k p−ks ,
k=2
k
p∤m

where the latter sum is bounded near s = 1 by Lemma 19.2 above. Thus, taking the
limit as s → 1+ , we get
−s
P
G(s, 1) p∤m p
lim = lim+ =1
s→1+ log(s − 1)−1 s→1 log(s − 1)−1

as desired.

Hahn Lheem Page 102


Math 124: Number Theory 11/20 - Dirichlet’s Theorem, Part II

19.1 Proof of Dirichlet’s Theorem


Now, we are finally ready to prove Dirichlet’s Theorem.

Theorem 19.4
If (a, m) = 1, then d(P (a, m)) = 1/ϕ(m).

Proof. Like above, we will write



X
−s
XX 1
G(s, χ) = χ(p)p + χ(p)k p−ks ;
p p k=2
k

denote the latter sum as Rχ (s), which is a bounded “error” term. Then, as χ ranges
over all Dirichlet characters modulo m, we compute
X XX X
χ(a)G(s, χ) = χ(a)χ(p)p−s + Rχ (s)χ(a)
χ χ p χ
X X X
−s
= p χ(a)χ(p) + Rχ (s)χ(a)
p χ χ
X
−s
= p ϕ(m)δa,p + Rχ,a (s)
p
X
= ϕ(m) p−s + Rχ,a (s), (*)
p≡a(m)

where the second to last equality


P follows from the orthogonality relation given in Corol-
lary 18.9 and Rχ,a (s) := χ Rχ (s)χ(a). Taking the “Dirichlet density” expression on
the left and right, we see that the right hand side corresponds to the Dirichlet density
of P (a, m). More explicitly, we have
P
G(s, 1) χ χ(a)G(s, χ)
1 = lim+ −1
= lim+
s→1 log(s − 1) s→1 log(s − 1)−1
ϕ(m) p≡a(m) p−s + Rχ,a (s)
P
= lim+
s→1 log(s − 1)−1
−s
P
p≡a(m) p
= ϕ(m) lim+
s→1 log(s − 1)−1

= ϕ(m) · d(P (a, m)),

and the theorem follows.

Hahn Lheem Page 103


Math 124: Number Theory 11/27 - Proving Proposition 19.3

20 11/27 - Proving Proposition 19.3


Last time, we proved Dirichlet’s Theorem, modulo the first part of Proposition 19.3,
which states that G(s, χ) is bounded near s = 1 for nontrivial χ ̸= 1. Recall we defined
G(s, χ) such that exp(G(s, χ)) = L(s, χ) for Re s > 1.

20.1 Reducing to Analytic Continuation


Proving this boundedness fact for G(s, χ) is quite intense and is commonly not covered
in undergraduate courses, but Kisin is truly built different so here we are. We will try
to make this journey as easy as possible. We will prove the following:

Theorem 20.1
If χ ̸= 1, then the L-function

X χ(n)
L(s, χ) =
n=1
ns

has analytic continuation to Re s > 0 and L(1, χ) ̸= 0.

We will show that this theorem implies our desired Proposition 19.3. But first, we
should define exactly what it means for a function to be analytic. This is a term from
complex analysis.

Definition 20.2 (Analytic Functions). If Ω ⊆ C is open, then f : Ω → C is


analytic if for all z0 ∈ Ω, there exists some D ⊆ Ω open neighborhood containing
z0 such that on D,
X∞
f (z) = an (z − z0 )n
n=0

and the sum is a convergent power series.

Hidden beneath this definition are many incredible facts, which is kind of the reason
why complex analysis is such a beautiful subject. It is useful to think of analytic as the
complex analysis notion of differentiable. The magic is that, unlike in real analysis, if
a function is differentiable once, it is differentiable infinitely many times. This means
we can write f as an infinite power series, which are basically as good as one can get.
Assuming analytic continuation (Theorem 20.1), which we can now interpret as
meaning L(s, χ) can be extended to all of the right half-plane Re s > 0 as an analytic
function, we will prove Proposition 19.3.

Proof of 19.3. As L(1, χ) ̸= 0, there exists a small disc D ⊆ C centered at 1 such that

Hahn Lheem Page 104


Math 124: Number Theory 11/27 - Proving Proposition 19.3

L(s, χ)|D does not take the value 0. Choose a neighborhood D′ of L(1, χ) such that
L(s, χ)(D) ⊆ D′ .
Now, choose a branch of the complex-valued logarithm defined on D′ , and let
G1 (s, χ) = log(L(s, χ)) for s ∈ D. Then, on D ∩ {s | Re s > 1}, we have exp(G(s, χ)) =
L(s, χ) = exp(G1 (s, χ)). But exp is invariant under addition by 2πi, so on D ∩ {s |
Re s > 1}, we have
G(s, χ) − G1 (s, χ) = 2πin
for some n ∈ Z. But since L(s, χ) is bounded by D′ , G1 (s, χ) is bounded on D ∋ 1,
which implies G(s, χ) is bounded around s = 1, as desired.

So now we have reduced our task to proving analytic continuation a la Theorem


20.1. This is not much of a reduction in the sense that this is still Really Hard, but we
have now gotten to the core of the proof for Dirichlet’s Theorem to fall.

20.2 Analytic Continuation for Riemann Zeta


Although we are concerned when χ ̸= 1, it turns out we have a nice way of analytically
continuing, to some extent, L(s, χ) when χ = 1. Note that this is just the Riemann
zeta function ζ(s).

Proposition 20.3
1
ζ(s) − s−1 can be analytically continued to {s ∈ C | Re s > 0}.

Before we prove this, we will prove this lemma, another result from complex analysis.

Lemma 20.4
Let {an }, {bn } ⊆ C be sequences such that ∞
P
n=1 an bn converges. Let An = a1 +
a2 + · · · + an . Suppose An bn → 0 as n → ∞. Then,

X ∞
X
an bn = An (bn − bn+1 )
n=1 n=1

and the right hand side converges.

If you sit down with this a little bit, this looks to be true: you can cancel a lot of
terms on the right to just reduce to an bn terms, which remain on the left. The real
content is that the sum is convergent.

Hahn Lheem Page 105


Math 124: Number Theory 11/27 - Proving Proposition 19.3

PN
Proof. Let SN = n=1 an bn . (Also for formality, let A0 = 0.) Then, we can write
N
X N
X N
X
SN = (An − An−1 )bn = An bn − An+1 bn
n=1 n=1 n=1
XN N
X −1 N
X −1
= A n bn − An bn+1 = AN bN + An (bn − bn+1 ).
n=1 n=1 n=1

Thus, taking the limit as N → ∞, we get



X ∞
X
an bn = lim SN = An (bn − bn+1 ).
N →∞
n=1 n=1

(This is the “you can cancel a lot of terms on the right” I was talking about.)

Now we prove Proposition 20.3. This is our first time really proving a function can
be analytically continued, so it may be useful to lay out the general principle first. A
priori, ζ(s) is defined for Re s > 1. To extend to Re s > 0, we will take a point z with
Re z > 1, then find a ball around z which goes beyond {Re s > 1}, then show that our
function at hand can be analytically defined on this ball.

Proof of 20.3. Applying the above lemma with an = 1 and bn = n−s , we see that as
P ∞
n=1 is exactly ζ(s), it follows that we can write


X
ζ(s) = n(n−s − (n + 1)−s ).
n=1

Let {x} = x − ⌊x⌋ ∈ [0, 1) denote the fractional part of x. Note that we can write
ˆ n+1
−s −s
n − (n + 1) = s x−s−1 dx,
n

Hahn Lheem Page 106


Math 124: Number Theory 11/27 - Proving Proposition 19.3

so

X
ζ(s) = n(n−s − (n + 1)−s )
n=1

X ˆ n+1
= n·s x−s−1 dx
n=1 n

X∞ ˆ n+1
=s ⌊x⌋x−s−1 dx
n
ˆ
n=1

=s ⌊x⌋x−s−1 dx
ˆ1 ∞ ˆ ∞
−s−1
=s x·x
dx − s {x}x−s−1 dx
1
∞ ˆ ∞ 1
x1−s
=s· −s {x}x−s−1 dx
1−s 1
ˆ ∞1
1
=1+ −s {x}x−s−1 dx.
s−1 1

We see that the first term has a pole at s = 1 (we are dividing by s − 1). But ’tis
1
merely a scratch, since it is a simple pole which goes away if we subtract s−1 . (In fact,
we are just left with 1.)
We should check that the integral for the second term indeed converges and is
analytic for Re s > 0 (i.e., all of our problems lie in the first term). But we know
{x} ∈ [0, 1), so |{x}| < 1, meaning
ˆ ∞ ˆ ∞ ˆ ∞
−s−1 −s−1
{x}x dx ≤ |x | dx = x−1−Re s dx,
1 1 1

which converges for Re s > 0 by just integrating like a high schooler would: for s ∈ R+ ,
we have (check this!!) ˆ ∞

s x−s−1 dx = −x−s 1
=1
1

To show it is analytic on the right half-plane, it suffices to show it is analytic at


1
s = 1, since that is the only pole of both ζ(s) and 1 + s−1 . We can manipulate
ˆ ∞ ˆ ∞ ˆ ∞
−s−1 −2 1−s
{x}x dx = {x}x x dx = {x}x−2 elog x(1−s)
1 1 1
ˆ ∞ ∞
−2
X (log x)n
= {x}x (1 − s)n
1 n=0
n!
∞ ˆ ∞
x−2 (log x)n
X  
= {x} · dx (1 − s)n .
n=1 1 n!

Hahn Lheem Page 107


Math 124: Number Theory 12/01 - Proving Theorem 20.1

´ ∞ Letting an be the integral in P


the sum above, we see that we have just expressed
1
{x}x −s−1
dx as a power series ∞ n
n=1 an (1 − s) around s = 1, as desired.

Some additional remarks on analytic functions. If h1 ̸≡ 0 is a function analytic at


s = 1, then we can write h1 (s) = ∞ n
P
n=0 n (s − 1) as a power series around s = 1. We
a
denote ords=1 h1 = min{i : ai ̸= 0}. Let j = ords=1 h1 . Then, in the power series, we
can factor out a (s − 1)j and get h1 (s) = (s − 1)j · g1 (s) for some analytic g1 such that
g1 (1) ̸= 0. (By construction of g1 , the constant term of g1 as a power series around
s = 1 is nonzero.)
If h2 is another function analytic at s = 1 with k = ords=1 h2 (s), then we can write
h2 (s) = (s − 1)k g2 (s) similarly. Then,

h2 (s) (s − 1)k g2 (s) g2 (s)


= j
= (s − 1)k−j
h1 (s) (s − 1) g1 (s) g1 (s)

0 k>j
h2 (s) 
=⇒ lim+ = c ̸= 0 k = j .
s→1 h1 (s) 
∞ k<j

21 12/01 - Proving Theorem 20.1


1
Last time, we showed ζ(s) − s−1 can be analytically continued to Re s > 0. We now
want to prove that L(s, χ) has analytic continuation (Theorem 20.1). We will first
prove the following useful lemma.

Lemma 21.1
Let χ ̸= 1 be a Dirichlet character modulo m. Then, for N ≫ 0,
N
X
χ(n) ≤ ϕ(m).
n=0

Proof. We will denote χ0 = 1 also as the trivial character, as it is less awkward to write
χ0 (n) than 1(n). (This is the notation Kisin has maintained throughout the course,
anyways.) As χ ̸= χ0 , by the Orthogonality Relations (Corollary 18.9), we have
m−1
X m−1
X
0= χ(n)χ0 (n) = χ(n).
n=0 n=0

Hahn Lheem Page 108


Math 124: Number Theory 12/01 - Proving Theorem 20.1

Write N = m · q + r for 0 ≤ r ≤ m − 1. Then,


−1
N m−1
! r−1
X X X
χ(n) = q χ(n) + χ(n)
n=0 n=0 n=0
N
X −1 r−1
X m−1
X
=⇒ χ(n) = χ(n) ≤ |χ(n)| = ϕ(m),
n=0 n=0 n=0
Pm−1
where the second equality follows from n=0 χ(n) = 0.

21.1 Proof of Analytic Continuation of L(s, χ)


Now, incredibly, we can prove the first part of Theorem 20.1.

Proposition 21.2
If χ ̸= 1, then L(s, χ) has an analytic continuation to Re s > 0.

P P∞ −s
Proof. Let S(x) = 0≤n≤x χ(n). We know L(s, χ) = n=1 χ(n)n . We will now
−s
P ∞
invoke Lemma 20.4. Letting an = χ(n) and bn = n , we have n=1 an bn = L(s, χ) is
convergent and An = a1 + · · · + an = S(n). We can check, by the above lemma, that
|An bn | = |S(n)/ns | ≤ ϕ(m)n−s → 0 for Re s > 0. Then, by Lemma 20.4, we have

X ∞
X
−s
L(s, χ) = χ(n)n = S(n)(n−s − (n + 1)−s )
n=1 n=1
X∞  ˆ n+1 
−s−1
= S(n) s x dx
n
ˆ
n=1

=s S(x)x−s−1 dx (S(x) = S(⌈x⌉))
1

Again, by´Lemma 21.1, which tells us |S(x)| ≤ ϕ(m) (in particular, this is bounded),

we have that 1 S(x)x−s−1 dx converges absolutely. (This is a generalization of the very
last part of the proof for Proposition 20.3, replacing {x} with S(x) (or in general, any
bounded function).)

21.2 Evaluating L(1, χ)


Now we work towards the second part of Theorem 20.1, which states L(1, χ) ̸= 0 for
χ ̸= 1.

Proposition 21.3
Q
Let F (s) = χ mod m L(s, χ). For s ∈ R such that s > 1, we have F (s) > 1.

Hahn Lheem Page 109


Math 124: Number Theory 12/01 - Proving Theorem 20.1

Proof. Recall G(s, χ), which satisfies exp(G(s, χ)) = L(s, χ) for Re s > 1, can be written
as ∞
XX 1
G(s, χ) = χ(pk )p−ks .
p k=1
k
Through some hard work, we can obtain

X X XX 1
G(s, χ) = χ(pk )p−ks
χ mod m χ mod m p k=1
k

XX 1 −ks X
= p χ(pk )
p k=1
k χ mod m
X 1
= ϕ(m) p−ks > 0,
p,k
k
pk ≡1(m)

where the last equality follows because χ mod m χ(pk ) = 0 unless pk ≡ 1 (mod m), in
P
which case the sum is ϕ(m). (This can be seen from any of the orthogonality relations
given in §18.2.) This implies
!
Y X
F (s) = L(s, χ) = exp G(s, χ) > 1
χ mod m χ mod m

as desired.

Proposition 21.4
L(1, χ) = 0 for at most one nontrivial Dirichlet character χ ̸= 1.

1
We know ζ(s) − s−1 has analytic continuation at s = 1, so we can write ζ(s) =
1 −1
s−1
+ g(s) = (s − 1) (1 + (s − 1)g(s)) for some g(s) which is analytic at s = 1.
To prove this Proposition, we will recall the very end of last lecture regarding the
order of s = 1 as a zero and writing h2 (s)/h1 (s) in terms of (s − 1) for analytic h1 , h2 .
To recap, if j1 = ords=1 h1 (s) and j2 = ords=1 h2 (s), then

0 j1 > j2
h2 (s) 
lim = c ̸= 0 j1 = j2 .
s→1+ h1 (s) 
∞ j1 < j2

Proof. We can write


Y Y
F (s) = L(s, χ) = L(s, 1) L(s, χ).
χ mod m χ̸=1

Hahn Lheem Page 110


Math 124: Number Theory 12/04 - Last Lecture

We know L(s, 1) = ζ(s) p|m (1 − p−s ); the finite product on the right behaves perfectly
Q

well at s = 1, so L(s, 1) just has a pole at s = 1 of order 1, like ζ(s). Thus, we write
L(s, 1) = 1/h2 (s) with ords=1 h2 (s) = 1.
If χ ̸= 1 such that L(1, χ) = 0, then by definition ords=1
Q L(s, χ) ≥ 1. If there were
two such χ ̸= 1 such that L(1, χ) = 0, then the order of χ̸=1 L(s, χ) at s = 1 would
be ≥ 2. But then this would imply ords=1 F (s) ≥ − ords=1 h2 (s) + 2 = 1, meaning
lims→1+ F (s) = 0, which we know from the above proposition is false. The conclusion
follows.

Corollary 21.5
If χ ̸= 1 such that χ(Z) ̸⊆ R, then L(1, χ) ̸= 0.

Proof. Suppose L(1, χ) = 0, so ords=1 L(s, χ) > 0; we can write L(s, χ) = (s − 1)g(s)
for some g(s) analytic at s = 1. Note that we can interpret χ(Z) ̸⊆ R as saying χ ̸= χ,
as α = α ⇐⇒ α ∈ R. So it makes to look at χ. For s ∈ R, s > 1, we have

X ∞
X
−s
L(s, χ) = χ(n)n = χ(n)n−s
n=1 n=1
=⇒ I didn’t write this down in time rip

[get this from somebody and fill in later]

So now we will demonstrate that L(1, χ) is indeed what we expect. To recap, we


established that L(s, χ), which a priori was not defined at s = 1, can be analytically
continued to s = 1. But we don’t know the value of L(1, χ) – we just know it’s defined!
We now check that it indeed agrees with the sum with s = 1 plugged in. Recall

X
L(s, χ) = S(n)(n−s − (n + 1)−s ),
n=1
Pn
where S(n) = a=1 χ(a) and |S(n)| ≤ ϕ(m) by Lemma 21.1.

22 12/04 - Last Lecture


And just like that, our semester comes to a close. Not before we end with a bang,
though! (By bang, we mean completely finishing the proof of Dirichlet’s Theorem by
showing L(1, χ) ̸= 0 given just χ ̸= 1.)
Last time, we showed that this is the case when χ(Z) ̸⊆ R. Today, we will assume
χ(Z) ⊂ {−1, 0, 1}. We will show L(1, χ) = ∞ χ(n)
P
n=1 n ̸= 0. Let me state this as an
actual result.

Hahn Lheem Page 111


Math 124: Number Theory 12/04 - Last Lecture

Proposition 22.1
If χ is a Dirichlet character modulo m and χ(Z) ⊂ {−1, 0, 1}, then L(1, χ) =
P∞ χ(n)
n=1 n ̸= 0.

P
Proof. Let cn = d|n χ(d). (We will see in a bit why we are considering this sum.)
Suppose (n, m) = 1. If d | nm, write d = d1 d2 such that d1 | n, d2 | m. Then,
  
X X X
cnm = χ(d) =  χ(d1 )  χ(d2 ) = cn cm .
d|nm d1 |n d2 |m

So it suffices to consider when n = pa is some prime power. We can explicitly


compute, depending on the value of χ(p),

1
 p|m
2 a
cpa = 1 + χ(p) + χ(p ) + · · · + χ(p ) = a + 1 χ(p) = 1 ;

0 or 1 χ(p) = −1

in particular, cpa ≥ 0 always and ≥ 1 if a is even. This shows that ∞


P
n=1 cn is unbounded.
Now, let

X tn
f (t) = χ(n)
n=1
1 − tn
where t ∈ (0, 1). Note that for n ≥ 1, we have 1 − tn ≥ 1 − t, so
tn tn 1
χ(n) n
≤ = · tn ,
1−t 1−t 1−t
and since then summing over all n ≥ 1 gives a sum of geometric series, we see that f (t)
converges absolutely for all t ∈ (0, 1).
Furthermore, for each term in the sum, we can expand
td
χ(d) = χ(d)td (1 + td + t2d + · · · )
1 − td
= χ(d)(td + t2d + t3d + · · · ).
Now, summing over all such terms, we get
 

X X ∞
X
f (t) =  χ(d) tn = cn tn ,
n=1 d|n n=1

so aha, this is why our cn sums are useful. But we showed that the sum of cn is
unbounded! This means limt→1− f (t) = ∞. We will rely on this to reach a contradiction,
so keep this in mind.

Hahn Lheem Page 112


Math 124: Number Theory 12/04 - Last Lecture

Suppose for the sake of contradiction that L(1, χ) = ∞ χ(n)


P
n=1 n = 0. We can cleverly
use this as follows:

!
X χ(n)
−f (t) = (1 − t)−1 − f (t)
n=1
n

tn
 
X 1
= χ(n) −
n=1
n(1 − t) 1 − tn

X
=: χ(n)bn
n=1

1 tn 1 t
where bn = bn (t) = n(1−t)
− 1−tn
. We can compute b1 = 1−t = 1−t
= 1. Furthermore,
 
1 tn tn
limn→∞ bn (t) = limn→∞ n(1−t)
− 1−t n = limn→∞ − 1−t n = 0.

We claim that the bn ’s form a non-increasing sequence, that is, b1 ≥ b2 ≥ · · · for


each t ∈ (0, 1). We will prove this shortly. Assuming this claim, though, we have

X ∞
X
−f (t) = χ(n)bn = S(n)(bn − bn+1
n=1 n=1
Pn
where like in previous lectures, S(n) = j=1 χ(j). The last equality follows from
Lemma 20.4, which is possible to invoke because |S(n)| ≤ ϕ(m) (by Lemma 21.1) and
bn → 0, so S(n)bn → 0. Using the simple bound |S(n)(bn − bn+1 )| ≤ ϕ(m)|bn − bn+1 | =
ϕ(m)(bn − bn+1 ), we have

X
|f (t)| = S(n)(b − n − bn+1 )
n=1

X
≤ |S(n)(bn − bn+1 )|
n=1
X∞
≤ ϕ(m)|bn − bn+1 |
n=1
X∞
= ϕ(m)(bn − bn+1 )
n=1
= ϕ(m)b1 = ϕ(m), (using claim)

where the last line follows because we assumed the bn ’s form a non-increasing sequence.
But this means f (t) is bounded above by a constant for all t. This contradicts our
conclusion that limt→1− f (t) = ∞, and the proposition follows.
It remains to prove the claim that b1 ≥ b2 ≥ · · · . Bear with me as we proceed with

Hahn Lheem Page 113


Math 124: Number Theory 12/04 - Last Lecture

some computations:
1 tn 1 tn+1
(1 − t)(bn − bn+1 ) = − − +
n 1 + t + · · · + tn−1 n + 1 1 + t + · · · + tn
1 tn (1 + t + · · · + tn ) − tn+1 (1 + t + · · · + tn−1 )
= +
n(n + 1) (1 + t + · · · + tn )(1 + t + · · · + tn−1 )
1 tn
= − .
n(n + 1) (1 + t + · · · + tn )(1 + t + · · · + tn−1 )

Now, we invoke the AM-GM inequality,13 which states that AM ≥ GM. Using this
here, we can conclude
1 + t + t2 + · · · + tn−1  n(n−1) 1/n n−1
≥ t 2 =t 2
n
n−1
=⇒ 1 + t + t + · · · + tn−1 ≥ n · t 2 ≥ n · tn/2
2

=⇒ 1 + t + · · · + tn ≥ (n + 1)tn/2
1 tn
=⇒ (1 − t)(bn − bn+1 ) ≥ − = 0,
n(n + 1) n(n + 1)tn
so indeed bn − bn+1 ≥ 0. Hooray!

At last, all parts of the proof of Dirichlet’s theorem are covered. Free at last, free
at last...

22.1 So... what is an L-function?


Yeah, so what exactly are these things? Clearly they contain a lot of information: the
pole of ζ(s) at s = 1 implies the infinitude of primes, and the convergence of L(s, χ)
and the nonvanishing of L(1, χ) is enough to prove Dirichlet’s Theorem. But here is
a bird’s eye view of L-functions, which dives into the most cutting edge of modern
number theory.
In summary, L-functions are attached to algebro-geometric objects (objects from
algebraic geometry). This is best illustrated by example: consider the curve defined by
y 2 = x(x − 1)(x − λ) for some λ ∈ Q, λ ̸= 0, 1. (For the adults out there, this is an
example of an elliptic curve.) Here’s what Desmos gives me for λ = −1.
Below is an image of the curve over R. What does it look like over C? This is a bit
difficult to imagine, since to visualize any map where our two coordinates are in C, we
would need four dimensions, which is too big for a mortal like me. But it turns out,
by some really cool theory of elliptic curves of C, that this is a complex torus! It is, in
13
AM stands for arithmetic mean, which is your normal “sum then divide by number of terms” mean,
while GM is the geometric mean, where instead of summing you multiply, and instead of dividing by
the number of terms, you√take the nth root where n is the number of terms. For instance, the geometric
mean of 2, 3, and 36 is 3 2 · 3 · 36 = 6.

Hahn Lheem Page 114


Math 124: Number Theory 12/04 - Last Lecture

more sophisticated terms, a complex Riemann surface of genus 1 (meaning it has one
hole). We can realize the below graph as a vertical cross-section of the complex torus,
where the rightmost component of the graph can be realized as a circle passing through
the point at infinity.
But R and C are not special: we can con-
sider this over any field. Q is of utmost im-
portance because it is related to the integers
in an obvious way, e.g. if we solve an + bn = 1
over Q, then we have solved an + bn = cn over
Z.
This curve has even more structure than,
well, just being a curve. There is a way to add
two points on the curve to get another point.
This endows the points on the curve with an
operation, and so the points, one can show,
form a group! In fact, it is an abelian group.
Denote E as the elliptic curve, and let E(K)
be the points of E over K. For instance, E(C)
are the points on E where both coordinates
are in C, and this forms a torus. Here is an
incredible result:

Theorem 22.2 (Mordwell’s Theorem)


E(Q) is a finitely generated group.

Now, how is this related to L-functions? Well, I’m going to construct an L-function
for you. Let ap = p + 1 − |E(Fp )| (yes, this seems a bit out of nowhere), and define
Lp (X) = 1 − ap X + pX 2 for “good” primes (ignore this for now, it is a technicality).
Then, consider the L-function
Y Y
L(E, s) := Lp (p−s )−1 = (1 − ap p−s + p1−2s .
p p

This converges for Re s > 3/2. But even better, this exhibits very nice properties akin
to what we proved for our L-functions attached to Dirichlet characters:

Theorem 22.3 (Wiles, Taylor-Wiles, ...)


L(E, s) has analytic continuation to all of C, and L(E, s) = L(E, 2 − s).

Let’s relate them more explicitly. Oh wait, we can’t actually, because nothing’s
been proven yet. But at least we can state some really important conjectures. Here’s
one of the Millenium Problems:

Hahn Lheem Page 115


Math 124: Number Theory 12/04 - Last Lecture

(Birch and Swinnerton-Dyer Conjecture) Let r be the rank


of E(Q) as a Z-module (this is well-defined because by
Mordell’s Theorem, E(Q) is finitely generated). Then,
r = ords=1 L(E, s). In general, for an algebraic variety X/Q,
Y
L(X, s) = LX,p (p−s )−1
p

for some polynomials LX,p arising from point counts of


X(Fp ).

Here’s another one:

L(X, s) has analytic continuation and satisfies a functional


equation. The special values of this L-function correspond
to cycles on X, and there is an equivalent of the Riemann
Hypothesis for these L-functions.

What we did in class was a baby example of this: Dirichlet L-functions correspond to
0-dimensional algebraic varieties, in particular the ones defined by the curve z m = 1,
which give birth to the number field Q(ζm ). So yeah, geometry is tied with these
L-functions, which mostly live in the realm of analysis but are completely tied with
number theory. All of this is really beautiful, and I would really encourage you all to
take a look at some of these things at some point in your academic journey!

Hahn Lheem Page 116

You might also like