Introduction
The purpose of this document is to prove Dirichlet’s theorem on primes in arithmetic progression, published in 1837 (Dickson 2005, 415).
We know from Euclid that there is an infinitude of primes (Euclid 1956, ix.20, v. ii, pp. 412–3). Dirichlet’s theorem is that, for every modulus k, for every l that is prime to k, there is an infinitude of primes belonging to the sequence
(l, l + k, l + 2k, l + 3k, …),
meaning they are congruent to l modulo k.
The proof use theorems about real and complex numbers. To contemplate this is a reason to compose these notes.
Background for the proof is
-
some elementary number theory, such as the Euler φ function;
-
primitive roots or else the classification of finite abelian groups;
-
real analysis through uniform convergence of series;
-
moduli of complex numbers.
Sources
Our proof of Dirichlet’s theorem will be based on that of Landau, given in the first semester of his six-semester course (Landau 1958, 109–25). I rearrange the steps, attempting to clarify why each of them is taken.
Perhaps Landau was demanding or perverse in proving the theorem so early in his course. Hardy and Wright state the result, but then say, “The proof of this theorem is too difficult for insertion in this book” (Hardy and Wright 1979, 13–14), even though they prove of the Prime Number Theorem.
There is a proof of Dirichlet’s theorem in Cohn’s Advanced Number Theory (Cohn 1980, 174–9), but following “[Dirichlet’s] considerably roundabout method of using the so-called theory of the class number of quadratic forms,” as Landau describes it (Landau 1958, 121).
The Princeton Companion to Mathematics has relevant coverage in
-
Chapter 47, “L-Functions” (Buzzard 2008), of Part III, “Mathematical Concepts”;
-
§ 4, “Primes in Arithmetic Progression,” of Chapter 2, “Analytic Number Theory” (Granville 2008) of Part IV, “Branches of Mathematics.”
In 1949, Selberg published “a new and more elementary proof of [Dirichlet’s] theorem. More elementary in the respect that we do not use the complex characters mod k, and also in that we consider only finite sums” (Selberg 1949). Specifically, for some Ck, for x large enough, modulo k,
∑p ≡ l ∧ p ≤ xlog p/p > Cklog x.
In the paper itself, log p here is written as log p1; later, the term 2log plog q appears to have been changed to log plog q2. The exponents are not mathematical, but references to footnotes. The typographical offense is symbolic of the difficulty of the paper. My sense is corroborated by R. A. Rankin in Mathematical Reviews: “Although the general trend of the argument is clear the paper is not easy to read since the algebraic steps necessary for the estimation of the order of magnitude of sums are mostly omitted.”
Credentials
I am not trained in analytic number theory as such. I had a course in algebraic number theory in graduate school, and my dissertation was on the logic of elliptic curves. Later I taught first and second undergraduate courses in number theory.
Twice in courses at the Nesin Mathematics Village, in August, 2018, and February, 2019, I have worked through the details of a proof of the Prime Number Theorem. First proved in 1896 (Dickson 2005, 439), the Theorem is that
limn → ∞#{p: p ≤ n}log n/n = 1.
I followed the exposition of Zagier (Zagier 1997), which makes heavy use of complex analysis. It is natural then to look also at Dirichlet’s Theorem, which is, for Morris Kline, “The first deep and deliberate use of analysis to tackle what seemed to be a clear problem of algebra” (Kline 1972, 829).
Mathematics as such
If one believes there is a real debate over whether mathematics is discovered or invented, then such results as Dirichlet’s and the Prime Number Theorem would seem to be evidence for the former position. Who could have expected to find a connection between the discrete objects called counting numbers and the continuously varying objects called real and complex numbers?
I don’t actually think there is something to debate. One may observe that mathematics resembles, in some respects, a voyage of discovery; in others, toil in a laboratory to solve a practical problem.
What is important is that mathematics requires deductive proof, whether in establishing that what has been discovered is what it seems to be, or what has been invented does the job it is supposed to.
Special cases
There are ad hoc elementary methods of proving special cases of Dirichlet’s theorem. Such proofs may be seen in a first undergraduate course of number theory. For example, suppose
(q1, q2, …, qn)
is some finite list of primes. Then the list contains no prime factor of the difference
4q1⋯qn − 1.
Since this difference is congruent to 3 modulo 4, not all of its prime factors can be congruent to 1 modulo 4. The only other option is being congruent to 3 modulo 4. Thus there is an infinitude of such primes (Hardy and Wright 1979 Thm 11, p. 13).
There is also an infinitude of primes that are congruent to 1 modulo 4 (Burton 2007 Thm 9.3, pp. 177–8). For let
s = 2q1⋯qn,
and let p be a prime factor of s2 + 1. Then p is odd and is not one of the qi. Moreover, modulo p,
s2 ≡ − 1,
and so, by Fermat’s Theorem, letting
ϖ = p − 1/2
(the notation is used at (Hardy and Wright 1979, 87)), we have
1 ≡ sp − 1 ≡ (s2)ϖ ≡ ( − 1)ϖ.
Consequently ϖ must be even, so
p ≡ 1 (mod 4).
Goal
We proceed to the proof of Dirichlet’s theorem. We fix a modulus k and a number ℓ prime to this, while we let range
-
p over the primes,
-
a over the counting numbers,
-
m over the counting numbers exceeding 1,
-
s over the real numbers exceeding 1.
Other letters will represent counting numbers, real numbers, or complex numbers, as needed.
I say counting numbers, since it should be obvious that these start with 1; the set of them is ℕ. If we needed a name for the numbers that start with 0, these could be called the natural numbers, and their set would be ω.
We shall use the abbreviation
h = φ(k) = #{a: a ≤ k ∧ gcd (a, k) = 1}.
Using our modulus k, we shall show the infinitude of the set
{p: p ≡ ℓ}
by showing
(1) lims → 1+∑p ≡ ℓlog p/ps = ∞.
This is how real analysis will come into play. If there were only finitely many primes congruent to ℓ modulo k, then the sum in (1) would have finitely many terms; it would then be bounded, since each summand is bounded.
The divergence of the sum in (1) will follow from the divergence of another sum, where the indices comprise all numbers congruent to ℓ. The new sum will involve the Mangoldt function, Λ, where Λ(a) is
-
log p, if a is a power of p,
-
0, if a is not a prime power.
We shall use later that
(2) log a = ∑d ∣ aΛ(d).
Sums of the same form will recur in (15) and (19). Meanwhile, recalling our convention that m ≥ 2, we separate out from the sum in (1) an “error term” that will be bounded:
∑p ≡ ℓlog p/ps = ∑a ≡ ℓΛ(a)/as − ∑pm ≡ ℓlog p/pms,
From the rule for geometric series,
∑p, mlog p/pm = ∑plog p∑m1/pm
= ∑plog p/p(p − 1) ≤ ∑mlog m/m(m − 1),
so that
∑pm ≡ ℓlog p/pms ≤ ∑p, mlog p/pm
≤ ∑mlog m/m(m − 1) ≤ ∑m2log m/m2 < ∞.
Thus, for (1), it is enough to show
(3) lims → 1+∑a ≡ ℓΛ(a)/as = ∞.
We want to turn the sum into a sum over all counting numbers. We do this with characters.
Characters
A character is a function, henceforth to be denoted by χ, from ℕ to ℂ satisfying
a ≡ b ⟹ χ(a) = χ(b),
gcd (a, k) = 1 ⇔ χ(a) ≠ 0,
χ(ab) = χ(a)χ(b).
Thus χ is 0 on the complement of
{a: gcd (a, k) = 1},
and the restriction of χ to the latter set factorizes through (ℤ/kℤ)× as a homomorphism into ℂ×. The range of this homomorphism must then be a subgroup of the finite cyclic group
{z ∈ ℂ×: zh = 1}.
For example, the principal character is χ0, where χ0(a) is
-
1 if gcd (a, k) = 1,
-
0, otherwise.
We shall see other examples presently. They exist by the following, unless k ≤ 2.
Theorem 1. The order of the group of characters is itself h, that is,
(4) ∑χ1 = h.
Proof. For any finite abelian group G, the number of homomorphisms into a cyclic group H of the same order is that order, by the classification of finite abelian groups. Indeed, G is a direct product of cyclic subgroups, each having an order d measuring that of G; and then that subgroup is mapped into the d-element cyclic subgroup of H and can be so mapped in d ways. □
Landau gives a purely number-theoretic proof, of which the foregoing is a generalization. We shall review this, since we shall use some of it later. We are still going to use the language of group theory. If there is a homomorphism ψ from the multiplicative group (ℤ/kℤ)× of the ring ℤ/kℤ to the cyclic group ℤn of order n, then there is a character χ given by the rule that, when gcd (a, k) = 1, then
(5) χ(a) = exp (2πi ψ(a)/n).
Given now a number d that is prime to k, but is not 1 modulo k, we show that some χ defined as in (5) is nontrivial at d. We use for this the theory of primitive roots (Landau 1958, 104–8). We shall not need the details, but here they are. For some prime power pb,
pb ∣ k,
d ≢ 1 (mod pb).
There are two cases:
-
p is odd, or p = 2 and b ≤ 2.
-
p = 2 and b > 2, but d ≡ 1 (mod 22).
We take these up in turn as follows.
-
Except when p = 2 and b > 2, every prime power pb has a primitive root r, which means r generates the multiplicative group of the ring ℤ/pbℤ. Thus
(ℤ/pbℤ)× = ⟨r⟩.
There is now an isomorphism x ↦ rx from the cyclic group ℤφ(pb) to ⟨r⟩. Let us denote the inverse of this isomorphism by logr. If pb ∣ k, then logr is well defined on (ℤ/kℤ)×, since this maps onto (ℤ/pbℤ)×.
-
If b > 2, then 2b has no primitive root, but 5 is the next best thing, in the sense that
(ℤ/2bℤ)× ≅ ⟨5⟩ × ⟨2b − 1⟩.
Thus there are homomorphisms
-
x ↦ |x| from (ℤ/2bℤ)× onto ⟨5⟩,
-
log5 from ⟨5⟩ onto ℤ2b − 2.
-
Now we can complete the number-theoretic the proof of (4).
-
If χ is not χ0, then, a ranging over a reduced (or complete) set of positive residues,
(6) ∑aχ(a) = 0,
since for any b prime to k,
∑aχ(a) = ∑aχ(ba) = χ(b)∑aχ(a),
and we may choose b so that χ(b) ≠ 1.
-
If a is prime to k, but not 1 modulo k, then
(7) ∑χχ(a) = 0,
since for any character ρ,
∑χχ(a) = ∑χ(ρχ)(a) = ρ(a)∑χχ(a),
and we may choose ρ so that ρ(a) ≠ 1.
We obtain (4) by computing a sum in two ways, using (6) and (7) respectively:
∑(a, χ)χ(a) = ∑a∑χχ(a) = ∑χχ(1) = ∑χ1,
and
∑(a, χ)χ(a) = ∑χ∑aχ(a) = ∑aχ0(a) = ∑a1 = h.
This completes the number-theoretic proof of Theorem 1.
We immediately apply the Theorem and (7) to eliminate the explicit dependence on congruence class in the sum in (3). Letting j be the inverse of ℓ modulo k, we obtain
(8) ∑a ≡ ℓΛ(a)/as = ∑aΛ(a)/ash ⋅ ∑χχ(ja)
= ∑aΛ(a)/ash ⋅ ∑χχ(a)/χ(ℓ) = 1/h∑χ1/χ(ℓ) ⋅ ∑aχ(a)Λ(a)/as.
L-series
The last sum is an example of an L-series. Recalling the convention that s ranges over (1, ∞), we define
(9) L(s, χ) = ∑aχ(a)/as,
noting that the series is convergent, even absolutely convergent. Convergence is uniform away from 1, that is, on intervals [1 + ε, ∞), where ε > 0, by the Cauchy condition (Apostol 1974 Thm 9.5, p. 223). The same properties are held by the formal derivative, given by
(10) L′(s, χ) = − ∑aχ(a)log a/as.
Therefore L′(s, χ) really is the derivative of L(s, χ) (Apostol 1974 Thm 9.14, p. 230). Moreover,
(11) L(s, χ)∑aχ(a)Λ(a)/as = ∑bχ(b)/bs ⋅ ∑aχ(a)Λ(a)/as
= ∑a, bχ(ab)Λ(a)/(ab)s = ∑cχ(c)/cs ⋅ ∑a ∣ cΛ(a),
and thus, by (2),
(12) ∑aχ(a)Λ(a)/as = − L′(s, χ)/L(s, χ).
Combining this with (8), we have now
∑a ≡ ℓΛ(a)/as = − 1/h∑χL′(s, χ)/χ(ℓ)L(s, χ).
Thus we shall establish (3) by showing that, as s approaches 1 from above, L′(s, χ)/L(s, χ) is
-
unbounded when χ = χ0,
-
bounded when χ ≠ χ0.
For the unboundedness, from a special case of (12) we have
− L′(s, χ0)/L(s, χ0) = ∑gcd (a, k) = 1Λ(a)/as
= ∑aΛ(a)/as − ∑gcd (a, k) > 1Λ(a)/as,
where
∑gcd (a, k) > 1Λ(a)/as = ∑p ∣ klog p∑b1/pbs
= ∑aΛ(a)/as − ∑p ∣ klog p/ps − 1.
The second term is a sum of finitely many addends, each bounded. For the first term, note that
∑aΛ(a)/a ≥ ∑p1/p = ∞.
Thus for each N there is m large enough that
∑1 ≤ a ≤ mΛ(a)/a ≥ N + 1,
and then there is s close enough to 1 that
∑1 ≤ a ≤ mΛ(a)/as ≥ N.
It remains to establish the boundedness of L′(s, χ)/L(s, χ) when χ ≠ χ0.
Bounds
Lemma 1. For all nonprincipal χ, whenever i ≤ n,
± ∑i ≤ j ≤ nχ(i) ≤ h/2.
Proof. By (6), we can increase i by a multiple of k until
i ≤ n < i + k.
Then
∑i ≤ j ≤ nχ(i)
= ∑i ≤ j ≤ i + k − 1χ(i) − ∑n + 1 ≤ j ≤ i + k − 1χ(i)
= − ∑n + 1 ≤ j ≤ i + k − 1χ(i).
The first and last sums have h nonzero terms between them, each of modulus 1. □
We shall use the foregoing in concert with the following.
Lemma 2. For any sequence (ωj: j) of complex numbers, if for some R, whenever i ≤ n,
± ∑i ≤ j ≤ nωj ≤ R,
then for any decreasing sequence (εj: j) of positive real numbers, whenever i ≤ n,
(13) ± ∑i ≤ j ≤ nεjωj ≤ εiR.
Proof. Since
ωj = ∑i ≤ α ≤ jωα − ∑i ≤ α ≤ j − 1ωα,
we compute
∑i ≤ j ≤ nεjωj = ∑i ≤ j ≤ nεj∑i ≤ α ≤ jωα − ∑i ≤ j ≤ nεj∑i ≤ α ≤ j − 1ωα,
where
∑i ≤ j ≤ nεj∑i ≤ α ≤ j − 1ωα = ∑i + 1 ≤ j ≤ nεj∑i ≤ α ≤ j − 1ωα = ∑i ≤ j ≤ n − 1εj + 1∑i ≤ α ≤ jωα,
so that
∑i ≤ j ≤ nεjωj = ∑i ≤ j ≤ n − 1(εj − εj + 1)∑i ≤ α ≤ jωα + εn∑i ≤ α ≤ nωα,
and therefore (13) holds, since
εi = ∑i ≤ j ≤ n − 1(εj − εj + 1) + εn. □
Now we can show that definitions (9) and (10), of L and L′ respectively, are valid even when s = 1:
Theorem 2. If χ is not principal, then, when s ≥ 1, both L(s, χ) and L′(s, χ) are uniformly convergent, hence continuous, and bounded strictly by h.
Proof. By Lemma 1, in Lemma 2 we can let ωj be χ(j), and εj be 1/js or log j/js, so that, independently of n and s,
± ∑i ≤ j ≤ nχ(j)/js ≤ h/2i,
± ∑i ≤ j ≤ nχ(j)log j/js ≤ hlog i/2i.
Strictly, we have shown the latter to hold only when i ≥ 3, since the function x ↦ x − slog x is decreasing on [es − 1, ∞); but then
± L′(s, χ) ≤ log 2/2 + hlog 3/6 < 1/2 + h/2 ≤ h. □
By the Theorem and in particular, when χ is not principal,
-
the continuity of L(s, χ) at 1,
-
the boundedness of L′(s, χ),
we can establish the boundedness of L′(s, χ)/L(s, χ) when χ ≠ χ0 by showing, when χ ≠ χ0,
(14) L(1, χ) ≠ 0.
We do this by cases. There are three kinds of characters, comprising
-
the principal character;
-
nonprincipal characters taking only real values;
-
characters taking properly complex values.
We shall consider the last two kinds separately. First we establish a product formula.
Theorem 3. For all χ, if s > 1, then
1/L(s, χ) = ∏p(1−χ(p)/ps).
Proof. Using the Möbius function, μ, where μ(a) is
-
( − 1)n, if a is the product of n distinct primes,
-
0, otherwise,
we compute
∏p ≤ n(1−χ(p)/ps) = ∑p ∣ j ⇒ p ≤ nχ(j)μ(j)/js
= ∑1 ≤ j ≤ nχ(j)μ(j)/js + ∑n < j ∧ (p ∣ j ⇒ p ≤ n)χ(j)μ(j)/js.
The latter sum converges to 0 as n grows indefinitely. Moreover, if a has n distinct prime factors, then
(15) ∑d ∣ aμ(a) = ∑0 ≤ j ≤ nC (n, j)( − 1)j = (1 − 1)n,
which is
-
1, if n = 0,
-
0, if n > 0,
so that, as in (11), with μ replacing Λ,
L(s, χ)∑aχ(a)μ(a)/as = ∑cχ(c)/cs ⋅ ∑a ∣ cμ(a) = χ(1) = 1. □
As a corollary, we can derive
(16) L(s, χ0)3 ⋅ |L(s,χ)|4 ⋅ |L(s,χ2)|2 ≥ 1
from
(1−χ0(p)/ps)3 ⋅ |1−χ(p)/ps|4 ⋅ |1−χ2(p)/p2|2 ≤ 1,
which, letting
η = χ0(p)/ps,
eνi = χ(p),
we derive from the following.
Lemma 3. If 0 < η < 1, then
(1 − η)3 ⋅ |1−ηeνi |4 ⋅ |1−ηe2νi |2 < 1.
Proof. We use the identity
|1−ηeνi |2 = (1 − ηcos ν)2 + (ηsin ν)2 = 1 − 2ηcos ν + η2
and the inequalities
(αβγ)1/3 ≤ (α + β + γ)/3
and
2cos ν + cos (2ν) = 2cos ν + 2cos2ν − 1
= 2(cosν+1/2)2 − 3/2 ≥ − 3/2,
so that, if 0 < η < 1, then
|1−ηeνi |4 ⋅ |1−ηe2νi |2 = (1 − 2ηcos ν + η2)2(1 − 2ηcos (2ν) + η2)
≤ (1−2η(2cosν+cos(2ν))/3+η2)3
≤ (1 + η + η2)3 = (1−η3/1−η)3 < (1/1−η)3. □
Now we have (16), used in the following.
Theorem 4. (14) holds for all characters χ of the third kind.
Proof. From (16),
(17) |L(s,χ)| ≥ 1/L(s, χ0)3/4 ⋅ L(s, χ2)1/2.
We estimate
L(s, χ0) ≤ ∑a1/as < 1 + ∫1∞d t/ts = 1 + 1/s − 1 = s/s − 1.
Since χ is of the third kind, χ2 is not principal, so by Theorem 2,
|L(s,χ2)| < h.
Plugging into (17), we obtain
|L(s,χ)| ≥ (s − 1)3/4/s3/4h1/2.
For an upper bound, we have, by the Fundamental Theorem of Calculus,
L(s, χ) − L(1, χ) ≤ ∫1sL′(t, χ)d t,
and again by Theorem 2,
± ∫1sL′(t, χ)d t < h(s − 1).
Therefore
|L(1,χ)| > (s − 1)3/4/s3/4h1/2 − h(s − 1)
= (s − 1)3/4/h1/2 ⋅ (1/s3/4−h3/2(s−1)1/4).
The lower bound is positive when s = 1 + 1/16h6 < 2. □
The final argument is the most complicated and is where Dirichlet is reported to have used class numbers. I follow Landau rather slavishly here, but think he has an error or misprint.
Theorem 5. (14) holds for all characters χ of the second kind; even
(18) L(1, χ) > 0.
Proof. Fixing a character χ of the second kind, we define
(19) f(a) = ∑d ∣ aχ(d).
-
Since χ is multiplicative, in the weak sense whereby
gcd (a, b) = 0 ⇒ χ(ab) = χ(a) ⋅ χ(b),
the same is true of f.
-
Since also χ takes values only from { − 1, 0, 1},
f(p2n − 1) = ∑0 ≤ j ≤ n − 1χ(p)2j(1 + χ(p)) ≥ 0,
f(p2n) = 1 + ∑1 ≤ j ≤ nχ(p)2j − 1(1 + χ(p)) ≥ 1.
Consequently
(20) f(a) ≥ 0 ∧ f(a2) ≥ 1.
We now make some abbreviations, first
(21) n = (4h)6,
which is both a square and a cube, and then, with this,
z = ∑1 ≤ j ≤ n2(n − j)f(j) = ∑ab ≤ n2(n − ab)χ(b).
We shall bound z below and above. From below, by (20) we have
z ≥ ∑1 ≤ j ≤ n1/22(n − j2) ≥ ∑1 ≤ j ≤ n1/2/22(n − j2) ≥ ∑1 ≤ j ≤ n1/2/22(n−n/4)
(22) = 3n3/2/4 = 3(4h)9/4.
For a bound from above, we note
z ≤ z1 + z2,
where
z1 = ∑1 ≤ a ≤ n1/3∑n2/3 + 1 ≤ b ≤ [n/a]2(n − ab)χ(b),
z2 = ∑1 ≤ b ≤ n2/3∑1 ≤ a[n/b]2(n − ab)χ(b).
We bound these. First, by Lemmas 1 and 2,
(23) z1 ≤ ∑1 ≤ a ≤ n1/3|∑n2/3 + 1 ≤ b ≤ [n/a]2(n−ab)χ(b)|
≤ ∑1 ≤ a ≤ n1/32nh = n4/3h.
Next,
(24) z2 = ∑1 ≤ b ≤ n2/3χ(b)∑1 ≤ a ≤ [n/b]2(n − ab),
and
∑1 ≤ a ≤ [n/b]2(n − ab) = 2n∑1 ≤ a ≤ [n/b]1 − 2b∑1 ≤ a ≤ [n/b]a
= 2n[n/b] − b[n/b]([n/b]+1).
Using here the notation
ϑ = n/b − [n/b],
we obtain
∑1 ≤ a ≤ [n/b]2(n − ab) = 2n(n/b−ϑ) − b((n/b−ϑ)2+n/b−ϑ)
= n2/b − n + b(ϑ − ϑ2).
Plugging into (24), using Lemmas 1 and 2 again, as well as that
|ϑ−ϑ2| ≤ 1,
we get
z2 = n2∑1 ≤ b ≤ n2/3χ(b)/b − n∑1 ≤ b ≤ n2/3χ(b) + ∑1 ≤ b ≤ n2/3χ(b)b(ϑ − ϑ2)
≤ n2(L(1,χ)−∑n2/3 ≤ b ≤ ∞χ(b)/b) + nh/2 + n2/3h/2
≤ n2L(1, χ) + n4/3h/2 + nh/2 + n2/3h/2
< n2L(1, χ) + 2n4/3h.
Combining this with (23), recalling (21), we have
z < n2L(1, χ) + 3n4/3h = n2L(1, χ) + 3(4h)8h
= n2L(1, χ) + 3(4h)9/4.
Comparison with (22) yields (18). □
Apostol, Tom M. 1974. Mathematical Analysis. Second. Addison-Wesley Publishing Co., Reading MA–London–Don Mills ON.
Burton, David M. 2007. Elementary Number Theory. Sixth. Boston: McGraw-Hill.
Buzzard, Kevin. 2008. “L-Functions.” In The Princeton Companion to Mathematics, edited by Timothy Gowers, 228–9. Princeton, NJ: Princeton University Press.
Cohn, Harvey. 1980. Advanced Number Theory. New York: Dover.
Dickson, Leonard Eugene. 2005. History of the Theory of Numbers. Volume I: Divisibility and Primality. Mineola, New York: Dover.
Euclid. 1956. The Thirteen Books of Euclid’s Elements. New York: Dover Publications.
Granville, Andrew. 2008. “Analytic Number Theory.” In The Princeton Companion to Mathematics, edited by Timothy Gowers, 332–48. Princeton, NJ: Princeton University Press.
Hardy, G. H., and E. M. Wright. 1979. An Introduction to the Theory of Numbers. Fifth. Oxford: Clarendon Press.
Kline, Morris. 1972. Mathematical Thought from Ancient to Modern Times. New York: Oxford University Press.
Landau, Edmund. 1958. Elementary Number Theory. New York: Chelsea Publishing.
Selberg, Atle. 1949. “An Elementary Proof of Dirichlet’s Theorem About Primes in an Arithmetic Progression.” Ann. Of Math. (2) 50: 297–304. https://doi.org/10.2307/1969454.
Zagier, D. 1997. “Newman’s Short Proof of the Prime Number Theorem.” Amer. Math. Monthly 104 (8): 705–8. https://doi.org/10.2307/2975232.