Skip to main content

Section 1 Equidistribution in Number Theory

Subsection 1.1 Introduction

I wasn't going to do this at all, but then Akshay won the Fields medal.

―Ali Altuğ

Some topics we will try and cover:

  1. Linnik's and Duke's theorem 1970-1990
  2. lattice points and hyperbolic geometry
  3. modular forms
  4. spectral theory
  5. modular forms of half-integral weight.
  6. harmonic analysis
  7. Kuznetsov formula
  8. Other topics!

Subsection 1.2 Linnik and Duke

Akshay's first breakthrough, according to Sarnak, was subconvexity. He became interested in ergodic theory, because they could prove hard theorems, equidistribution is a powerful tool in number theory.

Our main goal will be to talk about: Some problems stated (and proved) by Linnik. In the book, Ergodic properties of algebraic fields, 1968. He considered lattice points on a sphere of radius \(n\text{,}\) these are points (say in \(\RR^3\)) whose coordinates are integral of a fixed distance from the origin. Analogous to the circle problem. If \(n\) is fixed there is nothing to distribute, but if we vary \(n\) and project down we can ask do they accumulate miss any patches, generally how do they distribute. Of course this can be generalised.

Setup

Sphere \(S^2\text{.}\)

Lattice points \(\alpha = (x_1, x_2, x_3) \in \ZZ^3\text{.}\)

\begin{equation*} |\alpha|^2 = x_1^2 + x_2^2 + x_3^2 \end{equation*}

this is where the number theory comes, we are looking at representability of numbers \(n\) by this ternary quadratic form.

Various methods exist for studying this for varying \(n\text{,}\) quadratic reciprocity for \(n=2\text{,}\) circle method/Vinogradov for \(n=4\text{.}\) \(n=3\) is the cut off, here half-integral weight modular forms are relevant.

Set

\begin{equation*} \Omega_N = \left\{x= \frac{\alpha}{|\alpha|} : \alpha \in \ZZ^2 ,\,|\alpha|^2 = N \right\}\subseteq S^2\text{.} \end{equation*}
Question 1.1

Are \(\Omega_N\) “equidistributed” as \(N \to \infty\text{?}\)

There is an immediate obstruction to \(N\) being a sum of \(3\) squares, e.g. \(N = 7\) implies \(\Omega_7 = \emptyset\text{.}\)

Recall: (Gauss/Legendre)

\begin{equation*} N = x_1^2 + x_2^2+ x_3^2,\,x_i \in \ZZ \iff N \ne 4^a(8b+7)\text{,} \end{equation*}

so we avoid these sets and ask the same question.

This is saying that the points are equidistributed with respect to the Lebesgue measure.

The last condition is a defect of the method, known as a Linnik condition.

Remark 1.3

Linnik's proof is ergodic theoretic.

After this came Duke in 1988, in the mean time, Weil conjectures were proved, Iwaniec gave bounds for Kloosterman sums. Duke was a graduate student of Sarnak at Courant. He gave a more direct proof of 1.2 which does not have the \(\legendre{N}{p} = 1\) condition. His proof is based on the theory of (half-integral weight) modular forms, and a good bound Iwaniec on certain exponential sums.

Why do exponential sums enter the picture? We are trying to prove that we have a sum converging to an integral. Generally we work with a basis of functions first, we could try using fourier analysis, using harmonics as our basis, this is when exponential sums appear. That requires us to work out harmonics on the sphere (spherical harmonics) which leads to representation theory, which as the sphere is compact involves Weyl representations etc.

Duke also proved “the same” theorem over modular surfaces. Instead of looking at expanding spheres we study expanding hyperboloids.

Setup

Space

\begin{equation*} \pm \Gamma\backslash \HH \end{equation*}
\begin{equation*} \HH = \left\{ x+iy \in CC : y \gt 0\right\} \end{equation*}
\begin{equation*} \Gamma = \SL_2(\ZZ) \end{equation*}
\begin{equation*} \mathcal F = \left\{ z\in \HH : -\frac12 \le \Re(z) \le 0, |z| \ge 1 \text{ or }\frac12 \gt \Re(z) \gt 0, |z| \gt 1 \right\} \end{equation*}
Figure 1.4 \(\mathcal F\)

\(\Omega_N\)'s are now replaced with CM points.

Digression (CM points)

\(Q\) a binary quadratic form

\begin{equation*} Q(x,y) = ax^2+ bxy + cy^2 \end{equation*}

of discriminant \(d = b^2- 4ac \lt 0\text{,}\) \(a,b,c\in \ZZ\text{.}\)

The number of such is given by the Hurwitz class number.

Given \(Q\) we associate a CM point

\begin{equation*} z_Q = \frac{-b + \sqrt{d}}{2a} \in \HH\text{.} \end{equation*}

The action of \(\Gamma\) on \(Q\) is the same as the action of \(\Gamma\) on \(z_Q\text{.}\)

\begin{equation*} z_{\gamma Q } =\gamma z_Q\text{.} \end{equation*}

Let

\begin{equation*} \Lambda_d = \{z_Q \in \mathcal F : \disc(Q) = d\}\text{.} \end{equation*}

We need to sum properly to take automorphisms into account.

Definition 1.6
\begin{equation*} \sum{\vphantom{\sum}}^*_{z_Q \in \Lambda_d} \end{equation*}

is the sum weighted by \(\frac 12\) if \(Q = a(x^2+ y^2)\) (\(d= -4\)), \(\frac 13\) if \(Q = a(x^2+ xy+y^2)\) (\(d= -3\)).

If we want to be fancy we can say the word stack here.

Remark 1.7
\begin{equation*} \sum_{z_Q \in \Lambda_d}{\vphantom{\sum}}^*1 = H(d) \end{equation*}

where \(H(d)\) is the Hurwitz class number.

Remark 1.8

If \(d\) is fundamental, i.e. a discriminant of some \(\QQ(\sqrt{d})\) then \(H(d) = h(d)\) the regular class number.

For the measure on \(\mathcal F\) we take

\begin{equation*} \diff\mu = \frac 3\pi \frac{\diff x \diff y}{y^2}\text{.} \end{equation*}

Equidistribution implies density, but is so much more, for example we cannot have dense points but which happen to cluster towards some line for example.

Subsection 1.3 Basics

Question: Let \(\alpha \in \RR\) and consider \(\{ \alpha n \}\) where \(\{ x \} = x \pmod 1\) so \(\left\{\frac32\right\} = \frac 12\text{.}\) How are these distributed?

Example 1.10

If \(\alpha = \frac 27\) then we have \(\{ \{\alpha n \} : n \in \NN\} = \{\frac i7 : i \in \{0,\ldots, 6\}\}\) and in fact it hits each evenly.

Example 1.11

If \(\alpha = \sqrt 2\) so \(\{\alpha\} \approx 0.4142?...\) \(\{\alpha2\} \approx 0.8284?...\) \(\{\alpha3\} \approx 0.24264...\) \(\{\alpha4\} \approx 0.656854..\text{.}\) These spread out densely, but there is a difference between density and equidistribution. In this example, equidistribution says that the proportion of time the sequence spends in each interval \((a,b)\) is \(b -a\text{.}\)

So questions are: is \(\{n\alpha\}\) dense?

Is \(\{n \alpha\}\) uniformly distributed (equidistributed with respect to the standard measure)?

The answer to both questions is yes.

Digression (Diophantine approximation)

This is a very tough area of number theory, not so many definitive results here.

(Pidgeonhole) divide \(\lb 0, 1)\) into even \(N\) subintervals of width \(\frac 1N\text{,}\) consider

\begin{equation*} \alpha_0 = 0,\alpha_1 = \left\{ \alpha 1\right\}, \alpha_2 = \left\{ \alpha 2\right\}, \ldots\in \lb 0,1) \end{equation*}

as soon as we get to \(\alpha_N\) we must have two in one subinterval say \(|\alpha_{n_1} - \alpha_{n_2}| \lt \frac 1N\text{.}\) So there exists \(p_{n_1}, p_{n_2}\) such that

\begin{equation*} | n_1 \alpha - k_{n_1} - (n_2\alpha - k_{n_2}) | \lt \frac1N \end{equation*}
\begin{equation*} | (n_1 - n_2)\alpha -( k_{n_1} - k_{n_2}) | \lt \frac1N\text{.} \end{equation*}

Exercise.

This is very strong, Roth's theorem tells us that even \(q^{2+\epsilon}\) here is enough to force finiteness.

Note 1.15

\(\alpha \not \in \QQ\) is necessary! Otherwise

\begin{equation*} \left | \frac{p_0}{q_0} - \frac pq \right| = \left | \frac{p_0q - pq_0}{qq_0} \right| \ge \frac{\left | p_0q - pq_0 \right|}{\max \left\{q^2 ,q_0^2 \right\}} \end{equation*}

so choose \(q \gt q_0\) implies \(\frac {p_0}{q_0} = \frac pq\text{.}\)

One can do better:

Note 1.17

\(\sqrt 5\) is the best possible without further restriction on \(\alpha\text{.}\)

Example 1.18

If \(\alpha = \frac{1- \sqrt 5}{2}\) then

\begin{equation*} \left| \alpha - \frac pq \right| \lt \frac {1}{Aq^2} \end{equation*}

has only finitely many solutions for \(A \gt \sqrt 5\text{.}\)

What if we allow further restriction?

Let \(f \in \ZZ\lb x \rb\) be the minimal polynomial of \(\alpha\text{.}\) Gauss's lemma implies that \(f\) is irreducible over \(\QQ\) so

\begin{equation*} q^n f\left(\frac pq\right) \in \ZZ \smallsetminus\{0\}\,\forall \frac pq\in \QQ\text{.} \end{equation*}

The mean value theorem says that there exists \(x_0 \in \lb \alpha , p/q )\) s.t.

\begin{equation*} \frac{f(p/q) - \cancelto{0}{f(\alpha)}}{p/q - \alpha} = f'(x_0) \end{equation*}

so

\begin{equation*} \frac{q^nf(p/q) }{q^n f'(x_0)} = |p/q - \alpha|\text{,} \end{equation*}

notice how \(n\) appears here.

This theme of using some calculus is repeated across diophantine analysis.

Remark 1.20
  • Thue: can replace \(n\) with \((\deg(\alpha) + 2)/2\) (This already has implications to integral solutions of degree \(\ge 3\) polynomials \(f \in \ZZ\lb x \rb\text{,}\) e.g. elliptic curves with bounded integral discriminant)
  • Roth (\(\sim\) 1958): for all \(\epsilon \gt 0 \text{,}\) there are only finitely many \(p/q\) satisfying
    \begin{equation*} | \alpha - \frac pq| \lt \frac{1}{q^{2+\epsilon}}\text{.} \end{equation*}
Back to equidistribution

\(a_n = \left\{n \alpha\right\} , \left\{a_n\right\}_{n=1}^\infty\text{.}\)

Will show for any \(x \in \lb 0,1)\) there exists

\begin{equation*} \left\{ a_{n_j}\right\}_{j=1}^\infty\text{ s.t. } a_{n_j} \to x\text{.} \end{equation*}

Notation \(\| x\|\) means the distance to the nearest integer. Dirichlet implies that infinitely many \(p/q\) have \(\alpha - p/q| \lt 1/q^2\text{.}\)Given \(\epsilon \gt 0\) let \(q\) be such that

\begin{equation*} | \alpha - \frac pq | \lt \frac{1}{q^2} \implies | q\alpha - p| \lt \frac 1q \end{equation*}
\begin{equation*} \frac 1q \lt \epsilon \end{equation*}

Choose \(j\) such that \(j (\alpha q- p)\) is within \(1/q\) of \(x\) (why?). So

\begin{equation*} \| j(\alpha q - p ) -x\| \lt \frac 1q \lt \epsilon\text{.} \end{equation*}

(Fill in the gaps here).

Now let's define what is means to be uniformly distributed.

Definition 1.22 Uniformly distributed sequence

\(a_1, a_2, \ldots\) is called uniformly distributed if for all \((b,c) \subseteq \lb 0,1)\)

\begin{equation*} \#\left\{ n \le N : \left\{ a_n\right\} \in (b,c) \right\} \sim N(c-b) \end{equation*}

or

\begin{equation*} \lim_{N \to \infty} \frac{\#\left\{ n \le N : \left\{ a_n\right\} \in (b,c) \right\}}{N} = c-b\text{.} \end{equation*}

Uniformly distributed or not?

When \(\alpha \in \RR \smallsetminus \QQ\text{:}\) \(\{ n\alpha \}\text{,}\) \(\{ n^2\alpha \}\text{,}\) \(\{ (3n^2 + 2n + 1)\alpha \}\) \(\{ (\sqrt 5 n^3 + 2n - (\zeta_{10} + \bar \zeta_{10}))\alpha \}\) all are.

\(\{ n! e \}\text{,}\) \(\{\log(n)\}\) and \(\{ \log (p_n)\}\) are not.

Example 1.23 Several
  1. \(\alpha\in \RR\smallsetminus\QQ,\,a_n = \{n\alpha\}\text{.}\) Yes
  2. \(\alpha\in \RR\smallsetminus\QQ,\,a_n = \{n^2\alpha\}\text{.}\) Yes
  3. \(a_n = \{\log(n)\}\text{.}\) No
  4. \(a_n = \{n! e\}\text{.}\) No (not dense).
  5. \(\{ \log (p_n)\}\text{.}\) No.
  6. \(a_n = \{\sqrt{n}\}\text{.}\) Yes.
  7. \(a_n = \{e^n\}\text{.}\) ?
  8. \(a_n = \{\log n!\}\text{.}\) Yes ?
  9. \(a_n = \{\log \log n!\}\text{.}\) No

To begin let's show \(a_n = \{n!e\}\) is not equidistributed as it has only one limit point \(0\text{.}\)

\begin{equation*} e = \sum_{n=1}^\infty \frac{1}{n!} \end{equation*}

so \(n!e \in \ZZ + \frac{1}{n+1} + \frac{(n+1)(n+2)} + \cdots \le \frac{e}{n+1} \to 0\text{,}\) as \(n\to \infty\text{.}\) So its not dense, hence certainly not equidistributed.

Let \(\epsilon\gt 0\text{,}\) choose \(M \gt 0\) such that \(\frac1M\lt \epsilon\text{,}\) furthermore choose a \(\delta \lt \frac1M\text{.}\) Dirichlet implies there exists \(m \in \ZZ\) s.t. \(\| m\alpha\| = \delta\text{.}\)

\begin{equation*} 0\lt \delta \lt \frac1M \lt\epsilon \end{equation*}

recall \(\| x\|\) is the distance to the nearest integer. Consider the set \(S_i = \left\{ \{n \alpha\} : n \lt N,\,n \equiv i \pmod m\right\}\) then

\begin{equation*} S = \{ \{n \alpha\} : n\lt N \} = \bigsqcup_{i=1}^m S_i \end{equation*}

moreover,

\begin{equation*} S_i = \{\{ km\alpha + i\alpha\} : 0 \le k \le N_i\} \end{equation*}

where

\begin{equation*} N_i = \frac Nm + O(1) \end{equation*}

(why is the \(O(1)\) here?). We can rewrite these \(S_i\) as follows:

\begin{equation*} S_i = \{ \delta k + \gamma_i \pmod 1 : 0 \le k \le N_i\} \end{equation*}

where

\begin{equation*} \begin{cases} i\alpha \pmod 1 \amp \text{ if } \delta = \{ m\alpha \} \\ i\alpha - \delta (N_i + 1)\pmod 1 \amp \text{ if } \delta = 1 - \{ m\alpha \} \\ \end{cases} \end{equation*}

replace \(k\) by \(N_i + 1-k \text{.}\)

Now \(0 \le \gamma \lt 1\) and let

\begin{equation*} K_i = \lfloor \delta N_i + \gamma_i \rfloor \end{equation*}
\begin{equation*} \# \{ k \lt N_i : \{ \delta k + \gamma_i \} \in \lb b,c\rb \} = \sum_{j=0}^{K_i} \# \{k \le N_i : \delta k + \gamma \in [j+b,j+c] \} \end{equation*}
\begin{equation*} = (K_i + O(1)) \left(\frac{c-b}{\delta} + O(1) \right) \end{equation*}
\begin{equation*} N_i(c-b) + O\left( \frac {c-b}{\delta} + \delta N_i + 1 \right) \end{equation*}

so

\begin{equation*} S = \bigsqcup_{i=1}^m S_i \end{equation*}
\begin{equation*} \# S = N(c-b) + O\left( \frac M\delta + \delta N \right) \end{equation*}

(check!). Finally choose \(N \gt \frac{M}{\delta^2}\text{.}\) So that

\begin{equation*} \frac{M}{\delta} + \delta N \lt \epsilon N\text{.} \end{equation*}

Which implies

\begin{equation*} \# \{ n \lt N : \{n\alpha \} \in [b,c] \} =\#S =N(c-b) + \epsilon N\text{.} \end{equation*}

We followed our nose essentially, but we needed to put ourselves in a favourable position so that we can get a good handle on the error.

Example 1.25 10

Consider the sequence

\begin{equation*} b_{m,i} = \frac i m \end{equation*}

and define

\begin{equation*} a_{\binom{m}{2} + i } = b_{m,i} \end{equation*}

so that

\begin{equation*} b_{m,i} \end{equation*}
\begin{equation*} 0,1 \end{equation*}
\begin{equation*} 0,\frac 12 , 1 \end{equation*}
\begin{equation*} 0,\frac 13 ,\frac 23, 1 \end{equation*}
\begin{equation*} 0,\frac 14 ,\frac 24,\frac 34, 1 \end{equation*}
\begin{equation*} a_m = 0,1, 0,\frac 12 , 1, 0,\frac 13 ,\frac 23, 1, 0,\frac 14 ,\frac 24,\frac 34, 1 \end{equation*}

this is uniformly distributed!

Prove this (in a similar way to before).

  1. If not sense we are done already. Otherwise let \(I_1 = \lb 0, \frac 12) ,I_2 = \lb\frac 12, 1\rb\) Let \(X = \{a_n \in I_1\}, Y = \{a_n \in I_2\}\text{,}\) so both sets are infinite. Let \(b_n = x_1, \ldots, x_{10^{10}}, y_1, x_{10^{10} + 1}, \ldots, x_{20^{20}},\ldots\) so \(\# \{n:b_n \in \lb4/5,1\rb \} \sim \frac{N}{10^{10}}\text{.}\)
  2. If \(a_n\) is dense then, let
    \begin{equation*} b_{n,i} = a_m \text{ s.t. } \frac{i-1}{n} \lt a_m \lt \frac in \end{equation*}
    by the example above this is equidistributed.

What can we say about the space of uniformly distributed sequences? Is it closed under addition? No (\(a_n = -b_n\)). \(\{k a_n\}\) neither (\(a_n = \alpha n,\,\frac1\alpha a_n = n\)). \(a_nb_n\) doesn't work either, but if \(b_n\) converges then \(a_nb_n\) is uniformly distributed.

Subsection 1.4 Weyl's criterion (1916)

\(1.\implies 2.\). 

\(1.\) means \(2.\) is true for the characteristic function of \(\lb b,c\rb\text{,}\) hence true for step functions (because everything is linear).

Recall that any \(f \in C\lb 0,1\rb)\) is the uniform limit of such functions.

Exercise: Choose one that is \(\epsilon\) close to \(f\) and finish the proof.

\(2.\implies 1.\). 

Find \(f_1,f_2 \in C(\lb 0,1 \rb)\) such that \(f_1(x) \le \chi_{(b,c)}(x) \le f_2(x)\) s.t.

\begin{equation*} \int f_1 - \chi_{(b,c)}) \le \epsilon \end{equation*}

Exercise, finish this proof.

\(2.\implies 3.\). 

Obvious.

\(3.\implies 2.\). 

Recall Fejér's theorem that \(f\in C\lb 0,1 \rb)\) is the uniform limit of its fourier series. Pick an \(n \in \NN\) s.t.

\begin{equation*} f_n(x) - \epsilon \le f(x) \le f_n(x) + \epsilon\text{.} \end{equation*}

Exercise, finish this proof.

Using this we can tackle more examples, e.g.:

  1. \(\{\alpha p_n\}\text{,}\) \(n\)-th primes, \(\alpha\in \RR\smallsetminus \QQ\text{.}\)
  2. \(\{\sqrt{n}\}\)
  3. \(\{\log(n)\}\)
  4. \(\{\log(p_n)\}\)
  5. \(\{\beta^n\}\text{,}\) \(\beta \in \RR\text{.}\)
  6. \(\{\alpha a_n\}\text{,}\) \(a_n \in \ZZ\)

Subsection 1.5 Some applications of Weyl's criterion

Application 1

\(\{n \alpha \}\) is uniformly distributed iff \(\alpha \in \RR\smallsetminus \QQ\text{.}\) Let

\begin{equation*} S_N(h,\alpha) = \sum_{n=0}^N e(h\alpha n) \end{equation*}

then

\begin{equation*} S_N(h,\alpha) = \frac{e((N+1) h\alpha) - 1}{e(h\alpha) - 1}\text{.} \end{equation*}

Then \(e(h\alpha ) - 1 \ne 0\) for all \(h \in \ZZ\smallsetminus \{0\} \) iff \(\alpha \not \in \QQ\) so \(S_N(h,\alpha ) = O_{h,\alpha}(1) \) iff \(\alpha \not \in \QQ\text{.}\) So \(\lim_{N \to \infty} S_N(h,\alpha) = 0,\,\forall h \ne 0 \iff \alpha\not \in \QQ\text{.}\) And \(S_N(0,\alpha) = N+1\) which implies

\begin{equation*} \lim_{N\to \infty} \frac1N S_N(0,\alpha) = 1 \end{equation*}

so by Weyl's criterion \(\left\{ n\alpha \right\}\) is uniformly distributed if \(\alpha \not \in \QQ\text{.}\)

Digression

He actually showed weak Goldbach conjecture, Every (sufficiently large) odd integer is a sum of 3 primes. No bound by Vinogradov, Borozdin gave a large bound, Helfgott brought it down to reality.

Application 2
\begin{equation*} \{\beta^n \} \end{equation*}

Koksma was a Dutch student of Van der Corput.

Application 3
\begin{equation*} \{\log(n) \} \end{equation*}

is not equidistributed.

Prove this. Hint : assume it was

\begin{equation*} \# \{ n \lt 1001 N : \} =\# \{ n \lt 1000N : \} + \# \{1000N \le n \le 1001 N : \} \end{equation*}

then

\begin{equation*} \log(n) - \log(100N + n_1) = \log(1000N) + \log (1+ n_1/1000N) \end{equation*}
Application 4
\begin{equation*} \{\sqrt{n} \} \end{equation*}

Before this, lets introduce

Definition 1.33 Discrepancy

Let \(a_n \in \lb0,1)\) be a sequence and \(N\in \ZZ_{\ge 0}\) we define

\begin{equation*} D_N = \sup_{0\le b\le c \le 1}\left\{ \left| \frac{\# \{n \lt N : a_n \in [b,c ]\}}{N} - (c-b)\right| \right\}\text{.} \end{equation*}

Some useful lemmas:

Exercise (hint squeeze).

Ommitted/exercise, 2.1.1 of Kuipers and Niederreiter.

For \(\sqrt{n}\) now.

\begin{equation*} S_N(c) = \# \{ n \le N : 0 \le \{ \sqrt n \} \le c\}\text{.} \end{equation*}
Note 1.36

If \(d = \lfloor \sqrt{n} \rfloor\) then

\begin{equation*} \sqrt n = d + \{ \sqrt n \} \end{equation*}

and \(d \le \sqrt n \le d+ 1\) and \(\#\{ n : d \le \sqrt n \le d+ 1\} = 2d+ 2\text{.}\)

Similarly

\begin{equation*} \# \{ n : d \le \sqrt n \le d + \alpha \} = 2\alpha d + \alpha^2 + 1 \end{equation*}

then

\begin{equation*} S_N(\alpha) = \sum_{j=0}^{\lfloor \sqrt N \rfloor} \# \{ n : j + \alpha \le \sqrt n\le j + \alpha\} \end{equation*}
\begin{equation*} = \sum_{j=0}^{\lfloor \sqrt N \rfloor} 2\alpha d + \alpha^2 + O(1) \end{equation*}
\begin{equation*} = 2\alpha \frac{ \lfloor \sqrt N \rfloor (\lfloor \sqrt N \rfloor + 1)}{2} + O(\sqrt N) = \alpha N + O(\sqrt N) \end{equation*}

so

\begin{equation*} D_N^* = \sum_{0\le \alpha \le 1} \left\{ \left| \frac{S_N(\alpha)}{N} -\alpha \right| \right\} = \sup_{0\le \alpha \le 1 } O_\alpha (1/\sqrt{N}) = O(1/\sqrt N)\text{.} \end{equation*}

Make the dependence on \(\alpha \) explicit and make the proof rigorous.

Application 5

Assume otherwise, let \(k \in \ZZ_{\gt0}\text{.}\)

\begin{equation*} I_k = \min\{ n : p_n \gt e^k \} \end{equation*}
\begin{equation*} I_{k-1/2} = \min\{ n : p_n \gt e^{k-1/2} \} \end{equation*}
\begin{equation*} \chi_{\lb 0,1/2)} = \text{char. fn. of }\lb 0, 1/2)\text{.} \end{equation*}

Consider

\begin{equation*} S_k = \sum_{n \lt I_k} \chi_{\lb 0,1/2)} ( \{\log (p_n) \} ) \end{equation*}
\begin{equation*} S_{k-1/2} = \sum_{n \lt I_{k-1/2}} \chi_{\lb 0,1/2\rb} ( \{\log (p_n) \} ) \end{equation*}

Note that \(S_k = S_{k-1/2}\text{.}\)

If uniformly distributed

\begin{equation*} \lim_{k\to \infty} \frac{S_k}{I_k} = \lim_{k\to \infty} \frac{S_{k-1/2}}{I_{k-1/2}} = L \end{equation*}
\begin{equation*} L\gt 0 \implies I_{k}/I_{k-1/2} \to 1\text{.} \end{equation*}

But the prime number theorem says

\begin{equation*} \pi(N) =\# \{ p \lt N \} \sim \frac{N}{\log N} \end{equation*}

so

\begin{equation*} I_{k}/I_{k-1/2} \sim \sqrt e \ne 1 \end{equation*}

so \(L=0\) but this cannot happen either.

\begin{equation*} S_{k-1/2} \ge \pi(e^{k-1/2}) - \pi (e^{k-1}) \end{equation*}
\begin{equation*} \ge e^{k-1} \frac{\sqrt e - 1}k \gt 0\text{.} \end{equation*}

Subsection 1.6 Weyl differencing and the Van der Corput inequality

Idea: Use the transformation

\begin{equation*} n \mapsto n+ m \end{equation*}
\begin{equation*} m \mapsto m \end{equation*}
\begin{equation*} \begin{pmatrix} n\\ m\end{pmatrix} \begin{pmatrix} 1\amp 1 \\ 0 \amp 1\end{pmatrix} \end{equation*}

switch orders

Figure 1.41 Regions
\begin{equation*} \sum_m \sum_n y_n \overline y_m = \sum_m \sum_n y_{n+m} \overline y_m \end{equation*}

Note: if \(n \ge 0\) then \(m = 1 ,\ldots, N-n\text{,}\) \(n = 0, 1,\ldots, N-1\text{.}\) If \(n \lt 0\) then

\begin{equation*} m = 1-n, \ldots, N \end{equation*}
\begin{equation*} n = 1-N, \ldots, -1 \end{equation*}
\begin{equation*} \sum_{m=1}^N |y_m|^2 + \sum_{n=1}^{N-1} \sum_{m=1}^{N-n} y_{n+m}\overline y_m + \sum_{n = -N+1}^{-1} \sum_{m= 1-n}^N y_{m+n} \overline y_m \end{equation*}
\begin{equation*} = \sum_{m=1}^N |y_m| ^2 + 2 \Re \left( \sum_{n=1}^{N-1} \sum _{m=1}^{N-n} y_{n+m} \overline y_m \right) \end{equation*}

Let \(\alpha \in \lb 0,1)\text{,}\) let \(h \in \ZZ \smallsetminus \{ 0 \}\)

\begin{equation*} S_h(N,\alpha) = \frac 1N \sum_{n=1}^N e(hx_n\alpha) \end{equation*}

the trick is now to bound \(|S_h(N,\alpha)|^2\)

\begin{equation*} N^2|S_h(N,\alpha)|^2 = \sum_{n=1}^N 1 +2\Re \left( \sum_{n=1}^{N-1} \sum_{m=1}^{N-n} e(h(x_{n+m} - x_m) \alpha) \right) \end{equation*}

so

\begin{equation*} \int_0^1 |S_h(N,\alpha)|^2 \diff \alpha = \frac 1N\text{.} \end{equation*}

Recall Fatou's lemma: Let \(f_n \gt 0\) be sequence of positive measurable functions, then if

\begin{equation*} f(x) = \liminf f_n(x) \end{equation*}

we have

\begin{equation*} \int f \le \liminf \int f_n\text{.} \end{equation*}

Take

\begin{equation*} f_n(\alpha) = \sum_{N=1}^{n} | S_h (N^2,\alpha)|^2 \end{equation*}

Fatou implies

\begin{equation*} \int_0^1 \sum_{n=1}^\infty |S_h(N^2,\alpha)|^2 \diff \alpha \le \sum_{N=1}^\infty \int_0^1 |S_h(N^2,\alpha)|^2\diff \alpha \lt \infty \end{equation*}

so

\begin{equation*} \sum_{N=1}^\infty |S_h(N^2 ,\alpha)|^2 \end{equation*}

is finite for almost all \(\alpha\) and for any \(h \ne 0\text{.}\) Then by Weyl's criterion we are done.

Consider

\begin{equation*} (H+ 1)^2 \left| \sum_{n=1}^N y_n\right|^2 \end{equation*}

then

\begin{equation*} (H+ 1)^2 \left| \sum_{n=1}^N y_n\right|^2 = \left| \sum_{h= 0 }^H \sum _{n=1}^N y+n\right |^2 \end{equation*}
\begin{equation*} = \left| \sum_{h= 0}^H \sum _{n\in \ZZ} y_{n+h}\right|^2 \end{equation*}
\begin{equation} = \left| \sum _{n\in \ZZ}\sum_{h= 0}^H y_{n+h}\right|^2\label{eqn-vandercorput}\tag{1.1} \end{equation}

note \(n\)-sum is nonzero for at most \(N+H\) terms \(1-H \le n \le N\) then Cauchy-Schwarz implies

\begin{equation*} \le (N+ H) \sum_n \left| \sum_{h=0}^H y_{n-h} \right|^2 \end{equation*}
\begin{equation} = (N+H) \sum_n \sum_{h_1 = 0}^H \sum_{h_2 = 0}^H y_{n+h_1} \overline{y_{n + h_2}}\label{eqn-vandercorput2}\tag{1.2} \end{equation}

let \(m = n + h_2\text{,}\) \(l = h_1 - h_2\text{.}\) Then

\begin{equation*} = (N+H) \sum_{-H \le l \le H} \sum_m \overline y_m y_{m-l} \underbrace{\left( \sum_{0\le h_1, h_2 \le H} 1 \right)}_{= I(H,l)} \end{equation*}

note \(I(H,l) = H+1 - |l|\text{,}\) so

\begin{equation*} = (N+H) \sum_{-H \le l \le H} H + 1 - |l | \sum_m \overline y_m y_{m-l} \end{equation*}

then if \(l = 0\) we get the first term of the statement, \(l \ne 0\) the other.

What is the difference between this and Weyl differencing? When \(H\) is large, not so much, but we can take \(H\) small now, shifting the weighting around. We change the balance to make one part shorter and the other longer.

We'll use Weyl's criterion and the van der Corput lemma. Fix, \(N \in \ZZ_{\ge1}\) a \(H\le N\text{.}\) Then for any \(k \in \ZZ\smallsetminus\{0\}\)

\begin{equation*} \left| \sum e( ka_n) \right|^2 \le \frac{ N+H}{H+ 1} N + 2 \frac{ N+H}{H+ 1} \sum \left(1 - \frac h{H+1} \right) \left| \sum_{n=1}^{N-h} e(b_h(n)k) \right| \end{equation*}
\begin{equation*} \implies \lim_{N \to \infty} \frac{1}{N^2} \left| \sum_{n=1}^N e(ka_n)\right|^2 \le \frac1{H+1} \end{equation*}

since \(H\) is arbitrary

\begin{equation*} \lim_{N \to \infty} \frac{1}{N^2} \left| \sum_{n=1}^N e(ka_n)\right|^2 = 0\text{.} \end{equation*}

Let \(\deg P = d\) then, for \(d = 1\) we are done by Weyl's criterion, for \(d \le D\) use van der Corput differencing. Note that for fixed \(h\text{,}\) \(P(n+h) -P(n)\) is of lower degree.

Subsection 1.7 A different perspective (Ergodic)

Furstenberg (1981 book) gives a different proof that \(\{n ^2 \alpha\}\) is uniformly distributed.

Ergodic theory 101
Definition 1.45 Ergodic measures

Let \(X\) be a locally compact space and \(H\) a non-compact group, \(H \acts X\text{.}\) \(\mu\) a probability measure on \(X\text{,}\) \(H\)-invariant. We say that \(\mu\) is an ergodic measure if any of the following equivalent conditions hold

  1. \(A \subseteq X\) and \(A \) is \(H\)-invariant (\(hA = A\) for all \(h \in H\)). Then
    \begin{equation*} \mu(A) = 1 \text{ or } \mu (A) = 0\text{.} \end{equation*}
  2. For \(f\) measurable \(\mu\) almost everywhere \(H\)-invariant
    \begin{equation*} f(hx) = f(x) \end{equation*}
    for almost all \(x\) then \(f\) is constant \(\mu\) almost everywhere.
  3. \(\mu\) is an extreme point on the convex set of \(H\)-invariant probability measures.
Definition 1.46 Uniquely ergodic actions

An action of a group \(H\) on a locally compact space \(X\) is uniquely ergodic if there is only one invariant probability measure on \(X\text{.}\)

Example 1.47

\(x\mapsto 5x\text{,}\) or \(x\mapsto \sqrt2 + x\) on \(T^1\) are ergodic, let's show that \(x\mapsto \sqrt 2 + x\) is uniquely ergodic.

Let \(\mu\) be an invariant probability measure on \(\lb 0,1\rb\) the \(n\)th Fourier coefficient

\begin{equation*} \hat \mu(n) = \int e(nx) \diff \mu (x) = e(n\alpha) \int e(nx) \diff \mu (x) \end{equation*}

so \(\hat \mu(n) = \delta(n)\text{.}\)

Example 1.49

For \(x\mapsto 5x\) \(f = \sum a_ne(nx) = \sum a_n e(5nx)\) so \(a_n = a_{5n}\)

\begin{equation*} \infty \gt \| f\|^2 = \|a_n\|^2 \implies a_n = 0 \forall n \ne 0 \end{equation*}
\begin{equation*} \implies f \equiv C \text{ a.e.} \end{equation*}
Definition 1.50 Equidistribution

A sequence of probability measures \(\mu_n\) on a locally compact space \(X\) is called \(\mu\)-equidistributed if they converge to \(\mu\) in the weak \(\ast\) topology. I.e.

\begin{equation*} \forall f \in C_c(X),\, \int f\diff \mu_n \to \int f \diff \mu\text{.} \end{equation*}
Remark 1.51

If we have a sequence \(a_n\) these define a sequence of measures

\begin{equation*} \mu_N = \frac 1N \sum_{n \le N} \delta( x - a_n) \end{equation*}

if these are equidistributed then

\begin{equation*} \lim_{N \to \infty} \frac 1 N \sum_{n=1}^N f(a_n) = \int_X f(x) \diff \mu (x)\text{.} \end{equation*}
Remark 1.53

This does not help us!

But the following does:

For any \(x\) construct

\begin{equation*} \mu_L = \frac 1 L \int_0^L h_t x \diff t \end{equation*}
\begin{equation*} \mu_L \to \mu, \, L\to \infty \end{equation*}

in the weak \(\ast\) sense.

Remark 1.55 Funny side remark (Benford's law)

First digit of a set of observations are usually 1 (roughly 30%).

Why? Assumption, the process follows a power law, if \(b^n\) is the first digit of this is

\begin{equation*} \log_{10} b^n = n \log_{10} b \end{equation*}

this is \(k\) iff \(n \log_{10} b \in \lb \log_{10} k , \log_{10} k + 1 \rb\text{.}\) If these are equidistributed then the probability of \(k\) is

\begin{equation*} \int_{\log_{10} k}^{\log_{10} k+1} 1 \diff x = \log(1 + 1/k)\text{.} \end{equation*}

Furstenberg

  1. Construct a suitable dynamical system on \(T^2\) (2-torus), for which a specific orbit gives the sequence \(\{n^2 \alpha \}\text{.}\)
  2. Will show that this is uniquely ergodic.
\begin{equation*} T \colon T^2 \to T^2 \end{equation*}
\begin{equation*} (x,y) \mapsto( x+\alpha, y + 2x + \alpha) \end{equation*}
\begin{equation*} T^n_\alpha(x,y) = (x+n\alpha, y+ 2nx + n^2 \alpha) \end{equation*}

in particular the orbit of \((0,0)\) is

\begin{equation*} T^n(0,0) = (n\alpha, n^2 \alpha) \end{equation*}

so if we show that \(T^n(0,0)\) is equidistributed we are done.

So we show that \(X,T\) is uniquely ergodic.

  1. Lebesgue is ergodic.
  2. Only ergodic measure
  1. \begin{equation*} f\in L^2 (T^2, \diff m) \end{equation*}
    \(T\)-invariant
    \begin{equation*} f(x,y) = \sum_{m,n} a_{m,n} e(mx+ ny) \end{equation*}
    \(T\)-invariance implies \(a_{m,n} = e((m+n)\alpha) a_{m+2n,n}\) in particular
    \begin{equation*} |a_{m,n} | = |a_{m+2n,n}| \end{equation*}
    so
    \begin{equation*} a_{m,n} = 0 \end{equation*}
    if \(n \ne 0\text{.}\) By Riemann-Lebesgue lemma. So for \(n= 0\)
    \begin{equation*} \sum a_{m,0} e(mx) \end{equation*}
    \(T\)-invariance implies \(a_{m,0} = e(m\alpha) a_{m,0}\) so \(a_{m,0} = 0\) for all \(m \ne 0\text{.}\) So \(f\) is constant almost everywhere.
  2. We use the following:

    Claim:

    \(g \colon T^1 \to T^1\) measurable, let \(T_g \colon T^2 \to T^2\)

    \begin{equation*} (x,y) \mapsto (x+\alpha, y+ g(\alpha)) \end{equation*}

    Then if the Lebesgue measure \(m\) is \(T_g\)-ergodic then it indeed is uniquely ergodic.

    Proof: exercise.