Skip to main content

Section 2.9 Belyi's theorem, effective Mordell and ABC (Angus)

We begin with one of the most famous results in arithmetic geometry.

There are many proofs of this, Falting's being the original and most famous.

Remark 2.9.2.

Falting's proof is not effective. That is, it cannot predict the number of points or give any bounds.

Today we'll show how this theorem follows from a (much harder conjecture), but how this nonetheless gives new insight into the question of effectiveness. Specifically we'll show ABC implies Mordell.

“Mordell is as easy as ABC”- Zagier

This is a remarkably deep statement about the integers. Something surprising about how one compares the additive and multiplicative structures of the integers.

For our purposes (to connect it to the curves and Mordell) we'd like to remove the dependence on integrality and coprimality, by making it scaling invariant.

We now define

\begin{equation*} H(A,B,C) = \prod_{v}\max(|A|_v,|B|_v,|C|_v) \end{equation*}
\begin{equation*} N(A,B,C) = \prod_{p\in I} p \end{equation*}

for

\begin{equation*} I = \{p \text{ prime} : \max(|A|_p,|B|_p,|C|_p) \gt \min(|A|_p,|B|_p,|C|_p)\}\text{.} \end{equation*}
\begin{equation*} H(\lambda A,\lambda B,\lambda C) = H(A,B,C) \end{equation*}
\begin{equation*} N(\lambda A,\lambda B,\lambda C) = N(A,B,C) \end{equation*}

for \(\lambda, A,B,C \in \QQ\units\text{.}\) Moreover if \(A,B,C \in \ZZ\) and \(\gcd = 1\) then we recover the original definition.

Since we have \(A+ B+C = 0\) and our functions are scaling invariant, they only depend on \(r= - A/B\text{.}\) We'll also reformulate it over an arbitrary number field \(K\text{.}\)

Note that to satisfy the hypotheses of the conjecture we require

\begin{equation*} r \in \PP^1_K \smallsetminus \{0,1,\infty\}\text{.} \end{equation*}

We now define

\begin{equation*} H(r) = \prod_{v}\max(1,|r|_v) \end{equation*}
\begin{equation*} N(r) = \prod_{p\in I} p \end{equation*}

for

\begin{equation*} I = \{p \text{ prime} : \max(v_p(r), v_p(1/r), v_p(r-1)) \gt 0 \}\text{.} \end{equation*}
Remark 2.9.5.

In fact this new height is off from the old one by a constant factor, but since ABC allows for a constant factor this won't trouble us.

Motivation: ABC implies Fermat bound.

One can see this simply by assuming a solution

\begin{equation*} x^n + y^n =z^n ,\, n \ge3 \end{equation*}

and setting

\begin{equation*} (A,B,C)= (x^n,y^n, z^n) \end{equation*}

then

\begin{equation*} N(A,B,C) = \prod_{p|ABC} p \le |xyz| \lt \max(|x|^3,|y|^3,|z|^3) = H(A,B,C)^{3/n}\text{.} \end{equation*}

So setting

\begin{equation*} \epsilon = 1 - 3/n \end{equation*}

for \((A,B,C)\) s.t. \(H(A,B,C)\) is sufficiently large we get a contradiction to ABC. Thus ABC gives us a bound on the possible solutions to the Fermat equation, reducing the remainder of the conjecture to a finite computation.

Let us phrase this in the following alternate way: Let

\begin{equation*} F_n \colon x^n + y^n + z^n = 0 \end{equation*}

be the Fermat curve and consider the function

\begin{equation*} f\colon F_n \to \PP^1 \end{equation*}
\begin{equation*} (x:y:z) \mapsto -\left(\frac{x}{y}\right)^n \end{equation*}

ramified over \(0,1, \infty\text{.}\)

Note 2.9.6.

\(\deg(f) = n^2\)

Each of \(0,1,\infty\) has \(n \) preimages in \(F_n(\overline \QQ)\text{.}\)

The idea now is that \(N(A,B, C)\) is measuring ramification, while \(H(A, B,C)\) is a height function. The note above tells us that each of \(0, 1, \infty\) contributes a factor of \(O(H(A,B,C)^{n/n^2})\) to \(N(A,B,C)\text{.}\) So in this formulation, what we used was the existence of a rational function \(f\) such that

\begin{equation*} \#\{p\in C(\overline \QQ): f(p) \in \{0,1,\infty\}\} \lt \deg (f)\text{.} \end{equation*}

If \(C\) has genus 0 or 1, no such \(f\) can exist (hint: Riemann-Hurwitz).

ABC implies a bound on Mordell.

We begin with a technical proposition:

The genus 0 case follows from the fact that the \(f\) is a rational function (and in fact the error term is \(O(1)\)) (exercise). For the general case we need the theory of log heights on curves. From this we require the following

  • For \(D\) a divisor on \(C\) we have a height function
    \begin{equation*} h_D(\cdot) \end{equation*}
    which is well defined up to \(O(1)\text{.}\)
  • If
    \begin{equation*} D= \sum m_k D_k \end{equation*}
    is a decomposition into irreducible divisors, then
    \begin{equation*} h_D(P) = \sum m_k h_{D_k}(P)\text{.} \end{equation*}
  • For \(\Delta\) a degree 0 divisor
    \begin{equation*} h_{\Delta} (P) = O(\sqrt{\log H(f(P))} + 1)\text{.} \end{equation*}

Let \(D = \divisor_0(f) = \sum m_k D_k\text{,}\) \(D' = \sum_{f(P) = 0} (P)\) then \(b_f(0) = \deg D'\text{.}\) Then

\begin{equation*} \log H(f(P)) = h_D(P) + O(1) = \sum m_k h_{D_k}(P) + O(1) \end{equation*}

since \(\log H(f(P))\) is also a height function relative to \(D\text{.}\) We now turn to \(N_0(f(P))\text{.}\) Any prime occurring in this must also occur in \(h_{D_k}(P)\) for some \(k\) (except for a finite set \(\{p : p|f \text{ or } p \text{ bad red. for } C\}\)). Then

\begin{equation*} N_0(f(P)) \lt \sum h_{D_k}(P) + O(1) = h_{D'}(P) + O(1)\text{.} \end{equation*}

Letting

\begin{equation*} \Delta = (\deg D) D' - (\deg D') D \end{equation*}

we have

\begin{equation*} h_{\Delta} (P) = O(\sqrt{\log H(f(P))} + 1) \end{equation*}

thus

\begin{equation*} \log N_0(f(P)) \lt h_{D'} (P) + O(1) \end{equation*}
\begin{equation*} = \frac{1}{\deg D} (\deg D') h_{D'} (P) + O(1) \end{equation*}
\begin{equation*} = \frac{1}{\deg D} (\deg D') h_{D} (P) + O(\sqrt{\log H(f(P))} + 1) \end{equation*}
\begin{equation*} = \frac{1- b_f(0)}{d} \log H(f(P)) + O(\sqrt{\log H(f(P))} + 1) \end{equation*}
Remark 2.9.9.

One can show the above for \(N_1, N_\infty\) instead making the appropriate replacements for \(f\text{.}\)

Adding the three terms together we get

\begin{equation*} \log N_0(f(P))N_1(f(P)) N_\infty(f(P)) \end{equation*}
\begin{equation*} \lt \left(\left(1- \frac{b_f(0)}{d}\right) +\left(1- \frac{b_f(1)}{d}\right)+\left(1- \frac{b_f(\infty)}{d}\right)\right) \log H(f(P)) + O(\cdots) \end{equation*}
\begin{equation*} \log N(f(P)) \lt \frac 1d \left(\# f\inv(0)+\# f\inv(1)+\# f\inv(\infty) \right) \log H(f(P)) + O(\cdots) \end{equation*}
\begin{equation*} \lt \frac md \log H(f(P))+ O(\cdots) \end{equation*}

where

\begin{equation*} m = \#\{P \in C(\overline \QQ) : f(P) \in \{0,1,\infty\}\} \end{equation*}

exponentiating we get

\begin{equation*} N(f(P)) \lt H(f(P))^{m/d} K\text{.} \end{equation*}

Let \(C\) be a given curve of genus \(g \ge 2\) Belyi's theorem gives a function

\begin{equation*} f\colon C \to \PP^1 \end{equation*}

ramified over \(\{0,1,\infty\}\text{.}\) By Riemann-Hurwitz \(m = d + 2 - 2g\text{,}\) \(d = \deg (f)\) \(m\) as above. Thus \(m \lt d\text{,}\) thus we can pick \(0 \lt \epsilon \lt 1 - \frac md\) and so for sufficiently large \(H(f(P))\) (i.e. all but finitely many) we have a counterexample to ABC.

Remark 2.9.11. Closing remarks.

Belyi's theorem gives an algorithm for determining \(f \colon C \to \PP^1\) i.e. it is effective.

One can also show ABC implies Siegel's theorem.

In fact it can be shown that a particular effective form of Mordell (applied to \(y^2 + y = x^5\)) for all number fields implies ABC. This is related to Szpiro's conjecture.

References:

  1. Elkies - ABC implies Mordell
  2. Serre - Lectures on Mordell-Weil