These slides are online at:
I'll report on two projects
CPP 2023
and
(forthcoming)
both projects in algebraic number theory / arithmetic geometry
Goal: Formalize more modern (20th century) work in number theory and arithmetic geometry, such as solving:
What are the integer solutions x,y∈Z of y2=x3−13?
theorem Mordell_minus13 (x y : ℤ) (h : y^2 = x^3 - 13) : (x, y) = (17, 70) ∨ (x, y) = (17, -70) := ...
These types of problems are known as Diophantine equations.
Congruences (Ancient, modern form due to Euler/Gauss 1750-1850) 2y2=4x3−13 has no integer solutions, as the LHS is even and the RHS is odd.
Descent (Ancient, used heavily by Fermat, 1600s) y2=2x2 has no (nonzero) integer solutions, as given a solution (x,y) we can show that (x/2,y/2) is also an integer solution, so there cannot be a smallest one (except (0,0)).
Quadratic Reciprocity (Gauss, ~1800) 2y2=x4−41z4 has 1 solution with x=y=z=0, otherwise: if q∣y then 41 is a square modulo q, this implies that y is a square modulo 41 so that 2 is a 4th power mod 41 (QR), but this is false!
Laurent Thery, 2004 in Coq: x2+y2=p has a solution for all primes p that are 1 mod 4.
Delahaye, Mayero, 2005, in Coq: x4+y4=z4 have no solutions in the non-zero integers. Also solve x2+y2=z2 and xy=2t2.
Oosterhuis, 2007, in Isabelle: x2+3y2=p has a solution for all primes p that are 1 mod 3 (also 1 mod 4 result), also non-existence of solutions to x3+y3=z3 and x4+y4=z4.
This type of equation is known as a Mordell equation, studied by mordell in early 1900s,
If x,y satisfy y2=x3−13 then (y+−13)(y−−13)=x3, we can check that
y+−13 and y−−13 are coprime.
If this implies that y+−13 and y−−13 are both cubes then we can conclude that y+−13=(a+b−13)3 =(a3−39ab2)+(3a2b−13b3)−13 =a(a2−39b2)+b(3a2−13b2)−13 ⟹b=±1 ⟹1=±(3a2−13) ⟹a=±2,b=−1⟹y=±70,x=17
In the integers it is almost true that if x,y are coprime and xy=zn then there are some x′,y′ such that x=x′n,y=y′n the only issue is ±1.
(−4)(−9)=36=62 but neither −4, −9, is literally a square.
In number rings we also have issues: Observe that (6+−13)(2−3−13)=51−16−13=(8−−13)2 but 6+−13 and 2−3−13 are coprime but both are not squares in Z[−13]!
Problem: When does r⋅s=t3⟹r=t13 and s=t23? in a number ring
The theory of class groups of number fields is a key concept in number theory, it controls this sort of factorization. For a number field K the definition of the ideal class group is Cl(K):={I nonzero ideal of OK}/I∼J⟺xI=yJ for some x,y∈OK
this forms a group under multiplication of ideals, elements are called ideal classes.
Here every principal ideal is equivalent to the ideal (1) (and this is an iff).
The advantage of using ideals instead of elements is that we have an analogue of unique factorization into primes for ideals of a Dedekind domain.
In the previous example: (6+−13)(2−3−13)=(8−−13)2 this factors on the level of ideals as (7,6+−13)2(11,2−3−13)2=(8−−13)2=(7,6+−13)2(11,2−3−13)2
But the ideals (7,6+−13),(11,2−3−13) are non-principal, non-trivial in the class group.
The upshot of this is that we have a finite abelian group Cl(K) such that
gcd(n,#Cl(K))=1⟹∃u,v∈OK×,r⋅s=tn⟹r=ut1n and s=vt2n
If we can compute the class group and the unit group (modulo nth powers) explicitly we can push the previous proof through.
The definition of, and finite abelian group structure of the class group was formalized in Lean by Baanen, Dahmen, Ashvni (Narayanan), Nuccio 2021.
Note that the definition is inherently noncomputable, without a given set of a generators there is no guarentee that all elements have been found. Checking that elements are non-trivial is also hard (and an interesting problem to try and certify), but at worst only makes your life harder later.
The unit group in the case of Q(−d) is fairly easy to calculate, using the norm, in all cases it has order a power of 2, so doesn't impact whether things are cubes or not.
Often proofs require explicit calculation one can also state general results, but number theorists are interested in classifying single cases, does it ever happen that the difference between the area of an integer sided square and the volume of an integer sided cube equals 13?
General / conceptual | Specific / calculational |
---|---|
All equations y2=x3+d have finitely many solutions | The solutions of y2=x3−13 have x=17 |
There is a bound on the norm of a generator of an element of a class group | Every element of the class group of Q(−13) contains the integer 12 |
All class groups of number fields are finite | The class group of Q(−13) is Z/2 |
Need to calculate the size of the class group, to check that it is not divisible by 3
Do this in two steps, bounding/listing the possible generators, and checking their order/relations.
Given a list of possible generators we can calculate the full group by checking whether various combinations of the generators are trivial or not.
Verifying that a particular ideal is trivial in the class group is easy, just exhibit a single generator.
To show that an ideal is not principal we make use of the norm, if an ideal I=(x) then Nm(I)=Nm(x), so if there are no solutions x to this equation the ideal is not principal. In our case this comes down to solving things like:
x2+13y2=2 which can be resolved via elementary methods.
Theorem (Minkowski): Every class in the ideal class group of K contains an integral ideal of norm not exceeding Minkowski's bound
MK=∣D∣(π4)r2nnn!
the key ingredient of this theorem is Minkowski's (fundamental) theorem, on lattice vectors inside convex symmetric bodies:
This ingredient has been formalized in Isabelle + other HOL style proof assistants, and also in Lean!
But the proof still involves a bit of work, and a lower tech result gives something more useful in our case.
We actually compute these groups for a handful of quadratic fields fully, which is of independent interest, we use:
theorem exists_mk0_eq_mk0' [is_dedekind_domain S] (I : (ideal S)⁰) (M : finset R) (hM : ∀ m ∈ M, algebra_map R S m ≠ 0)
(hex : ∀ (a : S) {b : S}, b ≠ 0 → (∃ (q : S) (r : R) (H : r ∈ M),
abv (algebra.norm R (r • a - q * b)) < abv (algebra.norm R b))) :
∃ (J : (ideal S)⁰), class_group.mk0 I = class_group.mk0 J ∧ algebra_map _ _ (∏ m in M, m) ∈ (J : ideal S) := ...
If you can find a set of denominators {di} such that every element of the number field has difference from some other element with denominator in {di} of norm less than 1 then every element of the class group contains ∏idi.
This reduces the problem of finding generators of the class group to that of covering the number field by lattice centered "balls"
We need to do lots of calculations in rings like Z[−13] (but also aim at more general examples like Z[32])
Want to explicitly evaluate expressions like (1+−13)(2−2−13) at many points in the proof and calculation of class (and unit) groups.
(0,1)2=(−13,0) As the multiplication is linear we can define it via a matrix on the standard basis (the times table of the ring). We introduce a tactic that uses the times table to do explicit calculations quickly
example (a b c d : ℤ) : (a + b * √2 + c * √3 + d * √2 * √3 : sqrt_2_sqrt_3) *
(a - b * √2 + c * √3 - d * √2 * √3 : sqrt_2_sqrt_3) *
(a + b * √2 - c * √3 - d * √2 * √3 : sqrt_2_sqrt_3) *
(a - b * √2 - c * √3 + d * √2 * √3 : sqrt_2_sqrt_3) =
a^4 - 4*a^2*b^2 + 4*b^4 - 6*a^2*c^2 - 12*b^2*c^2 + 9*c^4 + 48*a*b*c*d - 12*a^2*d^2 - 24*b^2*d^2 - 36*c^2*d^2 + 36*d^4 :=
begin
apply sqrt_2_sqrt_3.times_table.ext (λ k, _),
fin_cases k; times_table,
end
Have been considered (often) before
Harrison-Théry 1997, HOL + Maple
Lewis-Wu 2021, Lean + Mathematica
We propose a prototypical link from Lean to SageMath (free OSS)
This is designed to be robust, connecting to Sage on the user's computer or via the web (c.f. polyrith)
Rather than ad hoc we want a system that can be extended in future with less effort (meta-meta-programming).
We can use this to implement tactics to prove ideal equalities and to factor ideals and polynomials, that call Sage to do some calcuations whose results are then verified.
meta def replace_certified_sage_equality
(matcher : expr → tactic bool)
(reify_type : Type∗ )
(reify : expr → tactic reify_type)
(sage_input : reify_type → tactic string)
(output_type : Type∗ )
[has_from_sage output_type]
(convert_output : output_type → tactic expr)
(validator : expr → expr → tactic expr)
(name : string)
(wit : parse (tk "with" ∗ > pexpr)?) :
tactic unit
With Sacha Huriot-Tattegrain and Sander Dahmen
Tate's algorithm concerns, elliptic curves: y2+a1xy+a3y=x3+a2x2+a4x+a6 with coefficients in a discrete valuation ring (v(ab)=v(a)+v(b),v(a+b)≥max(v(a),v(b)))
It essentially computes invariants of a nice scheme theoretic model of the curve.
Much of the work involves making changes of variables (x,y)↦(ux+r,uy+sx+t) to ensure that the coefficients ai (or various polynomials in them b2,b4,b6,b8 and c4,c6) have sufficiently high valuation.
Tate's algorithm:
Bridging the gap between ITPs and computer algebra
Papadopoulos I. Sur la classification de Néron des courbes elliptiques en caractéristique résiduelle 2 et 3
The tables of results go on for several pages, and the proofs many more (34 pages in the published version)
The contents of these tables is used in current research and proofs surrounding the generalized Fermat equation.
This is the sort of mathematics that formalization may actually make clearer, detailed analysis of the behaviour of a complicated procedure under different assumptions on the input.
The gold standard is to implement an algorithm in a proof assistant and prove it computes the same function as a natural "mathematical" definition in the system (we do not do this)
Another possibility is simply to implement the algorithm as described in a paper, and verify that its properties agree with what is claimed in theory. Here we use the algorithm itself as the definition of the function that it computes.
By implementing the algorithm in a descriptive, rather than low level way we still gain some confidence in correctness. We can prove substeps and transformation rules are correct.
By verifying enough properties the algorithm can sometimes be seen, on paper, to actually implement the desired function, even before the actual definition can be formalized.
We have formalized two versions of the algorithm:
Both are fast enough to run in practice (even without lots of optimization)
The project began with Sacha's internship at the VU ~1.5 years ago, so we were early users of Lean 4. Several bugs in Lean4/Mathlib4 tactics and interfaces found and many fixed, and even a very subtle one in the compiler core were discovered (hit in 0.0000001% of examples).
Current goals:
This may be the first ever implementation of the algorithm that runs on such rings
The algorithm itself is long and complicated in the geneneral case, proof of termination is via an a priori bounded quantity that increases every time the subprocedure runs twice.
Many goals combining known inequalities on valuations of our invaraints to obtain new ones
R: Type u
inst✝: CommRing R
inst: IsDomain R
p: R
evr: EnatValRing p
e: Model R
h1: v evr.valtn e.a1 ≥ 1
h2: v evr.valtn e.a2 ≥ 2
⊢ v evr.valtn (e.a1 * e.a1 + 4 * e.a2) ≥ 2
In order to formalize this cleanly we need specific api around, as well as decision procedures for the ordered semiring N∪∞
We simply do cases on each variable, simplifying inequalities that contain infinity to simpler statements and recursing, eventually reching a state where existing automation applies.
This still requires a non-trivial implementation to get this right.
Alternative approach using metavariables and Aesop was also promising.
lemma v_discr_of_v_ai {p : R} {q : ℕ} (valp : SurjVal p) (e : ValidModel R) (hq : q > 1)
(h1 : valp e.a1 ≥ 1) (h2 : valp e.a2 = 1) (h3 : valp e.a3 ≥ q)
(h4 : valp e.a4 ≥ q + 1) (h6 : valp e.a6 ≥ 2 * q) :
valp e.discr ≥ 2 * q + 3 := by
have h2' : valp e.b2 ≥ 1 := v_b2_of_v_a1_a2 valp e h1 h2
have h4' : valp e.b4 ≥ q + 1 := v_b4_of_v_a1_a3_a4 valp e h1 h3 h4
have h6' : valp e.b6 ≥ 2 * q := v_b6_of_v_a3_a6 valp e h3 h6
have h8' : valp e.b8 ≥ 2 * q + 1 := v_b8_of_v_ai valp e h1 h2 h3 h4 h6
apply_rules [val_add_ge_of_ge, val_sub_ge_of_ge] <;>
. simp
elinarith
if done by hand we end up with a 50 line argument here!
More generally there is in fact: Effective Quantifier Elimination for Presburger Arithmetic with Infinity, Aless Lasaruk and Thomas Sturm
Similarly we would benefit from a tactic for working with ring expressions modulo p or modulo given relations.
We still did a lot of experimentation and copy-pasting with computer algebra systems to write down the necessary transformations in general.
Thesis: more meta-meta-programming needed to make it easier for the formalizer to implement such tactics.
http://lmfdb.xyz/EllipticCurve/Q/14/a/1
For the invitation, and you for listening!
from sage.plot.plot3d.shapes import *
S = 0
N = 6
for x in range(N):
for y in range(N):
for z in range(N):
if y == 0 and z <= N - 2 and x <= N - 2:
continue
S += Box([.49,.49,.49], color='lightgray', alpha=1).translate(x,y,z)
S.scale(.1,.1,.1).show()
sage: S=points([])
....: for a in srange(-5,5):
....: for b in srange(-5,5):
....: S += implicit_plot(equify((x-a)^2 +13 * (y-b)^2 < 1), (-5,5), (-5,5))
....:
sage:
....: for a in srange(-10,10):
....: for b in srange(-10,10):
....: S += implicit_plot(equify((2*x-a)^2 +13 * (2*y-b)^2 < 1), (-5,5), (-5,5),color="red")
....:
sage:
....: for a in srange(-15,15):
....: for b in srange(-15,15):
....: S += implicit_plot(equify((3*x-a)^2 +13 * (3*y-b)^2 < 1), (-5,5), (-5,5),color="green")
S
sage: S
Launched png viewer for Graphics object consisting of 1400 graphics primitives
sage: S=points([])
....: for a in srange(-5,5):
....: for b in srange(-5,5):
....: S += implicit_plot(equify((x-a)^2 +13 * (y-b)^2 < 1), (-5,5), (-5,5))
....:
sage: S
Launched png viewer for Graphics object consisting of 100 graphics primitives
sage:
....: for a in srange(-10,10):
....: for b in srange(-10,10):
....: S += implicit_plot(equify((2*x-a)^2 +13 * (2*y-b)^2 < 1), (-5,5), (-5,5),color="red")
....:
S
sage: S
sage: from sage.plot.plot3d.shapes import *
....: def t(s,N=17, M=70):
....: S = 0
....: for x in range(N):
....: for y in range(N):
....: for z in range(N):
....: boxpos = (1-s)*vector([x,y,z]) + s*vector([(x + N*y + N^2*z) % M,(x+ N*y + N^2*z)//M,0])
....: S += Box([.45,.45,.45], color='lightgray',axis=False).translate(boxpos)
....: return S
....: animate([t(i) for i in srange(0,1.05, 0.1)])
Launched gif viewer for Animation with 11 frames