My Brain is OpenAlex J. Best's blog, posts on maths, programming and anything else I feel like.
https://alexjbest.github.io/blog/
Thu, 07 Dec 2017 21:08:43 +0000Thu, 07 Dec 2017 21:08:43 +0000Jekyll v3.6.2Quirky Facts: Group objects in the category of groups<blockquote>
<p>Recently I was thinking some more about (abelian) groups and I was led to the question of what the group objects in the category of groups are.
This is a funny question in and of itself and the answer also has a sense of humor, hence a new series “Quirky facts”.
I don’t just work on amusing concequences of axioms of algebra, honestly…
But it is fun and provides excellent blog fodder.</p>
</blockquote>
<p>So a group object in some category (that has a terminal object and products) is just a few morphisms that give an object the structure of a group.
By way of some examples: groups are group objects in the category of sets, lie groups the group objects in smooth manifolds, algebraic groups the group objects in the category of algebraic varieties, the list goes on.</p>
<p>Thinking about this too late at night one is led to the question: what is a group object in the category of groups?
Is it all the groups? Some subset perhaps?
Are there more somehow (like how a set can often be given two or more group structures)?</p>
<p>In any event I encourage the reader to try and work it out for a while before reading on, it’ll be worth it, I promise!</p>
<p>With that said, let’s begin, let $G$ be a group object in the category of groups, so we have $\times\colon G \times G \to G$ and $e\colon {1} \to G$ and $i\colon G \to G\text{,}$ all group homomorphisms.
$G$ is itself a group, so we’ll denote its own product by $\cdot$. The first thing to note is that as the group object identity map $e$ must be a group homomorphism, the identity element for $\times$ must be the same as the underlying group operation $\cdot$ as ${1}\ni 1 \mapsto 1 \in G$ via $e$.</p>
<p>Now we have a set $G$ with essentially two group operations on it $\cdot, \times\text{,}$ the fact that $\times$ has to be a group morphism and that the product group structure on the product group $G \times G$ is given by $(a_1,b_1)\cdot(a_2,b_2) = (a_1a_2,b_1b_2)$ means that</p>
<script type="math/tex; mode=display">\begin{equation*}
(a_1\times b_1)\cdot (a_2\times b_2) = (a_1\cdot a_2) \times(b_1\cdot b_2)
\end{equation*}</script>
<p>for any $a_1,a_2,b_1,b_2 \in G\text{.}$ As this is symmetric in $\cdot , \times$ this also says that the group $(G, \times)$ has a group object structure given by $\cdot\text{!}$ At this point one might start to wonder, is $\cdot = \times\text{?}$ So let’s throw in some elements, what about</p>
<script type="math/tex; mode=display">\begin{equation*}
a \cdot b = (a\times 1) \cdot ( 1 \times b) = (a\cdot 1) \times (1\cdot b) = a\times b
\end{equation*}</script>
<p>Ah hah! So the group operations were really the same!</p>
<p>So the answer is just totally boring then? Every group is a group object with the expected operation alone? Well not quite, so far we’ve just been talking about monoids really, i.e. we haven’t mentioned the inverse at all. In order to be a group object the “new” inversion map must be a group homomorphism for the underlying multiplication structure, which is really the group object one too, so by uniqueness of inverses must be the same map. So the cases where this all goes through are the groups for which inversion is a group homomorphism. This is precisely the abelian groups, no more no less!</p>
<p>Okay so in fact one can see the commutativity from the above discussion of the compatibility of multiplications as</p>
<script type="math/tex; mode=display">\begin{equation*}
a \cdot b = (a\times 1) \cdot ( 1 \times b) = (1\times a) \cdot (b\times 1)\text{,}
\end{equation*}</script>
<p>so one obtains the corresponding result even just for monoids, but thinking about abelianness and inverses being homomorphisms is what sent me down this little diversion so it seemed rude to cut it out.</p>
<p>It turns out this is known as the <a href="https://en.wikipedia.org/wiki/Eckmann%E2%80%93Hilton_argument">Eckmann-Hilton</a> argument/principle/theorem/show.
It goes back a long way (1961) and one can read much more elsewhere, there is even a <a href="https://www.youtube.com/watch?v=Rjdo-RWQVIY">Catsters video</a> or two.</p>
<p>To give a couple of more useful (though admittedly slightly less amusing) applications:
This in fact shows that higher homotopy groups $\pi_n(X),\,n \ge 2$ are abelian.
If you (like me) think you know a different proof, <a href="https://mathoverflow.net/questions/81090/applications-of-eckmann-hilton-argument-to-topology#comment207944_81090">you don’t</a>! (or maybe you do, who knows what you know).</p>
<p>Finally this also shows that $\pi_1(G,e)$ is abelian for a topological group $G$!
<a href="https://amathew.wordpress.com/2011/07/23/the-etale-fundamental-group-of-an-algebraic-group-is-not-necessarily-abelian/">No such luck for etale fundamental groups of algebraic groups though</a>.</p>
<p>P.S. anyone who wishes to have someone to blame for the outpouring of uselessness seen here (for example; my advisor) need look no further than <a href="https://strangenewuniverse.wordpress.com/">Sachi</a> for inspiring me to write something here again.</p>
Mon, 04 Dec 2017 00:00:00 +0000
https://alexjbest.github.io/blog/maths/2017/12/04/quirky-facts.html
https://alexjbest.github.io/blog/maths/2017/12/04/quirky-facts.htmlMathsAll rings are commutative (additively)<p>Normally a ring is defined to be an abelian group (written additively) with an extra multiplication operation under which everything is a semigroup.
Why do we restrict to an abelian group additively though? What happens if we try and make the same definition with an arbitrary group?</p>
<p>Well in fact we would get exactly the same objects! Indeed the rest of the ring axioms are enough to force $a + b = b+a$ always, hence removing that axiom is harmless except that it requires you to prove this to get the theory off the ground.</p>
<p>We can prove this very explicitly by observing that</p>
<script type="math/tex; mode=display">a + a + b + b = (1 +1 )(a+b) = (a+b) + (a+b) = a+b+a+b,\,\forall a,b\in R</script>
<p>hence</p>
<script type="math/tex; mode=display">a+b = b+a \in R.</script>
<p>While this calculation shows that the base group must be abelian in order to have a ring structure on top, I think a more conceptual explanation for why this is the case would be nice also.
The best I’ve got so far in that vein is the following:</p>
<p>The ring structure on a group $G$ defines an injective function $L_{\bullet}\colon G \to \operatorname{End}(G)$ via the left multiplication map $g\mapsto (h\mapsto g\times h)$, the ring axioms guarantee that each such map is a group endomorphism.
Additionally distribution implies that $L_{g+h} = L_g + L_h$.
The existence of a unit for multiplication implies that the identity map $\operatorname{id}_G\colon g\mapsto g$ is in the image of $G$ inside $\operatorname{End}(G)$, hence the doubling/squaring map is in the image $G$, and so is an endomorphism of $G$.
This fact that squaring is a group morphism then guarantees that the group is abelian. (For a different example of this this statement consider a group where squaring sends everything to the identity, i.e. in which every non-identity element has order 2, this group must then also be abelian).</p>
<p>This still feels a little convoluted to me and the calculation is of course a lot more revealing, but trying to unpack what’s going on in the calculation and why it turns out that way is quite interesting, and who knows maybe it will lend itself to some kind of generalisation?</p>
<p>This little fact is literally the first exercise in Lam’s A first course in noncommutative rings, but it had certainly escaped my attention until now.</p>
Sun, 02 Jul 2017 00:00:00 +0100
https://alexjbest.github.io/blog/maths/2017/07/02/all-rings-are-commutative.html
https://alexjbest.github.io/blog/maths/2017/07/02/all-rings-are-commutative.htmlMathsA trig integral<blockquote>
<p>Its summertime so I’ve been trying out a few project Euler problems again.
In the process of doing one of them I realised something about trig integrals that I forgot, or maybe even never knew.
I thought this was cute and wanted to share it.</p>
</blockquote>
<p>Consider the integral</p>
<script type="math/tex; mode=display">\int_{-1}^a \sqrt{1-x^2} \mathrm d x,</script>
<p>this is one where you have to substitute a trig function to work it out, I always found these a little bit magic, so lets look at it in a more elementary way.</p>
<p>Let’s assume to start that $ -1 \le a \le 0$ The function $\sqrt{1-x^2}$ gives us the height of a radius 1 circle in the $xy$ plane, and so computing the integral above is the same as finding the area of enclosed by the $x$-axis, circle and line $y = a$.
However we know what the area of a wedge of a circle is, the full circle (of radius 1) has area $\pi$ and so a wedge of angle $\theta$ has area $\pi\cdot \theta/2\pi = \theta/2$.
Now the area we are looking for is the area of a wedge minus the area of the extra triangle.
What’s the angle of the wedge? We’re stopping at $y=a$ so the wedge has angle $\cos^{-1}(-a)$.
Hence the area of the wedge is $\cos^{-1} (-a) /2 $.</p>
<p>The added triangle will have side length $\mid a\mid$ (i.e. $-a$) and $\sqrt{1-a^2}$ so we get the final formula</p>
<script type="math/tex; mode=display">\int_{-1}^{a} \sqrt{1-x^2} \mathrm d x = \frac{\cos^{-1} (-a)}{2} +\frac{a\sqrt{1-a^2}}{2}.</script>
<p>Using the fact that $\cos^{-1}(-x) = \pi/2 + \sin^{-1}(x)$ we get a slightly cleaner formula for the indefinite integral</p>
<script type="math/tex; mode=display">\int \sqrt{1-x^2} \mathrm d x = \frac{\sin^{-1} x}{2} +\frac{x\sqrt{1-x^2}}{2} + C.</script>
<p>Which before now I would have said was hard to remember, but after seeing it like this, probably ok.</p>
<p>Another interesting note is that if $a \ge 0$ using the above formula is also correct, here adding the approriate triangle to the wedge gives us the integral.</p>
Thu, 01 Jun 2017 00:00:00 +0100
https://alexjbest.github.io/blog/maths/2017/06/01/a-trig-integral.html
https://alexjbest.github.io/blog/maths/2017/06/01/a-trig-integral.htmlMathsEvery finite group is a Galois group<blockquote>
<p>The fact that every finite group is a Galois group is pretty well known (and in fact this post is basically just a transcription of the one in Lang’s Algebra) but I’ve been thinking about it recently and its a really cool result so I figured I’d share it.
Who knows, maybe I’ll post about the extension to profinite groups next time?</p>
</blockquote>
<p>The starting point here is the following theorem of Artin, telling us that we can cut out Galois extensions with any group of field automorphisms we like.</p>
<h5>
<span class="type">Theorem</span> <span class="title">(Artin)</span>
</h5>
<p id="p-140">Let \(K\) be a field and \(G\) a finite group of field automorphisms of \(K\text{,}\) then \(K\) is a Galois extension of the fixed field \(K^G\) with galois group \(G\text{,}\) moreover \([K:K^G] = \#G\text{.}\)</p>
<h5><span class="type">Proof</span></h5>
<p id="p-142">Pick any \(\alpha \in K\) and consider a maximal subset \(\{\sigma_1, \ldots, \sigma_n\}\subseteq G\) for which all \(\sigma_i \alpha\) are distinct. Now any \(\tau \in G\) must permute the \(\sigma_i \alpha\) as it is an automorphism and if some \(\tau\sigma_i \alpha \ne \sigma_j\alpha\) for all \(j\) then we could extend our set of \(\sigma\)s by adding this \(\tau\sigma_i\text{.}\)</p>
<p id="p-143">So \(\alpha\) is a root of
\begin{equation*}
f_\alpha(X) = \prod_{i=1}^n (X- \sigma_i\alpha)\text{,}
\end{equation*}
note that \(f_\alpha\) is fixed by \(\tau\) by the above. So all the coefficients of \(f_\alpha\) are in \(K^G\text{.}\) By construction \(f_\alpha\) is a separable polynomial as the \(\sigma_i\alpha\) were chosen distinct, note that \(f_\alpha\) also splits into linear factors in \(K\text{.}\)</p>
<p id="p-144">The above was for arbitrary \(\alpha \in K\) so we have just shown directly that \(K\) is a separable and normal extension of \(K^G\text{,}\) which is the definition of Galois. As every element of \(K^G\) is a root of a polynomial of degree \(n\) we cannot have the extension degree \([K:K^G] \gt n\text{.}\) But we also have a group of \(n\) automorphisms of \(K\) that fix \(K^G\) so \([K : K^G] \ge n\) and hence \([K : K^G] = n\text{.}\)</p>
<p>So now with this in hand we just have to realise our group as a group of field automorphisms of some field.</p>
<h5>
<span class="type">Corollary</span>
</h5>
<p id="p-145">Every finite group is a Galois group.</p>
<h5>Proof</h5>
<p id="p-146">Let \(k\) be an arbitrary field, \(G\) any finite group. Now take \(K = k(\overline g:g\in G)\) (i.e. adjoin all elements of \(G\) to \(k\) as indeterminates, denoted by \(\overline g\)). Now we have a natural action of \(G\) on \(K\) defined via \(h\cdot \overline g= \overline {hg}\) and extending \(k\)-linearly. Now \(K\) and \(G\) satisfy the statement of Artin's theorem and hence \(K/K^G\) is a Galois extension with Galois group \(G\text{.}\)</p>
<p>It is interesting to note that we could have started with any field we liked and built a Galois extension with both fields extensions of the base we picked.
They won’t necessarily share a huge amount with it, however it is interesting to note that the characteristic will have to be the same and so we can do this for whatever our favourite characteristic is.</p>
Tue, 02 May 2017 00:00:00 +0100
https://alexjbest.github.io/blog/maths/2017/05/02/every-group-is-a-galois-group.html
https://alexjbest.github.io/blog/maths/2017/05/02/every-group-is-a-galois-group.htmlMathsRibet's Converse to Herbrand: Part II - Cuspstruction<blockquote>
<p>So around a month back I posted the first post in this 2 (or more who knows?) part series on Ribet’s converse to Herbrand’s theorem. This is the sequel, Cuspstruction, it is basically just my personal notes from my STAGE talk with the same name.
We were following <a href="https://math.berkeley.edu/~ribet/Articles/invent_34.pdf">Ribet’s paper</a> and this is all about section 3.
The goal is to construct a cusp form with some very specific properties, which we can then take the corresponding Galois representation and use that to obtain the converse to Herbrand.
In this post though we’ll be focussing on constructing the cusp form, hence, Cuspstruction.</p>
</blockquote>
<h1 class="heading hide-type" alt="Subsubsection 35.1.2.1 Cuspstruction"><span class="title">Cuspstruction</span>
</h1>
<p id="p-1098">We will make use the following building blocks, some specific <a knowl="./knowl/def-modular-form.html" knowl-id="xref-def-modular-form" alt="Definition 25.2.3 Modular functions" title="Definition 25.2.3 Modular functions">modular forms</a> of weights 2 and type \(\epsilon\)
\begin{align*}
G_{2,\epsilon} &= L(-1,\epsilon)/2 + \sum_{n=1}^\infty \sum_{d|n} d \epsilon(d) q^n\\
s_{2,\epsilon} &= \sum_{n=1}^\infty \sum_{d|n} d \epsilon(n/d) q^n
\end{align*}
the latter is not a cusp form (not cuspidal at the other cusp of \(\Gamma_1(p)\)) we call such forms semi-cusp forms, denote the space of such by \(S^\infty\) (not standard notation) we will also use
\begin{equation*}
G_{1,\epsilon} = L(0,\epsilon) + \sum_{n=1}^\infty \sum_{d|n} \epsilon(d) q^n
\end{equation*}
the Eisenstein series are all hecke eigenfunctions for \(T_n\) \(n\) coprime to \(p\text{.}\)</p>
<p id="p-1099">Fix a <a knowl="./knowl/def-prime-ideal.html" knowl-id="xref-def-prime-ideal" alt="Definition 6.0.6 Prime ideals" title="Definition 6.0.6 Prime ideals">prime ideal</a> \(\mathfrak p|p\) of \(\mathbf{Q}(\mu_{p-1})\text{,}\) can think of \(\mu_p\subseteq \mathbf{Q}_p^*\) and take \(\omega\colon (\mathbf{Z}/p\mathbf{Z})^* \xrightarrow\sim \mu_{p-1}\) the unique <a knowl="./knowl/def-character.html" knowl-id="xref-def-character" alt="Definition 19.0.2 Characters" title="Definition 19.0.2 Characters">character</a> with \(\omega(d)\equiv d \pmod{\mathfrak p}\) for all \(d\in \mathbf{Z}\text{.}\)</p>
<p id="p-1100">We start with a key lemma, will use this repeatedly.</p>
<article class="theorem-like" id="lem-ribet-31"><h5 class="heading">
<span class="type">Lemma</span> <span class="title">3.1</span>
</h5>
<p id="p-1101">Let \(k\) be even \(2\le k \le p-3\text{,}\) then \(G_{2,\omega^{k-2}},G_{1,\omega^{k-1}}\) have \(\mathfrak p\)-integral \(q\)-expansions in \(\mathbf{Q}(\mu_{p-1})\) which are congruent mod \(\mathfrak p\) to
\begin{equation*}
-\frac{B_k}{2k} + \sum_{n=1}^\infty \sum_{d|n} d^{k-1} q^n
\end{equation*}
(this is the \(q\)-expansion of \(G_k\)).</p></article>
<div class="posterior">
<div class="hidden-knowl-wrapper"><a knowl="" class="id-ref" refid="hk-proof-104" id="proof-104"><article class="hiddenproof"><h5 class="heading"><span class="type">Proof</span></h5></article></a></div>
<div id="hk-proof-104" style="" class=""><article class="hiddenproof"><p id="p-1102">By our choice of \(\omega\) we get the desired result for the non-constant terms of the \(q\)-expansion.</p>
<p id="p-1103">So it remains to prove that
\begin{equation*}
L(-1,\omega^{k-2})\equiv -\frac{B_k}{k}\pmod{\mathfrak p}
\end{equation*}
\begin{equation*}
L(0,\omega^{k-1})\equiv -\frac{B_k}{k}\pmod{\mathfrak p}
\end{equation*}
we make use of the following expressions (see probably Washington)
\begin{equation*}
L(0,\epsilon) = -\frac 1p \sum_{n=1}^p \epsilon(n) (n- \frac p2)
\end{equation*}
\begin{equation*}
L(-1,\epsilon) = -\frac{1}{2p} \sum_{n=1}^p \epsilon(n) (n^2 - pn + \frac{p^2}{6})
\end{equation*}
\(\omega(n)\equiv n^p \pmod{\mathfrak p^2}\) so
\begin{equation*}
pL(0,\omega^{k-1}) = -\sum_{n=1}^p \omega^{k-1}(n)(n-p/2)
\end{equation*}
\begin{equation*}
\equiv -\sum_{n=1}^p n^{p(k-1) + 1} \pmod{\mathfrak p^2}
\end{equation*}
and
\begin{equation*}
pL(-1,\omega^{k-2}) = -\frac 12\sum_{n=1}^p \omega^{k-2}(n)(n^2 - pn +\frac{p^2}{6})
\end{equation*}
\begin{equation*}
\equiv -\frac 12\sum_{n=1}^p n^{p(k-2) + 2} \pmod{\mathfrak p^2}
\end{equation*}
but for all \(t\gt 0\) even we have the congruence
\begin{equation*}
\sum_{n=1}^{p-1} n^t \equiv pB_t \pmod{p^2}
\end{equation*}
and so
\begin{equation*}
pL(0,\omega^{k-1}) \equiv -pB_{p(k-1)+1} \pmod{p^2}
\end{equation*}
cancelling
\begin{equation*}
L(0,\omega^{k-1}) \equiv -B_{p(k-1)+1} \pmod{p}
\end{equation*}
\begin{equation*}
\equiv -B_{p(k-1)+1} \pmod{p}
\end{equation*}
\begin{equation*}
\equiv -\frac{B_k}{k} \pmod{p}
\end{equation*}
as \(p(k-1) + 1 \equiv k \pmod{p-1}\) using <a knowl="./knowl/thm-kummer-congruence.html" knowl-id="xref-thm-kummer-congruence" alt="Theorem 35.1.5 Kummer's congruence" title="Theorem 35.1.5 Kummer's congruence">Kummer's congruence</a>. Similarly
\begin{equation*}
L(-1,\omega^{k-2})\equiv -\frac 12 \frac 2k B_k \pmod{p}\text{.}
\end{equation*}
</p></article></div>
</div>
<article class="theorem-like" id="cor-ribet-32"><h5 class="heading">
<span class="type">Corollary</span> <span class="title">3.2</span>
</h5>
<p id="p-1104">Let \(k\) be as before and \(2\le n,m\le p-3\) both even with \(n+m \equiv k \pmod{p-1}\text{.}\) Then the product
\begin{equation*}
G_{1,\omega^{n-1}}G_{1,\omega^{m-1}} \in M_2(\Gamma_1(p), \omega^{k-2})
\end{equation*}
with \(q\)-expansion coefficients still \(\mathfrak p\)-integers in \(\mathbf{Q}(\mu_{p-1})\text{.}\)</p></article>
<p id="p-1105">Note: the constant term is a \(\mathfrak p\) unit unless \(B_n\) or \(B_m\) is divisible by \(p\text{.}\) We need to remove this condition.</p>
<article class="theorem-like" id="thm-ribet-33"><h5 class="heading">
<span class="type">Theorem</span> <span class="title">3.3</span>
</h5>
<p id="p-1106">Let \(k\) be as before, then there exists \(g\in M_2(\Gamma_1(p), \omega^{k-2})\) whose \(q\)-expansion coefficients are \(\mathfrak p\)-integers and whose constant coefficient is 1.</p></article>
<div class="posterior">
<span class="hidden-knowl-wrapper"><a knowl="" class="id-ref" refid="hk-proof-105" id="proof-105"><article class="hiddenproof"><h5 class="heading"><span class="type">Proof</span></h5></article></a></span><span id="hk-proof-105" style="" class=""><article class="hiddenproof"><p id="p-1107">It suffices to find a form with constant coefficient a \(\mathfrak p\)-unit. If \(p\nmid B_k\) then we can use \(G_{2,\omega^{k-2}}\) by <a knowl="./knowl/lem-ribet-31.html" knowl-id="xref-lem-ribet-31" alt="Lemma lemma 3.1 3.1" title="Lemma lemma 3.1 3.1">lemma 3.1</a>.</p>
<p id="p-1108">If \(p|B_k\) try the possible products
\begin{equation*}
G_{1,\omega^{m-1}}G_{1,\omega^{n-1}}
\end{equation*}
with \(2\le m,n\le p-3\) even as above with \(m+n \equiv k \pmod{p-1}\text{.}\) We want to claim that at least one of these must work (i.e. we have a pair \(m,n\) with \(p\nmid B_m, p\nmid B_n\)). If this isn't the case if we let
\begin{equation*}
t = \#\{2\le n \text{ even} \le p-3: p|B_n\}\text{,}
\end{equation*}
we must have \(t \ge (p-1)/4\text{,}\) assume otherwise, we will derive a contradiction from this.</p>
<p id="p-1109">Greenberg showed that
\begin{equation*}
\frac{h_p}{h_{\mathbf{Q}(\mu_p)^+}} = h^*_p = 2^? p \prod_{\substack{k=2\\ \text{even}}}^{p-2} L(0,\omega^{k-1})
\end{equation*}
(this is obtained by taking a quotient of the analytic <a knowl="./knowl/def-ideal-class-gp.html" knowl-id="xref-def-ideal-class-gp" alt="Definition 17.0.9 Ideal class groups and class numbers" title="Definition 17.0.9 Ideal class groups and class numbers">class number</a> formulas for \(\mathbf{Q}(\mu_p),\mathbf{Q}(\mu_p)^+\)) but by <a knowl="./knowl/lem-ribet-31.html" knowl-id="xref-lem-ribet-31" alt="Lemma lemma 3.1 3.1" title="Lemma lemma 3.1 3.1">lemma 3.1</a> we know that \(\mathfrak p^t\) will divide the <a knowl="./knowl/def-product.html" knowl-id="xref-def-product" alt="Definition 14.0.12 Products and coproducts" title="Definition 14.0.12 Products and coproducts">product</a> of \(L\)-values. And so \(p^t|h_p^*\text{,}\) we will get a contradiction if we show
\begin{equation*}
h_p^*\le p^{(p-1)/4}\text{.}
\end{equation*}
</p>
<p id="p-1110">Work of Carlitz-Olson '55, Maillet's determinant shows that
\begin{equation*}
h_p^* = \pm\frac{D}{p^{(p-3)/2}}
\end{equation*}
where \(D\) is the determinant of a \((p-1)/2 \times (p-1)/2\) matrix with entries in \([1,p-1]\text{.}\) So recalling Hadamard's inequality
\begin{equation*}
|\det(v_1\cdots v_n)| \le \prod_{i=1}^n ||v_i||\text{,}
\end{equation*}
or the simpler corollary
\begin{equation*}
|A_{ij}| \le B \implies |\det(A)| \le n^{n/2}B^{n}
\end{equation*}
and applying with \(B = p, n=(p-1)/2\) gives
\begin{equation*}
|D| \le \left(\frac{p-1}{2}\right)^{(p-1)/4} p^{(p-1)/2} \lt 2^{-(p-1)/2} p^{(3p-3)/4}
\end{equation*}
so
\begin{equation*}
h_p^* \lt p^{(p+3)/4} 2^{-(p-1)/4}\text{.}
\end{equation*}
And we are done as \(h_p^* = 1\) for \(p\le 19\) and as \(p\le 2^{(p-1)/4}\) for \(p\gt 19\text{.}\)</p></article></span>
</div>
<p id="p-1111">Now we fix \(2\le k\le p-3\) even with \(p|B_k\) and let \(\epsilon = \omega^{k-2}\text{,}\) \(k\) must really be at least 4 (or even 10) so \(\omega\) is a non-trivial even <a knowl="./knowl/def-character.html" knowl-id="xref-def-character" alt="Definition 19.0.2 Characters" title="Definition 19.0.2 Characters">character</a>, we will work in weight 2, type \(\epsilon\) from now on.</p>
<article class="theorem-like" id="prop-ribet-34"><h5 class="heading">
<span class="type">Proposition</span> <span class="title">3.4</span>
</h5>
<p id="p-1112">There exists
\begin{equation*}
f= \sum_{n=1}^\infty a_nq^n\in S_2^\infty(\Gamma_1(p), \epsilon)
\end{equation*}
with \(\mathfrak p\)-integer \(a_p\in \mathbf{Q}(\mu_{p-1})\) with
\begin{equation*}
f\equiv G_k \equiv G_{2,\epsilon}\pmod{\mathfrak p}\text{.}
\end{equation*}
</p></article>
<div class="posterior">
<div class="hidden-knowl-wrapper"><a knowl="" class="id-ref" refid="hk-proof-106" id="proof-106"><article class="hiddenproof"><h5 class="heading"><span class="type">Proof</span></h5></article></a></div>
<div id="hk-proof-106" style="" class=""><article class="hiddenproof"><p id="p-1113">Let
\begin{equation*}
f= G_{2,\epsilon} -cg
\end{equation*}
with \(c = L(-1,\epsilon)/2\text{.}\) This is a semi-cusp form by construction. We get \(f\equiv G_{2,\epsilon}\pmod{\mathfrak p}\) because
\begin{equation*}
c \equiv -B_k/2k \equiv 0 \pmod{\mathfrak p}
\end{equation*}
by <a knowl="./knowl/lem-ribet-31.html" knowl-id="xref-lem-ribet-31" alt="Lemma lemma 3.1 3.1" title="Lemma lemma 3.1 3.1">lemma 3.1</a> (again!) as we assume \(p|B_k\) now. Additionally by the same lemma \(G_k \equiv G_{2,\epsilon}\pmod{\mathfrak p}\text{.}\)</p></article></div>
</div>
<p id="p-1114">Let's take stock, we have a semi-cuspidal form which mod \(\mathfrak p\) looks like \(G_k\) and is hence an eigenform mod \(\mathfrak p\) but we want an actual eigenform, bro do you even lift?</p>
<article class="theorem-like" id="prop-ribet-35"><h5 class="heading">
<span class="type">Proposition</span> <span class="title">3.5</span>
</h5>
<p id="p-1115">There exists
\begin{equation*}
0\ne f'\in S_2(\Gamma_1(p), \epsilon)
\end{equation*}
which is an eigenform for all \(T_n\) with \((n,p) =1\text{.}\) With the eigenvalue \(\lambda(\ell)\) for \(T_\ell\) (\(\ell \ne p\)) satisfying
\begin{equation*}
\lambda(\ell) \equiv 1+\epsilon(\ell)\ell\equiv 1 +\ell^{k-1}\pmod{\mathfrak P}
\end{equation*}
for some prime \(\mathfrak P| \mathfrak p\) in \(\mathbf{Q}(\mu_{p-1},\lambda(n):(n,p)=1)\text{.}\)</p></article>
<div class="posterior">
<div class="hidden-knowl-wrapper"><a knowl="" class="id-ref" refid="hk-proof-107" id="proof-107"><article class="hiddenproof"><h5 class="heading"><span class="type">Proof</span></h5></article></a></div>
<div id="hk-proof-107" style="" class=""><article class="hiddenproof"><p id="p-1116">We start with \(f\) from the proposition above it's a mod \(\mathfrak p\) eigenform and so we can use <a knowl="./knowl/lem-deligne-serre-lifting.html" knowl-id="xref-lem-deligne-serre-lifting" alt="Lemma 35.1.1 6.11 Deligne-Serre lifting lemma" title="Lemma 35.1.1 6.11 Deligne-Serre lifting lemma">Deligne-Serre lifting lemma</a> (6.11 in Formes modulaires de poids 1) to obtain a semi-cusp form \(f'\text{,}\) that is an eigenvalue for the Hecke operators stated.</p>
<p id="p-1117">To promote the semi-cusp form to a full blown cusp form we observe that the space \(S_2^\infty(\Gamma_1(p),\epsilon)\) is generated by the cusp forms and \(s_{2,\epsilon}\) which is also an eigenform we only have to check that \(f'\) isn't \(s_{2,\epsilon}\) (or it's scalar multiple). So we check the eigenvalues mod \(\mathfrak p\text{.}\)
\begin{equation*}
\epsilon(\ell) + \ell \equiv 1 + \ell\epsilon(\ell)\pmod{\mathfrak p}
\end{equation*}
implies \(\epsilon(\ell) = 1\text{,}\) but \(\epsilon\) is non-trivial!</p></article></div>
</div>
<p id="p-1118">The final challenge is to ensure that \(f'\) is also an eigenform for \(T_{p^i}\text{.}\)</p>
<article class="theorem-like" id="prop-ribet-36"><h5 class="heading">
<span class="type">Proposition</span> <span class="title">3.6</span>
</h5>
<p id="p-1119">\(f'\) is an eigenform for all Hecke operators, so we can normalise as
\begin{equation*}
f' = \sum_{n=1}^\infty \lambda(n)q^n\text{.}
\end{equation*}
</p></article>
<div class="posterior">
<div class="hidden-knowl-wrapper"><a knowl="" class="id-ref" refid="hk-proof-108" id="proof-108"><article class="hiddenproof"><h5 class="heading"><span class="type">Proof</span></h5></article></a></div>
<div id="hk-proof-108" style="" class=""><article class="hiddenproof"><p id="p-1120">Use the theory of newforms. There are no oldforms for \(\Gamma_1(p)\) as
\begin{equation*}
M_2(\operatorname{SL}_2(\mathbf{Z})) = 0\text{.}
\end{equation*}
A newform that is an eigenform for all hecke operators coprime to the level \(p\) is also an eigenform for the remaining Hecke operators.</p></article></div>
</div>
<p id="p-1121">So in conclusion:</p>
<article class="theorem-like" id="thm-ribet-37"><h5 class="heading">
<span class="type">Theorem</span> <span class="title">3.7</span>
</h5>
<p id="p-1122">Assume \(p|B_k\) then there exists
\begin{equation*}
f =\sum_{n=1}^\infty a_nq^n\in S_2(\Gamma_1(p),\epsilon)
\end{equation*}
which is an eigenform for all \(T_n\) and \(\mathfrak p|p\) an ideal of \(\mathbf{Q}(a_n)\) such that
\begin{equation*}
a_\ell \equiv 1+\ell^{k-1}\equiv 1+\epsilon(\ell)\ell \pmod{\mathfrak p}
\end{equation*}
for all \(\ell \ne p\text{.}\)</p></article>
<article class="remark-like" id="remark-20"><h5 class="heading">
<span class="type">Remark</span>
</h5>
<p id="p-1123">Word on the internet is that Mazur, Mazur-Wiles' proof of the Main conjecture of Iwasawa theory is modelled on this.</p></article>
<p id="p-1124">That's all for now, in the remainder of Ribet's paper he constructs a Galois representation from this and use it to prove the theorem.</p>
Sun, 02 Apr 2017 00:00:00 +0100
https://alexjbest.github.io/blog/maths/2017/04/02/ribets-converse-to-herbrand-ii.html
https://alexjbest.github.io/blog/maths/2017/04/02/ribets-converse-to-herbrand-ii.htmlMathsRibet's Converse to Herbrand: Part I<blockquote>
<p>Tomorrow I’m giving the STAGE talk on Ribet’s converse to Herbrand’s theorem, after I’ll try and post more notes, but for now here’s a little intro to get us thinking about the problem.</p>
</blockquote>
<h1><span class="title">Ribet's converse to Herbrand</span>
</h1>
<section class="introduction" id="introduction-6"><p id="p-942">We are interested in the class <a knowl="./knowl/def-group.html" knowl-id="xref-def-group" alt="Definition 1.0.1 Groups" title="Definition 1.0.1 Groups">groups</a> of <a knowl="./knowl/def-nf.html" knowl-id="xref-def-nf" alt="Definition 16.0.1 Number fields" title="Definition 16.0.1 Number fields">cyclotomic fields</a>
\begin{equation*}
h_p = h_{\mathbf{Q}(\mu_p)}\text{.}
\end{equation*}
Lets list the first few of these</p>
<figure class="figure-like" id="table-1"><table>
<tr>
<td class="l m b0 r0 l0 t0 lines">\(p\)</td>
<td class="l m b0 r0 l0 t0 lines">2</td>
<td class="l m b0 r0 l0 t0 lines">3</td>
<td class="l m b0 r0 l0 t0 lines">5</td>
<td class="l m b0 r0 l0 t0 lines">7</td>
<td class="l m b0 r0 l0 t0 lines">11</td>
<td class="l m b0 r0 l0 t0 lines">13</td>
<td class="l m b0 r0 l0 t0 lines">17</td>
<td class="l m b0 r0 l0 t0 lines">19</td>
<td class="l m b0 r0 l0 t0 lines">23</td>
<td class="l m b0 r0 l0 t0 lines">29</td>
<td class="l m b0 r0 l0 t0 lines">31</td>
<td class="l m b0 r0 l0 t0 lines">37</td>
<td class="l m b0 r0 l0 t0 lines">41</td>
<td class="l m b0 r0 l0 t0 lines">43</td>
<td class="l m b0 r0 l0 t0 lines">47</td>
<td class="l m b0 r0 l0 t0 lines">53</td>
<td class="l m b0 r0 l0 t0 lines">59</td>
<td class="l m b0 r0 l0 t0 lines">61</td>
<td class="l m b0 r0 l0 t0 lines">67</td>
</tr>
<tr>
<td class="l m b0 r0 l0 t0 lines">\(h_p\)</td>
<td class="l m b0 r0 l0 t0 lines">1</td>
<td class="l m b0 r0 l0 t0 lines">1</td>
<td class="l m b0 r0 l0 t0 lines">1</td>
<td class="l m b0 r0 l0 t0 lines">1</td>
<td class="l m b0 r0 l0 t0 lines">1</td>
<td class="l m b0 r0 l0 t0 lines">1</td>
<td class="l m b0 r0 l0 t0 lines">1</td>
<td class="l m b0 r0 l0 t0 lines">1</td>
<td class="l m b0 r0 l0 t0 lines">3</td>
<td class="l m b0 r0 l0 t0 lines">8</td>
<td class="l m b0 r0 l0 t0 lines">9</td>
<td class="l m b0 r0 l0 t0 lines">37</td>
<td class="l m b0 r0 l0 t0 lines">121</td>
<td class="l m b0 r0 l0 t0 lines">211</td>
<td class="l m b0 r0 l0 t0 lines">695</td>
<td class="l m b0 r0 l0 t0 lines">4889</td>
<td class="l m b0 r0 l0 t0 lines">41241</td>
<td class="l m b0 r0 l0 t0 lines">76301</td>
<td class="l m b0 r0 l0 t0 lines">853513</td>
</tr>
<tr>
<td class="l m b0 r0 l0 t0 lines">\(p|h_p\)</td>
<td class="l m b0 r0 l0 t0 lines">no</td>
<td class="l m b0 r0 l0 t0 lines">no</td>
<td class="l m b0 r0 l0 t0 lines">no</td>
<td class="l m b0 r0 l0 t0 lines">no</td>
<td class="l m b0 r0 l0 t0 lines">no</td>
<td class="l m b0 r0 l0 t0 lines">no</td>
<td class="l m b0 r0 l0 t0 lines">no</td>
<td class="l m b0 r0 l0 t0 lines">no</td>
<td class="l m b0 r0 l0 t0 lines">no</td>
<td class="l m b0 r0 l0 t0 lines">no</td>
<td class="l m b0 r0 l0 t0 lines">no</td>
<td class="l m b0 r0 l0 t0 lines">yes</td>
<td class="l m b0 r0 l0 t0 lines">no</td>
<td class="l m b0 r0 l0 t0 lines">no</td>
<td class="l m b0 r0 l0 t0 lines">no</td>
<td class="l m b0 r0 l0 t0 lines">no</td>
<td class="l m b0 r0 l0 t0 lines">yes</td>
<td class="l m b0 r0 l0 t0 lines">no</td>
<td class="l m b0 r0 l0 t0 lines">yes</td>
</tr>
</table></figure><article class="definition-like" id="def-reg-prime"><h5 class="heading">
<span class="type">Definition</span> <span class="title">Regular primes</span>
</h5>
<p id="p-943">We'll call primes for which \(p\nmid h_p\) <em class="terminology">regular primes</em>. Otherwise <em class="terminology">irregular primes</em>.</p></article><p id="p-944">Why is this important from a number theory perspective?</p>
<article class="theorem-like" id="theorem-89"><h5 class="heading">
<span class="type">Theorem</span> <span class="title">Kummer 1850</span>
</h5>
<p id="p-945">Fermat's last theorem is true for <a knowl="./knowl/def-reg-prime.html" knowl-id="xref-def-reg-prime" alt="Definition 34.1.2 Regular primes" title="Definition 34.1.2 Regular primes">regular prime</a> exponents.</p></article><p id="p-946">It's hard to tell when a prime is a <a knowl="./knowl/def-reg-prime.html" knowl-id="xref-def-reg-prime" alt="Definition 34.1.2 Regular primes" title="Definition 34.1.2 Regular primes">regular prime</a>, you'd have to compute the class <a knowl="./knowl/def-group.html" knowl-id="xref-def-group" alt="Definition 1.0.1 Groups" title="Definition 1.0.1 Groups">group</a>.</p>
<article class="definition-like" id="def-bernoulli-numbers"><h5 class="heading">
<span class="type">Definition</span> <span class="title">Bernoulli numbers</span>
</h5>
<p id="p-947">The <em class="terminology">Bernoulli numbers</em> are the sequence of integers given by the exponential generating function
\begin{equation*}
\frac{x}{e^x - 1} + \frac x2 - 1 = \sum_{n\ge 2}^\infty B_k\frac{x^k}{k!}\text{.}
\end{equation*}
</p></article><p id="p-948">These have a number of cool properties, such as:</p>
<article class="theorem-like" id="thm-kummer-congruence"><h5 class="heading">
<span class="type">Theorem</span> <span class="title">Kummer's congruence</span>
</h5>
<p id="p-949">If \(h\equiv k \pmod {p-1}\) then
\begin{equation*}
\frac{B_k}{k}\equiv \frac{B_h}{h} \pmod{p}\text{.}
\end{equation*}
</p></article><p id="p-950">But most important for us is the relation to <a knowl="./knowl/def-ideal-class-gp.html" knowl-id="xref-def-ideal-class-gp" alt="Definition 16.0.9 Ideal class groups and class numbers" title="Definition 16.0.9 Ideal class groups and class numbers">class numbers</a>:</p>
<article class="theorem-like" id="thm-kummers-criterion"><h5 class="heading">
<span class="type">Theorem</span> <span class="title">Kummer's Criterion</span>
</h5>
<p id="p-951">\(p\) is a <a knowl="./knowl/def-reg-prime.html" knowl-id="xref-def-reg-prime" alt="Definition 34.1.2 Regular primes" title="Definition 34.1.2 Regular primes">irregular prime</a> if and only if there exists some \(2\le k \le p-3\text{,}\) even with \(p\) dividing the numerator of \(B_k\text{.}\)</p></article><p id="p-952">This is a great theorem relating <a knowl="./knowl/def-ideal-class-gp.html" knowl-id="xref-def-ideal-class-gp" alt="Definition 16.0.9 Ideal class groups and class numbers" title="Definition 16.0.9 Ideal class groups and class numbers">class numbers</a> to the <a knowl="./knowl/def-bernoulli-numbers.html" knowl-id="xref-def-bernoulli-numbers" alt="Definition 34.1.4 Bernoulli numbers" title="Definition 34.1.4 Bernoulli numbers">Bernoulli numbers</a>, but can we do better? What if I know a specific \(k\) so that \(p|B_k\text{,}\) can I say anything more specific about the class group? Yes; there is a strengthening of this theorem due in this form to Herbrand (in one direction) and Ribet (later, in the other direction).</p>
<p id="p-953">First we need to recall the mod \(p\) cyclotomic <a knowl="./knowl/def-character.html" knowl-id="xref-def-character" alt="Definition 18.0.2 Characters" title="Definition 18.0.2 Characters">character</a> \(\chi\colon \operatorname{Gal}(\overline{\mathbf{Q}}/\mathbf{Q}) \to \mathbf{F}_p^*\) defined by
\begin{equation*}
\zeta_p^{\chi(\sigma)} = \sigma (\zeta_p)\text{.}
\end{equation*}
</p>
<article class="theorem-like" id="thm-herbrand-ribet"><h5 class="heading">
<span class="type">Theorem</span> <span class="title">Herbrand-Ribet</span>
</h5>
<p id="p-954">Write \(C = \operatorname{cl}(\mathbf{Q}(\mu_p))/\operatorname{cl}(\mathbf{Q}(\mu_p))^p\) this is an \(\mathbf{F}_p\) Galois <a knowl="./knowl/def-lie-gp-rep.html" knowl-id="xref-def-lie-gp-rep" alt="Definition 27.5.17 Lie group representations" title="Definition 27.5.17 Lie group representations">representation</a> which decomposes as a sum of eigenspaces
\begin{equation*}
C = \bigoplus_{i=0}^{p-1} C(\chi^i)\text{.}
\end{equation*}
Then for \(2\le k\le p-3\) even we have
\begin{equation*}
p| B_k \iff C(\chi^{1-k}) \ne 0\text{.}
\end{equation*}
</p></article><p id="p-955">The \(\Leftarrow\) direction was proved by Herbrand in 1932. And the \(\Rightarrow \) direction by Ribet in 1974.</p>
<p id="p-956">Now for completeness here is a table of factorisations of <a knowl="./knowl/def-bernoulli-numbers.html" knowl-id="xref-def-bernoulli-numbers" alt="Definition 34.1.4 Bernoulli numbers" title="Definition 34.1.4 Bernoulli numbers">Bernoulli number</a> numerators.</p>
<figure class="figure-like" id="table-2"><table>
<tr>
<td class="l m b0 r0 l0 t0 lines">\(k\text{:}\)</td>
<td class="l m b0 r0 l0 t0 lines">\(2\)</td>
<td class="l m b0 r0 l0 t0 lines">\(4\)</td>
<td class="l m b0 r0 l0 t0 lines">\(6\)</td>
<td class="l m b0 r0 l0 t0 lines">\(8\)</td>
<td class="l m b0 r0 l0 t0 lines">\(10\)</td>
<td class="l m b0 r0 l0 t0 lines">\(12\)</td>
<td class="l m b0 r0 l0 t0 lines">\(14\)</td>
<td class="l m b0 r0 l0 t0 lines">\(16\)</td>
<td class="l m b0 r0 l0 t0 lines">\(18\)</td>
<td class="l m b0 r0 l0 t0 lines">\(20\)</td>
<td class="l m b0 r0 l0 t0 lines">\(22\)</td>
<td class="l m b0 r0 l0 t0 lines">\(24\)</td>
<td class="l m b0 r0 l0 t0 lines">\(26\)</td>
<td class="l m b0 r0 l0 t0 lines">\(28\)</td>
<td class="l m b0 r0 l0 t0 lines">\(30\)</td>
<td class="l m b0 r0 l0 t0 lines">\(32\)</td>
<td class="l m b0 r0 l0 t0 lines">\(34\)</td>
<td class="l m b0 r0 l0 t0 lines">\(36\)</td>
<td class="l m b0 r0 l0 t0 lines">\(38\)</td>
<td class="l m b0 r0 l0 t0 lines">\(40\)</td>
<td class="l m b0 r0 l0 t0 lines">\(42\)</td>
<td class="l m b0 r0 l0 t0 lines">\(44\)</td>
<td class="l m b0 r0 l0 t0 lines">\(46\)</td>
<td class="l m b0 r0 l0 t0 lines">\(48\)</td>
<td class="l m b0 r0 l0 t0 lines">\(50\)</td>
<td class="l m b0 r0 l0 t0 lines">\(52\)</td>
<td class="l m b0 r0 l0 t0 lines">\(54\)</td>
<td class="l m b0 r0 l0 t0 lines">\(56\)</td>
<td class="l m b0 r0 l0 t0 lines">\(58\)</td>
</tr>
<tr>
<td class="l m b0 r0 l0 t0 lines">Numerator of \(B_{k}\text{:}\)</td>
<td class="l m b0 r0 l0 t0 lines">\(1\)</td>
<td class="l m b0 r0 l0 t0 lines">\(-1\)</td>
<td class="l m b0 r0 l0 t0 lines">\(1\)</td>
<td class="l m b0 r0 l0 t0 lines">\(-1\)</td>
<td class="l m b0 r0 l0 t0 lines">\(5\)</td>
<td class="l m b0 r0 l0 t0 lines">\(-1 \cdot 691\)</td>
<td class="l m b0 r0 l0 t0 lines">\(7\)</td>
<td class="l m b0 r0 l0 t0 lines">\(-1 \cdot 3617\)</td>
<td class="l m b0 r0 l0 t0 lines">\(43867\)</td>
<td class="l m b0 r0 l0 t0 lines">\(-1 \cdot 283 \cdot 617\)</td>
<td class="l m b0 r0 l0 t0 lines">\(11 \cdot 131 \cdot 593\)</td>
<td class="l m b0 r0 l0 t0 lines">\(-1 \cdot 103 \cdot 2294797\)</td>
<td class="l m b0 r0 l0 t0 lines">\(13 \cdot 657931\)</td>
<td class="l m b0 r0 l0 t0 lines">\(-1 \cdot 7 \cdot 9349 \cdot 362903\)</td>
<td class="l m b0 r0 l0 t0 lines">\(5 \cdot 1721 \cdot 1001259881\)</td>
<td class="l m b0 r0 l0 t0 lines">\(-1 \cdot 37 \cdot 683 \cdot 305065927\)</td>
<td class="l m b0 r0 l0 t0 lines">\(17 \cdot 151628697551\)</td>
<td class="l m b0 r0 l0 t0 lines">\(-1 \cdot 26315271553053477373\)</td>
<td class="l m b0 r0 l0 t0 lines">\(19 \cdot 154210205991661\)</td>
<td class="l m b0 r0 l0 t0 lines">\(-1 \cdot 137616929 \cdot 1897170067619\)</td>
<td class="l m b0 r0 l0 t0 lines">\(1520097643918070802691\)</td>
<td class="l m b0 r0 l0 t0 lines">\(-1 \cdot 11 \cdot 59 \cdot 8089 \cdot 2947939 \cdot 1798482437\)</td>
<td class="l m b0 r0 l0 t0 lines">\(23 \cdot 383799511 \cdot 67568238839737\)</td>
<td class="l m b0 r0 l0 t0 lines">\(-1 \cdot 653 \cdot 56039 \cdot 153289748932447906241\)</td>
<td class="l m b0 r0 l0 t0 lines">\(5^{2} \cdot 417202699 \cdot 47464429777438199\)</td>
<td class="l m b0 r0 l0 t0 lines">\(-1 \cdot 13 \cdot 577 \cdot 58741 \cdot 401029177 \cdot 4534045619429\)</td>
<td class="l m b0 r0 l0 t0 lines">\(39409 \cdot 660183281 \cdot 1120412849144121779\)</td>
<td class="l m b0 r0 l0 t0 lines">\(-1 \cdot 7 \cdot 113161 \cdot 163979 \cdot 19088082706840550550313\)</td>
<td class="l m b0 r0 l0 t0 lines">\(29 \cdot 67 \cdot 186707 \cdot 6235242049 \cdot 3734958336910412\)</td>
</tr>
</table></figure>
</section>
Thu, 02 Mar 2017 00:00:00 +0000
https://alexjbest.github.io/blog/maths/2017/03/02/ribets-converse-to-herbrand.html
https://alexjbest.github.io/blog/maths/2017/03/02/ribets-converse-to-herbrand.htmlMathsPSA: Quadratic reciprocity<p>The quadratic reciprocity law works for Legendre symbols with odd entries, not just primes
<script type="math/tex">\left(\frac{a}{b}\right) = \left(\frac{b}{a}\right) (-1)^{\frac{a-1}{2}\frac{b-1}{2}}.</script></p>
Tue, 31 Jan 2017 00:00:00 +0000
https://alexjbest.github.io/blog/maths/2017/01/31/psa-quadratic-reciprocity.html
https://alexjbest.github.io/blog/maths/2017/01/31/psa-quadratic-reciprocity.htmlMathsMaking T-Shirts!<p>This is a placeholder for a post that will appear tomorrow to stop Beeminder taking my money.
It’ll be worth the wait!</p>
Sat, 31 Dec 2016 00:00:00 +0000
https://alexjbest.github.io/blog/2016/12/31/making-t-shirts.html
https://alexjbest.github.io/blog/2016/12/31/making-t-shirts.htmlWhat's that category?<blockquote>
<p>Some vague thoughts about a weird category me and my housemate got to thinking about recently, unfortunately I’m a little too sleepy to write anything more coherent right now, but beeminder demands tribute.</p>
</blockquote>
<p>When doing non-abelian group cohomology (and many other things) you end up dealing with the category of pointed sets, that is, sets with a point specified, and where morphisms must map specified points to each other.
This category is actually fairly nice (or at least nicer than plain $\mathrm{Set}$ is anyway) insomuch as it has a zero object (i.e. an object that is both initial and terminal), the one element set, this allows us to make sense of kernels etc. which is a nice thing to be able to talk about.
So why do we get a 0-object here? Well the one element set is already terminal in $\mathrm{Set}$ so we get terminalness for free, as for <em>why</em> it is initial it’s revealing to describe the category of pointed sets in a different way:
We can equivalently describe it as the <a href="https://en.wikipedia.org/wiki/Comma_category#Coslice_category">coslice category</a> $\{* \} \downarrow \mathrm{Set}$, where the objects are morphisms in $\mathrm{Set}$ from the one element set (giving you the specified point) and the new morphisms are commuting triangles in $\mathrm{Set}$.
When we do this we of course get the object we cosliced at (or more specifically its identity morphism) as an initial object, and as we already had a unique map from any object to this object we get that its identity map to itself is terminal also.</p>
<p>This got me thinking about $\mathrm{CRing}$, which infamously doesn’t have kernels, so what if we follow the recipe above or at least it’s dual and consider the slice category over the initial object $\mathbf{Z}$.
By the dual of the above this should now have a 0-object (the identity map on $\mathbf{Z}$) and so we could form kernels, indeed the kernel of a morphism $A \to B$ (where both $A$ and $B$ have an associated map to $\mathbf{Z}$) should be the preimage of $\mathbf{Z}$ I suppose, and this is probably a subring?</p>
<p>Before we get to that though be should ask what does this category $\mathrm{CRing}\downarrow \mathbf{Z}$ even look like?! It’s not absolutely trivial as far as I can tell as we get maps from polynomial rings into $\mathbf{Z}$ from evaluating at integer vectors.
We can also add in nilpotents and other stuff that just ends up mapping to zero but is still fine.
However on the other hand we immediately rule out positive characteristic rings, and anything which inverts an element of $\mathbf{Z}$.</p>
<p>I really have no conclusions about what this category of commutative rings with a map to $\mathbf{Z}$ actually is, but it is quite fun to play with.</p>
Mon, 31 Oct 2016 00:00:00 +0000
https://alexjbest.github.io/blog/maths/2016/10/31/whats-that-category.html
https://alexjbest.github.io/blog/maths/2016/10/31/whats-that-category.htmlMathsSome funny representations<blockquote>
<p>As part of a discussion in our Galois representations course John Bergdall challenged us to come up with a representation that is irreducible but not absolutely semi-simple.
I found this a pretty fun thing to think about so I thought I’d write up my progress and what the next steps are.</p>
</blockquote>
<p>First things first a reminder of the definitions: an irreducible representation is one with no fixed subspace, and a semisimple representation is one which can be written as a direct sum of irreducible representations.
Adding the word absolutely just means that we require the property to be true for the representation acting on the vector spaces over algebraic closure of the base field.
By allowing more general coefficients things that are irreducible to start with can easily become reducible.</p>
<p>To start our search lets work with the most simple type of potentially interesting representations I can think of, these look something like</p>
<script type="math/tex; mode=display">\rho\colon \mathbf{Z} \to \operatorname{GL}_2(K)</script>
<p>which are entirely determined by the image of 1, lets try and find an appropriate matrix to make the example we want work.</p>
<p>In my head non-semi-simple things look something like this</p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{pmatrix}
*&*\\0&*
\end{pmatrix} %]]></script>
<p>So in order to find an irreducible non absolutely semisimple representation we want to find a matrix over a field which has no eigenvalues, but which over the algebraic closure has repeated eigenvalues.
This is not possible over $\mathbf{C}$ for example, as the trace would have to be twice the eigenvalue.
This would then give that the eigenvalues were themselves in the ground field, and therefore the eigenvalues were defined over $K$ in the first place.</p>
<p>This sort of weird multiple roots of irreducible polynomials stuff happens only for non-perfect fields (by definition), of which the most quoted example is $\mathbf{F}_p((t))$.
Here the polynomial $x^p - t$ is irreducible but has repeated roots over the algebraic closure as it factors as $(x-\sqrt[p]{t})^p$.
As we are dealing with 2-dimensional representations here we should look for a matrix over $\mathbf{F}_2((t))$ with characteristic polynomial $x^2 - t$, one simple example of this is the matrix</p>
<script type="math/tex; mode=display">% <![CDATA[
M =
\begin{pmatrix}
0&t\\1&0
\end{pmatrix}. %]]></script>
<p>So we have a matrix that has no eigenvalues over the base field but repeated eigenvalues over the algebraic closure.
We now need to check that it only has a one dimensional eigenspace, to be sure that we have a non absolutely semisimple representation.
This eigenspace is given by the kernel of</p>
<script type="math/tex; mode=display">% <![CDATA[
M - \sqrt{t}I =
\begin{pmatrix}
-\sqrt{t}&t\\1&-\sqrt{t}
\end{pmatrix} %]]></script>
<p>which is indeed one dimensional and so we are done.</p>
<p>One property of this example is that if we consider the restriction of the representation to the subgroup $2\mathbf{Z}$ we get something semisimple as the square of our matrix $M$ is diagonal.
The new improved challenge is therefore to find a representation for which this doesn’t happen either; more explicitly to find an irreducible not absolutely semisimple representation which remains that way on all finite index subgroups.
I don’t think the example I have right now can be extended to this case and it might be necessary to look at representations of more exotic groups than simply $\mathbf{Z}$ for this.</p>
<p>Thoughts welcome!</p>
Sat, 01 Oct 2016 00:00:00 +0100
https://alexjbest.github.io/blog/maths/2016/10/01/some-funny-representations.html
https://alexjbest.github.io/blog/maths/2016/10/01/some-funny-representations.htmlMathsBrief intermission - please stand by<blockquote>
<p>Just a quick post to explain my recent absence (and get beeminder off my back).</p>
</blockquote>
<p>Recently I haven’t posted that much, the reason being that I just moved to the US to start a PhD in maths at Boston University and things have been kinda busy with the move.
I’m really excited about starting doing maths full time again, and I’m sure there will be a lot more to post about in the very near future.
I just moved into an apartment today after a brief stay in a hotel while I attended a week long international orientation, not so much maths but I did get the chance to give a 10 minute talk on the 4-colour theorem.</p>
<p>See you soon.</p>
Thu, 01 Sep 2016 00:00:00 +0100
https://alexjbest.github.io/blog/maths/2016/09/01/brief-intermission.html
https://alexjbest.github.io/blog/maths/2016/09/01/brief-intermission.htmlMaths7ECM - Some personal thoughts<blockquote>
<p>So I’m currently on my way back from the 7th European Congress of Mathematics, which took place in Berlin over the course of the last week.
While everything is still fresh(ish) in my memory I wanted to record some of the bits that I (personally) found most interesting so that I’ll be able to look back when I invariably do forget what I did for a whole week.
If you like similar things to me then maybe you’ll find something interesting here too (if you don’t like similar things to me then I question your choice to read this blog).
Some of these topics I’d like to revisit in more detail (possibly in a future post?!) but for now these short snippets will have to do.
So in approximately no particular order and undoubtedly with some gaps.</p>
</blockquote>
<h2 id="peter-scholze">Peter Scholze</h2>
<p>This is the most obvious one for me, Scholze talked about Perfectoid spaces, which since he introduced them in his thesis have become a hot topic in number theory.
Scholze himself has been awarded various prizes and honours for developing this theory (including another at the ECM).</p>
<p>The motivation is to transfer problems involving the $p$-adics (which can be problematic as they are mixed characteristic) over to the similar looking ring $\mathbf{F}_p((t))$.
Scholze said he wants to fight for the freedom of $p$.
The way he does this is via a technique called tilting, which was originally used by Fontaine and Wintenberger to prove a result relating the Galois theories of these fields.
Scholze (and independently Kedlaya-Liu) takes this result and distils out the definition of a field where this works and calls it a perfectoid field.
He then does some natural looking things (which I’m sure are actually very complicated to do properly) and constructs perfectoid algebras and perfectoid spaces.
He shows that various things work out nicely and that one can tilt the spaces too in a way that matches what you’d hope for simple spaces like projective space.</p>
<p>In the last part of the talk Scholze got increasingly high level so rather than embarrass myself trying to replicate it I’ll just say that it definitely looks like he has more impressive things forthcoming.</p>
<p>One thing that was especially interesting for me in this talk was mention of work by Jared Weinstein who works at Boston University (where I will also be from September).
I was aware that Weinstein was doing work in this area so it was really interesting to have a bit of it put into context.</p>
<h2 id="karin-vogtmann">Karin Vogtmann</h2>
<p>Vogtmann first gave a nice introduction to outer automorphisms of the free group and why one might care, and got me interested with the observation that abelianising gives automorphisms of $\mathbf{Z}^d$.
She also described various related “outer space”s which are spaces which the outer automorphisms of free groups act on.
The construction is really cool and involves deforming certain metrized graphs using the interpretation of $\operatorname{Out}(F_n)$ as the automorphisms of the rose graph.
(One thing she mentioned in passing here that I’d like to learn more about is Gromov’s topology on the space of all metric spaces, so meta.)
The gist was that the various slightly different outer spaces were all very related to the main one.
The pictures were very good and extremely helpful and for me this talk definitely wins the Oscar for best visual effects.</p>
<h2 id="don-zagier">Don Zagier</h2>
<p>I found Don’s style of presenting (rapidly moving through hand written slides while barely breathing) engaging and super impressive, multiple times he switched to a slide and corrected something before I’d even had time to read the first sentence.
The content itself was also really fascinating, he started with a fairly random looking recurrently defined sequence that Apery used to prove that $\zeta(3)$ is irrational and gave multiple interpretations of where such a sequence comes from (though in fact there is still something very special going on).
He ended up giving some motivation for the concept motives (pun not intended) and presented us with a few results concerning various special values that were conjectured by a coauthor of his based on pure thought on the level of motives and proved by Zagier and others on the concrete level.</p>
<p>Unfortunately I can’t remember quite as much of what went on as I’d like due to the above-mentioned speed and Don’s recommendation at the start that we “don’t even bother trying to take notes”.
I do hope a video of the lecture will appear at some point as out of all the talks I saw I think this is the one I’d most benefit from rewatching in its entirety.</p>
<p>Another fun coincidence for me personally was that he mentioned some work of Irene Bouw, who I met only days beforehand in Ulm.</p>
<h2 id="tommy-hofman">Tommy Hofman</h2>
<p>Tommy talked about an algorithm he and Claus Fieker developed to compute Hermite normal forms over Dedekind domains.
This was another exceptional example of the law of large numbers (or whichever law it is that explains funny coincidences) as previously I worked on implementing algorithms for computing HNFs and doing this work was in some way the original cause of me coming to Kaiserslautern, which is where Tommy works!
The main takeaway from this talk (other than a nice new algorithm) is that quotients of rings of integers by powers of prime ideals are always Euclidean rings.</p>
<h2 id="other-nice-talks">Other nice talks</h2>
<ul>
<li>
<p>Fedor Pakovich found a relationship between Davenport-Zannier pairs (generalisations of pairs of polynomials $(f,g)$ such that $f^3-g^2$ is of minimal possible degree, excluding trivial things) and Dessins d’Enfants.
He used this to obtain a classification of these pairs (roughly 10 famillies and 11 sporadic pairs) by couting the corresponding graphs (which ended up being interpretable as certain weighted trees or something like this).
One of the reasons I found this so interesting was that it’s the first time I think I’ve seen Dessins used in anger, I started reading Girondo and Gonzailez-Diaz’s book on compact Riemann surfaces and Dessins about year ago but got distracted before I got to the juicy bit.</p>
</li>
<li>
<p>Thomas Willwacher talked about cohomology of graph complexes which it turns out is really hard to compute but relates many other areas of maths (including automorphisms of free groups!).</p>
</li>
<li>
<p>Maryna Viazovska gave a nice talk about her (and others) recent proof of sphere packing in dimensions 8 and 24, which got a lot of popular attention at the time so was nice to hear about.</p>
</li>
<li>
<p>Meinolf Geck spoke about his work computing with groups, remarked that finite groups of Lie type $G(\mathbf{F}_q)$ make up a large portion of the classification of finite simple groups. He wants to construct generic character tables as $q$ varies for different algebraic groups and possibly different primes - and this actually looks to work! That this is even possible totally blew my mind.</p>
</li>
<li>
<p>Uli Wagner spoke about the topological Tverberg conjecture, Gunter Ziegler gave a talk that touched on this in Warwick in 2014 which I really enjoyed, so it was cool to hear about recent progress regarding it.</p>
</li>
<li>
<p>Jeremy Grey spoke about Poincare and Weyl and it was a lot more philosophical than I was expecting, contrasting their views on philosophy of science over time, but it was fascinating stuff.</p>
</li>
</ul>
<p>There were certainly many many other interesting talks that I saw (and indeed many that I didn’t) but I’m tired and my memory is failing now and I have to make a post or Beeminder will take my money.</p>
<h2 id="other-thoughts">Other thoughts</h2>
<p>There were a few political statements made during the conference for example by Pavel Exner condemning the situation in Turkey (where academics were recalled/banned from travelling) and from Timothy Gowers regarding the Brexit (short version: 48% of us voted to stay).
This is something that I think is appropriate for large I/ECM like conferences, which represent the whole mathematical community.</p>
<h2 id="non-maths">Non-maths</h2>
<p>Berlin is a really awesome city so even outside of the conference I did a lot of nice things, and I’m really glad I got a good excuse to visit before I finish my time in Germany.
I took quite a lot of photographs (especially the Sunday before the conference when it was really nicely overcast), maybe I’ll upload some to Flickr and put a link here when I’m back in the UK and have some free time.</p>
Sat, 23 Jul 2016 00:00:00 +0100
https://alexjbest.github.io/blog/maths/2016/07/23/7ecm-a-personal-retrospective.html
https://alexjbest.github.io/blog/maths/2016/07/23/7ecm-a-personal-retrospective.htmlMathsWeber modular functions<blockquote>
<p>An quick introduction to modular functions, and the Weber modular functions in particular.</p>
</blockquote>
<h2 id="background">Background</h2>
<p>A <strong>modular function</strong> for a subgroup $\Gamma$ of $\operatorname{SL}_2(\mathbf{Z})$ is a meromorphic function $f$ on the upper half plane $\mathbf{H} =\{x+iy\in \mathbf{C} : y > 0\}$, that is invariant by all matrices in $\Gamma$ when they act via</p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{pmatrix}
a&b\\c&d
\end{pmatrix}
\cdot f(\tau) = f\left(\frac{a\tau+b}{c\tau+d}\right), %]]></script>
<p>(we also need $f$ to be “meromorphic at the cusps” too but let’s not worry about this right now).</p>
<p>So for example letting $\Gamma$ be the whole of $\operatorname{SL}_2(\mathbf{Z})$ we see that modular functions for $\operatorname{SL}_2(\mathbf{Z})$ must satisfy $f(\tau) = f(\tau+ 1)$ and $f(\tau) = f(-1/\tau) $ for all $\tau\in\mathbf{H}$, as</p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{pmatrix}
1&1\\0&1
\end{pmatrix}
\text{and}
\begin{pmatrix}
0&-1\\1&0
\end{pmatrix} %]]></script>
<p>are both in the modular group.
In fact, as the above matrices generate the whole modular group, being invariant under the above two transformations is sufficient for a function on the upper half plane to be modular for $\operatorname{SL}_2(\mathbf{Z})$.</p>
<p>The most well known modular function is undoubtedly the $j$-invariant:</p>
<script type="math/tex; mode=display">j(\tau) = \frac{1}{q} + 744 + 196884 q + 21493760 q^2 + 864299970 q^3 + \cdots,</script>
<p>where $q = e^{2\pi i \tau}$.
If you haven’t seen the $j$-invariant before this description as a $q$-expansion is not very helpful, but rest assured this function appears very naturally when considering sublattices of $\mathbf{C}$.
In any case the properties of this function are the interesting and important part: first and foremost it is as injective as a modular function possibly can be.
What I mean by this is that not only does</p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{pmatrix}
a&b\\c&d
\end{pmatrix}
\cdot j(\tau) = j\left(\frac{a\tau+b}{c\tau+d}\right). %]]></script>
<p>for any $\tau$, but if $j(\tau) = j(\sigma)$ then there is some matrix</p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{pmatrix}
a&b\\c&d
\end{pmatrix}
\in \operatorname{SL}_2(\mathbf{Z}) \text{ such that } \sigma = \frac{a\tau+b}{c\tau+d}. %]]></script>
<p>This is useful as it allows you to find out if two points in the upper half plane can be transformed into each other by the above matrix action simply by calculating the value of $j$ at these points.
This is especially handy when dealing with elliptic curves, which correspond to points in the upper half plane and are isomorphic only when their representative points are equivalent under the above action.
This is the reason that $j$ is called the $j$-invariant.</p>
<p>In addition to this another nice fact about $j$ is that it generates the set of all modular functions for the full modular group when we take rational functions of it, that is</p>
<script type="math/tex; mode=display">\{ \text{modular functions for } \operatorname{SL}_2(\mathbf{Z})\} = \mathbf{C}(j).</script>
<p>It is no surprise then that $j$ is a sort of poster child for modular functions, and indeed before looking into this topic I don’t think that I knew an interesting example of a modular function other than $j$, well now I do!</p>
<h2 id="weber-modular-functions">Weber modular functions</h2>
<p>In order to define some other modular functions we will use an intermediary function, $\eta$, which has $q$-expansion:</p>
<script type="math/tex; mode=display">\eta(\tau) = q^{1/24}\prod_{n=1}^\infty(1-q^n).</script>
<p>Now $\eta$ is not a modular function, rather a modular form of weight $\frac{1}{2}$, which means that instead of the condition we had for modular functions we have instead that</p>
<script type="math/tex; mode=display">% <![CDATA[
\eta\left( \frac{az+b}{cz+d} \right) = (cz+d)^{-1/2} \eta(z) \text{ for all }\begin{pmatrix} a&b\\c&d \end{pmatrix}\in\Gamma. %]]></script>
<p>One consequence of this is that if we divide $\eta$ by another modular form of weight $\frac12$ we get a modular function.
But we have only seen one such function so far, $\eta$, and $\eta/\eta$ is not a very interesting function.
We can however apply transformations to $\eta$ similar to those above with elements of the general linear group $\operatorname{GL}_2(\mathbf{Z})$ instead (specifically ones that $\eta$ is not invariant under).
This will allow us to create new functions of the same type.
For instance we let:</p>
<script type="math/tex; mode=display">\mathfrak{f}(\tau) = e^{\frac{-\pi i}{24}}\frac{\eta\left(\frac{\tau + 1}{2}\right)}{\eta(\tau)} = q^{-\frac{1}{48}}\prod_{n=1}^\infty\left(1+q^{n-\frac{1}{2}}\right),</script>
<script type="math/tex; mode=display">\mathfrak{f}_1(\tau) = \frac{\eta\left(\frac{\tau}{2}\right)}{\eta(\tau)} = q^{-\frac{1}{48}}\prod_{n=1}^\infty\left(1-q^{n-\frac{1}{2}}\right),</script>
<script type="math/tex; mode=display">\mathfrak{f}_2(\tau) = \sqrt{2}\frac{\eta(2\tau)}{\eta(\tau)}= \sqrt{2}q^{-\frac{1}{24}}\prod_{n=1}^\infty\left(1+q^{n}\right).</script>
<p>These are <strong>Weber modular functions</strong>, named after <a href="https://en.wikipedia.org/wiki/Heinrich_Martin_Weber">Heinrich Weber</a> who studied them in his Lehrbuch der Algebra.
Indeed we could have defined these via their $q$-expansions without mention of $\eta$ at all, but that makes them seem rather arbitrary, which they certainly aren’t!</p>
<p>Notice that we have applied the elements
<script type="math/tex">% <![CDATA[
\begin{pmatrix}
1&1\\0&2
\end{pmatrix},
\begin{pmatrix}
1&0\\0&2
\end{pmatrix} \text{ and }
\begin{pmatrix}
2&0\\0&1
\end{pmatrix}, %]]></script></p>
<p>all of which have determinant $2$, to $\eta$ here.</p>
<p>Until now the only modular function we have looked at is $j$, which was a modular function for the whole of $\operatorname{SL}_2(\mathbf{Z})$ so you will no doubt be pleased to hear that these functions are in fact modular for</p>
<script type="math/tex; mode=display">% <![CDATA[
\Gamma_{48} = \left\{\begin{pmatrix}a&b\\c&d\end{pmatrix}\in\operatorname{SL}_2(\mathbf{Z}) : a\equiv d \equiv 1,\, b\equiv c\equiv 0 \pmod{48}\right\}, %]]></script>
<p>a proper subgroup of the full modular group.</p>
<p>By virtue of the normalising coefficients we used here, these functions satisfy several nice identities, for example</p>
<script type="math/tex; mode=display">\mathfrak{f}_1(\tau)^8 + \mathfrak{f}_2(\tau)^8 = \mathfrak{f}(\tau)^8.</script>
<p>Whilst these are some very lovely looking expressions, such things aren’t quite as exciting in total isolation.
So how can we use these functions?
One interesting thing we can do with these functions is generate Hilbert class fields, for this sort of work we it will help to look at Shimura reciprocity, which will be the topic of a future post!</p>
Thu, 23 Jun 2016 00:00:00 +0100
https://alexjbest.github.io/blog/maths/2016/06/23/weber-modular-functions.html
https://alexjbest.github.io/blog/maths/2016/06/23/weber-modular-functions.htmlMathsNew Blog!<p>I enjoy reading a lot of different blogs, so I have been thinking it would be cool to (re-)start my own mathsy/programmingy blog on a somewhat regular basis.
So I have new-years-resolved to start blogging again/more regularly!</p>
<p>The eagle eyed among you may notice that is in fact April (or at least that that is when this post was published) so I have already pretty much failed…
However I shall not fail again!
No, this time I have set up a <a href="http://beeminder.com/alexjbest/blog">beeminder goal</a> pledging to blog regularly.
Meaning that I have to pay real money if I don’t keep this habit up, so you’re guaranteed to read <em>something</em> here at least.
Whether it was worth reading or not will be for future generations to say (i.e. not you! Ok, ok, constructive criticism is welcomed with open (but slightly bear-like and intimidating) arms).
This was probably a <strong>very-bad-idea</strong>, we shall see.</p>
<p>I have moved my old blog (which mostly concerned my work on <a href="http://flintlib.org">FLINT</a>) from Wordpress to this <a href="https://pages.github.com/">github.io</a> site powered by <a href="http://jekyllrb.com">Jekyll</a>.
So there is at least something here for me to build on, and if you want to follow this blog for updates <a href="https://alexjbest.github.io/blog/feed.xml">you can do so with rss/atom/smoke signals</a>.</p>
<p>This will almost about any of the questions and explorations I go on and find interesting, probably mostly maths with a bit of computer science/other random things thrown in.
Until next time, watch this space (or watch me go broke)!</p>
Sat, 02 Apr 2016 00:00:00 +0100
https://alexjbest.github.io/blog/2016/04/02/new-blog.html
https://alexjbest.github.io/blog/2016/04/02/new-blog.htmlGSoC with Flint - Finishing up<p>(Update: after some more work on matrix kernels I managed to improve upon what is given below, I don't think it needs another post but see the <a href="https://groups.google.com/d/msg/flint-devel/XpUaHNgk_Dc/Cou0KGuS7BIJ">long running thread</a> on the flint-devel mailing list if you are interested the current performance of the code.)</p>
<p>First of all apologies for not keeping everyone up to date recently. I've just been cleaning up code and deciding what is in a good enough state to include in my final evaluation for GSoC so there wasn't that much new of interest to report. This is all now done and my <a href="https://github.com/alexjbest/flint2">GitHub repository</a> now contains only code that appears to be working well.</p>
<p>The main project goals of implementing quick methods to compute Hermite and Smith normal forms have been completed. Below is table and graph comparing the timings to those obtained with Sage (I spent about half a week being very sad about my work until I realised I needed to clear the cache to get the right timings for Sage!). The class of matrices used were random square with dimension and entry bits as in the table.</p>
<p>Flint timings (s):</p>
<table>
<tr>
<td></td>
<th>4</th>
<th>8</th>
<th>16</th>
<th>32</th>
<th>64</th>
<th>128</th>
</tr>
<tr>
<th>10</th>
<td>0.0001498</td>
<td>0.000181</td>
<td>0.0003962</td>
<td>0.0003608</td>
<td>0.0011182</td>
<td>0.0011266</td>
</tr>
<tr>
<th>20</th>
<td>0.0007708</td>
<td>0.0010738</td>
<td>0.0014056</td>
<td>0.0023238</td>
<td>0.0040728</td>
<td>0.009321</td>
</tr>
<tr>
<th>30</th>
<td>0.0021142</td>
<td>0.003127</td>
<td>0.0042294</td>
<td>0.007065</td>
<td>0.015792</td>
<td>0.0427272</td>
</tr>
<tr>
<th>40</th>
<td>0.0050182</td>
<td>0.0066494</td>
<td>0.0101572</td>
<td>0.0169392</td>
<td>0.038086</td>
<td>0.0985452</td>
</tr>
<tr>
<th>50</th>
<td>0.0105184</td>
<td>0.0138692</td>
<td>0.0183774</td>
<td>0.0319924</td>
<td>0.0812636</td>
<td>0.2245758</td>
</tr>
<tr>
<th>60</th>
<td>0.0181876</td>
<td>0.0243814</td>
<td>0.0342512</td>
<td>0.0625658</td>
<td>0.1578844</td>
<td>0.4360994</td>
</tr>
<tr>
<th>70</th>
<td>0.028565</td>
<td>0.0373348</td>
<td>0.0546572</td>
<td>0.1110402</td>
<td>0.290543</td>
<td>0.8147328</td>
</tr>
<tr>
<th>80</th>
<td>0.0417594</td>
<td>0.0595228</td>
<td>0.0882594</td>
<td>0.1842448</td>
<td>0.4881932</td>
<td>1.3889464</td>
</tr>
<tr>
<th>90</th>
<td>0.0694218</td>
<td>0.08668</td>
<td>0.1405782</td>
<td>0.2854802</td>
<td>0.7817248</td>
<td>2.2501918</td>
</tr>
<tr>
<th>100</th>
<td>0.0880376</td>
<td>0.1192832</td>
<td>0.2052142</td>
<td>0.4448414</td>
<td>1.245277</td>
<td>3.5487596</td>
</tr>
<table>
<p>Flint to Sage timing ratios (< 1 is best for us):</p>
<table>
<tr>
<td></td>
<th>4</th>
<th>8</th>
<th>16</th>
<th>32</th>
<th>64</th>
<th>128</th>
</tr>
<tr>
<th>10</th>
<td>0.6965</td>
<td>1.0258</td>
<td>1.9950</td>
<td>1.5602</td>
<td>3.8941</td>
<td>2.8422</td>
</tr>
<tr>
<th>20</th>
<td>0.7261</td>
<td>0.8606</td>
<td>0.9396</td>
<td>1.1124</td>
<td>1.0772</td>
<td>0.9704</td>
</tr>
<tr>
<th>30</th>
<td>0.6531</td>
<td>0.7794</td>
<td>0.8381</td>
<td>0.8015</td>
<td>0.7669</td>
<td>0.7449</td>
</tr>
<tr>
<th>40</th>
<td>0.6492</td>
<td>0.7048</td>
<td>0.7380</td>
<td>0.5891</td>
<td>0.4896</td>
<td>0.4245</td>
</tr>
<tr>
<th>50</th>
<td>0.6595</td>
<td>0.6637</td>
<td>0.5511</td>
<td>0.3997</td>
<td>0.3666</td>
<td>0.3543</td>
</tr>
<tr>
<th>60</th>
<td>0.6354</td>
<td>0.6325</td>
<td>0.4950</td>
<td>0.3586</td>
<td>0.3128</td>
<td>0.2958</td>
</tr>
<tr>
<th>70</th>
<td>0.6016</td>
<td>0.5616</td>
<td>0.3836</td>
<td>0.3126</td>
<td>0.2764</td>
<td>0.2768</td>
</tr>
<tr>
<th>80</th>
<td>0.1957</td>
<td>0.2762</td>
<td>0.3801</td>
<td>0.6697</td>
<td>1.2174</td>
<td>1.6715</td>
</tr>
<tr>
<th>90</th>
<td>0.2509</td>
<td>0.3145</td>
<td>0.4730</td>
<td>0.8187</td>
<td>1.5402</td>
<td>2.1104</td>
</tr>
<tr>
<th>100</th>
<td>0.2527</td>
<td>0.3257</td>
<td>0.5340</td>
<td>0.9906</td>
<td>1.8450</td>
<td>2.5461</td>
</tr>
</table>
<p>For simplicity's sake I simply compared fmpz_mat_hnf_pernet_stein to Sage's hermite_form, so it's worth noting that Flint is faster still for small matrices if another method is used (which the fmpz_mat_hnf function should choose for you in practice). We can see here that although Flint does well for many matrices in this range it gets worse as the matrix and bit size gets larger, indeed this trend continues and my functions are far worse for very big matrices (Denis Kryzkov's benchmark <a href="https://github.com/krisk0/razin/blob/master/cout/benchmark_AlexBest_HNF.cout">here</a> gives a good indication of the scale of the problem). The run time of the asymptotically most efficient HNF method I have implemented (Pernet-Stein) depends heavily on the computation of nullspaces and so this is definitely an area that can be improved. Both approaches I've tried to speed up nullspace computation (multimodular and p-adic lifting) haven't worked out being any better (asymptotically) than the pre-existing code based on the row echelon form. The remaining barrier here seems to be modular rref which I've looked at improving over the past week, this is certainly possible and I plan to carry on working on it (I have a complete but buggy implementation of a method described in Storjohann's thesis and a working implementation of classical Gaussian elimination which is fast for small matrices at the moment). Finishing this will I hope bring the timings for Pernet-Stein down to something more like the ones in Sage. Sage uses IML for these computations, but even without using BLAS I think we could get a fair amount closer to IML's performance for large matrices.</p>
<p>As for Smith forms, the randomised algorithm I worked on remains problematic, so I have left that out for now and stuck with the two slightly older approaches, a general one due to Kannan and Bachem and a modular one due to Iliopoulos for full rank square matrices. Below are some timings, again with comparisons to Sage. The class of matrices used is the same as that above, which means that it is highly likely that Iliopoulos' would be appropriate (though it may not have been chosen by fmpz_mat_snf). Sage uses Pari for this, but it was easier for me to change my existing timing code than set up a direct comparison to Pari.</p>
<p>Flint timings (s):</p>
<table>
<tr>
<td></td>
<th>4</th>
<th>8</th>
<th>16</th>
<th>32</th>
<th>64</th>
<th>128</th>
</tr>
<th>10</th>
<td>0.0001008</td>
<td>0.0001776</td>
<td>0.0002186</td>
<td>0.0005208</td>
<td>0.0016076</td>
<td>0.0042602</td>
<tr>
<th>20</th>
<td>0.001181</td>
<td>0.0017224</td>
<td>0.0025602</td>
<td>0.0050306</td>
<td>0.0131318</td>
<td>0.038855</td>
</tr>
<tr>
<th>30</th>
<td>0.0039768</td>
<td>0.006887</td>
<td>0.013429</td>
<td>0.031941</td>
<td>0.0904368</td>
<td>0.2860226</td>
</tr>
<tr>
<th>40</th>
<td>0.0138918</td>
<td>0.0201976</td>
<td>0.0408486</td>
<td>0.1258706</td>
<td>0.386331</td>
<td>1.1849852</td>
</tr>
<tr>
<th>50</th>
<td>0.0312358</td>
<td>0.0544678</td>
<td>0.1120322</td>
<td>0.370241</td>
<td>1.155253</td>
<td>3.6941254</td>
</tr>
<tr>
<th>60</th>
<td>0.061067</td>
<td>0.116257</td>
<td>0.2779696</td>
<td>0.8377208</td>
<td>2.5126946</td>
<td>7.9345656</td>
</tr>
</table>
<p>Flint to Sage timing ratios (< 1 is best for us):</p>
<table>
<tr>
<td></td>
<th>2</th>
<th>4</th>
<th>8</th>
<th>16</th>
<th>32</th>
<th>64</th>
<th>128</th>
</tr>
<tr>
<th>10</th>
<td>0.01630</td>
<td>0.02870</td>
<td>0.04939</td>
<td>0.05714</td>
<td>0.10546</td>
<td>0.22552</td>
<td>0.36044</td>
</tr>
<tr>
<th>20</th>
<td>0.05063</td>
<td>0.06777</td>
<td>0.08559</td>
<td>0.08868</td>
<td>0.10665</td>
<td>0.14573</td>
<td>0.19519</td>
</tr>
<tr>
<th>30</th>
<td>0.08086</td>
<td>0.07276</td>
<td>0.09437</td>
<td>0.10418</td>
<td>0.13330</td>
<td>0.17716</td>
<td>0.23537</td>
</tr>
<tr>
<th>40</th>
<td>0.08905</td>
<td>0.10716</td>
<td>0.09976</td>
<td>0.10617</td>
<td>0.16009</td>
<td>0.21715</td>
<td>0.27530</td>
</tr>
<tr>
<th>50</th>
<td>0.10550</td>
<td>0.11037</td>
<td>0.11270</td>
<td>0.11762</td>
<td>0.18405</td>
<td>0.24098</td>
<td>0.30266</td>
</tr>
<tr>
<th>60</th>
<td>0.12292</td>
<td>0.10537</td>
<td>0.11135</td>
<td>0.12858</td>
<td>0.17996</td>
<td>0.23113</td>
<td>0.29524</td>
</tr>
</table>
<p>(I will update this table on my blog when more timings finish, I wanted to post this post and it is taking a while)</p>
<p>These timings look good for Flint but I'm not completely sure yet what the large scale behaviour is.</p>
<p>I still have a number of more experimental bits of code around that I will continue to work on getting into a more stable and usable state. Along with some other little bits that I never managed to get around to during the official GSoC period that I hope to get around to at some point.</p>
<p>Finally I want to say a huge thanks to everyone who commented on what I've been doing, and especially to Fredrik for his excellent advice over the course of the project. All the comments were very much appreciated.</table>
</table>
Mon, 18 Aug 2014 15:14:46 +0100
https://alexjbest.github.io/blog/gsoc/2014/08/18/gsoc-with-flint-finishing-up.html
https://alexjbest.github.io/blog/gsoc/2014/08/18/gsoc-with-flint-finishing-up.htmlflintgsocGSoCGSoC with Flint - Week 11<p>This week I've worked on a variety of different things, unfortunately getting hung up on most of them!</p>
<p>First I found some bugs in my Saunders-Wan Smith normal form code, most of them ended up being fixable, however one still eludes me. It seems the bug is fairly rare (occurs for roughly 1 in 10000 test cases) and it is certainly related to the random nature of the algorithm but my current thinking is that the current behaviour should not be happening even when we get unlucky with the randomness. I've left this issue alone for now after spending a couple of days making no progress on it.</p>
<p>I then moved on to writing some general HNF/SNF methods that pick between available implementations based on matrix size, norm and anything else that I could work out as being relevant. While doing this I found that the Kannan-Bachem Hermite form method worked better than the modular method some of the time and so I decided to try and combine them into a modular method that works by taking minors into HNF rather than rows. I had some problems making this work correctly when the input was not of full rank and all the fixes I tried ended up having some corner case that made them fail. It just occurred to me however that when the modular method is used as part of the Pernet-Stein algorithm the matrix given is always square and so this was not perhaps a completely wasted effort.</p>
<p>I then moved back to working on Pernet-Stein, specifically I added capacity for non-full row rank matrices, and fully general matrices should be done very soon.</p>
<p>This week I'm going to be doing some profiling and comparisons between my implementations and existing ones and try and work out the reasons why certain results are as they are and how the HNF/SNF methods I now have can be made faster in the future. It should be helpful to have a note of the barriers to having faster HNF/SNF computation in Flint and how they could be overcome.<br />
Of course I'll also be tidying up and documenting several bits as I go along to fill in any gaps in functionality I have left along the way.</p>
Tue, 05 Aug 2014 20:03:18 +0100
https://alexjbest.github.io/blog/gsoc/2014/08/05/gsoc-with-flint-week-11.html
https://alexjbest.github.io/blog/gsoc/2014/08/05/gsoc-with-flint-week-11.htmlflintgsocGSoCGSoC with Flint - Week 10<p>This week I worked more on the Smith normal form algorithm I talked about last week. I implemented Iliopoulos' algorithm for computation of the Smith form modulo an arbitrary integer, this procedure is used in a couple of places as part of Saunders and Wan's "engineered" algorithm. Firstly we use a prime power modulus to find the Smith normal form locally for small primes (i.e. those less than 100), the modular approach is also used for the rough part (concerning all the other primes) when the largest non-zero invariant factor is small compared to the size of the matrix. This algorithm is now working correctly, though the question of how to test it properly given its Monte Carlo nature is one that will require some thought. Currently whenever I have encountered a matrix for which the output of Saunders and Wan doesn't match the deterministic algorithms' outputs it has turned out to be a code bug rather than a side effect of the algorithm's randomness. I suppose allowing for a reasonable number of failures beyond the expected number in test code would be one approach, but of course there will still be occasions when the number of failures exceeds this and allowing a large number of extra failures could allow real bugs to creep in.</p>
<p>For the next couple of days I'm going to work a little more on Smith forms, hopefully implementing Storjohann's banded method for finding Smith forms of upper triangular matrices, this could be coupled with a HNF algorithm to give another (deterministic!) algorithm for the Smith form of a general matrix. I also need to clean up the Saunders and Wan code a bit as there are still a number of inefficiencies. I have not got a valence method included in this algorithm as this would require implementing a few other things (such as minimal polynomials), but the option is certainly there and it would easily slot in to the current code.</p>
Mon, 28 Jul 2014 14:59:25 +0100
https://alexjbest.github.io/blog/gsoc/2014/07/28/gsoc-with-flint-week-10.html
https://alexjbest.github.io/blog/gsoc/2014/07/28/gsoc-with-flint-week-10.htmlalgorithmsflintgsocmathsGSoCGSoC with Flint - Week 9<p>So this week I've been working on implementing the methods to find the Smith normal form of an integer matrix, rather than the Hermite normal form which I've worked on up until now. I'm mostly working off of the excellent <a href="https://www.cs.drexel.edu/~wan/publications/ACMTOMS05.pdf" target="_blank">paper</a> of Saunders and Wan which describes an "engineered" algorithm to compute the Smith form which is highly practical. By engineered they mean that the algorithm switches between different methods based on practical observations regarding which method is likely to be the most efficient for different types of matrix, however the algorithm is structured in such a way that the asymptotic complexity remains as small as possible.</p>
<p>The algorithm is actually randomised of Monte Carlo type, meaning there is a controllable possibility of error present in the results. It becomes harder to be sure that the exponents of primes occurring in the Smith form are correct as the primes get smaller in the algorithm of Eberly, Giesbrecht and Villard (on which Saunders and Wan algorithm is based). So the computation is split into two parts, the smooth part concerning only primes less than a given limit and a rough part concerning all others. Currently I have an Eberly, Giesbrecht and Villard based routine working and finding rough parts fairly well and so I am currently working on a modular SNF algorithm to fill in the gap.</p>
<p>Saunders and Wan don't stop with just this however, they also incorporate ideas described by Dumas, Saunders and Villard which use a so called valence based method which I would also like to add at some point. It seems this does require efficient computation of the minimal polynomial however, so I'm not sure how far I will get with this.</p>
Mon, 21 Jul 2014 16:31:44 +0100
https://alexjbest.github.io/blog/gsoc/2014/07/21/gsoc-with-flint-week-9.html
https://alexjbest.github.io/blog/gsoc/2014/07/21/gsoc-with-flint-week-9.htmlalgorithmsflintgsocGSoCGSoC with Flint - Week 8<p>This week I've been working on improving the computation of the nullspace of an integer by implementing a p-adic lifting based method. The algorithm is also described in Stein's book on computing with modular forms and is closely related to Dixon's p-adic method for linear system solving. This is pretty much finished now (modulo some weird memory leaks and determining the best values for parameters such as p-adic precision and range of primes used) and does appear to be asymptotically faster than the multimodular algorithm I worked on before, though experiments suggest that currently the matrix needs to be fairly large before the p-adic algorithm becomes better.</p>
<p>I was originally thinking of implementing Pauderis and Storjohann's algorithm if I had time this week, but have spent much of the past week looking at efficient nullspace computation (and being plagued a little by unexpected power cuts!) so haven't got much further than a skeleton implementation. Maybe I can flesh this out at some point but for the next week I'm going to move on to algorithms for the Smith normal form, another important normal form for matrices over the integers, where both column and row operations are allowed.</p>
<p>Hopefully building on what I've done so far I should be able to get some fast Smith normal form algorithms implemented shortly, indeed one simple approach is to repeatedly transpose and compute Hermite normal forms until the resulting matrix is diagonal, this won't necessarily be the Smith form but it can then be computed with relative ease. Something more sophisticated than this will undoubtedly be best, but many SNF algorithms involve computing the HNF at some point and working from there so the current HNF algorithms provide a good basis for those computations.</p>
Tue, 15 Jul 2014 02:47:04 +0100
https://alexjbest.github.io/blog/gsoc/2014/07/15/gsoc-with-flint-week-8.html
https://alexjbest.github.io/blog/gsoc/2014/07/15/gsoc-with-flint-week-8.htmlalgorithmsflintgsocmathsGSoCGSoC with Flint - Week 7<p>This week I've been working on improving the algorithm of Pernet-Stein as much as possible. After introducing as many of the ideas given in the original paper as I could I found the main bottleneck to actually be the computation of the nullspace of a certain submatrix of the matrix given, this is needed in order to efficiently solve a linear system which (likely) has a nasty final row. If we know a basis for a nullspace of the first n-1 rows of the system we can replace the final row with a random nice (having small entries) row and then find a solution to the original system by adding on a suitable multiple of the nullspace basis vector (the nullspace should be 1 dimensional for random input).<br />
Flint uses the reduced row echelon form of a matrix to compute nullspaces (the nullspace of a matrix in this form can be easily read off and the transformation does not alter it) and so a natural place to improve nullspace computation is to improve the row echelon form computation. We can use a multimodular approach for this problem (this is described in Stein's <a href="http://wstein.org/books/modform/stein-modform.pdf" target="_blank">book</a> on computing with modular forms) and I've achieved some nice speed-ups with this method in the past couple of days. For example the multimodular method is around 200x faster for 125x125 matrices with 32 bit entries. While this has made Hermite form computations a lot faster (nullspace computation is over 90% of the work for large matrices) I still want to try and see if this can be improved upon further, after all, in this case we don't need the whole row echelon form just a vector in the kernel and the knowledge that the nullspace is 1-dimensional. So I plan to work on this further in the next couple of days, and depending on how I feel about this approach I will either spend the rest of this week making improvements to Pernet-Stein or possibly work on implementing the algorithm of Pauderis and Storjohann.</p>
Mon, 07 Jul 2014 01:20:57 +0100
https://alexjbest.github.io/blog/gsoc/2014/07/07/gsoc-with-flint-week-7.html
https://alexjbest.github.io/blog/gsoc/2014/07/07/gsoc-with-flint-week-7.htmlalgorithmsflintgsocmathsGSoC