Modular Forms of Level 1

In this chapter we study in detail the structure of level 1 modular forms, i.e., modular forms on \SL_2(\Z)=\Gamma_0(1)=\Gamma_1(1). We assume some complex analysis (e.g., the residue theorem), linear algebra, and that the reader has read Chapter Modular Forms.

Examples of Modular Forms of Level 1

In this section we will finally see some examples of modular forms of level 1! We first introduce the Eisenstein series and then define \Delta, which is a cusp form of weight 12. In Section Structure Theorem for Level 1 Modular Forms we prove the structure theorem, which says that all modular forms of level 1 are polynomials in Eisenstein series.

For an even integer k\geq 4, the nonnormalized weight k Eisenstein series is the function on the extended upper half plane \h^*=\h\cup\P^1(\Q) given by

(1)G_k(z) = \sum_{m,n\in\Z}^{*} \frac{1}{(mz+n)^k}.

The star on top of the sum symbol means that for each z the sum is over all m,n\in\Z such that mz+n\neq 0.

Proposition 2.1

The function G_k(z) is a modular form of weight k, i.e., G_k \in M_k(\SL_2(\Z)).

Proof

See [Ser73, Section VII.2.3] for a proof that G_k(z) defines a holomorphic function on \h^*. To see that G_k is modular, observe that

G_k(z+1) = \sum^* \frac{1}{(m(z+1)+n)^k} = \sum^* \frac{1}{(mz+(n+m))^k} =
\sum^* \frac{1}{(mz+n)^k},

where for the last equality we use that the map (m,n+m) \mapsto (m, n) on \Z\times \Z is invertible. Also,

G_k(-1/z) &= \sum^* \frac{1}{(-m/z+n)^k} \\
&= \sum^* \frac{z^k}{(-m+nz)^k}\\
&= z^k \sum^* \frac{1}{(mz+n)^k} = z^k G_k(z),

where we use that (n,-m) \mapsto (m,n) is invertible.

Proposition 2.2

G_k(\infty) = 2\zeta(k), where \zeta is the Riemann zeta function.

Proof

As z\to \infty (along the imaginary axis) in (1), the terms that involve z with m\neq 0 go to 0. Thus

G_k(\infty) = \sum^*_{n\in\Z} \frac{1}{n^k}.

This sum is twice \zeta(k) = \sum_{n\geq 1} \frac{1}{n^k}, as claimed.

The Cusp Form \Delta

Suppose E=\C/\Lambda is an elliptic curve over \C, viewed as a quotient of \C by a lattice \Lambda=\Z\omega_1+\Z\omega_2, with \omega_1/\omega_2\in\h (see [DS05, Section 1.4]). The Weierstrass \wp-function of the lattice \Lambda is

\wp = \wp_{\Lambda}(u) = \frac{1}{u^2} + \sum_{k=4,6,8,\ldots} (k-1)
G_k(\omega_1/\omega_2) u^{k-2},

where the sum is over even integers k\geq 4. It satisfies the differential equation

(\wp')^2 = 4\wp^3 - 60 G_4(\omega_1/\omega_2) \wp - 140 G_6(\omega_1/\omega_2).

If we set x=\wp and y=\wp', the above is an (affine) equation of the form y^2 = ax^3 + bx + c for an elliptic curve that is complex analytically isomorphic to \C/\Lambda (see [Ahl78, pg. 277] for why the cubic has distinct roots).

The discriminant of the cubic

4x^3 - 60G_4(\omega_1/\omega_2) x - 140 G_6(\omega_1/\omega_2)

is 16D(\omega_1/\omega_2), where

D(z) = (60 G_4(z))^3 - 27(140 G_6(z))^2.

Since D(z) is the difference of two modular forms of weight 12 it has weight 12. Morever,

D(\infty) &= \left(60 G_4(\infty)\right)^3 - 27\left(140 G_6(\infty)\right)^2\\
          &= \left(\frac{60}{3^2\cdot 5} \pi^4\right)^3 - 27\left(\frac{140\cdot 2}{3^3\cdot 5\cdot 7} \pi^6\right)^2\\
          &= 0,

so D is a cusp form of weight 12.

Let

\Delta = \frac{D}{(2\pi)^{12}}.

Lemma 2.3

If z\in\h, then \Delta(z)\neq 0.

Proof

Let \omega_1=z and \omega_2=1. Since E=\C/(\Z\omega_1 + \Z\omega_2) is an elliptic curve, it has nonzero discriminant \Delta(z) = \Delta(\omega_1/\omega_2)\neq 0.

Proposition 2.4

We have \Delta = q\cdot \prod_{n=1}^{\infty}(1-q^n)^{24}.

Proof

See [Ser73, Thm. 6, pg. 95].

Remark 2.5

Sage computes the q-expansion of \Delta efficiently to high precision using the command delta_qexp:

sage: delta_qexp(6)
q - 24*q^2 + 252*q^3 - 1472*q^4 + 4830*q^5 + O(q^6)

Fourier Expansions of Eisenstein Series

Recall from (?) that elements f of M_k(\SL_2(\Z)) can be expressed as formal power series in terms of q(z)=e^{2\pi i z} and that this expansion is called the Fourier expansion of f. The following proposition gives the Fourier expansion of the Eisenstein series G_k(z).

Definition 2.6

For any integer t\geq 0 and any positive integer n, the sigma function

G_k(z) = 2\zeta(k) + 2\cdot \frac{(2\pi i)^{k}}{(k-1)!}\cdot
\sum_{n=1}^{\infty} \sigma_{k-1}(n) q^n.

is the sum of the t^{th} powers of the positive divisors of n. Also, let d(n) = \sigma_0(n), which is the number of divisors of n, and let \sigma(n) = \sigma_1(n). For example, if p is prime, then \sigma_t(p) = 1 + p^t.

Proposition 2.7

For every even integer k\geq 4, we have

G_k(z) = 2\zeta(k) + 2\cdot \frac{(2\pi i)^{k}}{(k-1)!}\cdot
\sum_{n=1}^{\infty} \sigma_{k-1}(n) q^n.

Proof

See [Ser73, Section VII.4], which uses clever manipulations of series, starting with the identity

\pi \cot(\pi z) = \frac{1}{z} + \sum_{m=1}^{\infty}
\left( \frac{1}{z+m} + \frac{1}{z-m}\right).

From a computational point of view, the q-expansion of Proposition 2.7 is unsatisfactory because it involves transcendental numbers. To understand these numbers, we introduce the Bernoulli numbers B_n for n\geq 0 defined by the following equality of formal power series:

(2)\frac{x}{e^x - 1} = \sum_{n=0}^{\infty} B_n \frac{x^n}{n!}.

Expanding the power series, we have

\frac{x}{e^x - 1} =
1 - \frac{x}{2} + \frac{x^2}{12} - \frac{x^4}{720} + \frac{x^6}{30240}
- \frac{x^8}{1209600} + \cdots.

As this expansion suggests, the Bernoulli numbers B_n with n>1 odd are 0 (see Exercise 2.2). Expanding the series further, we obtain the following table:

\ds B_{0}=1,\quad B_{1}=-\frac{1}{2},\quad B_{2}=\frac{1}{6},\quad
B_{4}=-\frac{1}{30},\quad B_{6}=\frac{1}{42},\quad
B_{8}=-\frac{1}{30},\quad

\ds
B_{10}=\frac{5}{66},\quad
B_{12}=-\frac{691}{2730},\quad
B_{14}=\frac{7}{6},\quad
B_{16}=-\frac{3617}{510},\quad
B_{18}=\frac{43867}{798},\quad

\ds{}\!\!B_{20}=-\frac{174611}{330},\quad
B_{22}=\frac{854513}{138},\quad
B_{24}=-\frac{236364091}{2730},\quad
B_{26}=\frac{8553103}{6}.

See Section Fast Computation of Bernoulli Numbers for a discussion of fast (analytic) methods for computing Bernoulli numbers.

We compute some Bernoulli numbers in Sage:

sage: bernoulli(12)
-691/2730
sage: bernoulli(50)
495057205241079648212477525/66
sage: len(str(bernoulli(10000)))
27706

A key fact is that Bernoulli numbers are rational numbers and they are connected to values of \zeta at positive even integers.

Proposition 2.8

If k\geq 2 is an even integer, then

\zeta(k) = -\frac{(2\pi i)^{k}}{2\cdot k!}\cdot B_k.

Proof

This is proved by manipulating a series expansion of z \cot(z) (see [Ser73, Section VII.4]).

Definition 2.9

The normalized Eisenstein series of even weight k\geq 4 is

E_k = \frac{(k-1)!}{2\cdot (2\pi i)^k}\cdot G_k.

Combining Proposition 2.7 and Proposition 2.8, we see that

(3)E_k = -\frac{B_k}{2k} + q + \sum_{n=2}^{\infty} \sigma_{k-1}(n)
q^n.

Warning

Our series E_k is normalized so that the coefficient of q is 1, but often in the literature E_k is normalized so that the constant coefficient is 1. We use the normalization with the coefficient of q equal to 1, because then the eigenvalue of the n^{th} Hecke operator (see Section Hecke Operators) is the coefficient of q^n. Our normalization is also convenient when considering congruences between cusp forms and Eisenstein series.

Structure Theorem for Level 1 Modular Forms

In this section we describe a structure theorem for modular forms of level 1. If f is a nonzero meromorphic function on \h and w\in\h, let \ord_w(f) be the largest integer n such that f(z)/(w-z)^n is holomorphic at w. If f = \sum_{n=m}^{\infty} a_n q^n with a_m\neq 0, we set \ord_{\infty}(f)=m. We will use the following theorem to give a presentation for the vector space of modular forms of weight k; this presentation yields an algorithm to compute this space.

Let M_k=M_k(\SL_2(\Z)) denote the complex vector space of modular forms of weight k for \SL_2(\Z). The standard fundamental domain \cF for \SL_2(\Z) is the set of z\in \h with |z|\geq 1 and |\Re(z)|\leq 1/2. Let \rho = e^{2\pi i /3}.

Theorem 2.11

Let k be any integer and suppose f \in M_k(\SL_2(\Z)) is nonzero. Then

\ord_{\infty}(f) + \frac{1}{2}\ord_i(f) + \frac{1}{3} \ord_{\rho}(f)
+ \sum_{w\in \cF}^* \ord_w(f) = \frac{k}{12},

where \ds \sum^*_{w \in \cF} is the sum over elements of \cF other than i and \rho!.

Proof

The proof in [Ser73, Section VII.3] uses the residue theorem.

Let S_k=S_k(\SL_2(\Z))index{S_k} denote the subspace of weight k cusp forms for \SL_2(\Z). We have an exact sequence

0 \to S_k \to M_k \xrightarrow{\iota_\infty} \C

that sends f\in M_k to f(\infty). When k\geq 4 is even, the space M_k contains the Eisenstein series G_k, and G_k(\infty)=2\zeta(k)\neq 0, so the map M_k\to \C is surjective. This proves the following lemma.

Lemma 2.12

If k\geq 4 is even, then M_k = S_k \oplus \C G_k and the following sequence is exact:

0 \to S_k \to M_k \xrightarrow{\iota_\infty} \C  \to 0.

Proposition 2.13

For k<0 and k=2, we have M_k=0.

Proof

Suppose f\in M_k is nonzero yet k=2 or k<0. By Theorem 2.11,

\ord_{\infty}(f) + \frac{1}{2}\ord_i(f) + \frac{1}{3} \ord_{\rho}(f)
 + \sum_{w\in D}^* \ord_w(f) = \frac{k}{12}\leq \frac{1}{6}.

This is not possible because each quantity on the left is nonnegative so whatever the sum is, it is too big (or 0, in which case k=0).

Theorem 2.14

Multiplication by \Delta defines an isomorphism M_{k-12}\to S_k.

Proof

By Lemma 2.3, \Delta is not identically 0, so because \Delta is holomorphic, multiplication by \Delta defines an injective map M_{k-12}\hra S_k. To see that this map is surjective, we show that if f\in S_k, then f/\Delta \in M_{k-12}. Since \Delta has weight 12 and \ord_\infty(\Delta)\geq 1, Theorem 2.11 implies that \Delta has a simple zero at \infty and does not vanish on \h. Thus if f\in S_k and if we let g=f/\Delta, then g is holomorphic and satisfies the appropriate transformation formula, so g \in M_{k-12}.

Corollary 2.15

For k=0,4,6,8,10,14, the space M_k has dimension 1, with basis 1, G_4, G_6, G_8, G_{10}, and G_{14}, respectively, and S_k=0.

Proof

Combining Proposition 2.13 with Theorem 2.14, we see that the spaces M_k for k\leq 10 cannot have dimension greater than 1, since otherwise M_{k'}\neq 0 for some k'<0. Also M_{14} has dimension at most 1, since M _2 has dimension 0. Each of the indicated spaces of weight \geq 4 contains the indicated Eisenstein series and so has dimension 1, as claimed.

Corollary 2.16

\dim M_k = \begin{cases}
   0 & \text{if }k\text{ is odd or negative}, \\
   \lfloor{} k/12 \rfloor{} & \text{if } k\con 2\pmod{12},\\
   \lfloor{} k/12 \rfloor{}+1 & \text{if } k\not\con 2\pmod{12}.
\end{cases}

Here \lfloor{}x \rfloor{} is the biggest integer \leq x.

Proof

As we have already seen above, the formula is true when k\leq 12. By Theorem 2.14, the dimension increases by 1 when k is replaced by k+12.

Theorem 2.17

The space M_k has as basis the modular forms G_4^a G_6^b, where a,b run over all pairs of nonnegative integers such that 4a+6b=k.

Proof

Fix an even integer k. We first prove by induction that the modular forms G_4^a G_6^b generate M_k; the cases k\leq 10 and k=14 follow from the above arguments (e.g., when k=0, we have a=b=0 and basis 1). Choose some pair of nonnegative integers a,b such that 4a+6b=k. The form g=G_4^a G_6^b is not a cusp form, since it is nonzero at \infty. Now suppose f\in M_k is arbitrary. Since g(\infty)\neq 0, there exists \alpha\in\C such that f - \alpha g \in S_k. Then by Theorem 2.14, there is h \in M_{k-12} such that f
- \alpha g = \Delta \cdot h. By induction, h is a polynomial in G_4 and G_6 of the required type, and so is \Delta, so f is as well. Thus

\{G_4^aG_6^b\ |\ a\geq 0,\ b\geq 0,\ 4a+6b=k\}

spans M_k.

Suppose there is a nontrivial linear relation between the G_4^a
G_6^b for a given k. By multiplying the linear relation by a suitable power of G_4 and G_6, we may assume that we have such a nontrivial relation with k\con 0\pmod{12}. Now divide the linear relation by the weight k form G_6^{k/6} to see that G_4^3/G_6^2 satisfies a polynomial with coefficients in \C (see Exercise 2.4). Hence G_4^3/G_6^2 is a root of a polynomial, hence a constant, which is a contradiction since the q-expansion of G_4^3/G_6^2 is not constant.

Algorithm 2.18

Given integers n and k, this algorithm computes a basis of q-expansions for the complex vector space M_k mod q^n. The q-expansions output by this algorithm have coefficients in \Q.

  1. [Simple Case] If k=0, output the basis with just 1 in it and terminate; otherwise if k<4 or k is odd, output the empty basis and terminate.
  2. [Power Series] Compute E_4 and E_6 mod q^n using the formula from (3) and Section Fast Computation of Bernoulli Numbers.
  3. [Initialize] Set b\set 0.
  4. [Enumerate Basis] For each integer b between 0 and \lfloor k/6\rfloor, compute a=(k-6b)/4. If a is an integer, compute and output the basis element E_4^a E_6^b \mod{q^n}. When computing E_4^a, find E_4^m\pmod{q^n} for each m\leq a, and save these intermediate powers, so they can be reused later, and likewise for powers of E_6.

Proof

This is simply a translation of Theorem 2.17 into an algorithm, since E_k is a nonzero scalar multiple of G_k. That the q-expansions have coefficients in \Q follows from (3).

Example 2.19

We compute a basis for M_{24}, which is the space with smallest weight whose dimension is greater than 1. It has as basis E_4^6, E_4^3 E_6^2, and E_6^4, whose explicit expansions are

E_4^6 &= \frac{1}{191102976000000} +
    \frac{1}{132710400000}q + \frac{203}{44236800000}q^2 + \cdots,\\
E_4^3 E_6^2 &=
     \frac{1}{3511517184000} - \frac{1}{12192768000}q - \frac{377}{4064256000}q^2
     + \cdots, \\
E_6^4 &= \frac{1}{64524128256} - \frac{1}{32006016}q +
         \frac{241}{10668672}q^2 +\cdots.

We compute this basis in Sage as follows:

sage: E4 = eisenstein_series_qexp(4, 3)
sage: E6 = eisenstein_series_qexp(6, 3)
sage: E4^6
1/191102976000000 + 1/132710400000*q
                  + 203/44236800000*q^2 + O(q^3)
sage: E4^3*E6^2
1/3511517184000 - 1/12192768000*q
                - 377/4064256000*q^2 + O(q^3)
sage: E6^4
1/64524128256 - 1/32006016*q + 241/10668672*q^2 + O(q^3)

In Section The Miller Basis, we will discuss the reduced echelon form basis for M_k.

The Miller Basis

Lemma 2.20

The space S_k has a basis f_1,\ldots, f_{d} such that if a_i(f_j) is the i^{th} coefficient of f_j, then a_i(f_j) = \delta_{i,j} for i=1,\ldots, {d}. Moreover the f_j all lie in \Z[[q]]. We call this basis the Miller basis for S_k.

This is a straightforward construction involving E_4, E_6 and \Delta. The following proof very closely follows [Lan95, Ch. X,Thm. 4.4], which in turn follows the first lemma of V. Miller’s thesis.

Proof

Let d=\dim S_k. Since B_4 = -1/30 and B_6=1/42, we note that

F_4 = -\frac{8}{B_4} \cdot E_4 = 1 + 240q + 2160q^2 + 6720q^3 + 17520q^4 + \cdots

and

F_6 = -\frac{12}{B_6}\cdot E_6 = 1 - 504q - 16632q^2 - 122976q^3 - 532728q^4
+\cdots

have q-expansions in \Z[[q]] with leading coefficient 1. Choose integers a,b\geq 0 such that

4a+6b\leq 14 \qquad\text{and}\qquad 4a + 6b \con k \pmod{12},

with a=b=0 when k\con 0\pmod{12}, and let

g_j = \Delta^j F_6^{2(d-j)+b} F_4^a
   = \left(\frac{\Delta}{F_6^2}\right)^j F_6^{2d + b} F_4^a
,\qquad\qquad \text{for } j=1,\ldots,d.

Then it is elementary to check that g_j has weight k

a_j(g_j) = 1 \qquad\text{and}\qquad a_i(g_j)=0\qquad\text{when}\qquad i<j.

Hence the g_j are linearly independent over \C, so form a basis for S_k. Since F_4, F_6, and \Delta are all in \Z[[q]], so are the g_j. The f_i may then be constructed from the g_j by Gauss elimination. The coefficients of the resulting power series lie in \Z because each time we clear a column we use the power series g_j whose leading coefficient is 1 (so no denominators are introduced).

Remark 2.21

The basis coming from Miller’s lemma is “canonical”“, since it is just the reduced row echelon form of any basis. Also the set of all integral linear combinations of the elements of the Miller basis are precisely the modular forms of level 1 with integral q-expansion.

We extend the Miller basis to all M_k by taking a multiple of G_k with constant term 1 and subtracting off the f_i from the Miller basis so that the coefficients of q, q^2, \ldots q^d of the resulting expansion are 0. We call the extra basis element f_0.

Example 2.22

If k=24, then d=2. Choose a=b=0, since k\con 0\pmod{12}. Then

g_1 = \Delta F_6^2 =
q - 1032q^2 + 245196q^3 + 10965568q^4 + 60177390q^5 - \cdots

and

g_2 = \Delta^2 = q^2 - 48q^3 + 1080q^4 - 15040q^5 + \cdots.

We let f_2=g_2 and

f_1 = g_1 + 1032 g_2
= q + 195660q^3 + 12080128q^4 + 44656110q^5 - \cdots.

Example 2.23

When k=36, the Miller basis including f_0 is

f_0 &= 1 + \quad\,\,\, 6218175600q^4 + 15281788354560q^5 + \cdots,\\
f_1 &= \,\,\,\,\,\,  q + \quad\,\, 57093088q^4 + \,\,\,\,\,\,\,\,\,\,37927345230q^5 +\cdots,\\

f_2 &=  \qquad q^2 +  \,\,\,\,\,\,194184q^4 + \quad\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, 7442432q^5 + \cdots,\\

f_3 &=  \qquad\quad\,\, q^3 -\,\,\,\,\,\,\,\,\,\, 72q^4 +\qquad\quad\,\,\,\,\,\,\,\,\,\,\,\,  2484q^5 + \cdots.

Example 2.24

The Sage command victor_miller_basis computes the Miller basis to any desired precision given k.

sage: victor_miller_basis(28,5)
[
1 + 15590400*q^3 + 36957286800*q^4 + O(q^5),
q + 151740*q^3 + 61032448*q^4 + O(q^5),
q^2 + 192*q^3 - 8280*q^4 + O(q^5)
]

Remark 2.25

To write f \in M_k as a polynomial in E_4 and E_6, it is wasteful to compute the Miller basis. Instead, use the upper triangular (but not echelon!) basis \Delta^j
F_6^{2(d-j)+a} F_4^b, and match coefficients from q^0 to q^d.

Hecke Operators

In this section we define Hecke operators on level 1 modular forms and derive their basic properties. We will not give proofs of the analogous properties for Hecke operators on higher level modular forms, since the proofs are clearest in the level 1 case, and the general case is similar (see, e.g., [Lan95]).

For any positive integer n, let

X_n = \left\{\mtwo{a}{b}{0}{d} \in \Mat_2(\Z) \,\,:\,\,
a\geq 1,\,\, ad=n, \text{ and } 0\leq b <d\right\}.

Note that the set X_n is in bijection with the set of subgroups of \Z^2 of index n, where \abcd{a}{b}{c}{d} corresponds to L=\Z\cdot (a,b) + \Z\cdot (0,d), as one can see using Hermite normal form, which is the analogue over \Z of echelon form (see Exercise 7.5).

Recall from (?) that if \gamma=\abcd{a}{b}{c}{d}\in\GL_2(\Q), then

f^{[\gamma]_k} = \det(\gamma)^{k-1} (cz+d)^{-k} f(\gamma(z)).

Definition 2.26

The n^{th} Hecke operator T_{n,k} of weight k is the operator on the set of functions on \h defined by

T_{n,k}(f) = \sum_{\gamma \in X_n}
f^{[\gamma]_k}.

Remark 2.27

It would make more sense to write T_{n,k} on the right, e.g., f|T_{n,k}, since T_{n,k} is defined using a right group action. However, if n,m are integers, then the action of T_{n,k} and T_{m,k} on weakly modular functions commutes (by Proposition 2.29 below), so it makes no difference whether we view the Hecke operators of given weight k as acting on the right or left.

Proposition 2.28

If f is a weakly modular function of weight k, then so is T_{n,k}(f); if f is a modular function, then so is T_{n,k}(f).

Proof

Suppose \gamma\in\SL_2(\Z). Since \gamma induces an automorphism of \Z^2, X_n \cdot \gamma = \{ \delta \gamma : \delta\in X_n\} is also in bijection with the subgroups of \Z^2 of index n. For each element \delta\gamma \in X_n\cdot \gamma, there is \sigma\in\SL_2(\Z) such that \sigma\delta\gamma\in X_n (the element \sigma transforms \delta\gamma to Hermite normal form), and the set of elements \sigma\delta\gamma is thus equal to X_n. Thus

T_{n,k}(f) = \sum_{\sigma\delta\gamma\in X_n} f^{[\sigma\delta\gamma]_k}
           = \sum_{\delta\in X_n} f^{[\delta\gamma]_k}
           = T_{n,k}(f)^{[\gamma]_k}.

A finite sum of meromorphic function is meromorphic, so T_{n,k}(f) is weakly modular. If f is holomorphic on \h, then each f^{[\delta]_k} is holomorphic on \h for \delta \in X_n. A finite sum of holomorphic functions is holomorphic, so T_{n,k}(f) is holomorphic.

We will frequently drop k from the notation in T_{n,k}, since the weight k is implicit in the modular function to which we apply the Hecke operator. Henceforth we make the convention that if we write T_{n}(f) and if f is modular, then we mean T_{n,k}(f), where k is the weight of f.

Proposition 2.29

On weight k modular functions we have

(4)T_{mn} = T_{m} T_{n} \qquad\qquad\qquad\qquad\quad
 \text{if }(m,n)=1,

and

(5)T_{p^n} = T_{p^{n-1}} T_p\, -\, p^{k-1} T_{p^{n-2}} \qquad\!\!
\text{if $p$ is prime}.

Proof

Let L be a subgroup of index mn. The quotient \Z^2/L is an abelian group of order mn, and (m,n)=1, so \Z^2/L decomposes uniquely as a direct sum of a subgroup of order m with a subgroup of order n. Thus there exists a unique subgroup L' such that L\subset L'\subset \Z^2, and L' has index m in \Z^2. The subgroup L' corresponds to an element of X_m, and the index n subgroup L\subset L' corresponds to multiplying that element on the right by some uniquely determined element of X_n. We thus have

\SL_2(\Z) \cdot X_{m} \cdot X_n = \SL_2(\Z) \cdot X_{mn}

i.e., the set products of elements in X_m with elements of X_n equal the elements of X_{mn}, up to \SL_2(\Z)-equivalence. Thus for any f, we have T_{mn}(f) = T_{n}(T_m(f)). Applying this formula with m and n swapped yields the equality T_{mn} = T_{m} T_{n}.

We will show that T_{p^n} + p^{k-1} T_{p^{n-2}} = T_p T_{p^{n-1}}. Suppose f is a weight k weakly modular function. Using that f^{[\abcd{p}{0}{0}{p}]_k} = (p^2)^{k-1}p^{-k} f = p^{k-2} f, we have

\sum_{x\in X_{p^n}} f^{[x]_k}\,\, +\,\, p^{k-1}\!\!\! \sum_{x\in
    X_{p^{n-2}}} f^{[x]_k}
 =  \sum_{x\in X_{p^n}} f^{[x]_k} \,\,\,+\, p\!\!\!\! \sum_{x\in pX_{p^{n-2}}}
f^{[x]_k}.

Also

T_p T_{p^{n-1}}(f) = \sum_{y\in X_p} \sum_{x\in X_{p^{n-1}}}
(f^{[x]_k})^{[y]_k} = \sum_{x \in X_{p^{n-1}}\cdot X_p}
f^{[x]_k}.

Thus it suffices to show that X_{p^n} disjoint union p copies of p X_{p^{n-2}} is equal to X_{p^{n-1}}\cdot X_p, where we consider elements with multiplicities and up to left \SL_2(\Z)-equivalence (i.e., the left action of \SL_2(\Z)).

Suppose L is a subgroup of \Z^2 of index p^n, so L corresponds to an element of X_{p^n}. First suppose L is not contained in p\Z^2. Then the image of L in \Z^2/p\Z^2 =
(\Z/p\Z)^2 is of order p, so if L'=p\Z^2 + L, then [\Z^2:L']=p and [L:L']=p^{n-1}, and L' is the only subgroup with this property. Second, suppose that L\subset p\Z^2 if of index p^n and that x\in X_{p^n} corresponds to L. Then every one of the p+1 subgroups L'\subset \Z^2 of index p contains L. Thus there are p+1 chains L\subset L'\subset \Z^2 with [\Z^2:L']=p.

The chains L\subset L'\subset\Z^2 with [\Z^2:L']=p and [\Z^2:L]=p^{n-1} are in bijection with the elements of X_{p^{n-1}}\cdot X_p. On the other hand the union of X_{p^n} with p copies of p X_{p^{n-2}} corresponds to the subgroups L of index p^{n}, but with those that contain p\Z^2 counted p+1 times. The structure of the set of chains L\subset L'\subset\Z^2 that we derived in the previous paragraph gives the result.

Corollary 2.30

The Hecke operator T_{p^n}, for prime p, is a polynomial in T_p with integer coefficients, i.e., T_{p^n}\in\Z[T_p]. If n,m are any integers, then T_n T_m = T_m T_n

Proof

The first statement follows from (5) of Proposition 2.29. It then follows that T_n T_m = T_m T_n when m and n are both powers of a single prime p. Combining this with (4) gives the second statement in general.

Proposition 2.31

Let f = \sum_{n\in\Z} a_n q^n be a modular function of weight k. Then

T_{n}(f) = \sum_{m\in\Z}
\left(\sum_{1\leq d\,\mid\, \gcd(n,m)} d^{k-1} a_{mn/d^2}\right) q^m.

In particular, if n=p is prime, then

T_p(f) = \sum_{m\in \Z} \left(a_{mp} + p^{k-1} a_{m/p}\right) q^m,

where a_{m/p}=0 if m/p\not\in\Z.

Proof

This is proved in [Ser73, Section VII.5.3] by writing out T_n(f) explicitly and using that \sum_{0\leq b<d} e^{2\pi i bm/d} is d if d\mid m and 0 otherwise.

Corollary 2.32

The Hecke operators preserve M_k and S_k.

Remark 2.33

Alternatively, for M_k the above corollary is Proposition 2.28, and for S_k we see from the definitions that if f(\infty)=0, then T_n f also vanishes at \infty.

Example 2.34

Recall from (3) that

E_4 = \frac{1}{240} + q + 9q^2 + 28q^3 + 73q^4 + 126q^5 + 252q^6 + 344q^7
      +\cdots.

Using the formula of Proposition 2.31, we see that

T_2(E_4) = (1/240 + 2^3\cdot (1/240)) + 9q + (73 + 2^3\cdot 1) q^2 + \cdots.

Since M_4 has dimension 1 and since we have proved that T_2 preserves M_4, we know that T_2 acts as a scalar. Thus we know just from the constant coefficient of T_2(E_4) that

T_2(E_4) = 9 E_4.

More generally, for p prime we see by inspection of the constant coefficient of T_p(E_4) that

T_p(E_4) = (1+p^3)E_4.

In fact

T_n(E_k) = \sigma_{k-1}(n) E_k,

for any integer n\geq 1 and even weight k\geq 4.

Example 2.35

By Corollary Corollary 2.32, the Hecke operators T_n also preserve the subspace S_k of M_k. Since S_{12} has dimension 1 (spanned by \Delta), we see that \Delta is an eigenvector for every T_n. Since the coefficient of q in the q-expansion of \Delta is 1, the eigenvalue of T_n on \Delta is the n^{th} coefficient of \Delta. Since T_{nm}=T_n T_m for \gcd(n,m)=1, we have proved the nonobvious fact that the Ramanujan function \tau(n) that gives the n^{th} coefficient of \Delta is a multiplicative function, i.e., if \gcd(n,m)=1, then \tau(nm)=\tau(n)\tau(m).

Remark 2.36

The Hecke operators respect the decomposition M_k = S_k \oplus \C E_k, i.e., for all k the series E_k are eigenvectors for all T_n.

Computing Hecke Operators

This section is about how to compute matrices of Hecke operators on M_k.

Algorithm 2.37

This algorithm computes the matrix of the Hecke operator T_n on the Miller basis for M_k.

  1. [Dimension] Compute d= \dim(M_k)-1 using Corollary 2.16.

  2. [Basis] Using Lemma 2.20, compute the echelon basis f_0,\ldots, f_d for M_k \pmod{q^{dn+1}}.

  3. [Hecke operator] Using Proposition 2.31, compute for each i the image T_n(f_i) \pmod{q^{d+1}} .

  4. [Write in terms of basis] The elements T_n(f_i)
\pmod{q^{d+1}} determine linear combinations of

    f_0,f_1,\ldots, f_d \pmod{q^d}.

    These linear combinations are easy to find once we compute T_n(f_i) \pmod{q^{d+1}}, since our basis of f_i is in echelon form. The linear combinations are just the coefficients of the power series T_n(f_i) up to and including q^d.

  5. [Write down matrix] The matrix of T_n acting from the right relative to the basis f_0, \ldots, f_d is the matrix whose rows are the linear combinations found in the previous step, i.e., whose rows are the coefficients of T_n(f_i).

Proof

By Proposition 2.31, the d^{th} coefficient of T_n(f) involves only a_{dn} and smaller-indexed coefficients of f. We need only compute a modular form f modulo q^{dn+1} in order to compute T_n(f) modulo q^{d+1}. Uniqueness in step (4) follows from Lemma 2.20 above.

Example 2.38

We compute the Hecke operator T_2 on M_{12} using the above algorithm.

  1. [Compute dimension] We have d=2 - 1 = 1.

  2. [Compute basis] Compute up to (but not including) the coefficient of q^{dn+1} = q^{1\cdot 2 + 1} = q^3. As given in the proof of Lemma 2.20, we have

    F_4 = 1 + 240q + 2160q^2 + \cdots \quad\text{and}\quad
F_6 = 1 - 504q - 16632q^2 + \cdots.

    Thus $M_{12}$ has basis

    f_0 = 1 + 196560q^{2} + \cdots
\quad\text{and}\quad
f_1 = q - 24q^{2} + \cdots.

    Sage does the arithmetic in the above calculation as follows:

    sage: R.<q> = QQ[['q']]
    sage: F4 =  240 * eisenstein_series_qexp(4,3)
    sage: F6 = -504 * eisenstein_series_qexp(6,3)
    sage: F4^3
    1 + 720*q + 179280*q^2 + O(q^3)
    sage: Delta = (F4^3 - F6^2)/1728; Delta
    q - 24*q^2 + O(q^3)
    sage: F4^3 - 720*Delta
    1 + 196560*q^2 + O(q^3)
    
  3. [Compute Hecke operator] In each case letting a_n denote the n^{th} coefficient of f_0 or f_1, respectively, we have

    T_2(f_0) &= T_2 (1 + 196560q^{2} + \cdots) \\
         &= (a_0 + 2^{11} a_0) q^0 + (a_2 + 2^{11} a_{1/2})q^1 + \cdots\\
         &= 2049  + 196560q + \cdots,

    and

    T_2(f_1) &= T_2 (q - 24q^{2} + \cdots) \\
         &= (a_0 + 2^{11} a_0) q^0 + (a_2 + 2^{11} a_{1/2})q^1 + \cdots\\
         &= 0 -24 q + \cdots.

    (Note that a_{1/2} = 0.)

  4. [Write in terms of basis] We read off at once that

    T_2(f_0) = 2049 f_0 + 196560 f_1
\quad\text{and}\quad
T_2(f_1) = 0 f_0 + (-24)f_1.

  5. [Write down matrix] Thus the matrix of T_2, acting from the right on the basis f_0, f_1, is

    T_2 = \mtwo{2049}{196560}{0}{-24}.

As a check note that the characteristic polynomial of T_2 is (x-2049)(x+24) and that 2049 = 1 + 2^{11} is the sum of the 11^{th} powers of the divisors of 2.

Example 2.39

The Hecke operator T_2 on M_{36} with respect to the echelon basis is

\left(
\begin{matrix}
\hfill 34359738369&\hfill0&\hfill6218175600&\hfill9026867482214400\\
0&\hfill0&\hfill34416831456&\hfill5681332472832\\
0&\hfill1&\hfill194184&\hfill-197264484\\
0&\hfill0&\hfill-72&\hfill-54528\\
\end{matrix}\right).

It has characteristic polynomial

(x - 34359738369) \cdot
(x^3 - 139656x^2 - 59208339456x - 1467625047588864),

where the cubic factor is irreducible.

The echelon_form command creates the space of modular forms but with basis in echelon form (which is not the default).

sage: M = ModularForms(1,36, prec=6).echelon_form()
sage: M.basis()
[
1 + 6218175600*q^4 + 15281788354560*q^5 + O(q^6),
q + 57093088*q^4 + 37927345230*q^5 + O(q^6),
q^2 + 194184*q^4 + 7442432*q^5 + O(q^6),
q^3 - 72*q^4 + 2484*q^5 + O(q^6)
]

Next we compute the matrix of the Hecke operator T_2.

sage: T2 = M.hecke_matrix(2); T2
[34359738369   0       6218175600 9026867482214400]
[          0   0      34416831456    5681332472832]
[          0   1           194184       -197264484]
[          0   0              -72           -54528]

Finally we compute and factor its characteristic polynomial.

sage: T2.charpoly().factor()
(x - 34359738369) * (x^3 - 139656*x^2 - 59208339456*x - 1467625047588864)

The following is a famous open problem about Hecke operators on modular forms of level 1. It generalizes our above observation that the characteristic polynomial of T_2 on M_k, for k=12, 36, factors as a product of a linear factor and an irreducible factor.

Conjuecture 2.40

The characteristic polynomial of T_2 on S_k is irreducible for any k.

Kevin Buzzard observed that in several specific cases the Galois group of the characteristic polynomial of T_2 is the full symmetric group (see [Buz96]). See also [FJ02] for more evidence for the following conjecture:

Conjuecture 2.41

For all primes p and all even k\geq 2 the characteristic polynomial of T_{p,k} acting on S_k is irreducible.

Fast Computation of Fourier Coefficients

How difficult is it to compute prime-indexed coefficients of

\Delta = \sum_{n=1}^{\infty} \tau(n) q^n
       = q - 24q^2 + 252q^3 - 1472q^4 + 4830q^5 - 6048q^6 + \cdots?

Theorem 2.42

Let p be a prime. There is a probabilistic algorithm to compute \tau(p), for prime p, that has expected running time polynomial in \log(p)

Proof

See [ECdJ+06].

More generally, if f=\sum a_n q^n is an eigenform in some space M_k(\Gamma_1(N)), where k\geq 2, then one expects that there is an algorithm to compute a_p in time polynomial in \log(p). Bas Edixhoven, Jean-Marc Couveignes and Robin de Jong have proved that \tau(p) can be computed in polynomial time; their approach involves sophisticated techniques from arithmetic geometry (e.g., ‘etale cohomology, motives, Arakelov theory). The ideas they use are inspired by the ones introduced by Schoof, Elkies and Atkin for quickly counting points on elliptic curves over finite fields (see [Sch95]).

Edixhoven describes (in an email to the author) the strategy as follows:

  1. We compute the mod \ell Galois representation \rho associated to \Delta. In particular, we produce a polynomial f such that \Q[x]/(f) is the fixed field of \ker(\rho). This is then used to obtain \tau(p) \pmod{\ell} and to do a Schoof-like algorithm for computing \tau(p).
  2. We compute the field of definition of suitable points of order \ell on the modular Jacobian J_1(\ell) to do part (1) (see [DS05, Ch. 6] for the definition of J_1(\ell)).
  3. The method is to approximate the polynomial f in some sense (e.g., over the complex numbers or modulo many small primes r) and to use an estimate from Arakelov theory to determine a precision that will suffice.

Fast Computation of Bernoulli Numbers

This section, which was written jointly with Kevin McGown, is about computing the Bernoulli numbers B_n, for n\geq 0, defined in Section Fourier Expansions of Eisenstein Series by

(6)\frac{x}{e^x - 1} = \sum_{n=0}^{\infty} B_n \frac{x^n}{n!}.

One way to compute B_n is to multiply both sides of (?) by e^x-1 and equate coefficients of x^{n+1} to obtain the recurrence

B_0=1\text{,}
\qquad
B_n=-\frac{1}{n+1}\cdot \sum_{k=0}^{n-1}\binom{n+1}{k}B_{k}.

This recurrence provides a straightforward and easy-to-implement method for calculating B_n if one is interested in computing B_n for all n up to some bound. For example,

B_1 = - \frac{1}{2} \cdot \left( \binom{2}{0} B_0\right) =
-\frac{1}{2}

and

B_2 = -\frac{1}{3} \cdot \left( \binom{3}{0} B_0 + \binom{3}{1} B_1  \right)
    = -\frac{1}{3} \cdot \left( 1 - \frac{3}{2}\right) = \frac{1}{6}.

However, computing B_n via the recurrence is slow; it requires summing over many large terms, it requires storing the numbers B_0,\dots,B_{n-1}, and it takes only limited advantage of asymptotically fast arithmetic algorithms. There is also an inductive procedure to compute Bernoulli numbers that resembles Pascal’s triangle called the Akiyama-Tanigawa algorithm (see [Kan00]).

Another approach to computing B_n is to use Newton iteration and asymptotically fast polynomial arithmetic to approximate 1/(e^x -1). This method yields a very fast algorithm to compute B_0, B_2,\ldots, B_{p-3} modulo p. See [BCS92] for an application of this method modulo a prime p to the verification of Fermat’s last theorem for irregular primes up to one million.

Example 2.43

David Harvey implemented the algorithm of [BCS92] in Sage as the command bernoulli_mod_p:

sage: bernoulli_mod_p(23)
[1, 4, 13, 17, 13, 6, 10, 5, 10, 9, 15]

A third way to compute B_n uses an algorithm based on Proposition 2.8, which we explain below (Algorithm 2.45). This algorithm appears to have been independently invented by several people: by Bernd C. Kellner (see [Kel06]); by Bill Dayl; and by H. Cohen and K. Belabas.

We compute B_n as an exact rational number by approximating \zeta(n) to very high precision using Proposition 2.8, the Euler product

\zeta(s)=\sum_{m=1}^\infty m^{-s} =\prod_{p\text{ prime}} (1-p^{-s})^{-1},

and the following theorem:

Theorem 2.44

For even n\geq 2,

\denom(B_n)=\prod_{p-1 \,\mid\, n} p.

Proof

See [Lan95, Ch. X, Thm. 2.1].

The Number of Digits of B_n

The following is a new quick way to compute the number of digits of the numerator of B_n. For example, using it we can compute the number of digits of B_{10^{50}} in less than a second.

By Theorem 2.44 we have d_n = \denom(B_n) = \prod_{p-1\mid n} p. The number of digits of the numerator is thus

\lceil \log_{10}(d_n \cdot |B_n|) \rceil.

But

\log(|B_n|) &= \log\left(\frac{2\cdot n!}{(2\pi)^n}\,\zeta(n)\right)\\
            &= \log(2) + \log(n!) - n\log(2) - n\log(\pi) +
            \log(\zeta(n)),

and \zeta(n)\sim 1 so \log(\zeta(n))\sim 0. Finally, Stirling’s formula (see [Ahl78, pg. 198–206]) gives a fast way to compute \log(n!) = \log(\Gamma(n+1)):

(7)\log(\Gamma(z)) ``=\text{''}
\frac{\log(2\pi)}{2} + \left(z - \frac{1}{2}\right)\log(z)
          - z + \sum_{m=1}^{\infty}
          \frac{B_{2m}}{2m(2m-1)z^{2m-1}}.

We put quotes around the equality sign because \log(\Gamma(z)) does not converge to its Laurent series. Indeed, note that for any fixed value of z the summands on the right side go to \infty as m\to\infty! Nonetheless, we can use this formula to very efficiently compute \log(\Gamma(z)), since if we truncate the sum, then the error is smaller than the next term in the infinite sum.

Computing B_n Exactly

We return to the problem of computing B_n. Let

K=\frac{2\cdot n!}{(2\pi)^n}

so |B_n|=K\zeta(n). Write

B_n=\frac{a}{d},

with a,d\in\Z, d\geq 1, and \gcd(a,d)=1. It is elementary to show that a=(-1)^{n/2+1}\;|a| for even n\geq 2. Suppose that using the Euler product we approximate \zeta(n) from below by a number z such that

0\leq \zeta(m)-z<\frac{1}{Kd}.

Then 0\leq |B_n| - zK < d^{-1}; hence 0\leq |a| - zKd < 1. It follows that |a|=\lceil zKd\rceil and hence a=(-1)^{n/2+1}\;\lceil zKd\rceil.

It remains to compute z. Consider the following problem: given s>1 and \varepsilon>0, find M\in\Z_+ so that

z=\prod_{p\leq M} (1-p^{-s})^{-1}

satisfies 0\leq\zeta(s)-z<\varepsilon. We always have 0\leq\zeta(s)-z. Also,

\sum_{n\leq M} n^{-s}\leq\prod_{p\leq M} (1-p^{-s})^{-1},

so

\zeta(s)-z \leq
  \sum_{n=M+1}^\infty n^{-s}
\leq
  \int_M^\infty x^{-s}\;dx
=
  \frac{1}{(s-1)M^{s-1}}.

Thus if M>\varepsilon^{-1/(s-1)}, then

\frac{1}{(s-1)M^{s-1}}
\leq
\frac{1}{M^{s-1}}
<
\varepsilon
\,,

so \zeta(s)-z<\varepsilon, as required. For our purposes, we have s=n and \varepsilon=(Kd)^{-1}, so it suffices to take M>(Kd)^{1/(n-1)}.

Algorithm 2.45

Given an integer n\geq 0, this algorithm computes the Bernoulli number B_n as an exact rational number.

  1. [Special cases] If n=0, return 1; if n=1, return -1/2; if n\geq 3 is odd, return 0.
  2. [Factorial factor] Compute \ds K=\frac{2 \cdot n!}{(2\pi)^n} to sufficiently many digits of precision so the ceiling in step (6) is uniquely determined (this precision can be determined using Section The Number of Digits of ).
  3. [Denominator] Compute \ds d=\prod_{p-1 | n} p.
  4. [Bound] Compute \ds M = \left\lceil (K d)^{1/(n-1)} \right\rceil.
  5. [Approximate \zeta(n)] Compute \ds z = \prod_{p\leq M} (1-p^{-n})^{-1}.
  6. [Numerator] Compute \ds a = (-1)^{n/2+1}\,\lceil d K z \rceil.
  7. [Output B_n] Return \ds \frac{a}{d}.

In step (5) use a sieve to compute all primes p\leq M efficiently (which is fast, since M is so small). In step (4) we may replace M by any integer greater than the one specified by the formula, so we do not have to compute (Kd)^{1/(n-1)} to very high precision.

In Section Computing Generalized Bernoulli Numbers Analytically below we will generalize the above algorithm.

Example 2.46

We illustrate Algorithm 2.45 by computing B_{50}. Using 135 binary digits of precision, we compute

K=7500866746076957704747736.71552473164563479.

The divisors of n are 1, 2, 5, 10, 25, 50, so

d = 2\cdot 3\cdot 11=66.

We find M=4 and compute

z = 1.00000000000000088817842109308159029835012.

Finally we compute

dKz = 495057205241079648212477524.999999994425778,

so

B_{50}=\frac{495057205241079648212477525}{66}.

Exercises

Exercise 2.1

Using Proposition 2.8 and the table found here, compute \sum_{n=1}^{\infty}\frac{1}{n^{26}} explicitly.

Exercise 2.2

Prove that if n>1 is odd, then the Bernoulli number B_n is 0.

Exercise 2.3

Use (3) to write down the coefficients of 1, q, q^2, and q^3 of the Eisenstein series E_8.

Exercise 2.4

Suppose k is a positive integer with k\equiv 0 \pmod{12}. Suppose a,b\geq 0 are integers with 4a + 6b = k.

  1. Prove 3 \mid a.
  2. Show that \ds G_4^a \cdot G_6^b \,/\, G_6^{\frac{k}{6}} = \left(G_4^3 / G_6^2 \right)^{\frac{a}{3}}.

Exercise 2.5

Compute the Miller basis for M_{28}(\SL_2(\Z)) with precision O(q^8). Your answer will look like Example 2.23.

Exercise 2.6

Consider the cusp form f = q^2 + 192 q^3 - 8280 q^4 + \cdots in S_{28}(\SL_2(\Z)). Write f as a polynomial in E_4 and E_6 (see Remark 2.25).

Exercise 2.7

Let G_k be the weight k Eisenstein series from equation (1). Let c be the complex number so that the constant coefficient of the q-expansion of g = c \cdot G_k is 1. Is it always the case that the q-expansion of g lies in \Z[[q]]?

Exercise 2.8

Compute the matrix of the Hecke operator T_2 on the Miller basis for M_{32}(\SL_2(\Z)). Then compute its characteristic polynomial and verify it factors as a product of two irreducible polynomials.

What Next? Much of the rest of this book is about methods for computing subspaces of M_k(\Gamma_1(N)) for general N and k. These general methods are more complicated than the methods presented in this chapter, since there are many more modular forms of small weight and it can be difficult to obtain them. Forms of level N>1 have subtle connections with elliptic curves, abelian varieties, and motives. Read on for more!