The Group of Units

Definition 8.1.1 (Unit Group)   The group of units $ U_K$ associated to a number field $ K$ is the group of elements of $ \O_K$ that have an inverse in $ \O_K$.

Theorem 8.1.2 (Dirichlet)   The group $ U_K$ is the product of a finite cyclic group of roots of unity with a free abelian group of rank $ r+s-1$, where $ r$ is the number of real embeddings of $ K$ and $ s$ is the number of complex conjugate pairs of embeddings.

(Note that we will prove a generalization of Theorem 8.1.2 in Section 12.1 below.)

We prove the theorem by defining a map $ \varphi :U_K\to \mathbf{R}^{r+s}$, and showing that the kernel of $ \varphi $ is finite and the image of $ \varphi $ is a lattice in a hyperplane in $ \mathbf{R}^{r+s}$. The trickiest part of the proof is showing that the image of $ \varphi $ spans a hyperplane, and we do this by a clever application of Blichfeld's Lemma 7.1.5.

Image dirichlet

Remark 8.1.3   Theorem 8.1.2 is due to Dirichlet who lived 1805-1859. Thomas Hirst described Dirichlet thus:
He is a rather tall, lanky-looking man, with moustache and beard about to turn grey with a somewhat harsh voice and rather deaf. He was unwashed, with his cup of coffee and cigar. One of his failings is forgetting time, he pulls his watch out, finds it past three, and runs out without even finishing the sentence.
Koch wrote that:
... important parts of mathematics were influenced by Dirichlet. His proofs characteristically started with surprisingly simple observations, followed by extremely sharp analysis of the remaining problem.
I think Koch's observation nicely describes the proof we will give of Theorem 8.1.2.

Units have a simple characterization in terms of their norm.

Proposition 8.1.4   An element $ a\in\O_K$ is a unit if and only if $ \Norm _{K/\mathbf{Q}}(a)=\pm 1$.

Proof. Write $ \Norm =\Norm _{K/\mathbf{Q}}$. If $ a$ is a unit, then $ a^{-1}$ is also a unit, and $ 1=\Norm (a)\Norm (a^{-1})$. Since both $ \Norm (a)$ and $ \Norm (a^{-1})$ are integers, it follows that $ \Norm (a)=\pm 1$. Conversely, if $ a\in\O_K$ and $ \Norm (a)=\pm 1$, then the equation $ aa^{-1}=1=\pm \Norm (a)$ implies that $ a^{-1} = \pm \Norm (a)/a$. But $ \Norm (a)$ is the product of the images of $ a$ in $ \mathbf{C}$ by all embeddings of $ K$ into  $ \mathbf{C}$, so $ \Norm (a)/a$ is also a product of images of $ a$ in  $ \mathbf{C}$, hence a product of algebraic integers, hence an algebraic integer. Thus $ a^{-1}\in K\cap \overline{\mathbf{Z}}= \O_K$, which proves that $ a$ is a unit. $ \qedsymbol$

Let $ r$ be the number of real and $ s$ the number of complex conjugate embeddings of $ K$ into $ \mathbf{C}$, so $ n=[K:\mathbf{Q}]=r+2s$. Define the log embedding

$\displaystyle \varphi :U_K \to \mathbf{R}^{r+s}


$\displaystyle \varphi (a) = (\log\vert\sigma_1(a)\vert,\ldots, \log\vert\sigma_{r+s}(a)\vert).

(Here $ \vert z\vert$ is the usual absolute value of $ z=x+iy\in\mathbf{C}$, so $ \vert z\vert=\sqrt{x^2+y^2}$.)

Lemma 8.1.5   The image of $ \varphi $ lies in the hyperplane

$\displaystyle H = \{(x_1,\ldots, x_{r+s})\in\mathbf{R}^{r+s} : x_1+ \cdots + x_r + 2x_{r+1} + \cdots + 2x_{r+s} = 0\}.$ (8.1)

Proof. If $ a\in U_K$, then by Proposition 8.1.4,

$\displaystyle \left(\prod_{i=1}^{r} \vert\sigma_i(a)\vert\right)
\cdot \left( ...
...r+s} \vert\sigma_i(a)\vert^2 \right) =
\vert\Norm _{K/\mathbf{Q}}(a)\vert = 1.$

Taking logs of both sides proves the lemma. $ \qedsymbol$

Lemma 8.1.6   The kernel of $ \varphi $ is finite.

Proof. We have

$\displaystyle \Ker (\varphi )$ $\displaystyle \subset \{a\in\O_K : \vert\sigma_i(a)\vert = 1$    for $\displaystyle i=1,\ldots,r+s\}$    
  $\displaystyle \subset \sigma(\O_K) \cap X,$    

where $ X$ is the bounded subset of $ \mathbf{R}^{r+s}$ of elements all of whose coordinates have absolute value at most $ 1$. Since $ \sigma(\O_K)$ is a lattice (see Proposition 2.4.5), the intersection $ \sigma(\O_K)\cap X$ is finite, so $ \Ker (\varphi )$ is finite. $ \qedsymbol$

Lemma 8.1.7   The kernel of $ \varphi $ is a finite cyclic group.

Proof. Lemma 8.1.6 implies that $ \ker(\varphi )$ is a finite group. It is a general fact that any finite subgroup $ G$ of the multiplicative group $ K^*$ of a field is cyclic. (Proof: If $ n$ is the exponent of $ G$, then every element of $ G$ is a root of the polynomial $ x^n-1$. A polynomial of degree $ n$ over a field has at most $ n$ roots, so $ G$ has order at most $ n$, hence $ G$ is cyclic of order $ n$.) $ \qedsymbol$

To prove Theorem 8.1.2, it suffices to prove that Im$ (\varphi )$ is a lattice in the hyperplane $ H$ of (8.1.1), which we view as a vector space of dimension $ r+s-1$.

Define an embedding

$\displaystyle \sigma : K\hookrightarrow \mathbf{R}^n$ (8.2)

given by $ \sigma(x) = (\sigma_1(x),\ldots,\sigma_{r+s}(x))$, where we view $ \mathbf{C}\cong \mathbf{R}\times \mathbf{R}$ via $ a+b i\mapsto (a,b)$. Thus this is the embedding

$\displaystyle x\mapsto \big($ $\displaystyle \sigma_1(x), \sigma_2(x),\ldots, \sigma_r(x),$    
     Re$\displaystyle (\sigma_{r+1}(x)),$   Im$\displaystyle (\sigma_{r+1}(x)), \ldots,$   Re$\displaystyle (\sigma_{r+s}(x)),$   Im$\displaystyle (\sigma_{r+s}(x))\big).$    

Lemma 8.1.8   The image $ \varphi :U_K\to \mathbf{R}^{r+s}$ is discrete.

Proof. We will show that for any bounded subset $ X$ of $ \mathbf{R}^{r+s}$, the intersection $ \varphi (U_K)\cap X$ is finite. If $ X$ is bounded, then for any $ u\in
Y=\varphi ^{-1}(X)\subset U_K$ the coordinates of $ \sigma(u)$ are bounded, since $ \vert\log(x)\vert$ is bounded on bounded subsets of $ [1,\infty)$. Thus $ \sigma(Y)$ is a bounded subset of $ \mathbf{R}^n$. Since $ \sigma(Y)\subset \sigma(\O_K)$, and $ \sigma(\O_K)$ is a lattice in $ \mathbf{R}^n$, it follows that $ \sigma(Y)$ is finite; moreover, $ \sigma$ is injective, so $ Y$ is finite. Thus $ \varphi (U_K)\cap X \subset \varphi (Y) \cap X$ is finite. $ \qedsymbol$

We will use the following lemma in our proof of Theorem 8.1.2.

Lemma 8.1.9   Let $ n\geq 2$ be an integer, suppose $ w_1,\ldots, w_n\in\mathbf{R}$ are not all equal, and suppose $ A, B\in\mathbf{R}$ are positive. Then there exist $ d_1,\ldots, d_{n} \in \mathbf{R}_{>0}$ such that

$\displaystyle \vert w_1\log(d_1)+\cdots +w_{n}\log(d_{n})\vert > B$

and $ d_1\cdots d_n = A$.

Proof. Order the $ w_i$ so that $ w_1\neq 0$. By hypothesis there exists a $ w_j$ such that $ w_j\neq w_1$, and again re-ordering we may assume that $ j=2$. Set $ d_3=\cdots=d_{r+s}=1$. Suppose $ d_1, d_2$ are any positive real numbers with $ d_1 d_2 = A$. Since $ \log(1)=0$,

$\displaystyle \left\vert\sum_{i=1}^{n} w_i \log(d_i)\right\vert$ $\displaystyle = \vert w_1\log(d_1) + w_2\log(d_2)\vert$    
  $\displaystyle = \vert w_1 \log(d_1) + w_2\log(A/d_1)\vert$    
  $\displaystyle = \vert(w_1-w_2)\log(d_1) + w_2\log(A)\vert$    

Since $ w_1\neq w_2$, we have $ \vert(w_1-w_2)\log(d_1) + w_2\log(A)\vert\to\infty$ as $ d_1\to \infty$. It is thus possible to choose the $ d_i$ as in the lemma. $ \qedsymbol$

Proof. [Proof of Theorem 8.1.2] By Lemma 8.1.8, the image $ \varphi (U_K)$ is discrete, so it remains to show that $ \varphi (U_K)$ spans $ H$. Let $ W$ be the $ \mathbf{R}$-span of the image $ \varphi (U_K)$, and note that $ W$ is a subspace of $ H$, by Lemma 8.1.5. We will show that $ W=H$ indirectly by showing that if $ v\not \in H^{\perp}$, where $ \perp$ is the orthogonal complement with respect to the dot product on $ \mathbf{R}^{r+s}$, then $ v\not \in W^{\perp}$. This will show that $ W^{\perp}\subset
H^{\perp}$, hence that $ H\subset W$, as required.

Thus suppose $ z=(z_1,\ldots,z_{r+s})\not\in H^{\perp}$. Define a function $ f:K^*\to \mathbf{R}$ by

$\displaystyle f(x) = z_1\log\vert\sigma_1(x)\vert + \cdots + z_{r+s}\log\vert\sigma_{r+s}(x)\vert.$ (8.3)

Note that $ f(U_K)=\{0\}$ if and only if $ z\in W^{\perp}$, so to show that $ z\not\in W^{\perp}$ we show that there exists some $ u\in
U_K$ with $ f(u)\neq 0$.


$\displaystyle A=\sqrt{\vert d_K\vert} \cdot \left( \frac{2}{\pi}\right)^s \in \mathbf{R}_{>0}.

Choose any positive real numbers $ c_1,\ldots, c_{r+s} \in \mathbf{R}_{>0}$ such that

$\displaystyle c_1\cdots c_r\cdot (c_{r+1}\cdots c_{r+s})^2 = A.


$\displaystyle S$ $\displaystyle = \{(x_1,\ldots,x_n) \in \mathbf{R}^n :$    
  $\displaystyle \qquad\qquad \vert x_i\vert\leq c_i$ for $\displaystyle 1\leq i \leq r,$    
  $\displaystyle \qquad\qquad \vert x_i^2 + x_{i+s}^2\vert \leq c_i^2$    for $\displaystyle r<i\leq r+s\} \subset \mathbf{R}^n.$    

Then $ S$ is closed, bounded, convex, symmetric with respect to the origin, and of dimension $ r+2s$, since $ S$ is a product of $ r$ intervals and $ s$ discs, each of which has these properties. Viewing $ S$ as a product of intervals and discs, we see that the volume of $ S$ is

$\displaystyle \Vol (S) = \prod_{i=1}^r (2c_i) \cdot \prod_{i=1}^s (\pi c_i^2)
= 2^r\cdot \pi^s \cdot A.

Recall Blichfeldt's Lemma 7.1.5, which asserts that if $ L$ is a lattice and $ S$ is closed, bounded, etc., and has volume at least $ 2^n\cdot \Vol (V/L)$, then $ S\cap L$ contains a nonzero element. To apply this lemma, we take $ L=\sigma(\O_K)\subset \mathbf{R}^n$, where $ \sigma$ is as in (8.1.2). By Lemma 7.1.7, we have $ \Vol (\mathbf{R}^n/L) = 2^{-s}\sqrt{\vert d_K\vert}$. To check the hypothesis of Blichfeld's lemma, note that

$\displaystyle \Vol (S) = 2^{r+s} \sqrt{\vert d_K\vert} = 2^n 2^{-s} \sqrt{\vert d_K\vert} = 2^n \Vol (\mathbf{R}^n/L).

Thus there exists a nonzero element $ x$ in $ S\cap \sigma(\O_K)$. Let $ a\in\O_K$ with $ \sigma(a)=x$, then $ \sigma(a)\in S$, so $ \vert\sigma_i(a)\vert\leq c_i$ for $ 1\leq i\leq r+s$. We then have

$\displaystyle \vert\Norm _{K/\mathbf{Q}}(a)\vert$ $\displaystyle = \left\vert\prod_{i=1}^{r+2s} \sigma_i(a)\right\vert$    
  $\displaystyle = \prod_{i=1}^r \vert\sigma_i(a)\vert\cdot \prod_{i=r+1}^s\vert\sigma_i(a)\vert^2$    
  $\displaystyle \leq c_1\cdots c_r\cdot (c_{r+1}\cdots c_{r+s})^2 = A.$    

Since $ a\in\O_K$ is nonzero, we also have

$\displaystyle \vert\Norm _{K/\mathbf{Q}}(a)\vert\geq 1.

Moreover, if for any $ i\leq r$, we have $ \vert\sigma_i(a)\vert< \frac{c_i}{A}$, then

$\displaystyle 1\leq \vert\Norm _{K/\mathbf{Q}}(a)\vert < c_1\cdots \frac{c_i}{A}\cdots c_r \cdot (c_{r+1}\cdots c_{r+s})^2 = \frac{A}{A} = 1,

a contradiction, so $ \vert\sigma_i(a)\vert\geq \frac{c_i}{A}$ for $ i=1,\ldots,r$. Likewise, $ \vert\sigma_i(a)\vert^2 \geq \frac{c_i^2}{A}$, for $ i=r+1,\ldots, r+s$. Rewriting this we have

$\displaystyle \frac{c_i}{\vert\sigma_i(a)\vert}\leq A$    for $\displaystyle i\leq r$   and$\displaystyle \quad \left(\frac{c_i}{\vert\sigma_i(a)\vert}\right)^2\geq A$   for $\displaystyle i=r+1,\ldots, r+s.$ (8.4)

Recall that our overall strategy is to use an appropriately chosen $ a$ to construct a unit $ u\in
U_K$ such $ f(u)\neq 0$. First, let $ b_1,\ldots, b_m$ be representative generators for the finitely many nonzero principal ideals of $ \O_K$ of norm at most $ A$. Since $ \vert\Norm _{K/\mathbf{Q}}(a)\vert\leq A$, we have $ (a)=(b_j)$, for some $ j$, so there is a unit $ u\in \O_K$ such that $ a=u b_j$.


$\displaystyle t = t_{c_1,\ldots, c_{r+s}} = z_1\log(c_1)+\cdots +z_{r+s}\log(c_{r+s}),

and recall $ f:K^*\to \mathbf{R}$ defined in (8.1.3) above. We first show that

$\displaystyle \vert f(u) - t\vert \leq B_j =
\vert f(b_j)\vert + \log(A)\cdot\...
...=1}^{r}\vert z_i\vert +
\frac{1}{2}\cdot \sum_{i=r+1}^s\vert z_i\vert\right).

We have

$\displaystyle \vert f(u) - t\vert$ $\displaystyle = \vert f(a) - f(b_j) - t\vert$    
  $\displaystyle \leq \vert f(b_j)\vert + \vert t - f(a)\vert$    
  $\displaystyle =\vert f(b_j)\vert + \vert z_1(\log(c_1) - \log(\vert\sigma_1(a)\vert)) + \cdots + z_{r+s}(\log(c_{r+s}) - \log(\vert\sigma_{r+s}(a)\vert))\vert$    
  $\displaystyle =\vert f(b_j)\vert + \vert z_1\cdot \log(c_1/\vert\sigma_1(a)\ver...
...cdots + \frac{z_{r+s}}{2}\cdot \log((c_{r+s}/\vert\sigma_{r+s}(a)\vert)^2)\vert$    
  $\displaystyle \leq \vert f(b_j)\vert + \log(A)\cdot\left(\sum_{i=1}^{r}\vert z_i\vert + \frac{1}{2}\cdot \sum_{i=r+1}^s\vert z_i\vert\right).$    

In the last step we use (8.1.4).

Let $ B=\max_{j} B_j$, and note that $ B$ does not depend on the choice of the $ c_i$; in fact, it only depends on the field $ K$. Moreover, for any choice of the $ c_i$ as above, we have

$\displaystyle \vert f(u) - t\vert \leq B.

If we can choose positive real numbers $ c_i$ such that

$\displaystyle c_1\cdots c_r\cdot (c_{r+1}\cdots c_{r+s})^2$ $\displaystyle = A$    
$\displaystyle \vert t_{c_1,\ldots, c_{r+s}}\vert$ $\displaystyle >B,$    

then the fact that $ \vert f(u)-t\vert\leq B$ would then imply that $ \vert f(u)\vert>0$, which is exactly what we aimed to prove.

If $ r+s=1$, then we are trying to prove that $ \varphi (U_K)$ is a lattice in $ \mathbf{R}^0=\mathbf{R}^{r+s-1}$, which is automatically true, so assume $ r+s>1$. To finish the proof, we explain how to use Lemma 8.1.9 to choose $ c_i$ such that $ \vert t\vert>B$. We have

$\displaystyle z_1\log(c_1)$ $\displaystyle +\cdots +z_{r+s}\log(c_{r+s}) =$    
  $\displaystyle z_1\log(c_1)+\cdots + z_r\log(c_r)+ \frac{1}{2}\cdot z_{r+1}\log(c_{r+1}^2) + \cdots + \frac{1}{2}\cdot z_{r+s}\log(c_{r+s}^2)$    
  $\displaystyle =w_1\log(d_1)+\cdots + w_r\log(d_r)+ w_{r+1}\log(d_{r+1}) + \cdots +\cdot w_{r+s}\log(d_{r+s}),$    

where $ w_i=z_i$ and $ d_i=c_i$ for $ i\leq r$, and $ w_i=\frac{1}{2}z_i$ and $ d_i=c_i^2$ for $ r<i\leq r+s$, The condition that $ z\not\in H^{\perp}$ is that the $ w_i$ are not all the same, and in our new coordinates the lemma is equivalent to showing that $ \vert\sum_{i=1}^{r+s} w_i \log(d_i)\vert>B$, subject to the condition that $ \prod_{i=1}^{r+s} d_i = A$. But this is exactly what Lemma 8.1.9 shows. It is thus possible to find a unit $ u$ such that $ \vert f(u)\vert>0$. Thus $ z\not\in W^{\perp}$, so $ W^{\perp}\subset Z^{\perp}$, whence $ Z\subset W$, which finishes the proof Theorem 8.1.2. $ \qedsymbol$

William Stein 2012-09-24