Groups, geometry and Gromov

the Plus team

This is an experimental venture by the LMS, in collaboration with Plus magazine, hoping to provide a wide range of readers with some idea of exciting recent developments in mathematics, at a level that is accessible to all members as well as undergraduates of mathematics. These articles are not yet publicly accessible and we would be grateful for your opinion on the level of the articles, whether some material is too basic or too complex for a general mathematical audience. The plan is to eventually publish these articles on the LMS website. Please give your feedback via the Council circulation email address.

The Gromov-Hausdorff distance introduced in the previous article is an extremely useful tool. In this article and the next we shall describe two striking applications in quite different areas of mathematics. The first is to prove a conjecture in abstract group theory. Gromov's proof, which cleverly forces geometry on groups where no geometry seemed present, is famous not only for the result itself, but even more because of its role in starting the whole new field of geometric group theory. A curious feature of Gromov's argument is its use of the solution to a problem first posed by David Hilbert in 1900, which was solved by Deane Montgomery and Leo Zippin in the 1950s. This was regarded as a great triumph at the time, but seems to have found no application before Gromov's paper.

To understand Gromov's result and its proof, we first explore its geometric motivation.

Groups and geometry

Groups are connected with spaces because a space has a so-called \emph{fundamental group}. Let's start by considering the simplest group, the integers $\Bbb Z$, and one of the nicest geometric shapes, the circle $S^1$. The two are intimately related. Starting from a base point $x \in S^1$, you can wind around the circle clockwise a number of times before returning to $x$, and you can also wind around it anti-clockwise a number of times. Counting anticlockwise windings positively and clockwise ones negatively you can associate an integer \emph{winding number} to every closed path beginning and ending at $x$ — to each \emph{loop} based at $x$ — however complicatedly it wanders back and forth on its journey. Furthermore, one such loop can be deformed continuously to another if and only if they have the same winding number. If you now consider each such journey as a group element and combine two elements by doing one after the other, then it's easy to see that the group formed by the paths (up to deformation) is isomorphic to $\mathbb{Z}$. This group $\mathbb{Z}$ of classes of loops based at $x$ is called the \emph{fundamental group} of $S^1$ at $x$. It can be defined in just the same way for any topological space $X$ with a chosen base-point $x$, and is denoted by $\pi_1(X,x)$. To see that two loops in the circle with the same winding number can be deformed to each other it is best to "unroll" the circle to get the real line $\mathbb R$. Suppose the circle has its usual metric, scaled so that the circumference has length 1. Then map $0 \in \mathbb R$ to the base-point $x$, and map a positive $t \in \mathbb R$ to the point got to by travelling a distance $t$ anticlockwise around the circle from $x$. Thus the interval $[0,1) \subset \mathbb R$ is mapped bijectively to the circle, as is each successive interval $[1,2)$, $[2,3)$, etc. Similarly, when $t \in \mathbb R$ is negative we map it to the point of the circle got by travelling the distance $-t$ \emph{clockwise} from $x$. The crucial feature of the \emph{covering map} $p: \mathbb R \to S^1$ so defined is that \emph{locally} it is an an isometry, so that any movement of a point $y$ of the circle can be replicated upstairs by a movement of any point $t \in \mathbb R$ such that $p(t) = y$. In particular, any loop in the circle based at $x$ can be lifted to a path in $\mathbb R$ beginning at $0$ and ending somewhere in $p^{-1}(x)$, i.e. at some integer, which is the winding-number of the loop. But if we have two paths in $\mathbb R$ from $0$ to $n$ then it is obvious we can deform one linearly to the other (think of their graphs!), and pushing the deformation back to the circle by the map $p$ proves that loops with the same winding number can be deformed to each other.

Figure 1: The Torus is dened by two circles (pink and red). cutting it open along those circles will give a rectangle, which is a topological square. Image from Wikipedia.

You can play a similar game with the torus $\mathbb{T} = S^1 \times S^1$. The torus can be formed by gluing together opposite edges of a square, and it can be "unrolled" to the plane $\mathbb R^2$, tiled by squares, just as the circle was unrolled to $\mathbb R$ tiled by the unit intervals. This way we see that the fundamental group of the torus is $\mathbb{Z}^2$, sitting inside the plane as the integer lattice. The real line and the plane are what are called \emph{universal covers} of the circle and the torus respectively. Each is a simply connected object (we have cut open and unrolled things precisely so that there are no non-contractible closed loops remaining in the cover) that maps surjectively onto the circle or torus by a continuous \emph{covering map} $p$ with the following property: every point on the circle/torus has an open neighbourhood $U$ whose pre-image under $p$ is a disjoint union of identical copies of $U$ sitting in the universal cover. The reason why we didn't define the universal cover to consist of just one copy of the cut-open and flattened circle or torus (one interval or one square) is because we wanted to ensure that all points on the circle and torus have these nice open neighbourhoods which are replicated in the cover, enabling us to "lift" any movement downstairs up to the covering-space.

Making manifolds

All this works in a more general set-up too. Both the circle and the torus are examples of \emph{manifolds}: topological spaces that look like Euclidean space when viewed from close up. They can be covered by overlapping neighbourhoods that are homeomorphic to Euclidean space. These local homeomorphisms are called \emph{charts}. The picture to have in mind is that the manifold is the surface of the Earth or some other curved object, covered by overlapping regions — open subsets — which are depicted by the charts as open subsets on the pages of an atlas, each page being a copy of standard Euclidean space. When two of the regions overlap the overlap will be depicted as an open subset on each of two different pages of the atlas, and so one will have a 1-1 correspondence between an open set on one page and an open set on the other page. These 1-1 correspondences are called the \emph{transition functions} between the charts. If they are differentiable then we say the charts give the manifold a \emph{differentiable} or \emph{smooth} structure. Another way of thinking of the charts is as systems of "local coordinates" on the manifold: on a differentiable manifold they enable us to say when a real-valued function defined on the manifold is differentiable. In these articles we shall mostly be interested in differentiable manifolds. It turns out that every connected manifold $M$, no matter how complicated its topology, has a universal cover $\tilde{M}$: a simply connected "unrolled" version of itself, also a manifold, with a corresponding covering map $p$ which has the properties listed above. Among other things, a metric on the manifold $M$ induces one on the covering space $\tilde M$. The fundamental group of $M$ always sits inside $\tilde M$ as an evenly-spaced lattice, the inverse-image of the base-point $x \in M$, just as it did for the circle and the torus. The fundamental group encodes a great deal of information about the shape of $M$: it is a crucial topological invariant. (Note that the fundamental group of a connected manifold is independent of the base-point you choose, up to isomorphism.) A particularly pretty picture emerges when $M$ is a closed surface. A closed surface, assuming it is orientable, is completely determined topologically by its \emph{genus} $g$, or number of "holes": it must be a sphere, or an ordinary torus, or else a generalised torus with more than one hole. Except for the sphere, which is simply connected and so is its own universal cover, the universal cover is always homeomorphic to the plane. Nevertheless, as we shall see, there is a dramatic difference between the case $g=1$ of the usual torus and the cases of higher genus.

Figure 2: The double torus can be cut open and turned into a hyperbolic octagon. Image stolen from web, need better version.

Let us consider the surface of genus two pictured in Figure 2. We cut it open, as in the figure, to get an octagon, from which the original surface can be reconstructed by sewing together its edges in pairs. But suppose we start sewing copies of the octagon together to make the universal covering, as we did with squares in the case of the usual torus to get the plane. We can do this perfectly well in principle, but we find that the number of octagons we use explodes: the number we need to attach grows exponentially with the number of steps out from the initial octagon. There is no way we can flatten out the resulting surface if we make it out of conventional cloth: there is much too much of it. What we have discovered here is the \emph{hyperbolic plane}. As a topological space this is just the usual plane, but it has a very different metric. One convenient model of it — the Poincaré model — represents it as the open unit disc in the usual Euclidean plane, but with a metric which attributes length $d/(1-r^2)$ to a short interval of Euclidean length $d$ which is at a distance $r$ from the centre of the disc: in other words, points near the boundary of the disc are very much further apart than they look, and the boundary itself is infinitely far away. Returning to our octagons, we get the tesselation of the hyperbolic plane depicted in Figure 3.

To our Euclidean eyes, the tiles that make up this tesselation appear to become smaller as we move towards the edge of the Poincaré disc, but in the hyperbolic metric they all have the same size and shape, being identical regular octagons, with each edge a segment of a geodesic in the Poincaré metric, and each angle being 45 degrees. This is because the Poincaré disc is a flattened model of the negatively curved hyperbolic plane. An unflattened version would be akin to a kale leaf, getting more and more crinkly towards the edge — but not quite a real kale leaf, in which the crinkles get smaller as we go out to the boundary; each crinkle should be as big as any other.

Figure 3: The universal cover of the double torus is the Poincare disc. It is tiled by hyperbolic octagons. The arrows and letters in this image are instructions for how to glue the edges to give the double torus. Image stolen from web, need better version.

A consequence of this crinkliness is that the area of a disc of radius $r$ grows exponentially with $r$, that is, it grows like $\exp{(kr)}$ for some $k>0$. (The exact formula is $2\pi(\cosh r - 1)$.) This is in marked contrast to Euclidean geometry, where the area of a disc is $\pi r^2$, and the growth is only polynomial. Since the fundamental group of the underlying surface is embedded as an evenly-spaced lattice in the Poincaré disc, this exponential growth is inherited by the group. The number of lattice points within a disc of radius $r$ grows exponentially with $r$. By contrast, the number of lattice points of $\mathbb{Z}^2$ within a Euclidean disc of radius $r$ grows like $r^2$.

Something very similar happens in three dimensions. A typical three-dimensional manifold has hyperbolic three-space as its universal cover. Again the fundamental group is represented as a lattice in this cover, and we have exponential growth, a fundamental feature of hyperbolic space of any dimension. In the next article we shall speak about Grigory Perelman's proof of Thurston's geometrisation conjecture (which implies the Poincaré conjecture), which describes how a general compact three-manifold is made up of pieces which are generically hyperbolic.

Growing groups with Gromov

An abstract group, given by a set of generators and relations between them, doesn't necessarily sit within a manifold as fundamental groups do, but it does come with another, albeit sparser, geometrical object: its \emph{Cayley graph}. The vertices of this graph are the elements of the group and two vertices $g$ and $h$ are joined by an edge if $h = sg$ where $s$ is either one of the generators of the group or else the inverse of a generator. The Cayley graph is essentially a map of the group telling you how to get from one element to another through multiplication by the generators. A simple example is the infinite cyclic group $\mathbb{Z}$, generated by the single element 1, whose Cayley graph is an infinite chain, represented by the real line with the integer points marked out as vertices. The Cayley graph comes with a natural metric, called the \emph{word metric}: the distance between any two group elements $g$ and $h$ is the length of he shortest path from $g$ to $h$ in the Cayley graph, or, equivalently, the length of the shortest word $w$ in the generators so that $h = wg$.

Cayley graphs were first introduced by Arthur Cayley in 1878 and revived a few decades later by Max Dehn. Dehn used them in to study one of the trickiest problems in group theory: to decide whether two words in the generators of a group represent the same element. He gave an algorithm to solve this word problem for certain types of fundamental groups, but later on it was proved that there are groups for which the problem is undecidable.

If you are dealing with the fundamental group of a manifold, then you know that the group sits within a manifold as a point lattice. This raises an obvious question: how does the word metric on the group compare with the metric it acquires from the manifold? The answer is that the two metrics are equivalent, at least if the manifold is compact: this was proved independently by A.S. Švarc and John Milnor in the 1950s and 1960s.

These results spawned interest in the growth of groups more generally. You can transfer the notion of growth to a generic finitely generated group by measuring the distance $R$ using the word metric defined on the Cayley graph. A group is said to have polynomial growth if the number $N(R)$ of vertices on the Cayley graph that lie within distance $R$ of the identity (measured in the word metric) grows like a polynomial in $R$. It has exponential growth if $N(R)$ grows exponentially with $R$. Notice that it is not important which set of generators we choose: if we get $N(R)$ using one generating set and $\tilde N (R)$ using a second set, then we have $N(R) \leq \tilde N (kR)$ and $\tilde N (R) \leq N(\tilde k R)$, where $k$ is the maximum length of a generator of the first set in terms of the second, and $\tilde k$ the maximum length of the second set in terms of the first.

Figure 4: The Cayley graph of the free group on two generators. The length of the edges has been scaled as you move further out in order to fit it all on the page.

Just as most two- and three-dimensional manifolds are hyperbolic, so most groups come with exponential growth. As an example, take the free group on two generators. Here each vertex has four edges (corresponding to the two generators and their inverses). At the start you have four choices as to which vertex to go to from the identity and subsequently you have three choices whenever you arrive at a new vertex. This shows that there are $2.3^R - 1$ vertices within distance $R$ of the identity — the number $N(R)$ within distance $R$ of the identity grows exponentially with $R$. Figure 4 shows the Cayley graph of this group: it is the universal covering space of the figure-eight graph with one vertex and two edges, which is the simplest space whose fundamental group is the free group on two generators. The horizontal edges in Figure 4 cover one of the edges of the figure-eight, and the vertical edges cover the other. The picture exhibits some pretty geometry, but is misleading in the same way as pictures of the Poincaré disc are, for each edge of the graph really has length 1. If, however, there are sufficiently many relations between the generators, this growth slows down. In abelian groups such as $\mathbb{Z}$ and $\mathbb{Z}^2$ the commutativity relation ensures that $N(1,R) \leq (1+R)^S$, where $S$ is the size of the generating set, so the growth is polynomial, like the volume of a ball in Euclidean space.

We can weaken the commutativity condition a bit without losing polynomial growth. It is possible to prove that nilpotent groups have polynomial growth too. Intuitively you can think of nilpotent groups as "very close" to being abelian (see here for a definition); one way of expressing it is to say that in nilpotent groups elements commute modulo "elements of lower order". We can go a little further still and show that any virtually nilpotent group — that is a group which contains a nilpotent subgroup of finite index — has polynomial growth.

It's not surprising that a tight algebraic constraint such as being virtually nilpotent should control the growth of a group. What is surprising, however, is that the converse is also true: the seemingly mild constraint of having polynomial growth implies that the group is virtually nilpotent. This result, which was first conjectured by Milnor, was proved by Gromov in 1980, and it is here that he made clever use of the Gromov-Hausdorff metric we introduced in the last article.

Gromov's proof

Gromov's idea, phrased informally, was this: while an abstract group might not sit in a manifold in the same way that fundamental groups do, you can imagine that it does by looking at its Cayley graph from far away and squinting, so that its vertices merge together to form a continuum. The polynomial growth of the group then ensures that the continuum you have squinted into existence is finite dimensional. The fact that the group you are dealing with is virtually nilpotent can then be deduced from what is known about the groups of isometries of such continua.

Technically the squinting corresponds to re-scaling the word metric $d_G$ on the Cayley graph of a group $G$ with polynomial growth by a sequence of numbers $r_n$ tending to 0. This gives a sequence of metric spaces $G_n$ whose distance function $d_n$ is simply $r_nd_G$. Rescaling the metric in this way has the effect of pulling the vertices of the graph "closer together" so you might hope that in the Gromov-Hausdorff limit you get something that looks like a continuum — rather like a manifold. And indeed you do. If, for example, $G$ is the integer lattice $\mathbb{Z}^2$, then the metric spaces coming from re-scaling converge to the Euclidean plane equipped with the taxi cab metric we met in the first article. Strictly speaking — bearing in mind that we have defined convergence only for \emph{compact} metric spaces — this means that for every $r > 0$ the ball of radius $r$ around the identity element in the rescaled group converges to the ball of radius $r$ in Euclidean space.

A very interesting example is the simplest non-abelian case; the Heisenberg group. Click here to explore it.> Gromov's argument is that for a general group $G$ with polynomial growth the sequence of metric spaces obtained by rescaling its word-metric converges to a finite-dimensional metric space $Y$, in the same sense that the rescaled lattices $\mathbb{Z}^2$ converged to the plane. It is the polynomial growth property which shows that $Y$ is finite-dimensional. The way $Y$ is constructed from copies of $G$ shows that left-multiplication by any element $g$ of $G$ defines an \emph{isometry} (a distance preserving map) from $Y$ to itself. Luckily, the Montgomery-Zippin solution of the Hilbert problem, which we shall come back to in a moment, tells us a lot about the group of isometries of a finite-dimensional space with reasonably nice local behaviour: it is a so-called \emph{Lie group}. Lie groups are well understood, and a result of Jacques Tits gives us exactly what we need: it says that a finitely generated subgroup of a Lie group is virtually nilpotent if it has polynomial growth. By showing that the space $Y$ has just the properties that Montgomery and Zippin call for, Gromov concludes that $G$ is a subgroup of a Lie group, and so because of its polynomial growth it is virtually nilpotent.

Hilbert's fifth problem

As we already mentioned, to make the leap from the limit space $Y$ to a Lie group, Gromov used a result that is interesting in its own right. In 1900 the German mathematician David Hilbert addressed the International Congress of Mathematicians in Paris, and in his address issued a list of 23 problems which, he thought, would dominate the mathematics of the twentieth century. Hilbert's fifth problem concerned Lie groups. By definition, a Lie group is a group $G$ which is also a differentiable manifold, and is such that the multiplication map $G \times G \to G$ is differentiable. This means, first, that $G$ has a topology in which it is a \emph{manifold}, i.e. locally it is homeomorphic to Euclidean space. Secondly, it means that $G$ can be covered by "charts" between which the transition functions are differentiable. Thirdly, it means that, when expressed in terms of these charts, the multiplication is differentiable. Lie groups arise everywhere in mathematics. The natural examples are groups of real or complex matrices, such as $SO(n)$, the group of rotations of $n$-dimensional Euclidean space, or $PSL(2,\mathbb R)$, which is isomorphic to the group of all orientation-preserving isometries of the hyperbolic plane.

Hilbert's problem asked whether the differentiability assumptions in the definition of a Lie group are redundant, i.e. whether any group with a topology is automatically a Lie group if the multiplication map is continuous and it is topologically a manifold. This seemed an important question, because Lie and others had shown how completely Lie groups can be understood by the techniques of differential calculus. It was also very difficult, and the positive solution found by Montgomery and Zippin was a breakthrough. (You can find out more in this beautiful downloadable book by Terence Tao.)

Strangely, however, it turned out that when one comes upon a group with a topology then almost always there is enough extra information at hand to make the differentiability properties obvious. Gromov seems to have been the first to make important use of Montgomery and Zippin's work. But perhaps even this is a little misleading. In fact they had proved a much stronger result than Hilbert had asked for: they proved that any locally compact group with no "small subgroups" is a Lie group, and from that they could deduce that the group of isometries of a finite-dimensional space with certain topological properties is a Lie group. Gromov used the full force of this stronger result to show that the isometries of his limiting space $Y$ form a Lie group: he did not show that $Y$ is a topological manifold, simply that it is a finite-dimensional locally compact space which is connected and locally connected.

Add new comment

Plus.Maths.org

Add new comment