Instance optimal algorithm for aggregation in multimedia databases

You need to know: Algorithm, running time of the algorithm, worst-case running time.

Background: Assume that each object in a database has m grades, one for each of its m attributes. For each attribute, there is a sorted list, which lists each object and its grade under that attribute, sorted by grade (highest grade first). Each object is assigned an overall grade, which is a fixed monotone function of the attribute grades, such as min or average. The problem is to determine the top k objects with the highest overall grades.

The Theorem: On 6th September 2001, Ronald Fagin, Amnon Lotem, and Moni Naor submitted to the Journal of Computer and System Sciences a paper in which they developed an algorithm to solve the problem above and prove that it is instance optimal, that is, optimal (within a constant factor) for any database.

Short context: Traditionally, the efficiency of algorithms are estimated based on worst-case running time. However, given algorithm A for some problem, it is usually very difficult to prove that its worst-case running time is optimal. And even if it is, there still may exists a different algorithm B, with worse worst-case running time, but performing much faster than A on most instances. In contrast, the Theorem proves that some given algorithm is optimal (within a constant factor) for all instances! This is the strongest possible notion of algorithm optimality one can imagine.

Links: Free arxiv version of the original paper is here, journal version is here.

Go to the list of all theorems

Linear groups of exponential growth have uniform exponential growth

You need to know: Complex numbers, matrix with complex entries, matrix multiplication, invertible matrix, group, subgroup, finitely generated group, generating set.

Background: Let G be a finitely generated subgroup of GL_n({\mathbb C}) (the group of n \times n invertible matrices with complex entries). For any finite generating set S of G, let \gamma_S(k) be the number of elements of G which are the product of at most k matrices in S \cup S^{-1}. Let e_S(G)=\lim\limits_{k \to \infty}\sqrt[k]{\gamma_S(k)}.

The Theorem: On 23rd August 2001, Alex Eskin, Shahar Mozes, and Hee Oh submitted to arxiv a paper in which they proved that if G is a finitely generated subgroup of GL_n({\mathbb C}) and e_S(G)>1 for some S then \inf_S e_S(G) > 1, where the infimum is taken over all finite generating sets S of G.

Short context: Subgroups of GL_n({\mathbb C}) are examples of linear groups. For any finitely generated group (not necessary linear), it is known that if e_S(G)>1 for some S then in fact e_S(G)>1 for all S, and, in this case, we say that G has exponential growth. If \inf_S e_S(G) > 1, we say that G has uniform exponential growth. In 1981, Gromov asked whether every group of exponential growth has uniform exponential growth. The Theorem answers this question affirmatively for subgroups of GL_n({\mathbb C}). In a later work, Wilson proved that in general the answer to Gromov’s question is negative.

Links: Free arxiv version of the original paper is here, journal version is here.

Go to the list of all theorems

The best constant in the centered Hardy–Littlewood maximal inequality

You need to know: Integration, \sup notation, Lebesgue measure.

Background: Let {\cal L}^1({\mathbb R}) be the set of all functions f:{\mathbb R}\to {\mathbb R} such that the integral \int_{-\infty}^{+\infty} |f(x)| dx exists and finite. We will denote \|f\|_1 the value of this integral. The centered Hardy-Littlewood maximal operator is the map which assigns to any function f \in {\cal L}^1({\mathbb R}) a function M_f (t) = \sup\limits_{\epsilon>0}\frac{1}{2\epsilon}\int_{t-\epsilon}^{t+\epsilon} |f(x)| dx. For any  \delta>0, denote |\{M_f > \delta\}| the Lebesgue measure of set of real numbers t such that M_f(t) > \delta.

The Theorem: On 16th August 2001, Antonios Melas submitted to the Annals of Mathematics a paper in which he proved that for every f \in {\cal L}^1({\mathbb R}) and for every \delta>0 we have |\{M_f > \delta\}| \leq \frac{11+\sqrt{61}}{12}\frac{\|f\|_1}{\delta}, and the constant \frac{11+\sqrt{61}}{12} \approx 1.5675208 in this inequality is the best possible.

Short context: The statement that inequality |\{M_f > \delta\}| \leq C\frac{\|f\|_1}{\delta} holds with some constant C is known as the centered Hardy–Littlewood maximal inequality, and has various applications in the theories of differentiation and integration (in particular, it is related to Lebesgue differentiation theorem, fractional integration theorem, and more). Let us denote C^* the best possible (smallest) constant C such that the inequality holds. Carbery and Soria suggested that C^*=1.5, but this was disproved by Aldaz in 1998. Few years later, Melas proved that \frac{11+\sqrt{61}}{12} \leq C^* \leq \frac{5}{3} and conjectured that the lower bound is the actual value of C^*. The Theorem confirms this conjecture.

Links: Free arxiv version of the original paper is here, journal version is here. See also Section 3.3 of this book for an accessible description of the Theorem.

Go to the list of all theorems

A polynomial is matrix-positive if and only if it is the sum of squares

You need to know: Matrix, transpose A^T of matrix A, matrix addition and multiplication, positive semidefinite matrix.

Background: Consider a set of 2n variables, which we denote x_1, x_2, \dots, x_n, x_1^T, x_2^T, \dots, x_n^T. A noncommutative monomial in these variables is any expression of the form ay_1y_2\dots y_m, where a is a real coefficient and each y_i is equal to either x_j or x_j^T for some j. A noncommutative polynomial Q(.) is the sum of any number of such monomials. If we substitute into Q(.) any real matrices A_1, \dots, A_n of any dimension r \times r instead of variables x_1, x_2, \dots, x_n, and their transposes A^T_1, \dots, A^T_n instead of variables x_1^T, \dots, x_n^T, then Y=Q(A_1, \dots, A_n,A^T_1, \dots, A^T_n) is an r \times r matrix. Let S(Q) be the set of all matrices Y which can be obtained in this way. If all matrices in S(Q) are symmetric, we call Q(.) symmetric. If all matrices in S(Q) are positive semidefinite, we call Q(.) matrix-positive. Finally, we say that Q(.) is the sum of squares if Q(.)=\sum_{i=1}^k h_i(.)^Th_i(.), where h_i(.), \, i=1,2,\dots,k are noncommutative polynomials in the same variables.

The Theorem: On 31st July 2001, William Helton submitted to the Annals of Mathematics a paper in which he proved that a noncommutative symmetric polynomial is matrix positive if and only if it is the sum of squares.

Short context: For multivariate polynomials in real variables, sum of square representation is one of the main methods to prove that a polynomial is non-negative. This method, however, does not always work because there exist non-negative polynomials which have no sum of square representations. In contrast to this, the Theorem proves that in noncommutative (matrix) setting every matrix-positive polynomial has sum of squares representation.

Links: The original paper is available here. See also Section 2.10 of this book for an accessible description of the Theorem.

Go to the list of all theorems

The simple random walk covers all points of Z_n^2 in 4n^2(log n)^2/pi steps

You need to know: Addition and subtraction modulo n, limits, basic probability theory, notation P[.] for probability of something.

Background: Let {\mathbb Z}_n^2 be the set of points x=(a,b) such that 0\leq a < n and 0\leq b < n are integers. Let x_k, k=0,1,2,\dots be a (random) sequence of points in {\mathbb Z}_n^2, such that, if x_k=(a_k,b_k), then x_{k+1} can be (a_k+1,b_k), or (a_k-1,b_k), or (a_k,b_k+1), or (a_k,b_k-1) with equal chances, where the addition and subtraction are modulo n. Let T_n be the smallest integer m such that for every x\in{\mathbb Z}_n^2 we have x=x_i for some 0\leq i \leq m.

The Theorem: On 26th July 2001, Yuval Peres, Jay Rosen, and Ofer Zeitouni submitted to the arxiv and Annals of Mathematics a paper in which they proved, among other results, that for any \epsilon>0, \lim\limits_{n\to \infty}P\left[\left|\frac{T_n}{(n \log n)^2} - \frac{4}{\pi}\right|>\epsilon\right] = 0.

Short context: Random walk is a simple model for various applications, it can be used to approximate stock prices, model the movement of a molecule in a liquid or gas, etc. The time it takes for a random walk to cover all points in a certain set has been intensively studied and has numerous applications. The problem of estimating the cover time T_n for {\mathbb Z}_n^2 was asked by Wilf in 1989 who called it “the white screen problem”. In the same year, Aldous conjectured that T_n/(n \log n)^2 converges to 4/\pi. The Theorem confirms this conjecture.

Links: Free arxiv version of the original paper is here, journal version is here. See also Section 4.8 of this book for an accessible description of the Theorem.

Go to the list of all theorems

Rational polygon whose set of non-ergodic directions has dimension 1/2

You need to know: Polygon, angles measured in radians, angle of incidence, angle of reflection, limits, Hausdorff dimension of a set.

Background: Let Q be a rational polygon, that is, polygon whose angles measured in radians are rational multiples of \pi. A “billiard” in Q is a point moving inside Q with constant speed, with the usual rule that the angle of incidence equals the angle of reflection. Its trajectory is called equidistributed if, for every subset M \subset Q of area S(M), \lim\limits_{T\to\infty}\frac{f_M(T)}{T} = \frac{S(M)}{S(Q)} where S(Q) is the area of Q, and f_M(T) is the time the point spent inside M during the time interval [0,T]. A direction, parametrized by angle \alpha\in[0,2\pi), leading to equidistributed trajectory is called ergodic, and all other directions are called non-egrodic. Let \text{NE}(Q)\subset [0,2\pi) denotes the set of non-egrodic directions in Q, and let \text{dim}(\text{NE}(Q)) be its Hausdorff dimension.

The Theorem: On 23rd July 2001, Yitwah Cheung and Michael Boshernitzan submitted to the Annals of Mathematics a paper in which they proved the existence of a rational polygon Q with \text{dim}(\text{NE}(Q))=1/2.

Short context: A fundamental theorem proved by Kerckhoff, Masur, and Smillie in 1986 states that, in any rational polygon Q, set \text{NE}(Q) has Lebesgue measure 0. In 1992, Masur proved a much stronger result that in fact \text{dim}(\text{NE}(Q)) \leq 1/2. The Theorem shows that constant 1/2 in this inequality is the best possible.

Links: Free arxiv version of the original paper is here, journal version is here. See also Section 3.7 of this book for an accessible description of the Theorem.

Go to the list of all theorems

Any two components of the USF in Z^d are adjacent a.s. if d is at most 8

You need to know: Finite and infinite graph, tree, forest, path, connected component, subgraph, induced subgraph. Basic probability theory, select an object uniformly at random, an event happening almost surely (a.s.). Notation \left \lfloor{x}\right \rfloor for the largest integer not exceeding real number x.

Background: Let {\mathbb Z}^d be (infinite) graph with vertices x=(x_1, \dots, x_d) with integer coordinates, where x and y are connected by an edge if \sum_{i=1}^d|x_i-y_i|=1. Let {\mathbb Z}_n^d be the (finite, induced) subgraph of {\mathbb Z}^d consisting on vertices x=(x_1, \dots, x_d) with |x_i|\leq n for all i. A spanning tree on {\mathbb Z}_n^d is a (non-induced) subgraph of {\mathbb Z}_n^d which (i) is a tree, and (ii) contains at least 1 edge incident to each vertex of {\mathbb Z}_n^d. Let H_n^d be a spanning tree on {\mathbb Z}_n^d selected uniformly at random. For any set C of edges in {\mathbb Z}^d, let P_n(C) be the probability that all these edges are included in H_n^d, and let P(C)=\lim\limits_{n\to\infty} P_n(C). A (random) subgraph H^d of {\mathbb Z}^d such that any set C of edges of {\mathbb Z}^d is included into H^d with probability P(C) is called the uniform spanning forest (USF) on {\mathbb Z}^d. For x,y \in {\mathbb Z}^d, let N(x, y) be the minimum number of edges outside the USF in a path joining x and y in {\mathbb Z}^d.

The Theorem: On 19th July 2001, Itai Benjamini, Harry Kesten, Yuval Peres, and Oded Schramm submitted to arxiv a paper in which they proved that, almost surely, \max\limits_{x,y\in {\mathbb Z}^d}N(x,y) = \left \lfloor{\frac{d-1}{4}}\right \rfloor.

Short context: The USF was introduced by Pemantle in 1991 as a way to choose a spanning forest in {\mathbb Z}^d uniformly at random. Pemantle proved that, for d\leq 4 (but not for d\geq 5), the USF is connected (and thus is a spanning tree) almost surely. In other words, N(x,y)=0 for all x,y \in {\mathbb Z}^d if and only if d\leq 4. The Theorem computes \max\limits_{x,y\in {\mathbb Z}^d}N(x,y) in all dimensions. In particular, N(x,y) \leq 1 a.s. for all x,y \in {\mathbb Z}^d if and only if d\leq 8.

Links: Free arxiv version of the original paper is here, journal version is here. See also Section 4.9 of this book for an accessible description of the Theorem.

Go to the list of all theorems

Regular n-gon minimizes the logarithmic capacity among all n-gons with a fixed area

You need to know: Basic geometry, polygon, regular polygon, notation |AB| for the length of line interval with endpoints A and B.  n-gon is a polygon with n vertices. We assume that polygon contains its boundary (vertices and sides).

Background: Let E be a polygon, or, more generally, a closed bounded subset of the plane. For any k points A_1, \dots A_k \in E, denote p(E, A_1, A_2, \dots, A_k) = \left(\prod\limits_{i=1}^{k-1}\prod\limits_{j=i+1}^k |A_iA_j|\right)^{\frac{2}{k(k-1)}} . Then let p_k(E) := \max\limits_{A_1, A_2, \dots, A_k \in E} p(E, A_1, A_2, \dots, A_k), and p(E) = \lim\limits_{k\to\infty} p_k(E). Number p(E) is called transfinite diameter, or logarithmic capacity of set E.

The Theorem: On 25th May 2001, Alexander Solynin and Victor Zalgaller submitted to the Annals of Mathematics a paper in which they proved that, for any n-gon E_n, p(E_n) \geq p(E^*_n), where E^*_n is the regular n-gon with the same area as E_n.

Short context: Number p(E) has many names (logarithmic capacity, transfinite diameter, conformal radius, Chebyshev constant), indicative of its numerous applications. In 1951, Pólya and Szegö conjectured that regular n-gon minimizes the logarithmic capacity among all n-gons with a fixed area and proved this conjecture for n=3 and n=4. The Theorem confirms this conjecture for all n.

Links: Free arxiv version of the original paper is here, journal version is here. See also Section 4.2 of this book for an accessible description of the Theorem.

Go to the list of all theorems

Any sequence of Lipschitz maps has a common point of differentiability

You need to know: Banach space, norm \|.\|_X in Banach space X, dimension of a Banach space, bounded linear functions between Banach spaces, small o(.) notation.

Background: Let X,Y be Banach spaces. A function f:X \to Y is called Lipschitz continuous if there is a constant K \geq 0 such that \|f(x)-f(y)\|_Y \leq K \|x-y\|_X for all x,y \in X. A function f:X\to Y is called Fréchet  differentiable at x_0 \in X if there is a bounded linear function T:X \to Y such that f(x_0 + u) = f(x_0) + T(u) + o(\|u\|_X) \, \text{as} \, u \to 0.

The Theorem: On 22nd May 2001, Joram Lindenstrauss and David Preiss submitted to the Annals of Mathematics a paper in which they proved the existence of infinite dimensional Banach space X such that every sequence f_1, f_2, \dots, of Lipschitz continuous functions on X has a point x_0 \in X such that all f_i are Fréchet  differentiable at x_0.

Short context: A well-known open question is to investigate which Banach spaces X have the property that every countable collection of Lipschitz functions on X has a common point of Fréchet differentiability. Before 2001, this property was known to only hold for finite dimensional spaces. The Theorem proves that it also holds for some Banach spaces of infinite dimension.

Links: Free arxiv version of the original paper is here, journal version is here. See also Section 3.1 of this book for an accessible description of the Theorem.

Go to the list of all theorems

There exist non-isotrivial elliptic curves over F_p(t) with arbitrarily large rank

You need to know: Field, elliptic curve over field, rank of elliptic curve, finite field {\mathbb F}_p (field of integers modulo prime p)

Background: The j-invariant of an ellipltic curve E given in the form y^2=x^3+ax+b is j(E)=1728\frac{4a^3}{4a^3+27b^2}. Let {\mathbb F}_p(t) be the field of rational functions of the form f(t)=\frac{a_mt^m+a_{m-1}t^{m-1}+a_1t+a_0}{b_nt^n+b_{n-1}t^{n-1}+b_1t+b_0}, where a_m,\dots,a_1,a_0,b_n,\dots,b_1,b_0 \in {\mathbb F}_p. Elliptic curve E over field {\mathbb F}_p(t) is called isotrivial if j(E) \in {\mathbb F}_p, and non-isotrivial otherwise.

The Theorem: On 19th May 2001, Douglas Ulmer submitted to the Annals of Mathematics arxiv a paper in which he proved that elliptic curve E defined over the field {\mathbb F}_p(t) by the equation y^2 + xy = x^3 - t^d, where d = p^n + 1 for some positive integer n, is non-isotrivial and has the rank at least (p^n - 1)/2n.

Short context: For elliptic curves over rationals, the question whether rank can be arbitrary large is important and open. By 2001, the largest known rank was 24 (later, in 2006, Elkies discovered an elliptic curve over {\mathbb Q} with a rank at least 28). In contrast, Shafarevitch and Tate produced in 1967 elliptic curves over {\mathbb F}_p(t) of arbitrarily large rank. These curves are isotrvial. The Theorem produces first non-isotrivial examples of elliptic curves over {\mathbb F}_p(t) of arbitrarily large rank.

Links: Free arxiv version of the original paper is here, journal version is here. See also Section 2.4 of this book for an accessible description of the Theorem.

Go to the list of all theorems