L1-regularisation recovers sparse vectors in R^m from n<<m inaccurate measurements

You need to know: Euclidean space {\mathbb R}^m, norms ||x||_2 = \sqrt{\sum\limits_{i=1}^m x_i^2} and ||x||_1 = \sum\limits_{i=1}^m |x_i| in {\mathbb R}^m, support of vector x \in {\mathbb R}^m (subset T\subset \{1,\dots,m\} such that x_i=0 for all i \not\in T), n \times m matrix, matrix multiplication, big O notation.

Background: For x_0 \in {\mathbb R}^m let y=Ax_0+e, where A is n \times m matrix (n<m) and e\in {\mathbb R}^n. Let x^* be a solution to the problem \min\limits_{x \in {\mathbb R}^m} ||x||_1 subject to ||Ax-y||_2 \leq \epsilon for some \epsilon \geq 0 (this is called l1-regularisation problem).

For T\subset \{1,\dots,m\}, let A_T be n \times |T| submatrix of A formed from the columns of A corresponding to the indices in T. For S>0, let \delta_S be the smallest number such that inequalities (1-\delta_S)||c||_2^2 \leq ||A_Tc||_2^2\leq (1+\delta_S)||c||_2^2 hold for all T with |T|\leq S and all c\in{\mathbb R}^{|T|}.

The Theorem: On 3rd March 2005, Emmanuel Candes, Justin Romberg, and Terence Tao submitted to arxiv and Communications on Pure and Applied Mathematics a paper in which they proved the following result. Let A be n \times m matrix (n<m), and let S be such that \delta_{3S}+3\delta_{4S} < 2. Then for any x_0 \in {\mathbb R}^m supported on T_0 with |T_0|\leq S and any e\in {\mathbb R}^n such that ||e||_2 \leq \epsilon, we have ||x^*-x_0||_2 \leq C_S \cdot \epsilon, where the constant C_S depends only on \delta_{4S}. For example, C_S \approx 8.82 for \delta_{4S}=\frac{1}{5} and C_S \approx 10.47 for \delta_{4S}=\frac{1}{4}.

Short context: Vectors x_0 \in {\mathbb R}^m with few non-zero entries are called sparse and may represent, for example, image or signal in various applications.  Vector y=Ax_0+e is the result of n measurements (given by rows of A) with error e. Given y, we then try to recover x_0 by solving the l1-regularisation problem. The Theorem states that its solution x^* is close to x_0. In particular, if the measurements are exact (\epsilon=0) then x^* = x_0. The authors show that the condition \delta_{3S}+3\delta_{4S} < 2 in the Theorem holds for almost all matrices A with unit-normed columns if n=O(S\log m), hence, with high probability, we can recover x_0 after just O(|T_0|\log m) random measurements. In an earlier work, Donoho proved the existence of a recovery procedure with similar properties, while the Theorem states that a concrete efficient algorithm (l1-regularisation) does the job, even for inaccurate measurements. The error C_S \cdot \epsilon in the Theorem is essentially optimal.

Links: Free arxiv version of the original paper is here, journal version is here.

Go to the list of all theorems

Pólya-Vinogradov bound is not tight for characters of odd, bounded order

You need to know: Set {\mathbb Z} of integers, greatest common divisor (gcd) of 2 integers, set {\mathbb C} of complex numbers, o(1) notation.

Background: Function \chi_q:{\mathbb Z}\to{\mathbb C} is called a Dirichlet character modulo integer q>0 if (i) \chi_q(n)=\chi_q(n+q) for all n, (ii) if \text{gcd}(n,q) > 1 then \chi_q(n)=0; if \text{gcd}(n,q) = 1 then \chi_q(n) \neq 0, and (iii) \chi_q(mn)=\chi_q(m)\chi_q(n) for all integers m and n. An example is character \chi_q^* such that \chi_q^*(n)=1 whenever \text{gcd}(n,q) = 1 and \chi_q^*(n)=0 otherwise. This character is called principal and all other characters – nonprincipal. The order of a Dirichlet character \chi_q is the least positive integer m such that \chi_q(n)^m=\chi_q^*(n) for all n. The character \chi_q is called primitive if there is no integer 0<d<q such that \chi_q(a)=\chi_q(b) whenever \text{gcd}(a,q) = \text{gcd}(b,q) = 1 and a-b is a multiple of d. For any nonprincipal character \chi_q let M(\chi_q) = \max\limits_{1\leq m\leq q}\left|\sum\limits_{n=1}^m \chi_q(n)\right|.

The Theorem: On 2nd March 2005, Andrew Granville and Kannan Soundararajan submitted to the Jornal of the AMS a paper in which they proved that that if \chi_q is a primitive character modulo q of odd order g, then M(\chi_q) \leq C_g \sqrt{q} (\log q)^{1-\frac{\delta_g}{2} + o(1)} where \delta_g = 1 - \frac{g}{\pi}\sin\frac{\pi}{g}, and C_g is a constant depending only on g.

Short context: In 1918, Pólya and Vinogradov independently proved that M(\chi_q)=O(\sqrt{q}\log q) for any nonprincipal Dirichlet character \chi_q. This is known as the Pólya–Vinogradov inequality, has numerous applications, but it is an open question whether a better bound for M(\chi_q) in terms of q is possible. The Theorem provides an improved bound for characters of odd, bounded order.

Links: Free arxiv version of the original paper is here, journal version is here.

Go to the list of all theorems

There are asymptotically C n^(-7/2) g^n n! labelled planar graphs on n vertices

You need to know: Graph, planar graph, labelled graph, factorial n!=1\cdot 2 \cdot \dots \cdot n.

Background: For positive integer n, let g(n) be the number of labelled planar graphs on n vertices. We write g(n) \sim f(n) if \lim\limits_{n\to\infty}\frac{g(n)}{f(n)}=1.

The Theorem: On 18th January 2005, Omer Gimenez and Marc Noy submitted to arxiv a paper in which they proved that g(n) \sim C n^{-7/2}\gamma^n n!, where \gamma = 27.22688\dots and C = 0.0000042609\dots are explicit constants.

Short context: How many planar graphs are there? It is not difficult to show the existence of limit \gamma = \lim\limits_{n\to\infty} \left(\frac{g(n)}{n!}\right)^{1/n}, but much more difficult to actually compute this limit. Many researchers provided better and better lower and upper bounds for \gamma, with the best (before 2005) bounds \gamma > 26.18 by Bender, Gao and Wormald and \gamma < 30.06 by Bonichon et.al. The Theorem provides an exact analytical expression for \gamma, which allows numerical computation up to arbitrary precision. Moreover, it provides the exact asymptotic formula for g(n).

Links: Free arxiv version of the original paper is here, journal version is here.

Go to the list of all theorems

Undirected connectivity is in log-space

You need to know: Undirected graph, path connecting 2 vertices in a graph, algorithm, memory used by an algorithm, big O notation.

Background: The undirected connectivity problem has the following formulation: given as input an undirected graph G with n vertices and two vertices s and t in G, find out whether or not these two vertices are connected by a path in G.

The Theorem: In January 2008, Omer Reingold submitted to the Journal of the ACM a paper in which he presented an algorithm which can solve this problem using memory O(\log n).  

Short context: Efficient memory use is one of the fundamental requirement for algorithms, almost as important and well-studied as running time. If the decision problem has an algorithm which solves it using memory O(\log n), where n is the size of (read-only) input, we say that this problem belongs to the class log-space. The Theorem proves that the undirected connectivity problem, which naturally arises in many applications, belongs to this class.

Links: The original paper is available here.

Go to the list of all theorems

 

The Grothendieck constant of the complete n-vertex graph is at least c log n

You need to know: Euclidean space {\mathbb R}^n, inner product (x,y)=\sum\limits_{i=1}^n x_i y_i in {\mathbb R}^n, unit vector in {\mathbb R}^n, matrix, big O notation.

Background: For a positive integer n, let K_n be the minimal constant such that for every n \times n matrix with entries a_{ij} and every unit vectors x_1,\dots, x_n in {\mathbb R}^n there are signs e_1,\dots,e_n \in \{-1, +1\} for which \sum\limits_{i \neq j} a_{ij}(x_i, x_j) \leq K_n \sum\limits_{i \neq j} a_{ij}e_ie_j. Constant K_n is called the Grothendieck constant of the complete n-vertex graph, by analogy with the Grothendieck constant K_G, see here.

The Theorem: On 8th November 2004, Noga Alon, Konstantin Makarychev, Yury Makarychev, and Assaf Naor submitted to Inventiones mathematicae a paper in which they proved, among other results, the existence of a constant c>0 such that K_n \geq c \log n for all n.

Short context: In 1999, Nemirovski, Roos, and Terlaky proved that K_n \leq C\log n for some constant C. Later, Megretski asked if this can be improved to K_n \leq C. In 2004, Charikar and Wirth derived from inequality K_n \leq C\log n various algorithms for combinatorial problems, and noticed that better upper bound on K_n could lead to algorithms with better accuracy. The Theorem, however, proves that inequality K_n \leq C\log n is optimal up to the constant factor.

Links: The original paper is available here.

Go to the list of all theorems

Random Bernoulli n by n matrix is singular with probability at most (3/4+o(1))^n

You need to know: Matrix, determinant \text{det}(M) of square matrix M, singular square matrix (matrix with determinant 0), basic probability theory,  independent and identically distributed (i.i.d.) random variables, o(1) notation.

Background: Let M_n be a random n \times n matrix whose entries are i.i.d. Bernoulli random variables (each entry is equal to +1 or -1 with probabilities 1/2). Let P_n = P(\text{det}(M_n) = 0) be the probability that M_n is singular.

The Theorem: On 5th November 2004, Terence Tao and Van Vu submitted to the Journal of the AMS a paper in which they proved that P_n \leq \left(\frac{3}{4}+o(1)\right)^n as n\to\infty.

Short context: The question of estimating P_n has attracted considerable attention in literature. In 1967, Komlós proved that \lim\limits_{n\to\infty} P_n = 0. In 1995, Kahn, Komlós, and Szemerédi proved that P_n=O(0.999^n). This was later improved by Tao and Vu to P_n=O(0.958^n). The Theorem provides a much better upper bound.

Links: Free arxiv version of the original paper is here, journal version is here.

Go to the list of all theorems

Uncountable set of 2-generated groups with exactly 2 conjugacy classes

You need to know: Group, identity element e in a group, subgroup, isomorphic groups, conjugate elements in a group, conjugacy classes, countable and uncountable sets, notation |S| for the number of elements in finite set S.

Background: A generating set of group G is a subset S \subset G such that every g\in G can be written as a finite product of elements of S and their inverses. Group G is called finitely generated if it has finite generating set. In particular, it is called 2-generated if it has generating set S with |S|=2. We say that group H can be embedded into group G, if G has a subgroup isomorphic to H. The order of non-identity element a\in G is the lowest positive integer n such that a^n=e, with the convention that the order is infinite if no such n exists. For a group G, let \pi(G) be the set of all (finite) positive integers n such that there exists an element a \in G of order n.

The Theorem: On 2nd November 2004, Denis Osin submitted to arxiv a paper in which he proved, among other results, that any countable group H can be embedded into a 2-generated group G such that any two elements of the same order are conjugate in G and \pi(H) = \pi(G).

Short context: Applying the Theorem to any torsion-free group H (a group is called torsion-free if all non-identity elements in it have infinite order), we obtain that any countable torsion-free group can be embedded into a torsion-free 2-generated group with exactly 2 conjugacy classes. In turn, this implies that there exist uncountably many pairwise non-isomorphic torsion-free 2-generated groups with exactly 2 conjugacy classes. Before 2004, it was an open question whether there exists any finitely generated group G with |G|>2 which has exactly 2 conjugacy classes. The Theorem also implies that for any integer n \geq 2 there are uncountably many pairwise nonisomorphic finitely generated groups with exactly n conjugacy classes.

Links: Free arxiv version of the original paper is here, journal version is here. See also Section 10.10 of this book for an accessible description of the Theorem.

Go to the list of all theorems

Linear growth of the number of quintic fields with bounded discriminant

You need to know: Prime numbers, infinite product, field, isomorphic and non-isomorphic fields. Also, see this previous theorem description for the concepts of number field, degree of a number field, and discriminant of a number field.

Background: Number fields of degree n=5 are called quintic. For X>0, let N_n(X) denotes the number of non-isomorphic number fields of degree n with absolute value of the discriminant at most X. Also, let {\cal P} be the set of prime numbers.

The Theorem: On 29th September 2004, Manjul Bhargava submitted to the Annals of Mathematics a paper in which he proved that the limit \lim\limits_{X\to\infty}\frac{N_5(X)}{X} exists and is equal to c_5 = \frac{13}{120}\prod\limits_{p\in {\cal P}}\left(1+p^{-2}-p^{-4}-p^{-5}\right) = 0.149....

Short context: Counting number fields up to isomorphism is a basic and important open problem in the area. There is and old folklore conjecture that \lim\limits_{n\to\infty}\frac{N_n(X)}{X} = c_n>0 for every fixed n, but, before 2004, this was known only for n\leq 3 (see here for the best upper bounds for N_n(X) available for general n). In June 2004, Bhargava submitted a paper which proves this conjecture for n=4. The Theorem proves it for n=5 (quintic fields).

Links: Free arxiv version of the original paper is here, journal version is here. See also Section 10.11 of this book for an accessible description of the Theorem.

Go to the list of all theorems

Sparse vectors in R^m can be approximated using O(N log m) non-adaptive measurements

You need to know: Euclidean space {\mathbb R}^m, orthonormal basis in {\mathbb R}^m, inner product (x,y)=\sum\limits_{i=1}^m x_i y_i in {\mathbb R}^m, norm ||x||_2 = \sqrt{(x,x)}, infimum, supremum, big O notation.

Background: Let \psi_i, \, i=1,2,\dots,m be orthonormal basis in {\mathbb R}^m, and let X=\{x \in {\mathbb R}^m\,|\,\left(\sum_i|(x,\psi_i)|^p\right)^{1/p}\leq R\} (where 0<p \leq 1 and R>0 are constants) be the set of vectors whose coefficients (x,\psi_i) in this basis are not too large. Let I_n:X \to {\mathbb R}^n has the form I_n(x)=((x,\xi_1), \dots, (x,\xi_n)) for some vectors \xi_i \in {\mathbb R}^m, A_n: {\mathbb R}^n \to {\mathbb R}^m be arbitrary operator, and E(n,m,p,R) = \inf\limits_{A_n,I_n}\sup\limits_{x\in X}||x-A_n(I_n(x))||_2.

The Theorem: On 18th September 2004, David Donoho submitted to IEEE Transactions on Information Theory a paper in which he proved that for any 0<p \leq 1, A>0 and \gamma>1 there is a constant C_p=C_p(A,\gamma), such that E(n,m_n,p,R) \leq C_p R \left(\frac{n}{\log(m_n)}\right)^{\frac{1}{2}-\frac{1}{p}}, whenever m_n \sim A n^\gamma, \, n\to\infty.

Short context: Let us call vectors x\in X l_p-sparse. Such vectors arise in numerous applications, for example, x may represent an image (e.g. medical image) with m pixels. Vectors \xi_i \in {\mathbb R}^m represent measurements, I_n(x) the result of n measurements, A_n: {\mathbb R}^n \to {\mathbb R}^m is the procedure of reconstruction x from measurements, \sup\limits_{x\in X}||x-A_n(I_n(x))||_2 is the worst-case error of this procedure, and E(n,m,p,R) is the minimal worst-case error we can achieve using non-adaptive procedure, that is, the same for all x. The Theorem states that we can achieve O(N^{\frac{1}{2}-\frac{1}{p}}) error, where N=\frac{n}{\log(m_n)}, using just n=N\log m_n non-adaptive measurements. This methodology is called compressed sensing and has dramatic implications in, for example, fast reconstruction of medical images. The paper is one of the most cited (if not the most cited) mathematical paper of the 21st century.

Links: The original paper is available here.

Go to the list of all theorems

Minkowski’s conjecture is true in dimension n=6

You need to know: Euclidean space {\mathbb R}^n, basis for {\mathbb R}^n, inner product (x,y)=\sum\limits_{i=1}^n x_i y_i in {\mathbb R}^n, matrix, determinant of a matrix.

Background: A lattice L in {\mathbb R}^n is a set of the form L =\left\{\left.\sum\limits_{i=1}^n a_i v_i\,\right\vert\, a_i \in {\mathbb Z}\right\}, where v_1, \dots, v_n is a basis for {\mathbb R}^n. A lattice L is called unimodular if the determinant of the n \times n matrix with entries (v_i,v_j) is 1 or -1. This property does not depend on the choice of basis for L.

For x=(x_1, \dots, x_n) \in {\mathbb R}^n, denote N(x)=|x_1\cdot x_2 \cdot \dots \cdot x_n|.

The Theorem: On 27th August 2004, Curtis McMullen submitted to the journal of the AMS a paper in which he proved that for any unimodular lattice L \subset {\mathbb R}^6, and any x\in {\mathbb R}^6, there is a y\in L such that N(x - y) \leq 2^{-6}=\frac{1}{64}.

Short context: Minkowski’s conjecture states that for any unimodular lattice L \subset {\mathbb R}^n, we have \sup\limits_{x\in {\mathbb R}^n} \inf\limits_{y\in L} N(x - y) \leq 2^{-n}. The conjecture is interesting in its own, and also has number-theoretic consequences. Before 2004, it was known for n \leq 5. The Theorem proves it for n=6.

Links: The original paper is available here.

Go to the list of all theorems