L1-regularisation recovers sparse vectors in R^m from n<<m inaccurate measurements

You need to know: Euclidean space ${\mathbb R}^m$ , norms $||x||_2 = \sqrt{\sum\limits_{i=1}^m x_i^2}$ and $||x||_1 = \sum\limits_{i=1}^m |x_i|$ in ${\mathbb R}^m$ , support of vector $x \in {\mathbb R}^m$ (subset $T\subset \{1,\dots,m\}$ such that $x_i=0$ for all $i \not\in T$ ), $n \times m$ matrix, matrix multiplication, big O notation.

Background: For $x_0 \in {\mathbb R}^m$ let $y=Ax_0+e$ , where A is $n \times m$ matrix ( $n<m$ ) and $e\in {\mathbb R}^n$ . Let $x^*$ be a solution to the problem $\min\limits_{x \in {\mathbb R}^m} ||x||_1$ subject to $||Ax-y||_2 \leq \epsilon$ for some $\epsilon \geq 0$ (this is called $l1$ -regularisation problem).

For $T\subset \{1,\dots,m\}$ , let $A_T$ be $n \times |T|$ submatrix of A formed from the columns of A corresponding to the indices in T. For $S>0$ , let $\delta_S$ be the smallest number such that inequalities $(1-\delta_S)||c||_2^2 \leq ||A_Tc||_2^2\leq (1+\delta_S)||c||_2^2$ hold for all T with $|T|\leq S$ and all $c\in{\mathbb R}^{|T|}$ .

The Theorem: On 3rd March 2005, Emmanuel Candes, Justin Romberg, and Terence Tao submitted to arxiv and Communications on Pure and Applied Mathematics a paper in which they proved the following result. Let A be $n \times m$ matrix ( $n<m$ ), and let S be such that $\delta_{3S}+3\delta_{4S} < 2$ . Then for any $x_0 \in {\mathbb R}^m$ supported on $T_0$ with $|T_0|\leq S$ and any $e\in {\mathbb R}^n$ such that $||e||_2 \leq \epsilon$ , we have $||x^*-x_0||_2 \leq C_S \cdot \epsilon$ , where the constant $C_S$ depends only on $\delta_{4S}$ . For example, $C_S \approx 8.82$ for $\delta_{4S}=\frac{1}{5}$ and $C_S \approx 10.47$ for $\delta_{4S}=\frac{1}{4}$ .

Short context: Vectors $x_0 \in {\mathbb R}^m$ with few non-zero entries are called sparse and may represent, for example, image or signal in various applications. Vector $y=Ax_0+e$ is the result of n measurements (given by rows of A) with error e. Given y, we then try to recover $x_0$ by solving the $l1$ -regularisation problem. The Theorem states that its solution $x^*$ is close to $x_0$ . In particular, if the measurements are exact ( $\epsilon=0$ ) then $x^* = x_0$ . The authors show that the condition $\delta_{3S}+3\delta_{4S} < 2$ in the Theorem holds for almost all matrices A with unit-normed columns if $n=O(S\log m)$ , hence, with high probability, we can recover $x_0$ after just $O(|T_0|\log m)$ random measurements. In an earlier work, Donoho proved the existence of a recovery procedure with similar properties, while the Theorem states that a concrete efficient algorithm ( $l1$ -regularisation) does the job, even for inaccurate measurements. The error $C_S \cdot \epsilon$ in the Theorem is essentially optimal.

Links: Free arxiv version of the original paper is here, journal version is here.

Go to the list of all theorems

Pólya-Vinogradov bound is not tight for characters of odd, bounded order

You need to know: Set ${\mathbb Z}$ of integers, greatest common divisor (gcd) of 2 integers, set ${\mathbb C}$ of complex numbers, $o(1)$ notation.

Background: Function $\chi_q:{\mathbb Z}\to{\mathbb C}$ is called a Dirichlet character modulo integer $q>0$ if (i) $\chi_q(n)=\chi_q(n+q)$ for all n, (ii) if $\text{gcd}(n,q) > 1$ then $\chi_q(n)=0$ ; if $\text{gcd}(n,q) = 1$ then $\chi_q(n) \neq 0$ , and (iii) $\chi_q(mn)=\chi_q(m)\chi_q(n)$ for all integers m and n. An example is character $\chi_q^*$ such that $\chi_q^*(n)=1$ whenever $\text{gcd}(n,q) = 1$ and $\chi_q^*(n)=0$ otherwise. This character is called principal and all other characters – nonprincipal. The order of a Dirichlet character $\chi_q$ is the least positive integer m such that $\chi_q(n)^m=\chi_q^*(n)$ for all n. The character $\chi_q$ is called primitive if there is no integer $0<d<q$ such that $\chi_q(a)=\chi_q(b)$ whenever $\text{gcd}(a,q) = \text{gcd}(b,q) = 1$ and $a-b$ is a multiple of d. For any nonprincipal character $\chi_q$ let $M(\chi_q) = \max\limits_{1\leq m\leq q}\left|\sum\limits_{n=1}^m \chi_q(n)\right|$ .

The Theorem: On 2nd March 2005, Andrew Granville and Kannan Soundararajan submitted to the Jornal of the AMS a paper in which they proved that that if $\chi_q$ is a primitive character modulo q of odd order g, then $M(\chi_q) \leq C_g \sqrt{q} (\log q)^{1-\frac{\delta_g}{2} + o(1)}$ where $\delta_g = 1 - \frac{g}{\pi}\sin\frac{\pi}{g}$ , and $C_g$ is a constant depending only on g.

Short context: In 1918, Pólya and Vinogradov independently proved that $M(\chi_q)=O(\sqrt{q}\log q)$ for any nonprincipal Dirichlet character $\chi_q$ . This is known as the Pólya–Vinogradov inequality, has numerous applications, but it is an open question whether a better bound for $M(\chi_q)$ in terms of q is possible. The Theorem provides an improved bound for characters of odd, bounded order.

Links: Free arxiv version of the original paper is here, journal version is here.

Go to the list of all theorems

There are asymptotically C n^(-7/2) g^n n! labelled planar graphs on n vertices

You need to know: Graph, planar graph, labelled graph, factorial $n!=1\cdot 2 \cdot \dots \cdot n$ .

Background: For positive integer n, let $g(n)$ be the number of labelled planar graphs on n vertices. We write $g(n) \sim f(n)$ if $\lim\limits_{n\to\infty}\frac{g(n)}{f(n)}=1$ .

The Theorem: On 18th January 2005, Omer Gimenez and Marc Noy submitted to arxiv a paper in which they proved that $g(n) \sim C n^{-7/2}\gamma^n n!$ , where $\gamma = 27.22688\dots$ and $C = 0.0000042609\dots$ are explicit constants.

Short context: How many planar graphs are there? It is not difficult to show the existence of limit $\gamma = \lim\limits_{n\to\infty} \left(\frac{g(n)}{n!}\right)^{1/n}$ , but much more difficult to actually compute this limit. Many researchers provided better and better lower and upper bounds for $\gamma$ , with the best (before 2005) bounds $\gamma > 26.18$ by Bender, Gao and Wormald and $\gamma < 30.06$ by Bonichon et.al. The Theorem provides an exact analytical expression for $\gamma$ , which allows numerical computation up to arbitrary precision. Moreover, it provides the exact asymptotic formula for $g(n)$ .

Links: Free arxiv version of the original paper is here, journal version is here.

Go to the list of all theorems

Undirected connectivity is in log-space

You need to know: Undirected graph, path connecting 2 vertices in a graph, algorithm, memory used by an algorithm, big O notation.

Background: The undirected connectivity problem has the following formulation: given as input an undirected graph G with n vertices and two vertices s and t in G, find out whether or not these two vertices are connected by a path in G.

The Theorem: In January 2008, Omer Reingold submitted to the Journal of the ACM a paper in which he presented an algorithm which can solve this problem using memory $O(\log n)$ .

Short context: Efficient memory use is one of the fundamental requirement for algorithms, almost as important and well-studied as running time. If the decision problem has an algorithm which solves it using memory $O(\log n)$ , where n is the size of (read-only) input, we say that this problem belongs to the class log-space. The Theorem proves that the undirected connectivity problem, which naturally arises in many applications, belongs to this class.

Links: The original paper is available here.

Go to the list of all theorems

The Grothendieck constant of the complete n-vertex graph is at least c log n

You need to know: Euclidean space ${\mathbb R}^n$ , inner product $(x,y)=\sum\limits_{i=1}^n x_i y_i$ in ${\mathbb R}^n$ , unit vector in ${\mathbb R}^n$ , matrix, big O notation.

Background: For a positive integer n, let $K_n$ be the minimal constant such that for every $n \times n$ matrix with entries $a_{ij}$ and every unit vectors $x_1,\dots, x_n$ in ${\mathbb R}^n$ there are signs $e_1,\dots,e_n \in \{-1, +1\}$ for which $\sum\limits_{i \neq j} a_{ij}(x_i, x_j) \leq K_n \sum\limits_{i \neq j} a_{ij}e_ie_j$ . Constant $K_n$ is called the Grothendieck constant of the complete n-vertex graph, by analogy with the Grothendieck constant $K_G$ , see here.

The Theorem: On 8th November 2004, Noga Alon, Konstantin Makarychev, Yury Makarychev, and Assaf Naor submitted to Inventiones mathematicae a paper in which they proved, among other results, the existence of a constant $c>0$ such that $K_n \geq c \log n$ for all n.

Short context: In 1999, Nemirovski, Roos, and Terlaky proved that $K_n \leq C\log n$ for some constant C. Later, Megretski asked if this can be improved to $K_n \leq C$ . In 2004, Charikar and Wirth derived from inequality $K_n \leq C\log n$ various algorithms for combinatorial problems, and noticed that better upper bound on $K_n$ could lead to algorithms with better accuracy. The Theorem, however, proves that inequality $K_n \leq C\log n$ is optimal up to the constant factor.

Links: The original paper is available here.

Go to the list of all theorems

Random Bernoulli n by n matrix is singular with probability at most (3/4+o(1))^n

You need to know: Matrix, determinant $\text{det}(M)$ of square matrix M, singular square matrix (matrix with determinant $0$ ), basic probability theory, independent and identically distributed (i.i.d.) random variables, $o(1)$ notation.

Background: Let $M_n$ be a random $n \times n$ matrix whose entries are i.i.d. Bernoulli random variables (each entry is equal to $+1$ or $-1$ with probabilities $1/2$ ). Let $P_n = P(\text{det}(M_n) = 0)$ be the probability that $M_n$ is singular.

The Theorem: On 5th November 2004, Terence Tao and Van Vu submitted to the Journal of the AMS a paper in which they proved that $P_n \leq \left(\frac{3}{4}+o(1)\right)^n$ as $n\to\infty$ .

Short context: The question of estimating $P_n$ has attracted considerable attention in literature. In 1967, Komlós proved that $\lim\limits_{n\to\infty} P_n = 0$ . In 1995, Kahn, Komlós, and Szemerédi proved that $P_n=O(0.999^n)$ . This was later improved by Tao and Vu to $P_n=O(0.958^n)$ . The Theorem provides a much better upper bound.

Links: Free arxiv version of the original paper is here, journal version is here.

Go to the list of all theorems

Uncountable set of 2-generated groups with exactly 2 conjugacy classes

You need to know: Group, identity element $e$ in a group, subgroup, isomorphic groups, conjugate elements in a group, conjugacy classes, countable and uncountable sets, notation $|S|$ for the number of elements in finite set S.

Background: A generating set of group G is a subset $S \subset G$ such that every $g\in G$ can be written as a finite product of elements of S and their inverses. Group G is called finitely generated if it has finite generating set. In particular, it is called 2-generated if it has generating set S with $|S|=2$ . We say that group H can be embedded into group G, if G has a subgroup isomorphic to H. The order of non-identity element $a\in G$ is the lowest positive integer n such that $a^n=e$ , with the convention that the order is infinite if no such n exists. For a group G, let $\pi(G)$ be the set of all (finite) positive integers n such that there exists an element $a \in G$ of order n.

The Theorem: On 2nd November 2004, Denis Osin submitted to arxiv a paper in which he proved, among other results, that any countable group H can be embedded into a 2-generated group G such that any two elements of the same order are conjugate in G and $\pi(H) = \pi(G)$ .

Short context: Applying the Theorem to any torsion-free group H (a group is called torsion-free if all non-identity elements in it have infinite order), we obtain that any countable torsion-free group can be embedded into a torsion-free 2-generated group with exactly 2 conjugacy classes. In turn, this implies that there exist uncountably many pairwise non-isomorphic torsion-free 2-generated groups with exactly 2 conjugacy classes. Before 2004, it was an open question whether there exists any finitely generated group G with $|G|>2$ which has exactly 2 conjugacy classes. The Theorem also implies that for any integer $n \geq 2$ there are uncountably many pairwise nonisomorphic finitely generated groups with exactly n conjugacy classes.

Links: Free arxiv version of the original paper is here, journal version is here. See also Section 10.10 of this book for an accessible description of the Theorem.

Go to the list of all theorems

Linear growth of the number of quintic fields with bounded discriminant

You need to know: Prime numbers, infinite product, field, isomorphic and non-isomorphic fields. Also, see this previous theorem description for the concepts of number field, degree of a number field, and discriminant of a number field.

Background: Number fields of degree $n=5$ are called quintic. For $X>0$ , let $N_n(X)$ denotes the number of non-isomorphic number fields of degree n with absolute value of the discriminant at most X. Also, let ${\cal P}$ be the set of prime numbers.

The Theorem: On 29th September 2004, Manjul Bhargava submitted to the Annals of Mathematics a paper in which he proved that the limit $\lim\limits_{X\to\infty}\frac{N_5(X)}{X}$ exists and is equal to $c_5 = \frac{13}{120}\prod\limits_{p\in {\cal P}}\left(1+p^{-2}-p^{-4}-p^{-5}\right) = 0.149...$ .

Short context: Counting number fields up to isomorphism is a basic and important open problem in the area. There is and old folklore conjecture that $\lim\limits_{n\to\infty}\frac{N_n(X)}{X} = c_n>0$ for every fixed n, but, before 2004, this was known only for $n\leq 3$ (see here for the best upper bounds for $N_n(X)$ available for general n). In June 2004, Bhargava submitted a paper which proves this conjecture for $n=4$ . The Theorem proves it for $n=5$ (quintic fields).

Links: Free arxiv version of the original paper is here, journal version is here. See also Section 10.11 of this book for an accessible description of the Theorem.

Go to the list of all theorems

Sparse vectors in R^m can be approximated using O(N log m) non-adaptive measurements

You need to know: Euclidean space ${\mathbb R}^m$ , orthonormal basis in ${\mathbb R}^m$ , inner product $(x,y)=\sum\limits_{i=1}^m x_i y_i$ in ${\mathbb R}^m$ , norm $||x||_2 = \sqrt{(x,x)}$ , infimum, supremum, big O notation.

Background: Let $\psi_i, \, i=1,2,\dots,m$ be orthonormal basis in ${\mathbb R}^m$ , and let $X=\{x \in {\mathbb R}^m\,|\,\left(\sum_i|(x,\psi_i)|^p\right)^{1/p}\leq R\}$ (where $0<p \leq 1$ and $R>0$ are constants) be the set of vectors whose coefficients $(x,\psi_i)$ in this basis are not too large. Let $I_n:X \to {\mathbb R}^n$ has the form $I_n(x)=((x,\xi_1), \dots, (x,\xi_n))$ for some vectors $\xi_i \in {\mathbb R}^m$ , $A_n: {\mathbb R}^n \to {\mathbb R}^m$ be arbitrary operator, and $E(n,m,p,R) = \inf\limits_{A_n,I_n}\sup\limits_{x\in X}||x-A_n(I_n(x))||_2.$

The Theorem: On 18th September 2004, David Donoho submitted to IEEE Transactions on Information Theory a paper in which he proved that for any $0<p \leq 1$ , $A>0$ and $\gamma>1$ there is a constant $C_p=C_p(A,\gamma)$ , such that $E(n,m_n,p,R) \leq C_p R \left(\frac{n}{\log(m_n)}\right)^{\frac{1}{2}-\frac{1}{p}},$ whenever $m_n \sim A n^\gamma, \, n\to\infty$ .

Short context: Let us call vectors $x\in X$ $l_p$ -sparse. Such vectors arise in numerous applications, for example, x may represent an image (e.g. medical image) with m pixels. Vectors $\xi_i \in {\mathbb R}^m$ represent measurements, $I_n(x)$ the result of n measurements, $A_n: {\mathbb R}^n \to {\mathbb R}^m$ is the procedure of reconstruction x from measurements, $\sup\limits_{x\in X}||x-A_n(I_n(x))||_2$ is the worst-case error of this procedure, and $E(n,m,p,R)$ is the minimal worst-case error we can achieve using non-adaptive procedure, that is, the same for all x. The Theorem states that we can achieve $O(N^{\frac{1}{2}-\frac{1}{p}})$ error, where $N=\frac{n}{\log(m_n)}$ , using just $n=N\log m_n$ non-adaptive measurements. This methodology is called compressed sensing and has dramatic implications in, for example, fast reconstruction of medical images. The paper is one of the most cited (if not the most cited) mathematical paper of the 21st century.

Links: The original paper is available here.

Go to the list of all theorems

Minkowski’s conjecture is true in dimension n=6

You need to know: Euclidean space ${\mathbb R}^n$ , basis for ${\mathbb R}^n$ , inner product $(x,y)=\sum\limits_{i=1}^n x_i y_i$ in ${\mathbb R}^n$ , matrix, determinant of a matrix.

Background: A lattice $L$ in ${\mathbb R}^n$ is a set of the form $L =\left\{\left.\sum\limits_{i=1}^n a_i v_i\,\right\vert\, a_i \in {\mathbb Z}\right\}$ , where $v_1, \dots, v_n$ is a basis for ${\mathbb R}^n$ . A lattice $L$ is called unimodular if the determinant of the $n \times n$ matrix with entries $(v_i,v_j)$ is $1$ or $-1$ . This property does not depend on the choice of basis for L.

For $x=(x_1, \dots, x_n) \in {\mathbb R}^n$ , denote $N(x)=|x_1\cdot x_2 \cdot \dots \cdot x_n|$ .

The Theorem: On 27th August 2004, Curtis McMullen submitted to the journal of the AMS a paper in which he proved that for any unimodular lattice $L \subset {\mathbb R}^6$ , and any $x\in {\mathbb R}^6$ , there is a $y\in L$ such that $N(x - y) \leq 2^{-6}=\frac{1}{64}$ .

Short context: Minkowski’s conjecture states that for any unimodular lattice $L \subset {\mathbb R}^n$ , we have $\sup\limits_{x\in {\mathbb R}^n} \inf\limits_{y\in L} N(x - y) \leq 2^{-n}$ . The conjecture is interesting in its own, and also has number-theoretic consequences. Before 2004, it was known for $n \leq 5$ . The Theorem proves it for $n=6$ .

Links: The original paper is available here.

Go to the list of all theorems