The set of n=d exp(cd) i.i.d Gaussian points in R^d is linearly separable

You need to know: Basic probability theory, random vector in {\mathbb R}^d, independent and identically distributed (i.i.d.) random vectors, Gaussian distribution in {\mathbb R}^d, covariance matrix, non-singular matrix, overwhelming probability, hyperplane in {\mathbb R}^d.

Background: Let X be a random point cloud consisting of n points x_i, i= 1,\dots,n, sampled independently and identically from a Gaussian distribution in {\mathbb R}^d with non-singular covariance matrix. Set X is called linearly separable (or 1-convex) if every point in it can be separated from other points by a hyperplane.

The Theorem: On 5th July 2006, David Donoho and Jared Tanner submitted to the Journal of the AMS a paper in which they proved, among other results, that for any \epsilon>0, with overwhelming probability for large d<n, each subset of X of size at most \frac{d}{2e \log(n/d)}(1-\epsilon) can be separated from other points of X by a hyperplane.

Short context: If you select n random points on the plane, then, if n is not too small, you will likely to find a point which is in the convex hull of the other points, and therefore cannot be separated from them by a line. The theorem states that (at least for Gaussian distribution), the situation in high dimension is dramatically different: with very high probability, every group of \frac{d}{2e \log(n/d)}(1-\epsilon) points can be separated from others by a hyperplane. In particular, set of n points is linearly separable if \frac{d}{2e \log(n/d)}(1-\epsilon)>1, or n < d \exp(cd) for some c>0. The Theorem has many applications, to, for example, developing efficient error-correction mechanisms, sparse signal recovery, etc.

Links: Free arxiv version of the original paper is here, journal version is here.

Go to the list of all theorems

 

Leave a comment