Cover's theorem

Cover's Theorem is a statement in computational learning theory and is one of the primary theoretical motivations for the use of non-linear kernel methods in machine learning applications. The theorem states that given a set of training data that is not linearly separable, one can with high probability transform it into a training set that is linearly separable by projecting it into a higher-dimensional space via some non-linear transformation. The theorem is named after the information theorist Thomas M. Cover who stated it in 1965.

The proof is easy. A deterministic mapping may be used. Indeed, suppose there are samples. Lift them onto the vertices of the simplex in the dimensional real space. Every partition of the samples into two sets is separable by a linear separator. QED.

A complex pattern-classification problem, cast in a high-dimensional space nonlinearly, is more likely to be linearly separable than in a low-dimensional space, provided that the space is not densely populated.
Cover, T.M., Geometrical and Statistical properties of systems of linear inequalities with applications in pattern recognition, 1965

References


    This article is issued from Wikipedia - version of the 11/27/2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.