webpage,

Learning Low dimensional Manifolds

.
(October 2009)

Abstract

Many read-world datasets can be characterized as follows: the "extrinsic dimension" of the data is high, but the "intrinsic dimension" is low. Consider for example the data generated by a motion capture device. Such a device typically tracks a few hundred dots located on a special suit worn by the tracked person. Each time point corresponds to a vector consisting of the (x,y,z) location of each dot. The extrinsic dimension of these vectors is thus around one thousand. However, the vectors are highly constrained because the dots are placed on a human body that has only a limited number of degrees of freedom. We say that the "intrinsic dimension" is the number of the degrees of freedom of the data. We are interested in learning algorithms whose performance scales with the intrinsic dimension of the data. We present the random projection trees algorithm which has this type of performance. Moreover, the algorithm is very efficient computationally and can be performed in a streaming fashion where each data point is seen only once. This is joint work with Sanjoy Dasgupta.

Tags

Users

  • @mhwombat

Comments and Reviews