copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Subspace approximation with outliers

A. Deshpande, and R. Pratap. (2020)cite arxiv:2006.16573.

Abstract

The subspace approximation problem with outliers, for given $n$ points in $d$ dimensions $x_1,łdots, x_n R^d$, an integer $1 k d$, and an outlier parameter $0 1$, is to find a $k$-dimensional linear subspace of $R^d$ that minimizes the sum of squared distances to its nearest $(1-\alpha)n$ points. More generally, the $\ell_p$ subspace approximation problem with outliers minimizes the sum of $p$-th powers of distances instead of the sum of squared distances. Even the case of robust PCA is non-trivial, and previous work requires additional assumptions on the input. Any multiplicative approximation algorithm for the subspace approximation problem with outliers must solve the robust subspace recovery problem, a special case in which the $(1-\alpha)n$ inliers in the optimal solution are promised to lie exactly on a $k$-dimensional linear subspace. However, robust subspace recovery is Small Set Expansion (SSE)-hard. We show how to extend dimension reduction techniques and bi-criteria approximations based on sampling to the problem of subspace approximation with outliers. To get around the SSE-hardness of robust subspace recovery, we assume that the squared distance error of the optimal $k$-dimensional subspace summed over the optimal $(1-\alpha)n$ inliers is at least $\delta$ times its squared-error summed over all $n$ points, for some $0 < 1 - \alpha$. With this assumption, we give an efficient algorithm to find a subset of $poly(k/\epsilon) łog(1/\delta) łogłog(1/\delta)$ points whose span contains a $k$-dimensional subspace that gives a multiplicative $(1+\epsilon)$-approximation to the optimal solution. The running time of our algorithm is linear in $n$ and $d$. Interestingly, our results hold even when the fraction of outliers $\alpha$ is large, as long as the obvious condition $0 < 1 - \alpha$ is satisfied.

Description

[2006.16573] Subspace approximation with outliers

Links and resources

BibTeX key: deshpande2020subspace
entry type: misc
year: 2020
url: http://arxiv.org/abs/2006.16573
note: cite arxiv:2006.16573

@analyst's tags highlighted

Cite this publication

%0 Generic %1 deshpande2020subspace %A Deshpande, Amit %A Pratap, Rameshwar %D 2020 %K 2020 approximation linear-algebra %T Subspace approximation with outliers %U http://arxiv.org/abs/2006.16573 %X The subspace approximation problem with outliers, for given $n$ points in $d$ dimensions $x_1,łdots, x_n R^d$, an integer $1 k d$, and an outlier parameter $0 1$, is to find a $k$-dimensional linear subspace of $R^d$ that minimizes the sum of squared distances to its nearest $(1-\alpha)n$ points. More generally, the $\ell_p$ subspace approximation problem with outliers minimizes the sum of $p$-th powers of distances instead of the sum of squared distances. Even the case of robust PCA is non-trivial, and previous work requires additional assumptions on the input. Any multiplicative approximation algorithm for the subspace approximation problem with outliers must solve the robust subspace recovery problem, a special case in which the $(1-\alpha)n$ inliers in the optimal solution are promised to lie exactly on a $k$-dimensional linear subspace. However, robust subspace recovery is Small Set Expansion (SSE)-hard. We show how to extend dimension reduction techniques and bi-criteria approximations based on sampling to the problem of subspace approximation with outliers. To get around the SSE-hardness of robust subspace recovery, we assume that the squared distance error of the optimal $k$-dimensional subspace summed over the optimal $(1-\alpha)n$ inliers is at least $\delta$ times its squared-error summed over all $n$ points, for some $0 < 1 - \alpha$. With this assumption, we give an efficient algorithm to find a subset of $poly(k/\epsilon) łog(1/\delta) łogłog(1/\delta)$ points whose span contains a $k$-dimensional subspace that gives a multiplicative $(1+\epsilon)$-approximation to the optimal solution. The running time of our algorithm is linear in $n$ and $d$. Interestingly, our results hold even when the fraction of outliers $\alpha$ is large, as long as the obvious condition $0 < 1 - \alpha$ is satisfied.

@misc{deshpande2020subspace, abstract = {The subspace approximation problem with outliers, for given $n$ points in $d$ dimensions $x_{1},\ldots, x_{n} \in R^{d}$, an integer $1 \leq k \leq d$, and an outlier parameter $0 \leq \alpha \leq 1$, is to find a $k$-dimensional linear subspace of $R^{d}$ that minimizes the sum of squared distances to its nearest $(1-\alpha)n$ points. More generally, the $\ell_{p}$ subspace approximation problem with outliers minimizes the sum of $p$-th powers of distances instead of the sum of squared distances. Even the case of robust PCA is non-trivial, and previous work requires additional assumptions on the input. Any multiplicative approximation algorithm for the subspace approximation problem with outliers must solve the robust subspace recovery problem, a special case in which the $(1-\alpha)n$ inliers in the optimal solution are promised to lie exactly on a $k$-dimensional linear subspace. However, robust subspace recovery is Small Set Expansion (SSE)-hard. We show how to extend dimension reduction techniques and bi-criteria approximations based on sampling to the problem of subspace approximation with outliers. To get around the SSE-hardness of robust subspace recovery, we assume that the squared distance error of the optimal $k$-dimensional subspace summed over the optimal $(1-\alpha)n$ inliers is at least $\delta$ times its squared-error summed over all $n$ points, for some $0 < \delta \leq 1 - \alpha$. With this assumption, we give an efficient algorithm to find a subset of $poly(k/\epsilon) \log(1/\delta) \log\log(1/\delta)$ points whose span contains a $k$-dimensional subspace that gives a multiplicative $(1+\epsilon)$-approximation to the optimal solution. The running time of our algorithm is linear in $n$ and $d$. Interestingly, our results hold even when the fraction of outliers $\alpha$ is large, as long as the obvious condition $0 < \delta \leq 1 - \alpha$ is satisfied.}, added-at = {2020-07-06T11:13:56.000+0200}, author = {Deshpande, Amit and Pratap, Rameshwar}, biburl = {https://www.bibsonomy.org/bibtex/29aae7e15d501af138f63a9447fd51149/analyst}, description = {[2006.16573] Subspace approximation with outliers}, interhash = {dba21b7f543d6226f5d6513a3f79009c}, intrahash = {9aae7e15d501af138f63a9447fd51149}, keywords = {2020 approximation linear-algebra}, note = {cite arxiv:2006.16573}, timestamp = {2020-07-06T11:13:56.000+0200}, title = {Subspace approximation with outliers}, url = {http://arxiv.org/abs/2006.16573}, year = 2020 }

BibSonomy

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Subspace approximation with outliers

Abstract

Description

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews
(0)

BibSonomy

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML Subspace approximation with outliers

Abstract

Description

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews (0)

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Subspace approximation with outliers

Comments and Reviews
(0)