Subspace clustering methods which embrace a self-expressive model that represents each data point as a linear combination of other data points in the dataset provide powerful unsupervised learning techniques. However, when dealing with large datasets, representation of each data point by referring to all data points via a dictionary suffers from high computational complexity. To alleviate this issue, we introduce a parallelizable multi-subset based self-expressive model (PMS) which represents each data point by combining multiple subsets, with each consisting of only a small proportion of the samples. The adoption of PMS in subspace clustering (PMSSC) leads to computational advantages because the optimization problems decomposed over each subset are small, and can be solved efficiently in parallel. Furthermore, PMSSC is able to combine multiple self-expressive coefficient vectors obtained from subsets, which contributes to an improvement in self-expressiveness. Extensive experiments on synthetic and real-world datasets show the efficiency and effectiveness of our approach in comparison to other methods.
- Article type
- Year
- Co-author
In this paper we address the problem ofgeometric multi-model fitting using a few weakly annotated data points, which has been little studied so far. In weak annotating (WA), most manual annotations are supposed to be correct yet inevitably mixed with incorrect ones. Such WA data can naturally arise through interaction in various tasks. For example, in the case of homography estimation, one can easily annotate points on the same plane or object with a single label by observing the image. Motivated by this, we propose a novel method to make full use of WA data to boost multi-model fitting performance. Specifically, a graph for model proposal sampling is first constructed using the WA data, given the prior that WA data annotated with the same weak label has a high probability of belonging to the same model. By incorporating this prior knowledge into the calculation of edge probabilities, vertices (i.e., data points) lying on or near the latent model are likely to be associated and further form a subset or cluster for effective proposal generation. Having generated proposals,