Efficient Leave-One-Out Strategy for Supervised Feature Selection

Dingcheng Feng; Feng Chen; Wenli Xu

doi:10.1109/TST.2013.6678908

AI Chat Paper

Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.

Chat more with AI

| Sign up

Browse by Subject

Search for peer-reviewed journals with full access.

Journals A - Z

About Us

Discover the SciOpen Platform and Achieve Your Research Goals with Ease.

About Us

Publish with Us

Support

Journals A - Z

About Us

Publish with Us

Support

PDF (668.5 KB)

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

AI Chat Paper

Show Outline

Outline

Show full outline

Hide outline

Outline

Show full outline

Hide outline

Open Access

Efficient Leave-One-Out Strategy for Supervised Feature Selection

Dingcheng Feng, Feng Chen(

), Wenli Xu

National Laboratory for Information Science and Technology, Tsinghua University, Beijing 100084, China

Show Author Information

Abstract

Feature selection is a key task in statistical pattern recognition. Most feature selection algorithms have been proposed based on specific objective functions which are usually intuitively reasonable but can sometimes be far from the more basic objectives of the feature selection. This paper describes how to select features such that the basic objectives, e.g., classification or clustering accuracies, can be optimized in a more direct way. The analysis requires that the contribution of each feature to the evaluation metrics can be quantitatively described by some score function. Motivated by the conditional independence structure in probabilistic distributions, the analysis uses a leave-one-out feature selection algorithm which provides an approximate solution. The leave-one-out algorithm improves the conventional greedy backward elimination algorithm by preserving more interactions among features in the selection process, so that the various feature selection objectives can be optimized in a unified way. Experiments on six real-world datasets with different feature evaluation metrics have shown that this algorithm outperforms popular feature selection algorithms in most situations.

Keywords

leave-one-out feature selection objectives evaluation metrics

References

[1]

I. Guyon and A. Elisseeff, An introduction to variable and feature selection, J. Mach. Learn. Res., vol. 3, nos. 7-8, pp. 1157-1182, Oct. 2003.

Google Scholar

[2]

H. Peng, F. Long, and C. Ding, Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy, IEEE Trans. Pattern Anal. Mach. Intell., vol. 27, no. 8, pp. 1226-1238, Aug. 2005.

Crossref Google Scholar

[3]

M. C. Bishop, Pattern Recognition and Machine Learning, Springer, 2006.

[4]

J. Li and M. Sun, Non-independent term selection for Chinese text categorization, Tsinghua Science and Technology, vol.14, no.1, pp. 113-120, Feb. 2009.

Crossref Google Scholar

[5]

F. Nie, H. Huang, X. Cai, and C. Ding, Efficient and robust feature selection via joint l2, 1-norms minimization, in Advances in Neural Information Processing Systems 23, Vancouver, BC, Canada, 2010, pp. 1813-1821.

[6]

Z. Zhao, L. Wang, H. Liu, and J. Ye, On similarity preserving feature selection, IEEE Trans. Knowl. Data Eng., vol. 25, no. 3, pp. 619-632, Mar. 2013.

Crossref Google Scholar

[7]

C. Hou, F. Nie, D. Yi, and Y. Wu, Feature selection via joint embedding learning and sparse regression, in Proc. 22nd Int. Joint Conf. on Artificial Intelligence, Barcelona, Spain, 2011, pp. 1324-1329.

[8]

S. Xiang, F. Nie, G. Meng, C. Pan, and C. Zhang, Discriminative least squares regression for multiclass classification and feature selection, IEEE Trans. Neural Netw. Learn. Syst., vol. 23, no. 11, pp. 1738-1754, Nov. 2012.

Crossref Google Scholar

[9]

M. Belkin and M. Niyogi, Laplacian eigenmaps for dimensionality reduction and data representation, Neural Comput., vo. 15, no. 6, pp. 1373-1396, Jun 2003.

Crossref Google Scholar

[10]

S. Roweis and L. Saul, Nonlinear dimensionality reduction by locally linear embedding, Science, vol. 290, no. 5500, pp. 2323-2326, Dec. 2000.

Crossref Google Scholar

[11]

R. Duda, P. Hart, and D. Stork, Pattern Classification, Wiley-Interscience, 2001.

[12]

X. He, D. Cai, and P. Niyogi, Laplacian score for feature selection, in Advances in Neural Information Processing Systems 18, Vancouver, BC, Canada, 2006, pp. 507-514.

[13]

I. Kononenko, Estimating attributes: Analysis and extensions of relief, in Machine Learning: ECML-94, Springer, 1994, pp. 171-182.

Crossref

[14]

Z. Zhao and H. Liu, Spectral feature selection for supervised and unsupervised learning, in Proc. 24th Int. Conf. on Machine Learning, Corvallis, USA, 2007, pp. 1151-1157.

Crossref

[15]

F. Nie, S. Xiang, Y. Jia, C. Zhang, and S. Yan, Trace ratio criterion for feature selection, in Proc. 23rd AAAI Conf. on Artificial Intelligence, Chicago, USA, 2008, pp. 671-676.

[16]

D. Cai, C. Zhang, and X. He, Unsupervised feature selection for multi-cluster data, in Proc. 16th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, Washington DC, USA, 2010, pp. 333-342.

Crossref

[17]

J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1988.

[18]

Z. Zhao, L. Wang, and H. Liu, Efficient spectral feature selection with minimum redundancy, in Proc. 24th AAAI Conf. on Artificial Intelligence, Atlanta, USA, 2010, pp. 673-678

Tsinghua Science and Technology

Volume 18 Issue 6,
December 2013

Pages 629-635

DOI: 10.1109/TST.2013.6678908

Cite this article:

Feng D, Chen F, Xu W. Efficient Leave-One-Out Strategy for Supervised Feature Selection. Tsinghua Science and Technology, 2013, 18(6): 629-635. https://doi.org/10.1109/TST.2013.6678908

702

Views

Downloads

Crossref

N/A

Web of Science

Scopus

CSCD

Google Scholar
Citation

Altmetrics

Received: 15 October 2012

Revised: 05 June 2013

Accepted: 07 June 2013

Published: 06 December 2013