[1]
I. Guyon and A. Elisseeff, An introduction to variable and feature selection, J. Mach. Learn. Res., vol. 3, nos. 7-8, pp. 1157-1182, Oct. 2003.
[2]
H. Peng, F. Long, and C. Ding, Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy, IEEE Trans. Pattern Anal. Mach. Intell., vol. 27, no. 8, pp. 1226-1238, Aug. 2005.
[3]
M. C. Bishop, Pattern Recognition and Machine Learning, Springer, 2006.
[4]
J. Li and M. Sun, Non-independent term selection for Chinese text categorization, Tsinghua Science and Technology, vol.14, no.1, pp. 113-120, Feb. 2009.
[5]
F. Nie, H. Huang, X. Cai, and C. Ding, Efficient and robust feature selection via joint l2, 1-norms minimization, in Advances in Neural Information Processing Systems 23, Vancouver, BC, Canada, 2010, pp. 1813-1821.
[6]
Z. Zhao, L. Wang, H. Liu, and J. Ye, On similarity preserving feature selection, IEEE Trans. Knowl. Data Eng., vol. 25, no. 3, pp. 619-632, Mar. 2013.
[7]
C. Hou, F. Nie, D. Yi, and Y. Wu, Feature selection via joint embedding learning and sparse regression, in Proc. 22nd Int. Joint Conf. on Artificial Intelligence, Barcelona, Spain, 2011, pp. 1324-1329.
[8]
S. Xiang, F. Nie, G. Meng, C. Pan, and C. Zhang, Discriminative least squares regression for multiclass classification and feature selection, IEEE Trans. Neural Netw. Learn. Syst., vol. 23, no. 11, pp. 1738-1754, Nov. 2012.
[9]
M. Belkin and M. Niyogi, Laplacian eigenmaps for dimensionality reduction and data representation, Neural Comput., vo. 15, no. 6, pp. 1373-1396, Jun 2003.
[10]
S. Roweis and L. Saul, Nonlinear dimensionality reduction by locally linear embedding, Science, vol. 290, no. 5500, pp. 2323-2326, Dec. 2000.
[11]
R. Duda, P. Hart, and D. Stork, Pattern Classification, Wiley-Interscience, 2001.
[12]
X. He, D. Cai, and P. Niyogi, Laplacian score for feature selection, in Advances in Neural Information Processing Systems 18, Vancouver, BC, Canada, 2006, pp. 507-514.
[13]
I. Kononenko, Estimating attributes: Analysis and extensions of relief, in Machine Learning: ECML-94, Springer, 1994, pp. 171-182.
[14]
Z. Zhao and H. Liu, Spectral feature selection for supervised and unsupervised learning, in Proc. 24th Int. Conf. on Machine Learning, Corvallis, USA, 2007, pp. 1151-1157.
[15]
F. Nie, S. Xiang, Y. Jia, C. Zhang, and S. Yan, Trace ratio criterion for feature selection, in Proc. 23rd AAAI Conf. on Artificial Intelligence, Chicago, USA, 2008, pp. 671-676.
[16]
D. Cai, C. Zhang, and X. He, Unsupervised feature selection for multi-cluster data, in Proc. 16th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, Washington DC, USA, 2010, pp. 333-342.
[17]
J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1988.
[18]
Z. Zhao, L. Wang, and H. Liu, Efficient spectral feature selection with minimum redundancy, in Proc. 24th AAAI Conf. on Artificial Intelligence, Atlanta, USA, 2010, pp. 673-678