Random Subspace Sampling for Classification with Missing Data

Yun-Hao Cao; Jian-Xin Wu

doi:10.1007/s11390-023-1611-9

| Sign up

Article Link

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

Show Outline

Outline

Abstract

Keywords

Electronic Supplementary Material

References

Show full outline

Hide outline

Regular Paper

Random Subspace Sampling for Classification with Missing Data

Yun-Hao Cao, Jian-Xin Wu()

State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210093, China

Show Author Information

Abstract

Many real-world datasets suffer from the unavoidable issue of missing values, and therefore classification with missing data has to be carefully handled since inadequate treatment of missing values will cause large errors. In this paper, we propose a random subspace sampling method, RSS, by sampling missing items from the corresponding feature histogram distributions in random subspaces, which is effective and efficient at different levels of missing data. Unlike most established approaches, RSS does not train on fixed imputed datasets. Instead, we design a dynamic training strategy where the filled values change dynamically by resampling during training. Moreover, thanks to the sampling strategy, we design an ensemble testing strategy where we combine the results of multiple runs of a single model, which is more efficient and resource-saving than previous ensemble methods. Finally, we combine these two strategies with the random subspace method, which makes our estimations more robust and accurate. The effectiveness of the proposed RSS method is well validated by experimental studies.

Keywords

missing data random subspace neural network ensemble learning

Electronic Supplementary Material

Download File(s)

JCST-2105-11611-Highlights.pdf (340.6 KB)

References

[1]

García-Laencina P J, Sancho-Gómez J L, Figueiras-Vidal A R. Pattern classification with missing data: A review. Neural Computing and Applications , 2010, 19(2): 263–282. DOI: 10.1007/s00521-009-0295-6.

Crossref Google Scholar

[2]

White I R, Royston P, Wood A M. Multiple imputation using chained equations: Issues and guidance for practice. Statistics in Medicine , 2011, 30(4): 377–399. DOI: 10.1002/ sim.4067.

Crossref Google Scholar

[3]

Farhangfar A, Kurgan L A, Pedrycz W. A novel framework for imputation of missing values in databases. IEEE Trans. Systems, Man, and Cybernetics—Part A : Systems and Humans , 2007, 37(5): 692–709. DOI: 10.1109/TSMCA.2007.902631.

Crossref Google Scholar

[4]

Juszczak P, Duin R P W. Combining one-class classifiers to classify missing data. In Proc. the 5th International Workshop on Multiple Classifier Systems, Jun. 2004, pp.92–101. DOI: 10.1007/978-3-540-25966-4_9.

Crossref

[5]

Krause S, Polikar R. An ensemble of classifiers approach for the missing feature problem. In Proc. the 2003 International Joint Conference on Neural Networks, Jul. 2003, pp.553–558. DOI: 10.1109/IJCNN.2003.1223406.

Crossref

[6]

Polikar R, DePasquale J, Syed Mohammed H, Brown G, Kuncheva L I. Learn⁺⁺. MF: A random subspace approach for the missing feature problem. Pattern Recognition , 2010, 43(11): 3817–3832. DOI: 10.1016/j.patcog.2010.05.028.

Crossref Google Scholar

[7]

Ghahramani Z, Jordan M I. Supervised learning from incomplete data via an EM approach. In Proc. the 6th International Conference on Neural Information Processing Systems, Nov. 1993, pp.120–127.

[8]

Ahmad S, Tresp V. Some solutions to the missing feature problem in vision. In Proc. the 5th International Conference on Neural Information Processing Systems, Nov. 1992, pp.393–400.

[9]

Salzberg S L. Bookreview: C4.5: Programs for machine learning by J. Ross Quinlan. Morgan Kaufmann Publishers, Inc., 1993. Machine Learning, 1994, 16(3): 235–240. DOI: 10.1007/BF00993309.

Crossref

[10]

Batista G E, Monard M C. A study of k-nearest neighbour as an imputation method. Hybrid Intelligent Systems, 2002, 87(48): 251–260. DOI: 10.1109/METRIC.2004.1357895.

Crossref

[11]

Schafer J L. Analysis of Incomplete Multivariate Data (1st edition). CRC Press, 1997. DOI: 10.1201/9780367803025.

Crossref

[12]

Zhao Y X, Udell M. Missing value imputation for mixed data via Gaussian copula. In Proc. the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Aug. 2020, pp.636–646. DOI: 10.1145/3394486.3403106.

Crossref

[13]

Rubin D B. Multiple Imputation for Nonresponse in Surveys (1st edition). John Wiley & Sons, Inc., 2004.

[14]

Houari R, Bounceur A, Tari A K, Kecha M T. Handling missing data problems with sampling methods. In Proc. the 2014 International Conference on Advanced Networking Distributed Systems and Applications, Jun. 2014, pp.99–104. DOI: 10.1109/INDS.2014.25.

Crossref

[15]

Stekhoven D J, Bühlmann P. MissForest—Non-parametric missing value imputation for mixed-type data. Bioinformatics , 2012, 28(1): 112–118. DOI: 10.1093/bioinformatics/btr597.

Crossref Google Scholar

[16]

Zhou Z H. Ensemble Methods: Foundations and Algorithms (1st edition). CRC Press, 2012. DOI: 10.1201/b12207.

Crossref

[17]

Ho T K. The random subspace method for constructing decision forests. IEEE Trans. Pattern Analysis and Machine Intelligence , 1998, 20(8): 832–844. DOI: 10.1109/34.709601.

Crossref Google Scholar

[18]

Breiman L. Random forests. Machine Learning , 2001, 45(1): 5–32. DOI: 10.1023/A:1010933404324.

Crossref Google Scholar

[19]

Sharpe P K, Solly R J. Dealing with missing values in neural network-based diagnostic systems. Neural Computing & Applications , 1995, 3(2): 73–77. DOI: 10.1007/BF 01421959.

Crossref Google Scholar

[20]

Jiang K, Chen H X, Yuan S M. Classification for incomplete data using classifier ensembles. In Proc. the 2005 International Conference on Neural Networks and Brain, Apr. 2005, pp.559–563. DOI: 10.1109/ICNNB.2005.1614675.

Crossref

[21]

Cao Y H, Wu J X, Wang H C, Lasenby J. Neural random subspace. Pattern Recognition , 2021, 112: Article No. 107801. DOI: 10.1016/j.patcog.2020.107801.

Crossref Google Scholar

[22]

Little R J A, Rubin D B. Statistical Analysis with Missing Data (3rd edition). John Wiley & Sons, Inc., 2019.

Crossref

[23]

Mazumder R, Hastie T, Tibshirani R. Spectral regularization algorithms for learning large incomplete matrices. The Journal of Machine Learning Research , 2010, 11(80): 2287–2322.

Google Scholar

[24]

Huang S J, Xu M, Xie M K, Sugiyama M, Niu G, Chen S C. Active feature acquisition with supervised matrix completion. In Proc. the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Jul. 2018, pp.1571–1579. DOI: 10.1145/3219819.3220084.

Crossref

[25]

Ioffe S, Szegedy C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proc. the 32nd International Conference on Machine Learning, Jul. 2015, pp.448–456.

[26]

Kingma D P, Ba J. Adam: A method for stochastic optimization. In Proc. the 3rd International Conference on Learning Representations, May 2015.

Journal of Computer Science and Technology

Volume 39 Issue 2,
March 2024

Pages 472-486

DOI: 10.1007/s11390-023-1611-9

Cite this article:

Cao Y-H, Wu J-X. Random Subspace Sampling for Classification with Missing Data. Journal of Computer Science and Technology, 2024, 39(2): 472-486. https://doi.org/10.1007/s11390-023-1611-9