AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
Article Link
Collect
Submit Manuscript
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Regular Paper

Random Subspace Sampling for Classification with Missing Data

State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210093, China
Show Author Information

Abstract

Many real-world datasets suffer from the unavoidable issue of missing values, and therefore classification with missing data has to be carefully handled since inadequate treatment of missing values will cause large errors. In this paper, we propose a random subspace sampling method, RSS, by sampling missing items from the corresponding feature histogram distributions in random subspaces, which is effective and efficient at different levels of missing data. Unlike most established approaches, RSS does not train on fixed imputed datasets. Instead, we design a dynamic training strategy where the filled values change dynamically by resampling during training. Moreover, thanks to the sampling strategy, we design an ensemble testing strategy where we combine the results of multiple runs of a single model, which is more efficient and resource-saving than previous ensemble methods. Finally, we combine these two strategies with the random subspace method, which makes our estimations more robust and accurate. The effectiveness of the proposed RSS method is well validated by experimental studies.

Electronic Supplementary Material

Download File(s)
JCST-2105-11611-Highlights.pdf (340.6 KB)
Journal of Computer Science and Technology
Pages 472-486
Cite this article:
Cao Y-H, Wu J-X. Random Subspace Sampling for Classification with Missing Data. Journal of Computer Science and Technology, 2024, 39(2): 472-486. https://doi.org/10.1007/s11390-023-1611-9

26

Views

0

Crossref

0

Web of Science

0

Scopus

0

CSCD

Altmetrics

Received: 26 May 2021
Accepted: 04 February 2023
Published: 30 March 2024
© Institute of Computing Technology, Chinese Academy of Sciences 2024
Return