Identification of data mining research frontier based on conference papers

Yue Huang^¹(), Hu Liu^², Jing Pan^³

School of Information Science, Beijing Language and Culture University, Beijing, China

International Business School, Beijing Foreign Studies University, Beijing, China

School of Economics and Management, University of Science and Technology Beijing, Beijing, China

Show Author Information

Abstract

Purpose

Identifying the frontiers of a specific research field is one of the most basic tasks in bibliometrics and research published in leading conferences is crucial to the data mining research community, whereas few research studies have focused on it. The purpose of this study is to detect the intellectual structure of data mining based on conference papers.

Design/methodology/approach

This study takes the authoritative conference papers of the ranking 9 in the data mining field provided by Google Scholar Metrics as a sample. According to paper amount, this paper first detects the annual situation of the published documents and the distribution of the published conferences. Furthermore, from the research perspective of keywords, CiteSpace was used to dig into the conference papers to identify the frontiers of data mining, which focus on keywords term frequency, keywords betweenness centrality, keywords clustering and burst keywords.

Findings

Research showed that the research heat of data mining had experienced a linear upward trend during 2007 and 2016. The frontier identification based on the conference papers showed that there were five research hotspots in data mining, including clustering, classification, recommendation, social network analysis and community detection. The research contents embodied in the conference papers were also very rich.

Originality/value

This study detected the research frontier from leading data mining conference papers. Based on the keyword co-occurrence network, from four dimensions of keyword term frequency, betweeness centrality, clustering analysis and burst analysis, this paper identified and analyzed the research frontiers of data mining discipline from 2007 to 2016.

Keywords

Data mining Bibliometrics CiteSpace Conference papers Research frontier

References

Bhattacharya, S. and Basu, P.K. (1998), “Mapping a research area at the micro level using co-word analysis”, Scientometrics, Vol. 43 No. 3, pp. 359-372.

Crossref Google Scholar

Freeman, L.C. (1978), “Centrality in social networks conceptual clarification”, Social Networks, Vol. 1 No. 3, pp. 215-239.

Crossref Google Scholar

Google Scholar Metrics (2017), “Google scholar metrics”, available at: https://scholar.google.com/scholar/metrics.html (accessed 20 March 2017).

Kontostathis, A., Galitsky, L.M., Pottenger, W.M., Roy, S. and Phelps, D.J. (2004), “A survey of emerging trend detection in textual data mining”, in Berry, M.W. (Eds), Survey of Text Mining, Springer, New York, NY, pp. 185-224.https://doi.org/10.1007/978-1-4757-4305-0_9

Crossref

Kessler, M.M. (1963), “Bibliographic coupling between scientific papers”, American Documentation, Vol. 14 No. 1, pp. 10-25.

Crossref Google Scholar

Kleinberg, J. (2003), “Bursty and hierarchical structure in streams”, Data Mining and Knowledge Discovery, Vol. 7 No. 4, pp. 373-397.

Google Scholar

Matsumura, N., Matsuo, Y., Ohsawa, Y. and Ishizuka, M. (2002), “Discovering emerging topics from WWW”, Journal of Contingencies and Crisis Management, Vol. 10 No. 2, pp. 73-81.

Crossref Google Scholar

Morris, S.A. and Yen, G.G. (2004), “Crossmaps: visualization of overlapping relationships in collections of journal papers”, Proceedings of the National Academy of Sciences of Sciences, Vol. 101 No. Supplement 1, pp. 5291-5296.

Crossref Google Scholar

Price, D.J.D.S. (1965), “Networks of scientific papers”, Science, Vol. 149 No. 3638, pp. 510 -515.

Crossref Google Scholar

Small, H. (1973), “Co-citation in the scientific literature: a new measure of the relationship between two documents”, Journal of the American Society for Information Science, Vol. 24 No. 4, pp. 265-269.

Crossref Google Scholar

White, H.D. and Griffith, B.C. (1981), “Author cocitation: a literature measure of intellectual structure”, Journal of the American Society for Information Science, Vol. 32 No. 3, pp. 163-171.

Crossref Google Scholar

Weinberg, B.H. (1974), “Bibliographic coupling: a review”, Information Storage and Retrieval, Vol. 10 Nos 5/6, pp. 189-196.

Crossref Google Scholar

International Journal of Crowd Science

Volume 5 Issue 2,
August 2021

Pages 143-153

DOI: 10.1108/IJCS-01-2021-0001

Cite this article:

Huang Y, Liu H, Pan J. Identification of data mining research frontier based on conference papers. International Journal of Crowd Science, 2021, 5(2): 143-153. https://doi.org/10.1108/IJCS-01-2021-0001