PDF (323.3 KB)
Collect
Submit Manuscript
Show Outline
Figures (2)

Tables (4)
Table. 1
Table. 2
Table. 3
Table. 4
Research paper | Open Access

Identification of data mining research frontier based on conference papers

Yue Huang1()Hu Liu2Jing Pan3
School of Information Science, Beijing Language and Culture University, Beijing, China
International Business School, Beijing Foreign Studies University, Beijing, China
School of Economics and Management, University of Science and Technology Beijing, Beijing, China
Show Author Information

Abstract

Purpose

Identifying the frontiers of a specific research field is one of the most basic tasks in bibliometrics and research published in leading conferences is crucial to the data mining research community, whereas few research studies have focused on it. The purpose of this study is to detect the intellectual structure of data mining based on conference papers.

Design/methodology/approach

This study takes the authoritative conference papers of the ranking 9 in the data mining field provided by Google Scholar Metrics as a sample. According to paper amount, this paper first detects the annual situation of the published documents and the distribution of the published conferences. Furthermore, from the research perspective of keywords, CiteSpace was used to dig into the conference papers to identify the frontiers of data mining, which focus on keywords term frequency, keywords betweenness centrality, keywords clustering and burst keywords.

Findings

Research showed that the research heat of data mining had experienced a linear upward trend during 2007 and 2016. The frontier identification based on the conference papers showed that there were five research hotspots in data mining, including clustering, classification, recommendation, social network analysis and community detection. The research contents embodied in the conference papers were also very rich.

Originality/value

This study detected the research frontier from leading data mining conference papers. Based on the keyword co-occurrence network, from four dimensions of keyword term frequency, betweeness centrality, clustering analysis and burst analysis, this paper identified and analyzed the research frontiers of data mining discipline from 2007 to 2016.

References

 

Bhattacharya, S. and Basu, P.K. (1998), “Mapping a research area at the micro level using co-word analysis”, Scientometrics, Vol. 43 No. 3, pp. 359-372.

 

Freeman, L.C. (1978), “Centrality in social networks conceptual clarification”, Social Networks, Vol. 1 No. 3, pp. 215-239.

 
Google Scholar Metrics (2017), “Google scholar metrics”, available at: https://scholar.google.com/scholar/metrics.html (accessed 20 March 2017).
 
Kontostathis, A., Galitsky, L.M., Pottenger, W.M., Roy, S. and Phelps, D.J. (2004), “A survey of emerging trend detection in textual data mining”, in Berry, M.W. (Eds), Survey of Text Mining, Springer, New York, NY, pp. 185-224.https://doi.org/10.1007/978-1-4757-4305-0_9
 

Kessler, M.M. (1963), “Bibliographic coupling between scientific papers”, American Documentation, Vol. 14 No. 1, pp. 10-25.

 

Kleinberg, J. (2003), “Bursty and hierarchical structure in streams”, Data Mining and Knowledge Discovery, Vol. 7 No. 4, pp. 373-397.

 

Matsumura, N., Matsuo, Y., Ohsawa, Y. and Ishizuka, M. (2002), “Discovering emerging topics from WWW”, Journal of Contingencies and Crisis Management, Vol. 10 No. 2, pp. 73-81.

 

Morris, S.A. and Yen, G.G. (2004), “Crossmaps: visualization of overlapping relationships in collections of journal papers”, Proceedings of the National Academy of Sciences of Sciences, Vol. 101 No. Supplement 1, pp. 5291-5296.

 

Price, D.J.D.S. (1965), “Networks of scientific papers”, Science, Vol. 149 No. 3638, pp. 510 -515.

 

Small, H. (1973), “Co-citation in the scientific literature: a new measure of the relationship between two documents”, Journal of the American Society for Information Science, Vol. 24 No. 4, pp. 265-269.

 

White, H.D. and Griffith, B.C. (1981), “Author cocitation: a literature measure of intellectual structure”, Journal of the American Society for Information Science, Vol. 32 No. 3, pp. 163-171.

 

Weinberg, B.H. (1974), “Bibliographic coupling: a review”, Information Storage and Retrieval, Vol. 10 Nos 5/6, pp. 189-196.

International Journal of Crowd Science
Pages 143-153
Cite this article:
Huang Y, Liu H, Pan J. Identification of data mining research frontier based on conference papers. International Journal of Crowd Science, 2021, 5(2): 143-153. https://doi.org/10.1108/IJCS-01-2021-0001
Metrics & Citations  
Article History
Copyright
Rights and Permissions
Return