AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
PDF (211.6 KB)
Collect
Submit Manuscript AI Chat Paper
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Research paper | Open Access

A study of similar question retrieval method in online health communities

Bufei Xing1Haonan Yin2Zhijun Yan3( )Jiachen Wang3
Baidu Online Network Technology Beijing Co., Ltd, Beijing, China
Paul Merage School of Business, University of California, Irvine, California, USA
School of Management and Economics, Beijing Institute of Technology, Beijing, China
Show Author Information

Abstract

Purpose

The purpose of this paper is to propose a new approach to retrieve similar questions in online health communities to improve the efficiency of health information retrieval and sharing.

Design/methodology/approach

This paper proposes a hybrid approach to combining domain knowledge similarity and topic similarity to retrieve similar questions in online health communities. The domain knowledge similarity can evaluate the domain distance between different questions. And the topic similarity measures questions’ relationship base on the extracted latent topics.

Findings

The experiment results show that the proposed method outperforms the baseline methods.

Originality/value

This method conquers the problem of word mismatch and considers the named entities included in questions, which most of existing studies did not.

References

 
Banerjee, S. and Pedersen, T. (2003), “Extended gloss overlaps as a measure of semantic relatedness”, Paper presented at the 18th International Joint Conference on Artificial Intelligence, Acapulco, Mexico.
 

Blei, D.M., Ng, A.Y. and Jordan, M.I. (2003), “Latent dirichlet allocation”, Journal of Machine Learning Research, Vol. 3, pp. 993-1022.

 
Chen, L., Jose, J.M., Yu, H., Yuan, F. and Zhang, D. (2016a), “A semantic graph based topic model for question retrieval in community question answering”, Paper presented at the Ninth ACM International Conference on Web Search and Data Mining, San Francisco, CA.https://doi.org/10.1145/2835776.2835809
 

Chen, L., Jose, J.M., Yu, H. and Yuan, F. (2016b), “A hybrid approach for question retrieval in community question answering”,The Computer Journal, Vol. 60 No. 7, pp. 1019-1031.

 
Duan, H., Cao, Y., Lin, C.Y. and Yu, Y. (2008), “Searching questions by identifying question topic and question focus”, Paper presented at the Meeting of the Association for Computational Linguistics, Columbus.
 

Eysenbach, G. (2008), “Medicine 2.0: social networking, collaboration, participation, apomediation, and openness”, Journal of Medical Internet Research, Vol. 10 No. 3, p. e22.

 

Ferrández, A. (2011), “Lexical and syntactic knowledge for information retrieval”, Information Processing and Management, Vol. 47 No. 5, pp. 692-705.

 

Figueroa, A. (2017), “Automatically generating effective search queries directly from community question-answering questions for finding related questions”, Expert Systems with Applications, Vol. 77, pp. 11-19.

 
Gao, Y., Xu, Y. and Li, Y. (2014), “A topic based document relevance ranking model”, Paper presented at the 23rd International Conference on World Wide Web, Seoul.https://doi.org/10.1145/2567948.2577289
 
Jeon, J., Croft, W.B. and Lee, J.H. (2005), “Finding similar questions in large question and answer archives”, Paper presented at the 14th ACM International Conference on Information and Knowledge Management, Bremen.https://doi.org/10.1145/1099554.1099572
 

Jiang, D., Leung, K.W.-T., Yang, L. and Ng, W. (2015), “TEII: topic enhanced inverted index for top-k document retrieval”, Knowledge-Based Systems, Vol. 89 No. 89, pp. 346-358.

 
Kusner, M., Sun, Y., Kolkin, N. and Weinberger, K. (2015), “From word embeddings to document distances”, Paper presented at the 32nd International Conference on International Conference on Machine Learning, Lille.
 

Lei, J., Tang, B., Lu, X., Gao, K., Jiang, M. and Xu, H. (2014), “A comprehensive study of named entity recognition in Chinese clinical text”, Journal of the American Medical Informatics Association, Vol. 21 No. 5, pp. 808-814.

 
Lian, X., Yuan, X., Hu, X. and Zhang, H. (2013), “Finding similar questions with categorization information and dependency syntactic tree”, Paper presented at the International Conference on Web-Age Information Management.https://doi.org/10.1007/978-3-642-38562-9_61
 

Liu, D.-R., Chen, Y.-H. and Huang, C.-K. (2014), “QA document recommendations for communities of question–answering websites”, Knowledge-Based Systems, Vol. 57 No. 2, pp. 146-160.

 
Niwattanakul, S., Singthongchai, J., Naenudorn, E. and Wanapu, S. (2013), “Using of jaccard coefficient for keywords similarity”, Paper presented at the International MultiConference of Engineers and Computer Scientists.
 

Roberts, K. and Demner-Fushman, D. (2016), “Interactive use of online health resources: a comparison of consumer and professional questions”, Journal of the American Medical Informatics Association, Vol. 23 No. 4, pp. 802-811.

 
Samuel, H., Kim, M.-Y., Prabhakar, S., Jabbar, M.S.M. and Zalane, O. (2017), “Community question retrieval in health forums”, Paper presented at the 2017 IEEE International Conference on Biomedical and Health Informatics (BHI).https://doi.org/10.1109/BHI.2017.7897209
 

Uzuner, Ö., South, B.R., Shen, S. and Duvall, S.L. (2011), “2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text”, Journal of the American Medical Informatics Association, Vol. 18 No. 5, pp. 552-556.

 

Van De Belt, T.H., Engelen, L.J., Berben, S.A. and Schoonhoven, L. (2010), “Definition of health 2.0 and medicine 2.0: a systematic review”, Journal of Medical Internet Research, Vol. 12 No. 2, pp. 1-14.

 
Wang, K., Ming, Z. and Chua, T.-S. (2009), “A syntactic tree matching approach to finding similar questions in community-based qa services”, Paper presented at the 32nd International Conference on Research and Development in Information Retrieval, Boston, MA.https://doi.org/10.1145/1571941.1571975
 

Wu, M.S. (2015), “Modeling query-document dependencies with topic language models for information retrieval”, Information Sciences, Vol. 312, pp. 1-12.

 

Wu, H.C., Luk, R.W.P., Wong, K.F. and Kwok, K.L. (2008), “Interpreting TF-IDF term weights as making relevance decisions”, ACM Transactions on Information Systems, Vol. 26 No. 3, pp. 1-37.

 

Yan, Z., Wang, T., Chen, Y. and Zhang, H. (2016), “Knowledge sharing in online health communities: a social exchange theory perspective”, Information and Management, Vol. 53 No. 5, pp. 643-653.

 

Yang, J., Yu, Q., Guan, Y. and Jiang, Z. (2014), “An overview of research on electronic medical record oriented named entity recognition and entity relation extraction”, Acta Automatica Sinica, Vol. 40 No. 8, pp. 1537-1562.

 

Zhang, W.N., Ming, Z.Y., Zhang, Y., Liu, T. and Chua, T.S. (2016), “Capturing the semantics of key phrases using multiple languages for question retrieval”, IEEE Transactions on Knowledge and Data Engineering, Vol. 28 No. 4, pp. 888-900.

 

Zhang, W.-N., Liu, T., Yang, Y., Cao, L., Zhang, Y. and Ji, R. (2014), “A topic clustering approach to finding similar questions from large question and answer archives”, PloS One, Vol. 9 No. 3, p. e71511.

International Journal of Crowd Science
Pages 154-165
Cite this article:
Xing B, Yin H, Yan Z, et al. A study of similar question retrieval method in online health communities. International Journal of Crowd Science, 2021, 5(2): 154-165. https://doi.org/10.1108/IJCS-03-2021-0006

812

Views

13

Downloads

3

Crossref

3

Scopus

Altmetrics

Received: 02 March 2021
Revised: 19 April 2021
Accepted: 22 April 2021
Published: 21 June 2021
© The author(s)

Bufei Xing, Haonan Yin, Zhijun Yan and Jiachen Wang. Published in International Journal of Crowd Science. Published by Emerald Publishing Limited. This article is published under the Creative Commons Attribution (CC BY 4.0) licence. Anyone may reproduce, distribute, translate and create derivative works of this article (for both commercial and non-commercial purposes), subject to full attribution to the original publication and authors. The full terms of this licence may be seen at http://creativecommons.org/licences/by/4.0/legalcode

Return