Multi-Feature Fusion Based Structural Deep Neural Network for Predicting Answer Time on Stack Overflow

Shi-Kai Guo; Si-Wen Wang; Hui Li; Yu-Long Fan; Ya-Qing Liu; Bin Zhang

doi:10.1007/s11390-023-1438-4

AI Chat Paper

Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.

Chat more with AI

| Sign up

Browse by Subject

Search for peer-reviewed journals with full access.

Journals A - Z

About Us

Discover the SciOpen Platform and Achieve Your Research Goals with Ease.

About Us

Publish with Us

Support

Journals A - Z

About Us

Publish with Us

Support

Article Link

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

Show Outline

Outline

Show full outline

Hide outline

Outline

Show full outline

Hide outline

Regular Paper

Multi-Feature Fusion Based Structural Deep Neural Network for Predicting Answer Time on Stack Overflow

Shi-Kai Guo^¹, Si-Wen Wang^{¹^,²}, Hui Li^¹(

), Yu-Long Fan^¹, Ya-Qing Liu^¹, Bin Zhang^³

1Information Science and Technology College, Dalian Maritime University, Dalian 116026, China

2Navigation College, Dalian Maritime University, Dalian 116026, China

3Computer Science and Technology College, Shandong Technology and Business University, Yantai 264005, China

Show Author Information

Abstract

Stack Overflow provides a platform for developers to seek suitable solutions by asking questions and receiving answers on various topics. However, many questions are usually not answered quickly enough. Since the questioners are eager to know the specific time interval at which a question can be answered, it becomes an important task for Stack Overflow to feedback the answer time to the question. To address this issue, we propose a model for predicting the answer time of questions, named Predicting Answer Time (i.e., PAT model), which consists of two parts: a feature acquisition and fusion model, and a deep neural network model. The framework uses a variety of features mined from questions in Stack Overflow, including the question description, question title, question tags, the creation time of the question, and other temporal features. These features are fused and fed into the deep neural network to predict the answer time of the question. As a case study, post data from Stack Overflow are used to assess the model. We use traditional regression algorithms as the baselines, such as Linear Regression, K-Nearest Neighbors Regression, Support Vector Regression, Multilayer Perceptron Regression, and Random Forest Regression. Experimental results show that the PAT model can predict the answer time of questions more accurately than traditional regression algorithms, and shorten the error of the predicted answer time by nearly 10 hours.

Keywords

feature fusion Stack Overflow answer time structural deep neural network feature acquisition

Electronic Supplementary Material

Download File(s)

JCST-2103-11438-Highlights.pdf (150.9 KB)

References

[1]

Wu D, Johnson S, Foster C, Li E, Elmiligi H, Rahman M. Improving response time prediction for Stack Overflow questions. In Proc. the 10th IEEE Annual Information Technology, Electronics and Mobile Communication Conference, Oct. 2019, pp.786–791. DOI: 10.1109/IEMCON.2019.8936252.

Crossref

[2]

Lopez T, Tun T T, Bandara A, Levine M, Nuseibeh B, Sharp H. An investigation of security conversations in Stack Overflow: Perceptions of security and community involvement. In Proc. the 1st International Workshop on Security Awareness from Design to Deployment, May 2018, pp.26–32. DOI: 10.1145/3194707.3194713.

Crossref

[3]

Wang W, Malik H, Godfrey M W. Recommending posts concerning API issues in developer Q&A sites. In Proc. the 12th IEEE/ACM Working Conference on Mining Software Repositories, May 2015, pp.224–234. DOI: 10.1109/MSR.2015.28.

Crossref

[4]

Yanovsky S, Hoernle N, Lev O, Gal K. One size does not fit all: A study of badge behavior in Stack Overflow. Journal of the Association for Information Science and Technology, 2021, 72(3): 331–345. DOI: 10.1002/asi.24409.

Crossref Google Scholar

[5]

Mondal S, Rahman M M, Roy C K. Can issues reported at Stack Overflow questions be reproduced? An exploratory study. In Proc. the 16th IEEE/ACM International Conference on Mining Software Repositories, May 2019, pp.479–489. DOI: 10.1109/MSR.2019.00074.

Crossref

[6]

Tabassum J, Maddela M, Xu W, Ritter A. Code and named entity recognition in StackOverflow. In Proc. the 58th Annual Meeting of the Association for Computational Linguistics, Jul. 2020, pp.4913–4926. DOI: 10.18653/v1/2020.acl-main.443.

Crossref

[7]

Zhang J X, Jiang H, Ren Z L, Chen X. Recommending APIs for API related questions in Stack Overflow. IEEE Access, 2017, 6: 6205–6219. DOI: 10.1109/ACCESS.2017.2777845.

Crossref Google Scholar

[8]

Pan W F, Ming H, Chang C K, Yang Z J, Kim D K. ElementRank: Ranking Java software classes and packages using a multilayer complex network-based approach. IEEE Trans. Software Engineering, 2021, 47(10): 2272–2295. DOI: 10.1109/TSE.2019.2946357.

Crossref Google Scholar

[9]

Ai J, Su Z, Li Y, Wu C X. Link prediction based on a spatial distribution model with fuzzy link importance. Physica A: Statistical Mechanics and Its Applications, 2019, 527: 121155. DOI: 10.1016/j.physa.2019.121155.

Crossref Google Scholar

[10]

Su Z, Zheng X L, Ai J, Shang L H, Shen Y M. Link prediction in recommender systems with confidence measures. Chaos, 2019, 29(8): 083133. DOI: 10.1063/1.5099565.

Crossref Google Scholar

[11]

Burlutskiy N, Fish A, Ali N, Petridis M. Prediction of users' response time in Q&A communities. In Proc. the 14th IEEE International Conference on Machine Learning and Applications, Dec. 2015, pp.618–623. DOI: 10.1109/ICMLA.2015.190.

Crossref

[12]

Bhat V, Gokhale A, Jadhav R, Pudipeddi J, Akoglu L. Min(e)d your tags: Analysis of question response time in StackOverflow. In Proc. the 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, Aug. 2014, pp.328–335. DOI: 10.1109/ASONAM.2014.6921605.

Crossref

[13]

Rahman M M, Roy C K. An insight into the unresolved questions at Stack Overflow. In Proc. the 12th Working Conference on Mining Software Repositories, May 2015, pp.426–429. DOI: 10.1109/MSR.2015.55.

Crossref

[14]

Treude C, Barzilay O, Storey M A. How do programmers ask and answer questions on the web? (NIER track). In Proc. the 33rd International Conference on Software Engineering, May 2011, pp.804–807. DOI: 10.1145/1985793.1985907.

Crossref

[15]

Goderie J, Georgsson B M, Graafeiland B V, Bacchelli A. ETA: Estimated time of answer predicting response time in Stack Overflow. In Proc. the 12th Working Conference on Mining Software Repositories, May 2015, pp.414–417. DOI: 10.1109/MSR.2015.52.

Crossref

[16]

Teevan J, Morris M R, Panovich K. Factors affecting response quantity, quality, and speed for questions asked via social network status messages. In Proc. the 5th International Conference on Weblogs and Social Media, Jul. 2011. DOI: 131.107.65.14

[17]

Arguello J, Butler B S, Joyce E, Kraut R, Ling K S, Rosé C, Wang X Q. Talk to me: Foundations for successful individual-group interactions in online communities. In Proc. the 2006 International Conference on Human Factors in Computing Systems, Apr. 2006, pp.959–968. DOI: 10.1145/1124772.1124916.

Crossref

[18]

Dror G, Maarek Y, Szpektor I. Will my question be answered? Predicting ``question answerability'' in community question-answering sites. In Proc. the 2013 European Conference on Machine Learning and Knowledge Discovery in Databases, Sept. 2013, pp.499–514. DOI: 10.1007/978-3-642-40994-3_32.

Crossref

[19]

Arunapuram P, Bartel J W, Dewan P. Distribution, correlation and prediction of response times in Stack Overflow. In Proc. the 10th IEEE International Conference on Collaborative Computing: Networking, Applications and Worksharing, Oct. 2014, pp.378-387. DOI: 10.4108/icst.collaboratecom.2014.257265.

Crossref

[20]

Bhat V, Gokhale A, Jadhav R, Pudipeddi J, Akoglu L. Effects of tag usage on question response time. Social Network Analysis and Mining, 2015, 5(1): Article No. 24. DOI: 10.1007/s13278-015-0263-3.

Crossref Google Scholar

[21]

Mi Q, Gao Y J, Keung J, Xiao Y, Mensah S. Identifying textual features of high-quality questions: An empirical study on Stack Overflow. In Proc. the 24th Asia-Pacific Software Engineering Conference, Dec. 2017, pp.636–641. DOI: 10.1109/APSEC.2017.77.

Crossref

[22]

Remígio J, Aragão F, Souza C, Costa E, Fechine J. Question's advisor—A Wizard interface to teach novice programmers how to post “better” questions in Stack Overflow. In Proc. the 19th International Conference on Enterprise Information Systems, Apr. 2017, pp.471–478. DOI: 10.5220/0006389504710478.

Crossref

[23]

Kowalik G, Nielek R. Senior programmers: Characteristics of elderly users from Stack Overflow. In Proc. the 8th International Conference on Social Informatics, Nov. 2016, pp.87–96. DOI: 10.1007/978-3-319-47874-6_7.

Crossref

[24]

Le Q V, Mikolov T. Distributed representations of sentences and documents. In Proc. the 31st International Conference on Machine Learning, Jun. 2014, pp.1188–1196.

[25]

Gupta R, Reddy P K. Towards question improvement on knowledge sharing platforms: A Stack Overflow case study. In Proc. the 2017 IEEE International Conference on Big Knowledge, Aug. 2017, pp.41–48. DOI: 10.1109/ICBK.2017.25.

Crossref

[26]

Lezina G E, Kuznetsov A M. Predict closed questions on StackOverflow. In Proc. the 9th Spring Researchers Colloquium on Databases and Information Systems, May 2013, pp.10–14.

[27]

Avrahami D, Fussell S R, Hudson S E. IM waiting: Timing and responsiveness in semi-synchronous communication. In Proc. the 2008 ACM Conference on Computer Supported Cooperative Work, Nov. 2008, pp.285–294. DOI: 10.1145/1460563.1460610.

Crossref

[28]

Li K, Zou C Q, Bu S H, Liang Y, Zhang J, Gong M L. Multi-modal feature fusion for geographic image annotation. Pattern Recognition, 2018, 73: 1–14. DOI: 10.1016/j.patcog.2017.06.036.

Crossref Google Scholar

[29]

Borovykh A, Oosterlee C W, Bohté S M. Generalization in fully-connected neural networks for time series forecasting. Journal of Computational Science, 2019, 36: 101020. DOI: 10.1016/j.jocs.2019.07.007.

Crossref Google Scholar

[30]

Cheng Y P. Backpropagation for fully connected cascade networks. Neural Processing Letters, 2017, 46(1): 293–311. DOI: 10.1007/s11063-017-9588-4.

Crossref Google Scholar

[31]

Deng W, Liu H L, Xu J J, Zhao H M, Song Y J. An improved quantum-inspired differential evolution algorithm for deep belief network. IEEE Trans. Instrumentation and Measurement, 2020, 69(10): 7319–7327. DOI: 10.1109/TIM.2020.2983233.

Crossref Google Scholar

[32]

Hao L Y, Li J, Guo G. A multi-target corner pooling-based neural network for vehicle detection. Neural Computing and Applications, 2020, 32(18): 14497–14506. DOI: 10.1007/s00521-019-04486-1.

Crossref Google Scholar

[33]

Hao L Y, Zhang H, Guo G, Li H. Quantized sliding mode control of unmanned marine vehicles: Various thruster faults tolerated with a unified model. IEEE Trans. Systems, Man, and Cybernetics: Systems, 2021, 51(3): 2012–2026. DOI: 10.1109/TSMC.2019.2912812.

Crossref Google Scholar

[34]

Priya S S, Gupta L. Predicting the future in time series using auto regressive linear regression modeling. In Proc. the 12th International Conference on Wireless and Optical Communications Networks, Sept. 2015. DOI: 10.1109/WOCN.2015.8064521.

Crossref

[35]

Nguyen B, Morell C, De Baets B. Large-scale distance metric learning for k-nearest neighbors regression. Neurocomputing, 2016, 214: 805–814. DOI: 10.1016/j.neucom.2016.07.005.

Crossref Google Scholar

[36]

Li Z J, Li Y X, Yu F, Ge D H. Adaptively weighted support vector regression for financial time series prediction. In Proc. the 2014 International Joint Conference on Neural Networks, Jul. 2014, pp.3062–3065. DOI: 10.1109/IJCNN.2014.6889426.

Crossref

[37]

Park J G, Jo S. Approximate Bayesian MLP regularization for regression in the presence of noise. Neural Networks, 2016, 83: 75–85. DOI: 10.1016/j.neunet.2016.07.010.

Crossref Google Scholar

[38]

Hu Q, Wu W B, Friedl M A. Mapping sub-pixel corn distribution using MODIS time-series data and a random forest regression model. In Proc. the 6th International Conference on Agro-Geoinformatics, Aug. 2017, pp.108–112. DOI: 10.1109/Agro-Geoinformatics.2017.8047051.

Crossref

Journal of Computer Science and Technology

Volume 38 Issue 3,
May 2023

Pages 582-599

DOI: 10.1007/s11390-023-1438-4

Cite this article:

Guo S-K, Wang S-W, Li H, et al. Multi-Feature Fusion Based Structural Deep Neural Network for Predicting Answer Time on Stack Overflow. Journal of Computer Science and Technology, 2023, 38(3): 582-599. https://doi.org/10.1007/s11390-023-1438-4

336

Views

Crossref

Web of Science

Scopus

CSCD

Google Scholar
Citation

Altmetrics

Received: 15 March 2021

Accepted: 24 February 2023

Published: 30 May 2023