AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
Article Link
Collect
Submit Manuscript
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Regular Paper

Multi-Feature Fusion Based Structural Deep Neural Network for Predicting Answer Time on Stack Overflow

Information Science and Technology College, Dalian Maritime University, Dalian 116026, China
Navigation College, Dalian Maritime University, Dalian 116026, China
Computer Science and Technology College, Shandong Technology and Business University, Yantai 264005, China
Show Author Information

Abstract

Stack Overflow provides a platform for developers to seek suitable solutions by asking questions and receiving answers on various topics. However, many questions are usually not answered quickly enough. Since the questioners are eager to know the specific time interval at which a question can be answered, it becomes an important task for Stack Overflow to feedback the answer time to the question. To address this issue, we propose a model for predicting the answer time of questions, named Predicting Answer Time (i.e., PAT model), which consists of two parts: a feature acquisition and fusion model, and a deep neural network model. The framework uses a variety of features mined from questions in Stack Overflow, including the question description, question title, question tags, the creation time of the question, and other temporal features. These features are fused and fed into the deep neural network to predict the answer time of the question. As a case study, post data from Stack Overflow are used to assess the model. We use traditional regression algorithms as the baselines, such as Linear Regression, K-Nearest Neighbors Regression, Support Vector Regression, Multilayer Perceptron Regression, and Random Forest Regression. Experimental results show that the PAT model can predict the answer time of questions more accurately than traditional regression algorithms, and shorten the error of the predicted answer time by nearly 10 hours.

Electronic Supplementary Material

Download File(s)
JCST-2103-11438-Highlights.pdf (150.9 KB)

References

[1]
Wu D, Johnson S, Foster C, Li E, Elmiligi H, Rahman M. Improving response time prediction for Stack Overflow questions. In Proc. the 10th IEEE Annual Information Technology, Electronics and Mobile Communication Conference, Oct. 2019, pp.786–791. DOI: 10.1109/IEMCON.2019.8936252.
[2]
Lopez T, Tun T T, Bandara A, Levine M, Nuseibeh B, Sharp H. An investigation of security conversations in Stack Overflow: Perceptions of security and community involvement. In Proc. the 1st International Workshop on Security Awareness from Design to Deployment, May 2018, pp.26–32. DOI: 10.1145/3194707.3194713.
[3]
Wang W, Malik H, Godfrey M W. Recommending posts concerning API issues in developer Q&A sites. In Proc. the 12th IEEE/ACM Working Conference on Mining Software Repositories, May 2015, pp.224–234. DOI: 10.1109/MSR.2015.28.
[4]

Yanovsky S, Hoernle N, Lev O, Gal K. One size does not fit all: A study of badge behavior in Stack Overflow. Journal of the Association for Information Science and Technology, 2021, 72(3): 331–345. DOI: 10.1002/asi.24409.

[5]
Mondal S, Rahman M M, Roy C K. Can issues reported at Stack Overflow questions be reproduced? An exploratory study. In Proc. the 16th IEEE/ACM International Conference on Mining Software Repositories, May 2019, pp.479–489. DOI: 10.1109/MSR.2019.00074.
[6]
Tabassum J, Maddela M, Xu W, Ritter A. Code and named entity recognition in StackOverflow. In Proc. the 58th Annual Meeting of the Association for Computational Linguistics, Jul. 2020, pp.4913–4926. DOI: 10.18653/v1/2020.acl-main.443.
[7]

Zhang J X, Jiang H, Ren Z L, Chen X. Recommending APIs for API related questions in Stack Overflow. IEEE Access, 2017, 6: 6205–6219. DOI: 10.1109/ACCESS.2017.2777845.

[8]

Pan W F, Ming H, Chang C K, Yang Z J, Kim D K. ElementRank: Ranking Java software classes and packages using a multilayer complex network-based approach. IEEE Trans. Software Engineering, 2021, 47(10): 2272–2295. DOI: 10.1109/TSE.2019.2946357.

[9]

Ai J, Su Z, Li Y, Wu C X. Link prediction based on a spatial distribution model with fuzzy link importance. Physica A: Statistical Mechanics and Its Applications, 2019, 527: 121155. DOI: 10.1016/j.physa.2019.121155.

[10]

Su Z, Zheng X L, Ai J, Shang L H, Shen Y M. Link prediction in recommender systems with confidence measures. Chaos, 2019, 29(8): 083133. DOI: 10.1063/1.5099565.

[11]
Burlutskiy N, Fish A, Ali N, Petridis M. Prediction of users' response time in Q&A communities. In Proc. the 14th IEEE International Conference on Machine Learning and Applications, Dec. 2015, pp.618–623. DOI: 10.1109/ICMLA.2015.190.
[12]
Bhat V, Gokhale A, Jadhav R, Pudipeddi J, Akoglu L. Min(e)d your tags: Analysis of question response time in StackOverflow. In Proc. the 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, Aug. 2014, pp.328–335. DOI: 10.1109/ASONAM.2014.6921605.
[13]
Rahman M M, Roy C K. An insight into the unresolved questions at Stack Overflow. In Proc. the 12th Working Conference on Mining Software Repositories, May 2015, pp.426–429. DOI: 10.1109/MSR.2015.55.
[14]
Treude C, Barzilay O, Storey M A. How do programmers ask and answer questions on the web? (NIER track). In Proc. the 33rd International Conference on Software Engineering, May 2011, pp.804–807. DOI: 10.1145/1985793.1985907.
[15]
Goderie J, Georgsson B M, Graafeiland B V, Bacchelli A. ETA: Estimated time of answer predicting response time in Stack Overflow. In Proc. the 12th Working Conference on Mining Software Repositories, May 2015, pp.414–417. DOI: 10.1109/MSR.2015.52.
[16]
Teevan J, Morris M R, Panovich K. Factors affecting response quantity, quality, and speed for questions asked via social network status messages. In Proc. the 5th International Conference on Weblogs and Social Media, Jul. 2011. DOI: 131.107.65.14
[17]
Arguello J, Butler B S, Joyce E, Kraut R, Ling K S, Rosé C, Wang X Q. Talk to me: Foundations for successful individual-group interactions in online communities. In Proc. the 2006 International Conference on Human Factors in Computing Systems, Apr. 2006, pp.959–968. DOI: 10.1145/1124772.1124916.
[18]
Dror G, Maarek Y, Szpektor I. Will my question be answered? Predicting ``question answerability'' in community question-answering sites. In Proc. the 2013 European Conference on Machine Learning and Knowledge Discovery in Databases, Sept. 2013, pp.499–514. DOI: 10.1007/978-3-642-40994-3_32.
[19]
Arunapuram P, Bartel J W, Dewan P. Distribution, correlation and prediction of response times in Stack Overflow. In Proc. the 10th IEEE International Conference on Collaborative Computing: Networking, Applications and Worksharing, Oct. 2014, pp.378-387. DOI: 10.4108/icst.collaboratecom.2014.257265.
[20]

Bhat V, Gokhale A, Jadhav R, Pudipeddi J, Akoglu L. Effects of tag usage on question response time. Social Network Analysis and Mining, 2015, 5(1): Article No. 24. DOI: 10.1007/s13278-015-0263-3.

[21]
Mi Q, Gao Y J, Keung J, Xiao Y, Mensah S. Identifying textual features of high-quality questions: An empirical study on Stack Overflow. In Proc. the 24th Asia-Pacific Software Engineering Conference, Dec. 2017, pp.636–641. DOI: 10.1109/APSEC.2017.77.
[22]
Remígio J, Aragão F, Souza C, Costa E, Fechine J. Question's advisor—A Wizard interface to teach novice programmers how to post “better” questions in Stack Overflow. In Proc. the 19th International Conference on Enterprise Information Systems, Apr. 2017, pp.471–478. DOI: 10.5220/0006389504710478.
[23]
Kowalik G, Nielek R. Senior programmers: Characteristics of elderly users from Stack Overflow. In Proc. the 8th International Conference on Social Informatics, Nov. 2016, pp.87–96. DOI: 10.1007/978-3-319-47874-6_7.
[24]
Le Q V, Mikolov T. Distributed representations of sentences and documents. In Proc. the 31st International Conference on Machine Learning, Jun. 2014, pp.1188–1196.
[25]
Gupta R, Reddy P K. Towards question improvement on knowledge sharing platforms: A Stack Overflow case study. In Proc. the 2017 IEEE International Conference on Big Knowledge, Aug. 2017, pp.41–48. DOI: 10.1109/ICBK.2017.25.
[26]
Lezina G E, Kuznetsov A M. Predict closed questions on StackOverflow. In Proc. the 9th Spring Researchers Colloquium on Databases and Information Systems, May 2013, pp.10–14.
[27]
Avrahami D, Fussell S R, Hudson S E. IM waiting: Timing and responsiveness in semi-synchronous communication. In Proc. the 2008 ACM Conference on Computer Supported Cooperative Work, Nov. 2008, pp.285–294. DOI: 10.1145/1460563.1460610.
[28]

Li K, Zou C Q, Bu S H, Liang Y, Zhang J, Gong M L. Multi-modal feature fusion for geographic image annotation. Pattern Recognition, 2018, 73: 1–14. DOI: 10.1016/j.patcog.2017.06.036.

[29]

Borovykh A, Oosterlee C W, Bohté S M. Generalization in fully-connected neural networks for time series forecasting. Journal of Computational Science, 2019, 36: 101020. DOI: 10.1016/j.jocs.2019.07.007.

[30]

Cheng Y P. Backpropagation for fully connected cascade networks. Neural Processing Letters, 2017, 46(1): 293–311. DOI: 10.1007/s11063-017-9588-4.

[31]

Deng W, Liu H L, Xu J J, Zhao H M, Song Y J. An improved quantum-inspired differential evolution algorithm for deep belief network. IEEE Trans. Instrumentation and Measurement, 2020, 69(10): 7319–7327. DOI: 10.1109/TIM.2020.2983233.

[32]

Hao L Y, Li J, Guo G. A multi-target corner pooling-based neural network for vehicle detection. Neural Computing and Applications, 2020, 32(18): 14497–14506. DOI: 10.1007/s00521-019-04486-1.

[33]

Hao L Y, Zhang H, Guo G, Li H. Quantized sliding mode control of unmanned marine vehicles: Various thruster faults tolerated with a unified model. IEEE Trans. Systems, Man, and Cybernetics: Systems, 2021, 51(3): 2012–2026. DOI: 10.1109/TSMC.2019.2912812.

[34]
Priya S S, Gupta L. Predicting the future in time series using auto regressive linear regression modeling. In Proc. the 12th International Conference on Wireless and Optical Communications Networks, Sept. 2015. DOI: 10.1109/WOCN.2015.8064521.
[35]

Nguyen B, Morell C, De Baets B. Large-scale distance metric learning for k-nearest neighbors regression. Neurocomputing, 2016, 214: 805–814. DOI: 10.1016/j.neucom.2016.07.005.

[36]
Li Z J, Li Y X, Yu F, Ge D H. Adaptively weighted support vector regression for financial time series prediction. In Proc. the 2014 International Joint Conference on Neural Networks, Jul. 2014, pp.3062–3065. DOI: 10.1109/IJCNN.2014.6889426.
[37]

Park J G, Jo S. Approximate Bayesian MLP regularization for regression in the presence of noise. Neural Networks, 2016, 83: 75–85. DOI: 10.1016/j.neunet.2016.07.010.

[38]
Hu Q, Wu W B, Friedl M A. Mapping sub-pixel corn distribution using MODIS time-series data and a random forest regression model. In Proc. the 6th International Conference on Agro-Geoinformatics, Aug. 2017, pp.108–112. DOI: 10.1109/Agro-Geoinformatics.2017.8047051.
Journal of Computer Science and Technology
Pages 582-599
Cite this article:
Guo S-K, Wang S-W, Li H, et al. Multi-Feature Fusion Based Structural Deep Neural Network for Predicting Answer Time on Stack Overflow. Journal of Computer Science and Technology, 2023, 38(3): 582-599. https://doi.org/10.1007/s11390-023-1438-4

336

Views

1

Crossref

0

Web of Science

1

Scopus

0

CSCD

Altmetrics

Received: 15 March 2021
Accepted: 24 February 2023
Published: 30 May 2023
© Institute of Computing Technology, Chinese Academy of Sciences 2023
Return