AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
Article Link
Collect
Submit Manuscript
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Regular Paper

Discovering API Directives from API Specifications with Text Classification

College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China
Key Laboratory of Safety-Critical Software (Nanjing University of Aeronautics and Astronautics), Ministry of Industry and Information Technology, Nanjing 210016, China
Key Laboratory of Complex Systems Modeling and Simulation (Hangzhou Dianzi University), Ministry of Education Hangzhou 310018, China
School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310018, China
Show Author Information

Abstract

Application programming interface (API) libraries are extensively used by developers. To correctly program with APIs and avoid bugs, developers shall pay attention to API directives, which illustrate the constraints of APIs. Unfortunately, API directives usually have diverse morphologies, making it time-consuming and error-prone for developers to discover all the relevant API directives. In this paper, we propose an approach leveraging text classification to discover API directives from API specifications. Specifically, given a set of training sentences in API specifications, our approach first characterizes each sentence by three groups of features. Then, to deal with the unequal distribution between API directives and non-directives, our approach employs an under-sampling strategy to split the imbalanced training set into several subsets and trains several classifiers. Given a new sentence in an API specification, our approach synthesizes the trained classifiers to predict whether it is an API directive. We have evaluated our approach over a publicly available annotated API directive corpus. The experimental results reveal that our approach achieves an F-measure value of up to 82.08%. In addition, our approach statistically outperforms the state-of-the-art approach by up to 29.67% in terms of F-measure.

Electronic Supplementary Material

Download File(s)
jcst-36-4-922-Highlights.pdf (335.1 KB)

References

[1]

Maalej W, Robillard M P. Patterns of knowledge in API reference documentation. IEEE Transactions on Software Engineering, 2013, 39(9): 1264-1282. DOI: 10.1109/TSE.2013.12.

[2]
Petrosyan G, Robillard M P, De Mori R. Discovering information explaining API types using text classification. In Proc. the 37th International Conference on Software Engineering, May 2015, pp.869-879. DOI: 10.1109/ICSE.2015.97.
[3]
Jiang H, Zhang J X, Ren Z L, Zhang T. An unsupervised approach for discovering relevant tutorial fragments for APIs. In Proc. the 39th International Conference on Software Engineering, May 2017, pp.38-48. DOI: 10.1109/ICSE.2017.12.
[4]

Monperrus M, Eichberg M, Tekes E, Mezini M. What should developers be aware of? An empirical study on the directives of API documentation. Empirical Software Engineering, 2012, 17(6): 703-737. DOI: 10.1007/s10664-011-9186-4.

[5]
Dekel U, Herbsleb J D. Improving API documentation us-ability with knowledge pushing. In Proc. the 31st International Conference on Software Engineering, May 2009, pp.320-330. DOI: 10.1109/ICSE.2009.5070532.
[6]
Dagenais B, Robillard M P. Recovering traceability links between an API and its learning resources. In Proc. the 34th IEEE/ACM International Conference on Software Engineering, June 2012, pp.47-57. DOI: 10.1109/ICSE.2012.6227207.
[7]
Subramanian S, Inozemtseva L Holmes R. Live API documentation. In Proc. the 36th ACM/IEEE International Conference on Software Engineering, May 2014, pp.643-652. DOI: 10.1145/2568225.2568313.
[8]
Saied M A, Sahraoui H, Dufour B. An observational study on API usage constraints and their documentation. In Proc. the 22nd IEEE International Conference on Software Analysis, Evolution, and Reengineering, March 2015, pp.33-42. DOI: 10.1109/SANER.2015.7081813.
[9]

Liu X Y, Wu J X, Zhou Z H. Exploratory undersampling for class-imbalance learning. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 2009, 39(2): 539-550. DOI: 10.1109/TSMCB.2008.2007853.

[10]

Robillard M P, DeLine R. A field study of API learning obstacles. Empirical Software Engineering, 2011, 16(6): 703-732. DOI: 10.1007/s10664-010-9150-8.

[11]
Rastkar S, Murphy G C, Murray G. Summarizing software artifacts: A case study of bug reports. In Proc. the 32nd ACM/IEEE International Conference on Software Engineering, May 2010, pp.505-514. DOI: 10.1145/1806799.1806872.
[12]
Jiang H, Zhang J X, Li X C, Ren Z L, Lo D. A more accurate model for _nding tutorial segments explaining APIs. In Proc. the 23rd IEEE International Conference on Software Analysis, Evolution, and Reengineering, March 2016, pp.157-167. DOI: 10.1109/SANER.2016.59.
[13]
Chen D Q, Manning C D. A fast and accurate dependency parser using neural networks. In Proc. the Conference on Empirical Methods in Natural Language Processing, October 2014, pp.740-750. DOI: 10.3115/v1/D14-1082.
[14]
Manning C D, Mihai S, John b, Jenny F, Steven J B, David M. The Stanford CoreNLP natural language processing toolkit. In Proc. the 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, June 2014, pp.55-60. DOI: 10.3115/v1/P14-5010.
[15]
Mirray G, Carenini G. Summarizing spoken and written conversations. In Proc. the 2008 Conference on Empirical Methods in Natural Language Processing, October 2008, pp.773-782. DOI: 10.3115/1613715.1613813.
[16]
Panichella A, Dit B, Oliveto R, Penta M D, Poshynanyk D, Lucia A D. How to effectively use topic models for software engineering tasks? An approach based on genetic algorithms. In Proc. the 35th International Conference on Software Engineering, May 2013, pp.522-531. DOI: 10.1109/ICSE.2013.6606598.
[17]
Nguyen A T, Nguyen T T, Nguyen T N, Lo D, Sun C N. Duplicate bug report detection with a combination of information retrieval and topic modeling. In Proc. the 27th International Conference on Automated Software Engineering, September 2012, pp.70-79. DOI: 10.1145/2351676.2351687.
[18]
Gorla A, Tavecchia I, Gross F, Zeller A. Checking app behavior against app descriptions. In Proc. the 36th International Conference on Software Engineering, May 2014, pp.1025-1035. DOI: 10.1145/2568225.2568276.
[19]
Bernardi M L, Sementa C, Zagarese Q, Distante D, Penta M D. What topics do Firefox and Chrome contributors discuss? In Proc. the 8th Working Conference on Mining Software Repositories, May 2011, pp.234-237. DOI: 10.1145/1985441.1985480.
[20]

Xia X, Lo D, Shihab E, Wang X Y, Yang X H. ELBlocker: Predicting blocking bugs with ensemble imbalance learning. Information and Software Technology, 2015, 61: 93-106. DOI: 10.1016/j.infsof.2014.12.006.

[21]

Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten I H. The WEKA data mining software: An update. ACM SIGKDD Explorations Newsletter, 2009, 11(1): 10-18. DOI: 10.1145/1656274.1656278.

[22]

Fu W, Menzies T, Sheng X P. Tuning for software analytics: Is it really necessary? Information and Software Technology, 2016, 76: 135-146. DOI: 10.1016/j.infsof.2016.04.017.

[23]
Zhang C, Yang J Y, Zhang Y, Fan J, Zhang X, Zhao J J, Ou P Z. Automatic parameter recommendation for practical API usage. In Proc. the 34th International Conference on Software Engineering, June 2012, pp.826-836. DOI: 10.1109/ICSE.2012.6227136.
[24]
Field A. Discovering Statistics Using SPSS (2nd edition). Sage, 2005.
[25]
Head A, Sadowski C, Murphy-Hill E, Knight A. When not to comment: Questions and tradeoffs with API documentation for C++ projects. In Proc. the 40th International Conference on Software Engineering, May 2018, pp.643-653. DOI: 10.1145/3180155.3180176.
[26]
Zhang J X, Jiang H, Ren Z L, Zhang T, Huang Z Q. Enriching API documentation with code samples and usage scenarios from crowd knowledge. IEEE Transactions on Software Engineering. DOI: 10.1109/TSE.2019.2919304.
[27]
Dekel U. Increasing awareness of delocalized information to facilitate API usage [Ph.D. Thesis]. Carnegie Mellon University, 2009.
[28]
Zhou Y, Gu R H, Chen T L, Huang Z Q, Panichella S, Gall H C. Analyzing APIs documentation and code to detect directive defects. In Proc. the 39th International Conference on Software Engineering, May 2017, pp.27-37. DOI: 10.1109/ICSE.2017.11.
[29]
Zhong H, Su Z D. Detecting API documentation errors. In Proc. the 2013 ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages and Applications, October 2013, pp.803-816. DOI: 10.1145/2509136.2509523.
[30]
Shi L, Zhong H, Xie T, Li M S. An empirical study on evolution of API documentation. In Proc. the 14th International Conference on Fundamental Approaches to Software Engineering, March 26-April 3, 2011, pp.416-431. DOI: 10.1007/978-3-642-19811-3_29.
[31]
Tan L, Yuan D, Krishna G, Zhou Y Y. /*iComment: Bugs or bad comments?*/. In Proc. the 21st ACM SIGOPS Symposium on Operating Systems Principles, October 2007, pp.145-158. DOI: 10.1145/1294261.1294276.
[32]
Blasi A, Goffi A, Kuznetsov K, Gorla A, Ernst M D, Pezzè M, Castellanos S D. Translating code comments to procedure specifications. In Proc. the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis, July 2018, pp.242-253. DOI: 10.1145/3213846.3213872.
[33]

Zhong H, Zhang L, Xie T, Mei H. Inferring specifications for resources from natural language API documentation. Automated Software Engineering, 2011, 18(3/4): 227-261. DOI: 10.1007/s10515-011-0082-3.

[34]
Pandita R, Taneja K, Williams L, Tung T. ICON: Inferring temporal constraints from natural language API descriptions. In Proc. the 2016 IEEE International Conference on Software Maintenance and Evolution, October 2016, pp.378-388. DOI: 10.1109/ICSME.2016.59.
[35]

Robillard M P, Chhetri Y B. Recommending reference API documentation. Empirical Software Engineering, 2015, 20(6): 1558-1586. DOI: 10.1007/s10664-014-9323-y.

[36]

Dagenais B, Robillard M P. Using traceability links to recommend adaptive changes for documentation evolution. IEEE Transactions on Software Engineering, 2014, 40(11): 1126-1146. DOI: 10.1109/TSE.2014.2347969.

[37]
Treude C, Robillard M P. Augmenting API documentation with insights from Stack Overflow. In Proc. the 38th IEEE/ACM International Conference on Software Engineering, May 2016, pp.392-403. DOI: 10.1145/2884781.2884800.
[38]

Kim J, Lee S, Hwang S, Kim S. Enriching documents with examples: A corpus mining approach. ACM Transactions on Information Systems, 2013, 33(1): Article No. 1. DOI: 10.1145/2414782.2414783.

[39]
Wu Y C, Mar L W, Jiau H C. CoDocent: Support API usage with code example and API documentation. In Proc. the 5th International Conference on Software Engineering Advances, August 2010, pp.135-140. DOI: 10.1109/IC-SEA.2010.28.
Journal of Computer Science and Technology
Pages 922-943
Cite this article:
Zhang J-X, Tao C-Q, Huang Z-Q, et al. Discovering API Directives from API Specifications with Text Classification. Journal of Computer Science and Technology, 2021, 36(4): 922-943. https://doi.org/10.1007/s11390-021-0235-1

370

Views

5

Crossref

3

Web of Science

4

Scopus

0

CSCD

Altmetrics

Received: 18 December 2019
Accepted: 09 June 2021
Published: 05 July 2021
©Institute of Computing Technology, Chinese Academy of Sciences 2021
Return