Automated test generation tools enable test automation and further alleviate the low efficiency caused by writing hand-crafted test cases. However, existing automated tools are not mature enough to be widely used by software testing groups. This paper conducts an empirical study on the state-of-the-art automated tools for Java, i.e., EvoSuite, Randoop, JDoop, JTeXpert, T3, and Tardis. We design a test workflow to facilitate the process, which can automatically run tools for test generation, collect data, and evaluate various metrics. Furthermore, we conduct empirical analysis on these six tools and their related techniques from different aspects, i.e., code coverage, mutation score, test suite size, readability, and real fault detection ability. We discuss about the benefits and drawbacks of hybrid techniques based on experimental results. Besides, we introduce our experience in setting up and executing these tools, and summarize their usability and user-friendliness. Finally, we give some insights into automated tools in terms of test suite readability improvement, meaningful assertion generation, test suite reduction for random testing tools, and symbolic execution integration.
- Article type
- Year
- Co-author
The adoption of deep neural network (DNN) model as the integral part of real-world software systems necessitates explicit consideration of their quality-of-service (QoS). It is well-known that DNN models are prone to adversarial attacks, and thus it is vitally important to be aware of how robust a model’s prediction is for a given input instance. A fragile prediction, even with high confidence, is not trustworthy in light of the possibility of adversarial attacks. We propose that DNN models should produce a robustness value as an additional QoS indicator, along with the confidence value, for each prediction they make. Existing approaches for robustness computation are based on adversarial searching, which are usually too expensive to be excised in real time. In this paper, we propose to predict, rather than to compute, the robustness measure for each input instance. Specifically, our approach inspects the output of the neurons of the target model and trains another DNN model to predict the robustness. We focus on convolutional neural network (CNN) models in the current research. Experiments show that our approach is accurate, with only 10%–34% additional errors compared with the offline heavy-weight robustness analysis. It also significantly outperforms some alternative methods. We further validate the effectiveness of the approach when it is applied to detect adversarial attacks and out-of-distribution input. Our approach demonstrates a better performance than, or at least is comparable to, the state-of-the-art techniques.