Risk-Based Testing: A Case Study

2010 Seventh International Conference on Information Technology Risk-Based Testing: A Case Study Ellen Souza Information Systems Bachelor Federal Rural University of Pernambuco Fazenda Saco, S/N C.P. 063, 56900-000 Serra Talhada PE, Brazil Cristine Gusmão, Júlio Venâncio Department of Systems and Computing University of Pernambuco Rua Benfica, 455, Madalena, 50750-410 Recife PE, Brazil Abstract This paper describes the application of risk-based testing for a software product evaluation in a real case study. Risk-based testing consists of a set of activities regarding risk factors identification related to software requirements. Once identified, the risks are prioritized according to their likelihood and impact and test cases are designed based on the strategies for treatment of the identified risk factors. Thus, test efforts are continuously adjusted according to risk monitoring. The paper also briefly reviews available risk-based es, describes the benefits that are likely to accrue from the growing body of work in this area and provides a set of problems, challenges and future work. Keywords: Case Study, Risk-Based Testing, Risk Management, Software Testing, Testing Process. 1. Introduction Software testing activity aims to improve the quality of software products through checking compliance of software products with its specification. However, software testing requires significant efforts. Testing activity may cost up to forty percent of the initial software development value [1]. Also, because the growth is exponential, the later a defect is found, the more expensive it is to correct [2]. In the delivery of software products, the test activity commonly does not receive the appropriate attention, because of restrictions of time, resources and cost. On the other hand, organizations do not want to miss clients due to bad product quality. In this context, it is fundamental to find a way to prioritize efforts and allocate resources to the software components that need to be tested carefully. Risk-Based Testing (RBT) aims to minimize some of these problems by risk factors identification related to software requirements [3]. Once identified, requirements are prioritized through risk analysis and the test cases are designed based on the strategies for treatment of the identified risks factors. Thus, test efforts are continually adjusted according to risk control and monitoring. This article presents a practical use of an RBT (RBTProcess) [4] during a software development project. The main objectives of this case study are to: (i) check if RBT can find defects faster than a non RBT ; and (ii) check whether the discovered defects are the ones with high severity. After this introduction, this paper is organized as follows: Section 2 presents the related work, showing some practical uses of RBT, results and restrictions; Section 3 gives an overview about the RBT process applied in this case study RBTProcess; Section 4 describes the case study execution flow and also the observed results; and finally, Section 5 draws some conclusions and points the way to further studies. 2. Related Work Traditional testers deal with risks and software testing, but commonly in an ad hoc fashion based on personal judgment [5]. Risk-based Testing concept addresses the explicit use of risk management activities inside the testing process. RBT justifies the testing efforts, focusing the testing activities where the probability of failure and loss are higher in the software. There are several RBT es in the literature. Some authors did not only propose, but also validate their es through empirical studies. Amland [5] proposes a set of metrics for risk analysis in order to support the RBT. The metrics consider that functionalities that are new, complex, with bad quality, are more likely to fail. The author proposes: (i) a functionality prioritization technique according to the risk exposure value obtained through 978-0-7695-3984-3/10 $26.00 2010 IEEE DOI 10.1109/ITNG.2010.203 1032

defined metrics; (ii) some metrics to control and monitor the test activity and (iii) performed a case study in a financial application where the results were satisfactory as the time spent to test and the number of used resources suffered a considerable reduction comparing with a non RBT. Chen [6] proposes a method for test case and scenario selection based on risk for regression testing. The strategy for test cases and scenarios prioritization is based on Amland [5], presented previously. The author performed a case study in an industrial software product and its method was also considered very effective. Rosenberg and other authors [7] provide an for the object oriented paradigm, where classes that are more prone to fail are identified. The authors assume that the more complex is the code, higher is the incidence of failures in the software. They defined a set of object oriented software metrics which help to find classes with more probability to failure. These metrics were applied for three years at NASA (National Aeronautics and Space Administration) projects and they provided a good basis for test planning, allowing developers to find important defects as soon as possible. Stallbaum and other authors [8] propose an automated technique for risk-based test case prioritization using activity diagrams as test models. The authors implemented a prototype with the proposed and they applied it in a software product developed by the German Federal Ministry of Finance. According to the authors, their enables the early detection of critical faults during the development process. Next Section presents the RBTProcess, a complete RBT software testing process with well defined phases, activities, artifacts, roles, and metrics. 3. RBT Approach This Section briefly presents the RBTProcess, a testing process model based on risks. Figure 1 shows the process structure, where the activities in dark gray are the ones from risk management included in the test process (the activities in white). Further information about RBTProcess organization can be found in [3, 4]. Risk Identification: its main objective is to identify only technical risks that are commonly related to software functionalities or requirements. It includes a review of risks sources and categories to adapt the Taxonomy Based Questionnaire (TBQ) [9] and/or a risk checklist. The TBQ is answered by the project members, followed by a brainstorming meeting to validate the identified risks. Figure 1. RBTProcess activities, roles and phases Risk Analysis: in this activity, the software functionalities are prioritized by an heuristic risk analysis where the software engineers, together with the risk analyst, inform values to metrics like complexity, cost, size, quality, and others in order to find the Risk Exposure (RE) value for each functionality. Test Planning: the test manager defines the test, as well as the strategy and the number of test cycles based on the RE value. Using the RBT, the manager has more information about the software quality due to risk identification and analysis, making better use of time and resources. Test Design: the test cases are designed to mitigate the identified risks. For each risk, at least, one test case is designed. When software engineers answer the TBQ, they state the kind of risk and how that functionality can fail, so this information is used by the test design to project the risk-based test cases. Test Execution: the designed RBT test cases are executed in the RE order. The idea is to test first the functionalities that are more likely to fail. Test Evaluation and Risk Control: these activities monitor, respectively, the progress of test cases and the identified risks. A risk is mitigated when all test cases designed to assess it have been executed and passed. The test efforts are continuously adjusted according to risk monitoring. A set of metrics were proposed by the authors for risk monitoring in [3]. 1033

4. Case Study This section presents the case study characteristics by providing its purpose, the tool used to evaluate the called MPhyScas (Section 4.1), the performed steps to execute the case study (Section 4.2) and, finally, the observed results (Section 4.3). The MPhyScas team, together with the authors, performed the RBTProcess activities, explained in previous section, and a group of undergraduate students in computer engineering, with basic software testing knowledge, executed the designed RBT and non RBT test cases. 4.1 MPhyScaS: Evaluated Tool The RBTProcess was applied to the MPhyScaS (Multi-Physics and Multi-Scales Solver Environment) [10] tool development. MPhyScaS is an Eclipse RCP [11] environment dedicated to the automatic development of simulators based on Finite Element Method (FEM). MPhyScas is divided in four modules: (1) Components Repository with scientific software; (2) Executable Simulator, an interface for data input and executable simulator; (3) Interface for modeling and simulators builder and (4) Builder Module is the simulator framework. The Interface for modeling and simulators builder module was considered in this case study. It reads the simulator.xml file to present the simulation structure. After that, the user can model and build their own simulation by inputting values to the tree structure simulator. Therefore, the tool validates the inputted values at the beginning of the simulation. The tested functionalities are presented at Table 2 and 4. 4.2 Case Study Execution Flow Figure 2 shows the performed steps in the case study and they are detailed as follows. (1) The MPhyScas team members and the volunteers received four hours training about the RBTProcess and basic software testing knowledge. After that, the authors, together with MPhyScas team leader instantiated RBTProcess, creating the TBQ, explained in step three, and defining the process roles, as shown below: Risk Analyst: one of the authors Test Manager: one of the authors Test Design: one of the authors (risk-based) and an MPhyScas team member (non risk-based) Tester: Five computer engineering students Other Software Process Roles: MPhyScas team members (1) RBTProcess and basic software test training (7) (6) Cycle 1 (2) MPhyScas training (5) Test Planning Non-Risk based Test Case Risk-based Cycle 2 Non Riskbased Test Design Test Execution Non Riskbased Hybrid (8) Test Evaluation Risk-based Test Case Hybrid Risk based Figure 2. Case study execution flow (3) Risk Identification (4) Risk Analysis (2) Volunteers received two hours training about MPhyScas functionalities. (3) After requirement elicitation phase, MPhyScas members performed the RPTProcess Risk Identification activity. They answered the TBQ for each one of the software functionalities. An example of a question included in the TBQ is shown in Figure 3. The answers of MPhyScas members for the FUNC10 - GroupTasks functionality is also presented in this figure. (Q1) Is this functionality completely defined? If the answer is NO, which part is not defined? Answer 1: No. The types of state are missing for the Assembler Group Task. Answer 2: No. It is not clear the way Custom Group Task works. Figure 3. Example of TBQ questions At the end, the Risk Analyst summarizes the answers, removing also the inconsistencies. This questionnaire is the primary requirement for the test design, as it contains the risk to be tested for each functionality. 1034

(4) In this step, the MPhyScas members gave values to metrics related to requirement dependency, complexity, costs, size, quality, and others, to calculate the Risk Exposure (RE) for each functionality. The RE is used to classify, prioritize and define the sequence that functionalities are tested. (5) In this activity, the Test Manager defined the amount of test and cycles that are necessary to test the tool based on the number of identified risks. In the RBTProcess, the functionalities are classified as high, medium and low priority. Only functionalities classified as high and medium were tested in this case study. (6) Two types of test cases were designed: non riskbased (functional) and risk-based test cases. Risk-based: one of the authors designed riskbased test cases for the most important functionalities. For each identified risk a test case was created. As shown in Figure 3, the Group Task functionality is not completely defined, so riskbased test cases could be designed to evaluate only the Assembler and the Custom Group Tasks, which are the undefined parts informed by MPhyScas members. It is important to note that functional test cases would check all Group Tasks. Also, in order to keep the same number of non risk-based test cases, eleven test cases were designed to mitigate the eleven most important risks. Non Risk-based: an MPhyScas member with test design training designed functional test cases for each one of the software functionalities, resulting in a suite with eleven test cases. (7) For the test execution, the volunteers were divided in three groups. In each cycle, a group performed different type of test case, as shown in Figure 2. The three test execution es are explained below: Risk-based : test cases based on risks, designed according to RBTProcess Test Design activity and executed in the order specified by RE. Non risk-based : functional test cases projected by an MPhyScas team member and executed in an order that represents the users way of use. Hybrid : non risk-based test cases executed in the RE order. (8) After the test execution, the Test Manager and the MPhyScas team leader analyzed all test cases results, removing inconsistencies and false defects, and summarizing the results. 4.3 Observed Results Table 1 and Table 2 present the metrics related to the risks identification using the Taxonomy Based Questionnaire (TBQ) technique. Table 1. Time spent to identify risks in minutes M1 Time spent to identify risks per functionality 2.41 M2 Time spent to identify risks per person 26.51 M3 Total time to identify risks 106 M1 presents the average time to identify a risk per functionality, while M2 presents the average time to identify risks for all functionalities per person (M2 = M1 * number of functionalities = 2.41 * 11). M3 is the total time spent to identify the risks associated to all functionalities (M3 = M2 * number of participants = 26.51 * 4). Table 2. Quantity and types of the identified risks M4 Quantity of identified risks 49 M5 Quantity of different risks 30 M6 Quantity of different risks per functionality FUNC01-Simulator Properties 3 FUNC02-Kernel Configuration 3 FUNC03-Global States 0 FUNC04-Blocks 1 FUNC05-Create Group 1 FUNC06-Local States 0 FUNC07-Create Phenomenon 7 FUNC08-Phenomenon-State Relation 1 FUNC09-Quantity Tasks 1 FUNC10-Group Tasks 5 FUNC11-Search 8 M7 Quantity of different risks per person 7.5 M8 Quantity of different risks per type Stability 2 Completeness 11 Validity 8 Feasibility 8 Clarity 1 M4 is the quantity of risks identified by all participants, M5 is the quantity of risks identified by all participants, removing the repeated ones (M5 = M4 Number of Repeated Risks = 49-19) and M6 shows the quantity of identified risks per functionality. Functionalities seven, eleven and ten are the ones that appear to have more risks. M7 is the average quantity of different risks identified per person (M7 = M5 / number of participants = 30 / 4) and M8 shows the number of risks identified per type. Most identified risks are related to completeness, which indicates that functionalities are not completely defined and it was confirmed by all MPhyScas members. Table 3 presents the metrics related to the risks analysis using RBTProcess metrics. M9 presents the average time to analyze a risk per functionality, while M10 presents the average time to analyze risks for all 1035

functionalities per person (M10 = M9 * number of functionalities = 11). M11 is the total time spent to analyze all functionalities (M11 = M10 * number of participants = 11.55 * 4). Table 3. Time spent to analyze risks in minutes M9 Time spent to analyze risks per functionality 1.05 M10 Time spent to analyze risks per person 11.55 M11 Total time to analyze risks 46.2 Another key point is that the time spent for risk analysis and identification were collected to check if the risk management activities executed do not overload the testing process. The Risk Exposure (RE) value for each functionality is shown in Table 4. Table 4. Risk exposure value M11 Risk Exposure Value High FUNC10-Group Tasks 1.74 High FUNC02-Kernel Configuration 1.64 High FUNC11-Search 1.44 High FUNC07-Create Phenomenon 1.21 Medium FUNC04-Blocks 1.10 Medium FUNC05-Create Group 1.10 Medium FUNC03-Global States 1.04 Low FUNC09-Quantity Tasks 0.84 Low FUNC08-Phenomenon-State Relation 0.81 Low FUNC06-Local States 0.75 Low FUNC01-Simulator Properties 0.65 They were classified as high, medium and low priority. Functionalities classified as high and medium were tested in this case study. For the ones classified with low priority, test cases were created only if they had a risk associated to and test cases were executed if there was enough time. The risk-based and hybrid test cases run following the RE value order. Table 5 presents a general test execution result for the three es explained in Section 4, where Non RBT means Non risk-based and RBT means riskbased. Table 5. General test execution results Tester Cycle Approach Executed Test Cases Reported Defects Severity H M L One One Non RBT 7 2 2 Three Two Non RBT 11 4 4 Four One Non RBT 11 1 1 Total 29 7 1 6 Two One RBT 1 4 2 1 1 Three One RBT 2 4 1 2 1 Five Two RBT 4 10 4 4 2 Total 7 18 7 7 4 Five One Hybrid 4 8 2 5 1 One Two Hybrid 10 1 1 Four Two Hybrid 10 4 2 2 Total 24 13 4 7 2 Five volunteers executed different test es in two test cycles with approximately two hours duration. As an example, the tester called one performed non risk-based test cases in cycle one and hybrid test cases in cycle two. Regarding severity, column H means High, M is Medium and L is Low. This table also demonstrates that the risk-based found more defects than the others and what is interesting is that the hybrid the Non RBT test cases order by RE value, had also a result near to RBT. Analyzing now the severity, again, RBT had a better result, proving that the focus on the functionalities that are more likely to fail. Table 6 shows the defect concentration per test cases. Table 6. Defect found by test case Tester Cycle Approach Test Case Number 1 3 4 5 6 7 8 9 10 11 Three Two Non RBT 1 1 1 1 One One Non RBT 1 1 Four One Non RBT 1 Total 7 1 1 1 1 1 2 Five Two RBT 8 2 Two One RBT 4 Three One RBT 4 Total 18 16 2 Four Two Hybrid 2 1 1 Five One Hybrid 2 3 3 One Two Hybrid 1 Total 13 4 3 1 3 1 1 While RBT found more defects in the first test cases, the Non RBT found more defects in the last ones. We could see also that the RBT was more effective than the hybrid as it found more than eight percent of the defects after the first test cases execution. It is worthwhile to say that RBT test case one explores the same functionality as Non RBT test case eleven. Table 7 completes the results from Table 6. It shows that, because of the execution order, the RBT and the hybrid es found most defects after 30 minutes execution. Table 7. Time to find a defect in minutes Tester Cycle Approach Time to find a defect in minutes 10 20 30 40 50 Three Two Non RBT 1 1 2 One One Non RBT 1 1 Four One Non RBT 1 Total 7 3 2 2 Two One RBT 2 1 1 Three One RBT 1 2 1 Five Two RBT 3 3 2 2 Total 18 6 6 3 1 2 Five One Hybrid 2 3 3 One Two Hybrid 1 Four Two Hybrid 1 1 1 1 Total 13 2 4 4 1 2 Table 8 shows the number of defects found per software functionality. 1036

Table 8. Defects found per functionality Tester Cycle Approach Functionalities 6 7 8 9 10 11 One One Non RBT 1 1 Three Two Non RBT 1 1 1 1 Four One Non RBT 1 Total 7 1 1 1 1 1 2 Five Two RBT 8 2 Two One RBT 4 Three One RBT 4 Total 18 0 0 0 0 16 2 Five One Hybrid 3 2 3 One Two Hybrid 1 Four Two Hybrid 1 1 2 Total 13 1 4 0 1 4 3 The main finding here is that, the functionality that appears in this table with most defects is the one that had the greater risk exposure value in the risk analysis shown in Table 4. 5. Conclusion and Ongoing Work For this case study, through RBTProcess, we could prove that RBT really focuses on the parts of the software that are more likely to fail. This helps test managers to make better user of their limited time and resources. Therefore, we could prove also that RBT finds the most important defects earlier than functional, so they can be fixed earlier and consequently the software quality is improved faster and thus we can demonstrate cost benefit. As shown in Table 1 and Table 3, the time spent, respectively, to identify and analyze risks is not time consuming, confirming the adoption viability for most organizations that develop software. Also, the MPhyScas members did not find it difficult to perform these activities and externalized that the really attacked the correct functionalities, saving project time and resources. However, we cannot generalize those results, and other case studies and experiments needed to be performed to confirm those benefits. Therefore, some of the RBTProcess activities were performed by the authors due to an unavailability of human resources. Also, the MPhyScas members suggested a review in the risk analysis guide as they found it confusing to give values to some of the metrics. Another problem is that the results were summarized manually by the authors, taking a long to conclude. To minimize this problem, a tool, called RBTTool [12], is under development to provide support to RBT activities, especially to the ones related to risk management, which test engineers find more difficult to perform. Concluding, we can state that, although having some limitations, the RBT has a great potential applicability and it is also a cost saving. 6. References [1] Pressman, R. Engenharia de Software. 1st ed. Makron Books, São Paulo, Brazil, 1995. [2] Graham, D., van Veenendaal, E., Evans, I., Black, R. Foundations of Software Testing: ISTQB Certification. Thomson Learning, 2007. [3] Souza, E., Gusmão, C., Alves, K., Venâncio, J., Melo, R. Measurement and Control for Risk-based Test Cases and Activities. 10th IEEE Latin American Test Workshop - LATW, 2009. [4] Souza, E. RBTProcess: Modelo de Processo de Teste de Software baseado em Riscos. Master Thesis, University of Pernambuco, Recife/Brazil, 2008. [5] Amland, S., Risk Based Testing and Metrics: Risk analysis fundamentals and metrics for software testing. 5o International Conference EuroSTAR 99, 1999. [6] Chen, Y. Specification-based Regression Testing Measurement with Risk Analysis. Master Thesis, University of Ottawa/Canada, 2002. [7] Rosenberg, L. H., Stapko, R., Gallo. A. Risk-based object oriented testing. 24o Annual Software Engineering Workshop, NASA SEW24, 1999. [8] Stallbaum, H., Metzger, A., Pohl, K. An Automated Technique for Risk-based Test Case Generation and Prioritization. 3rd Workshop on Automation of Software Test, 2008. [9] Carr, M. J., Konda, S.L., Monarch, I., Ulrich, F. C., Walker, C. Taxonomy Based Risk Identification. Technical Report CMU/SEI-93-TR-6. Software Engineering Institute, 1993. [10] Oliveira, C., Rocha, F., Medeiros, R. W., Lima, R., Soares, S., Santos, F., Santos, I. H. S. Dynamic Interface for Multi-physics Simulators. International Journal of Modeling and Simulation for the Petroleum Industry, v. 2, p. 35-42, 2008. [11] Burnette, E. Rich Client Platform (RCP) Tutorial One. Available at http://www.eclipse.org/articles/article-rcp- 1/tutorial1.html, 2006 [12] Venâncio, J., Gusmão, C., Mendes, E., Souza, E. RBTTool - Uma Ferramenta de Apoio à Abordagem de Teste de Software baseado em Riscos. 23o Brazilian Symposium on Software Engineering, 2009. 1037