BASICS OF SOFTWARE ENGINEERING EXPERIMENTATION

Basics of Software Engineering Experintentation by Natalia Juristo and Ana M. Moreno Universidad Politecnica de Madrid, Spain SPRINGER SCIENCE+BUSINESS MEDIA, LLC

A C.I.P. Catalogue record for this book is available from the Library of Congress. ISBN 978-1-4419-5011-6 ISBN 978-1-4757-3304-4 (ebook) DOI 10.1007/978-1-4757-3304-4 Printed on acid-free paper All Rights Reserved 2001 Springer Science+ Business Media New York Originally published by Kluwer Academic Publishers in 2001 Softcover reprint of the hardcover 1st edition 2001 No part of the material protected by this copyright notice may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying, recording or by any information storage and retrieval system, without written permission from the copyright owner.

CONTENTS LIST OF FIGURES LIST OF TABLES FOREWORD ACKNOWLEDGEMENTS xi xiii xix xxi PART 1: INTRODUCTION TO EXPERIMENTATION 1. INTRODUCTION 1.1. PRE-SCIENTIFIC STATUS OF SOFTWARE ENGINEERING 1.2. WHY DON'T WE EXPERIMENT IN SE? 1.3. KINDS OF EMPIRICAL STUDIES 1.4. AMPLITUDE OF EXPERIMENTAL STUDIES 1.5. GOALS OF THIS BOOK 1.6. WHO DOES THIS BOOK TARGET? 1.7. OBJECTIVES TO BE ACHIEVED BY THE READER OF THIS BOOK 1.8. ORGANISATION OF THE BOOK 2. WHY EXPERIMENT? THE ROLE OF EXPERIMENTATION IN SCIENTIFIC AND TECHNOLOGICAL RESEARCH 2.1. INTRODUCTION 2.2. RESEARCH AND EXPERIMENTATION 2.3. THE SOCIAL ASPECT IN SOFTWARE ENGINEERING 2.4. THE EXPERIMENTATION/LEARNING CYCLE 2.5. SCIENTIFIC METHOD 2.6 WHY DO EXPERIMENTS NEED TO BE REPLICATED? 2.7 EMPIRICAL KNOWLEDGE VERSUS THEORETICAL KNOWLEDGE 3. HOW TO EXPERIMENT? 3.1. INTRODUCTION 3.2. SEARCHING FOR RELATIONSHIPS AMONG VARIABLES 3.3. STRATEGY OF STEPWISE REFINEMENT 3.4. PHASES OF EXPERIMENTATION 3.5. ROLE OF STATISTICS IN EXPERIMENTATION 3 6 10 12 17 18 19 20 23 23 26 27 33 35 40 45 45 47 49 51

vi PART ll: DESIGNING EXPERIMENTS 4. BASIC NOTIONS OF EXPERIMENTAL DESIGN 4.1. INTRODUCTION 57 4.2. EXPERIMENTAL DESIGN TERMINOLOGY 57 4.3. THE SOFTWARE PROJECf AS AN EXPERIMENT 65 4.4. RESPONSE VARIABLES IN SE EXPERIMENTATION 70 4.5. SUGGESTED EXERCISES 80 5. EXPERIMENTAL DESIGN 5.1. INTRODUCfiON 83 5.2. EXPERIMENTAL DESIGN 83 5.3. ONE-FACfOR DESIGNS 85 5.4. HOW TO AVOID VARIATIONS OF NO INTEREST TO THE EXPERIMENT: BLOCK DESIGNS 90 5.5. EXPERIMENTS WITH MUL TPLE SOURCES OF DESIRED VARIATION: FACfORIAL DESIGNS 97 5.6. WHAT TO DO WHEN FACTORIAL ALTERNATIVES ARE NOT COMPARABLE: NESTED DESIGNS 102 5.7. HOW TO REDUCE THE. AMOUNT OF EXPERIMENTS: FRACfiONAL DESIGNS 103 5.8. EXPERIMENTS WITH SEVERAL DESIRED AND UNDESIRED VARIATIONS: FACfORIAL BLOCK DESIGNS 104 5.9. IMPORTANCE OF EXPERIMENTAL DESIGN AND STEPS 113 5.10. SPECIFIC CONSIDERATIONS FOR EXPERIMENTAL DESIGNS IN SOFTWARE ENGINEERING 116 5.11. SUGGESTED EXERCISES 119 PART ill: ANALYSING THE EXPERIMENTAL DATA 6. BASIC NOTIONS OF DATA ANALYSIS 6.1. INTRODUCfiON 125 6.2. EXPERIMENTAL RESULTS AS A SAMPLE OF A POPULATION 126 6.3. STATISTICAL HYPOTHESES AND DECISION MAKING 128 6.4. DATA ANALYSIS FOR LARGE SAMPLES 132 6.5. DATA ANALYSIS FOR SMALL SAMPLES 137 6.6. READERS' GUIDE TO PART III 147 6.7. SUGGESTED EXERCISES 151

Vll 7. WHICH IS THE BEITER OF TWO ALTERNATIVES? ANALYSIS OF ONE-FACTOR DESIGNS WITH TWO ALTERNATIVES 7.1. INTRODUCTION 153 7.2. STATISTICAL SIGNIFICANCE OF THE DIFFERENCE BETWEEN TWO ALTERNATIVES USING HISTORICAL DATA 153 7.3. SIGNIFICANCE OF THE DIFFERENCE BETWEEN TWO ALTERNATIVES WHEN NO HISTORICAL DATA ARE AVAILABLE 160 7.4. ANALYSIS FOR PAIRED COMPARISON DESIGNS 163 7.5. ONE-FACTOR ANALYSIS WITH TWO ALTERNATIVES IN REAL SE EXPERIMENTS 165 7.6. SUGGESTED EXERCISES 173 8. WHICH OF K ALTERNATIVES IS THE BEST? ANALYSIS FOR ONE-FACTOR DESIGNS AND K ALTERNATIVES 8.1. INTRODUCTION 175 8.2. IDENTIFICATION OF THE MATHEMATICAL MODEL 176 8.3. VALIDATION OF THE BASIC MODEL THAT RELATES THE EXPERIMENTAL VARIABLES 179 8.4. CALCULATING THE FACTOR- AND ERROR-INDUCED VARIATION IN THE RESPONSE VARIABLE 186 8.5. CALCULATING THE STATISTICAL SIGNIFICANCE OF THE FACTOR-INDUCED VARIATION 189 8.6. RECOMMENDATIONS OR CONCLUSIONS OF THE ANALYSIS 195 8.7. ANALYSIS OF ONE FACTOR WITH K ALTERNATIVES IN REAL SE EXPERIMENTS 199 8.8. SUGGESTED EXERCISES 201 9. EXPERIMENTS WITH UNDESIRED VARIATIONS: ANALYSIS FOR BLOCK DESIGNS 9.1. INTRODUCTION 203 9.2. ANALYSIS FOR DESIGNS WITH A SINGLE BLOCKING VARIABLE 203 9.3. ANALYSIS FOR DESIGNS WITH TWO BLOCKING VARIABLES 216

Vlll 9.4. ANALYSIS FOR TWO BLOCKING VARIABLE DESIGNS AND REPLICATION 219 9.5. ANALYSIS FOR DESIGNS WITH MORE THAN TWO BLOCKING VARIABLES 220 9.6. ANALYSIS WHEN THERE ARE MISSING DATA IN BLOCK DESIGNS 227 9.7. ANALYSIS FOR INCOMPLETE BLOCK DESIGNS 229 9.8. SUGGESTED EXERCISES 232 10. BEST ALTERNATIVES FOR MORE THAN ONE VARIABLE: ANALYSIS FOR FACTORIAL DESIGNS 10.1. INTRODUCTION 235 10.2. ANALYSIS OF GENERAL FACTORIAL DESIGNS 236 10.3. ANALYSIS FOR FACTORIAL DESIGNS WITH TWO ALTERNATIVES PER FACTOR 246 10.4. ANALYSIS FOR FACTORIAL DESIGNS WITHOUT REPLICATION 269 10.5. HANDLING UNBALANCED DATA 280 10.6. ANALYSIS OF FACTORIAL DESIGNS IN REAL SE EXPERIMENTS 286 10.7. SUGGESTED EXERCISES 289 11. EXPERIMENTS WITH INCOMPARABLE FACTOR ALTERNATIVES: ANALYSIS FOR NESTED DESIGNS 11.1. INTRODUCTION 293 11.2. IDENTIFICATION OF THE MATHEMATICAL MODEL 294 11.3. VALIDATION OF THE MODEL 294 11.4. CALCULATION OF THE VARIATION IN THE RESPONSE VARIABLE DUE TO FACTORS AND ERROR 295 11.5. STATISTICAL SIGNIFICANCE OF THE VARIATION IN THE RESPONSE VARIABLE 296 11.6. SUGGESTED EXERCISES 297 12. FEWER EXPERIMENTS: ANALYSIS FOR FRACTIONAL FACTORIAL DESIGNS 12.1. INTRODUCTION 299 12.2. CHOOSING THE EXPERIMENTS IN A 2k-p FRACTIONAL FACTORIAL DESIGN 300 12.3. ANALYSIS FOR 2k-p DESIGNS 305 12.4. SUGGESTED EXERCISES 310

IX 13. SEVERAL DESIRED AND UNDESIRED VARIATIONS: ANALYSIS FOR FACTORIAL BLOCK DESIGNS 13.1. INTRODUCTION 313 13.2. IDENTIFICATION OF THE MATHEMATICAL MODEL 314 13.3. CALCULATION OF RESPONSE VARIABLE VARIABILITY 316 13.4. STATISTICAL SIGNIFICANCE OF THE VARIATION IN THE RESPONSE VARIABLE 317 13.5. ANALYSIS OF FACTORIAL BLOCK DESIGNS IN REAL SE EXPERIMENTS 320 13.6. SUGGESTED EXERCISES 321 14. NON-PARAMETRIC ANALYSIS METIIODS 14.1. INTRODUCTION 323 14.2. NON-PARAMETRIC METHODS APPLICABLE TO INDEPENDENT SAMPLES 324 14.3. NON-PARAMETRIC METHODS APPLICABLE TO RELATED SAMPLES 328 14.4. NON-PARAMETRIC ANALYSIS IN REAL SE EXPERIMENTS 330 14.5. SUGGESTED EXERCISES 334 15. HOW MANY TIMES SHOULD AN EXPERIMENT BE REPLICATED? 15.1. INTRODUCTION 337 15.2. IMPORTANCE OF THE NUMBER OF REPLICATIONS IN EXPERIMENTATION 338 15.3. THE VALUE OF THE MEANS OF THE ALTERNATIVES TO BE USED TO REJECT H 0 IS KNOWN 338 15.4. THE VALUE OF THE DIFFERENCE BETWEEN TWO MEANS OF THE ALTERNATIVES TO BE USED TO REJECT H 0 IS KNOWN 341 15.5. THE PERCENTAGE VALUE TO BE EXCEEDED BY THE STANDARD DEVIATION TO BE USED TO REJECT H 0 IS KNOWN 342 15.6 THE DIFFERENCE BETWEEN THE MEANS OF THE ALTERNATIVES TO BE USED TO REJECT H 0 IS KNOWN FOR MORE THAN ONE FACTOR 343 15.7. SUGGESTED EXERCISES 345

X PART IV: CONCLUSIONS 16. SOME RECOMMENDATIONS ON EXPERIMENTING 16.1. INTRODUCTION 349 16.2. PRECAUTIONS TO BE TAKEN INTO ACCOUNT IN SE EXPERIMENTS 349 16.3. A GUIDE TO DOCUMENTING EXPERIMENTATION 354 REFERENCES 359 ANNEXES ANNEX I: SOME SOFTWARE PROJECT VARIABLES 367 ANNEX II: SOME USEFUL LATIN SQUARES AND HOW THEY ARE USED TO BUILD GRECO-LATIN AND HYPER-GRECO-LATIN SQUARES 379 ANNEX III: STATISTICAL TABLES 385

LIST OF FIGURES Figure 1.1. Figure 2.1. Figure 2.2. Figure 3.1. Figure 3.2. Figure 4.1. Figure 4.2. Figure 4.3. Figure 5.1. Figure 5.2. Figure 5.3. Figure 6.1. Figure 6.2. Figure 6.3. Figure 6.4. Figure 6.5. Figure 8.1. Figure 8.2. Figure 8.3. Figure 8.4. Figura 8.5. Figure 8.6. Figure 8.7. Figure 8.8. Figure 9.1. The SE community structured similarly to other engineering communities Iterative learning process Experimentation/learning cycle Process of experimentation in SE Graph of the population of oldenburg at the end of each year as a function of the number storks observed in the same year <81930-36) Relationship among Parameters, Factors and Response Variable in an Experimentation External parameters Internal parameters Design of the ftrst part of the study Design of the second part of the study Three-factor factorial design and two alternatives per factor Distribution of the Z statistic Student's t distribution for several values ofv Fisher's F distribution Chi-square distribution for several values of v Methods of analysis applicable according to the characteristics of the response variables Point graph for all residuals Residuals plotted as a function of estimated response variable values Residuals graph with pattern Funnel-shaped graph of residuals versus estimated values Residuals graph for each language Graph of residuals as a function of time Sample means in relation to the reference t distribution Reference distribution Distribution of residuals against estimated values for our example

xii Figure 9.2. Figure 9.3. Figure 9.4. Figure 10.1. Figure 10.2. Figure 10.3. Figure 10.4. Figure 10.5. Figure 10.6. Figure 10.7. Figure 10.8. Figure 10.9. Figure 10.10. Figure 10.11. Figure 12.1. Figure 12.2. Figure 12.3. Figure 12.4. Figure 13.1. Figure 14.1. Figure 11.1. Graph of normal probability of residuals for our example Graph of residuals by alternative and block for our example Significant language in a t distribution with the scaling factor 0,47 Domain and estimation technique effects Graph of errors and estimated values of the response variable in the umeplicated 2 4 example Residual normal probability graph Graph of residuals plotted against estimated response Graphs of effect and interaction for our example Graph without interaction between factor A and B, each with two alternatives Effects of A, B and AC Effect of the factors and interactions on normal probability paper Normal probability residuals graph Graph of residuals against estimated response Graphs of principal effects and interactions Normal probability graph of the effects of a 2 5 1 design Graph of normal probability of the 2 5 1 experiment residuals Graph of residuals plotted angainst predicted values for the 25.1 design described Graph of effects A, B, C and AB Graph of interaction AB Number of system releases Greco-Latin squares

LIST OF TABLES Table 1.1. Table 1.2. Table4.1. Table4.2. Table4.3. Table 4.4. Table 4.5. Table4.6. Table 5.1. Table 5.2. Table 5.3. Table 5.4. Table 5.5. Table 5.6. Table 5.7. Table 5.8. Table 5.9. Table 5.10. Table 5.11. Table 6.1. Table 6.2. Table 6.3. Table 6.4. Table 6.5. Table 7.1. Percentage of faults in the car industry Summary of fallacies and rebuttals about computer science experimentation Examples of factors and parameters in real experiments Examples of response variables in SE experiments Examples of software attributes and metrics Measurement type scales Examples of GQM application to identify response variables in an experiment Examples of response variables in real SE experiments Different experimental designs Replications of each combination of factors Temporal distribution of the observations Possible factorial design Nested design Three hypothetical results of the experiment with A and B to studyrv Suggested block design for the 2k factorial design Shows the sign for two factors and Table 5.9 shows the sign table for three factors Sign table for the 2 3 design with two blocks of size 4 2 x 2 factorial experiment with repeated measures in blocks of size 2 Another representation of the design in Table 5.10 Examples of null and alternative hypotheses Critical levels of the normal distribution for un ilateral and bilateral tests Expected frequencies according to Ho (there is no difference between tool use or otherwise) Observed frequencies Structure of the remainder of part III Data on 20 projects (using process A and B)

XIV Table 7.2. Table 7.3. Table 7.4. Table 7.5. Table 7.6. Table 7.7. Table 7.8. Table 7.9. Table 7.10. Table 7.11. Table 7.12. Table 7.13. Table 7.14. Table 7.15. Table 7.16. Table 7.17. Table 8.1. Table 8.2. Table 8.3. Table 8.4. Table 8.5. Table 8.6. Table 8.7. Table 8.8. Table 8.9. Table 8.10. Table 8.11. Table 9.1. Table 9.2. Table 9.3. Shows the 210 observations taken from the historical data collected about the standard process A Means of 10 consecutive components Difference between means of consecutive groups Results of a random experiment for comparing alternative A and B to calculations Accuracy of the estimate for I 0 similar projects Ratio of detected faults p Number of seconds subjects looked at algorithm when answering each question part Percentage of correct answers to all question parts Mean confidence level for each question part Number of seconds subjects took to answer questions Number of times the algorithm was viewed when answering each question Analysis of claim (a) Analysis of claim (b) Analysis of claim (c) Analysis of claim (d) Number of errors in 24 similar projects Effects of the different programming language alternatives Estimated values of Yij Residuals associated with each observation Analysis of variance table for one-factor experiments Results of the analysis of variance Results for good versus bad 00 Results for bad structured versus bad 00 Results for good structured versus good 00 Lines of code used with three programming languages Producting coded of 5 development tools Data taken for the example of a design with one blocking variable Effects of blocks and alternatives for our example Experiment residuals for our example

Basics of Software Engineering Experimentation XV Table 9.4. Table 9.5. Table 9.6. Table 9.7. Table 9.8. Table 9.9. Table 9.10. Table9.11. Table 9.12. Table 9.13. Table 9.14. Table 9.15. Table 9.16. Table9.17. Table 9.18. Table 9.19. Table 9.20. Table 10.1. Table 10.2. Table 10.3. Table 10.4. Table 10.5. Table 10.6. Table 10.7. Table 10.8. Table 10.9. Table 10.10. Table 10.11. Table 10.12. Table 10.13. Analysis of variance by one factor and one block variable Results of the analysis of variance for our example Incorrect analysis by means of one factor randomised design Coded data for 5 x 5 Latin square of our example Results of the experiment with Latin squares in our example Analysis of variance of a replicated Latin square, with replication type (I) Analysis of variance of a replicated Latin square, with replication type (2) Analysis of variance of a replicated Latin square, with replication type (3) Greco-Latin square Greco-Latin square design for programming languages Analysis of variance for a Greco-Latin design Results of the analysis of variance for the Greco-Latin square Incomplete randomised block design for the programming language experiment Results of the approximate analysis of variance with a missing datum Balanced incomplete block design for the tools experiment Analysis of variance for the balanced incomplete block design Analysis of variance for the example in Table 9.18 Data collected in a 3 x 4 experimental design Principal effects of the techniques and domain Effects of interaction a~ for our example Analysis of variance table for two factors Result of the analysis of variance for our example Experimental response variables Alternatives of the factors for our example Sign table for the 2 2 design of our example Residual calculation for our example Analysis of variance table for 2 2 design Results of the analysis of variance for our example Alternatives for three factors in our example Sign table for a 2 3 design

XVI Table 10.14. Table 10.15. Table 10.16. Table 10.17. Table 10.18. Table 10.19. Table 10.20. Table 10.21. Table 10.22. Table 10.23. Table 10.24. Table 10.25. Table 10.26. Table 10.27. Table 10.28. Table 10.29. Table 10.30. Table 10.31. Table 10.32. Table 10.33. Table 11.1. Table 11.2. Table 11.3. Table 11.4. Table ll.5. Table 12.1. Tabla 12.2. Tabla 12.3. Tabla 12.4. Tabla 12.5. Table 12.6. Table 12.7. Table 12.8. Table 13.1. Residual calculation Analysis of variance table for 2k model of fixed effects Values of the analysis of variance for our example Results of the specimen 2 4 experimental design Sign table for a 2 4 design Effects of the factors and interactions of our 2 4 design Residuals related to the nomeplicated 2 4 design in question Residual calculation for our example Table of analysis of variance for our example Analysis of variance for the replicated data of Table 10.21 Experiment on how long it takes to make a change with proportional data Analysis of variance for the maintainability data in Table 10.23 Values of n;i for an unbalanced design Values of Il;j for an unbalanced design Analysis of variance summary for (Wood, 1997) Analysis of Variance oflnspection Technique and Specification Analysis of variance testing for sequence and interaction effects Improvement in productivity with five methodologies Porcentage of reuse in a given application Effort employed Data gathered in a nested design Examples of residuals Table of analysis of variance for the two-stage nested design Analysis of variance for the data of example 12.1 Reliability of disks from different suppliers Sign table for a 2 3 Experimental Design Sign table of a 2 4 1 design (option 1) Sign table of a 2 4 1 design (option 2) Sign table of a 2 4 1 design (option 3) Sign table of a 2 4 1 design (option 4) 2 5 1 design Result of the analysis of variance for the example 2 5 1 design Number of errors detected in 16 program Factor alternatives to be considered

Basics of Software Engineering Experimentation xvii Table 13.2. Table 13.3. Table 13.4. Table 13.5. Table 13.6. Table 13.7. Table 13.8. Table 14.1. Table 14.2. Table 14.3. Table 14.4. Table 14.5. Table 14.6. Table 14.7. Table 15.2. Table 15.3. Table 15.4. Table 16.1. Table 1.1. Table 1.2. Table 1.3. Table 1.4. Table 1.5. Table 1.6. Table 1.7. Table 1.8. Table 1.9. Table 1.10 Combination of alternatives related to 2 3 design with two blocks of size 4 Calculation of the effects in a 2 3 design Table of analysis of variance for k factors with two alternatives, one block with two alternatives and r replications Analysis of variance for our example Design of the experiment described in (Basili, 1996) Results of the analysis of variance for the generic domain problems Results of the analysis of variance for the NASA problem domain Data on the percentage of errors detected by the two tools Data and ranks of the CASE tools testing experiment Errors detected per time unit across nine programs Kruskal-Wallis test result for development response variables Grades attained by two groups of students Time taken to specify a requirement Lines of code with two different languages Number of replications generated according to curves of constant power for one-factor experiments Parameters of the operating characteristic curve for the graphs in Annex III: two-factor ftxed-effects model Number of replications for two-factor experiments generated using operatins curves Questions to be addressed by experimental documentation Possible values for problem parameters Possible values for user variables Possible values for information sources variables Possible values for company variables Possible values for software system variables Possible values for user documentation parameters Possible values for process variables Possible values for the variables methods and tools Possible values for personnel variables Possible values for intermediate product variables

XVlll Tablel.ll. Table 1.12. Table III. I. Table III.2. Table 111.3. Table III.4. Table III.5. Table III.6. Table 111.7. Table 111.8. Table 111.9. Table III.lO. External parameters for the software application domain Internal parameters for the software application domain Normal Distribution Normal Probability Papel Student's t Distribution Ordinate Values of the t Distribution 90-Percentiles of the F( v 1, v 2 ) Distribution 95-Percentiles of the F( v 1, v 2 ) Distribution 99-Percentiles of the F( v 1, v 2 ) Distribution Chi-square Distribution Operating Characteristic Curves for Test on Main Effects Wilcoxon Text

FOREWORD Although the term "software engineering" was coined in 1968 at a NATO conference, the discipline of software engineering is still in an unfortunately prolonged adolescence. Practitioners and researchers list as major problems the same difficulties that were listed ten years ago, and ten years before that. There is very little consensus on which technologies are the most effective. Educators cannot agree on what prerequisites should be required for a computer science major, which languages should be taught, and which skills are the most valuable for good research and practice. And the major computing societies continue to bicker over what expertise is necessary for someone to be called a licensed software engineer. We need only look at other engineering disciplines to see that software development is more art than craft, and that we have a long way to go before we can rightly call ourselves "engineers." However, the picture is not completely bleak. As Natalia Juristo and Ana Moreno point out in this solid introduction to experimentation, we can learn from other disciplines whose problems are similar to ours. We can recognize that there are ways to identify possible causative factors and to organize some of our research so that we can explore and discover the effectiveness of technologies in a quantitative, reproducible way. In other words, we can and should add organized, intellectual investigation to our gut-feel decision-making about what produces the best software. That is not to say that we will fmd a one-size-fits-all approach to building good software. Indeed, we are likely to fmd that certain approaches work best in certain situations. Juristo and Moreno explain how a good experimental design will include capture of these situational factors, so that we view technological effectiveness in its organizational and human contexts. They clearly present many of the biases that are likely to affect the outcome of a study, and ways to avoid or moderate their effects, so that we see as much of the technologys effect as possible. Such valuable advice helps us to evaluate a technique or tool in our own backyards, reproducing a study to see how the technique or tool fares with our very own practices and projects. They also point out that our studies of effectiveness must be objective, where the creator of a new technique is not the only one evaluating it. Their underlying subtext is that good software engineering experimentation takes into account the ethics as well as the activities of an investigation. If you are a researcher, you should master the approaches to empirical software engineering described by Juristo and Moreno. Just as any chemist for physicist

XX knows how to collect and analyze data to confirm or refute underlying theories, you too should use these quantitative techniques to guide your investigations. Moreover, when other researchers follow the recommended approaches, you will have an easier time replicating existing studies and devising follow-on studies that expand what we know about software development and maintenance. If you are a practitioner, the advice in this book will enable you to read and assess the studies you fmd in your journals and at your conferences. What are the most promising technologies for the kind of software you develop? For the constraints under which you develop it? What are the trade-offs involved in adopting new technology? And how can you create or replicate studies to verify that what you read about really happens on your projects? Finally, if you are an educator, this book will help you to guide your students in understanding that software engineering is far more than simply having a good technology idea and trying it out on a project. They will see that software engineering can truly be engineering, built on a foundation of knowledge based on careful, repeatable studies. They will learn that software engineers do more than build products; they design and build products with confidence and quality. Shari Lawrence Pfleeger August2000

ACKNOWLEDGEMENTS So many people have provided help and support in one way or other to write this book. If we list them we could take the risk to forget somebody. We would like to thank all of them. In particular, we are specially indebted to those many who have argued directly with us; sharing their visions and considering their helpful comments have really improved the ideas that we present in this book.