E-Book, Englisch, 245 Seiten
Madeyski Test-Driven Development
1. Auflage 2009
ISBN: 978-3-642-04288-1
Verlag: Springer
Format: PDF
Kopierschutz: 1 - PDF Watermark
An Empirical Evaluation of Agile Practice
E-Book, Englisch, 245 Seiten
ISBN: 978-3-642-04288-1
Verlag: Springer
Format: PDF
Kopierschutz: 1 - PDF Watermark
Agile methods are gaining more and more interest both in industry and in research. Many industries are transforming their way of working from traditional waterfall projects with long duration to more incremental, iterative and agile practices. At the same time, the need to evaluate and to obtain evidence for different processes, methods and tools has been emphasized. Lech Madeyski offers the first in-depth evaluation of agile methods. He presents in detail the results of three different experiments, including concrete examples of how to conduct statistical analysis with meta analysis or the SPSS package, using as evaluation indicators the number of acceptance tests passed (overall and per hour) and design complexity metrics. The book is appropriate for graduate students, researchers and advanced professionals in software engineering. It proves the real benefits of agile software development, provides readers with in-depth insights into experimental methods in the context of agile development, and discusses various validity threats in empirical studies.
Lech Madeyski is Assistant Professor in the Software Engineering Department, Institute of Informatics, Wroclaw University of Technology, Poland. His current research interests include: experimentation in software engineering, software metrics and models, software quality and testing, software products and process improvement, and agile software development methodologies (e.g., eXtreme Programming).He has published research papers in refereed software engineering journals (e.g., IET Software, Journal of Software Process: Improvement and Practice) and conferences (e.g., PROFES, XP, EuroSPI, CEE-SET). He has been a member of the program, steering, or organization committee for several software engineering conferences such as PROFES (International Conference on Product Focused Software Process Improvement), ENASE (International Working Conference on Evaluation of Novel Approaches to Software Engineering), CEE-SET (Central and East-European Conference on Software Engineering Techniques), and BPSC (International Working Conference on Business Process and Services Computing).His paper at PROFES 2007 received the Best Paper Award.
Autoren/Hrsg.
Weitere Infos & Material
1;Foreword;5
2;Preface;6
3;Acknowledgements;10
4;Contents;12
5;Acronyms;16
6;to 1 Introduction ;18
6.1;1.1 Test-First Programming;18
6.1.1;1.1.1 Mechanisms Behind Test-First Programming that Motivate Research;19
6.2;1.2 Research Methodology;21
6.2.1;1.2.1 Empirical Software Engineering;21
6.2.2;1.2.2 Empirical Methods;22
6.2.2.1;1.2.2.1 Qualitative and Quantitative Research Paradigms;22
6.2.2.2;1.2.2.2 Fixed and Flexible Research Designs;22
6.2.2.3;1.2.2.3 Empirical Strategies;23
6.2.2.4;1.2.2.4 Between-Groups and Repeated Measures Experimental Designs;24
6.3;1.3 Software Measurement;25
6.3.1;1.3.1 Measurement Levels;25
6.3.2;1.3.2 Software Product Quality;26
6.3.2.1;1.3.2.1 ISO/IEC 9126;26
6.3.2.2;1.3.2.2 Test Code Metrics;28
6.3.2.3;1.3.2.3 Validity of Software Quality Standards;28
6.3.3;1.3.3 Software Development Productivity;29
6.4;1.4 Research Questions;30
6.5;1.5 Book Organization;30
6.6;1.6 Claimed Contributions;31
7;to 2 Related Work in Industrial and Academic Environments ;32
7.1;2.1 Test-First Programming;32
7.2;2.2 Pair Programming;37
7.3;2.3 Summary;40
8;to 3 Research Goals, Conceptual Model and Variables Selection ;42
8.1;3.1 Goals Definition;42
8.2;3.2 Conceptual Model;43
8.3;3.3 Variables Selection;45
8.3.1;3.3.1 Independent Variable (IV);45
8.3.1.1;3.3.1.1 Test-First and Test-Last Programming;45
8.3.1.2;3.3.1.2 Pair Programming and Solo Programming;47
8.3.2;3.3.2 Dependent Variables (DVs) --- From Goals to Dependent Variables;49
8.3.2.1;3.3.2.1 External Code Quality;49
8.3.2.2;3.3.2.2 Internal Code Quality;50
8.3.2.3;3.3.2.3 Development Speed;51
8.3.2.4;3.3.2.4 Thoroughness and Fault Detection Effectiveness of Unit Tests;51
8.3.3;3.3.3 Confounding Variables;53
9;to 4 Experiments Planning, Execution and Analysis Procedure ;55
9.1;4.1 Context Information;55
9.2;4.2 Hypotheses;57
9.3;4.3 Measurement Tools;58
9.3.1;4.3.1 Aopmetrics;58
9.3.2;4.3.2 ActivitySensor and SmartSensor Plugins;59
9.3.3;4.3.3 Judy;59
9.4;4.4 Experiment Accounting;60
9.4.1;4.4.1 Goals;60
9.4.2;4.4.2 Subjects;60
9.4.3;4.4.3 Experimental Materials;60
9.4.4;4.4.4 Experimental Task;61
9.4.5;4.4.5 Hypotheses and Variables;61
9.4.6;4.4.6 Design of the Experiment;61
9.4.7;4.4.7 Experiment Operation;62
9.4.7.1;4.4.7.1 Preparation Phase;62
9.4.7.2;4.4.7.2 Execution Phase;62
9.5;4.5 Experiment Submission;62
9.5.1;4.5.1 Goals;63
9.5.2;4.5.2 Subjects;63
9.5.3;4.5.3 Experimental Materials;63
9.5.4;4.5.4 Experimental Task;64
9.5.5;4.5.5 Hypotheses and Variables;64
9.5.6;4.5.6 Design of the Experiment;64
9.5.7;4.5.7 Experiment Operation;64
9.5.7.1;4.5.7.1 Pre-study;65
9.5.7.2;4.5.7.2 Preparation Phase;65
9.5.7.3;4.5.7.3 Execution Phase;65
9.6;4.6 Experiment Smells&Library;66
9.6.1;4.6.1 Goals;66
9.6.2;4.6.2 Subjects;66
9.6.3;4.6.3 Experimental Materials;67
9.6.4;4.6.4 Experimental Tasks;67
9.6.5;4.6.5 Hypotheses and Variables;67
9.6.6;4.6.6 Design of the Experiment;68
9.6.7;4.6.7 Experiment Operation;68
9.6.7.1;4.6.7.1 Preparation Phase;68
9.6.7.2;4.6.7.2 Execution Phase;68
9.7;4.7 Analysis Procedure;69
9.7.1;4.7.1 Descriptive Statistics;69
9.7.2;4.7.2 Assumptions of Parametric Tests;69
9.7.3;4.7.3 Carry-Over Effect;70
9.7.4;4.7.4 Hypotheses Testing;71
9.7.5;4.7.5 Effect Sizes;71
9.7.6;4.7.6 Analysis of Covariance;72
9.7.6.1;4.7.6.1 Non-Parametric Analysis of Covariance;73
9.7.7;4.7.7 Process Conformance and Selective Analysis;73
9.7.8;4.7.8 Combining Empirical Evidence;76
10;to 5 Effect on the Percentage of Acceptance Tests Passed ;77
10.1;5.1 Analysis of Experiment Accounting;77
10.1.1;5.1.1 Preliminary Analysis;77
10.1.1.1;5.1.1.1 Descriptive Statistics;78
10.1.1.2;5.1.1.2 Assumption Testing;80
10.1.1.3;5.1.1.3 Non-Parametric Analysis;81
10.1.1.4;5.1.1.4 Parametric Analysis;89
10.1.2;5.1.2 Selective Analysis;101
10.1.2.1;5.1.2.1 Descriptive Statistics;101
10.1.2.2;5.1.2.2 Assumption Testing;103
10.1.2.3;5.1.2.3 Analysis using Kruskal--Wallis and Mann--Whitney Tests;103
10.1.2.4;5.1.2.4 Rank-Transformed Analysis of Covariance;108
10.2;5.2 Analysis of Experiment Submission;117
10.2.1;5.2.1 Preliminary Analysis;117
10.2.1.1;5.2.1.1 Descriptive Statistics;117
10.2.1.2;5.2.1.2 Assumption Testing;118
10.2.1.3;5.2.1.3 Independent t-Test;119
10.2.1.4;5.2.1.4 Analysis of Variance;121
10.2.1.5;5.2.1.5 Analysis of Covariance;122
10.2.2;5.2.2 Selective Analysis;126
10.2.2.1;5.2.2.1 Descriptive Statistics;126
10.2.2.2;5.2.2.2 Assumption Testing;127
10.2.2.3;5.2.2.3 Analysis of Variance;128
10.2.2.4;5.2.2.4 Analysis of Covariance;129
10.3;5.3 Analysis of Experiment Smells&Library;132
10.3.1;5.3.1 Preliminary Analysis;133
10.3.1.1;5.3.1.1 Descriptive Statistics;133
10.3.1.2;5.3.1.2 Assumption Testing;135
10.3.1.3;5.3.1.3 Wilcoxon Signed-Rank Test;135
10.3.2;5.3.2 Selective Analysis;137
10.3.2.1;5.3.2.1 Descriptive Statistics;137
10.3.2.2;5.3.2.2 Assumption Testing;137
10.3.2.3;5.3.2.3 Wilcoxon Signed-Rank Test;139
10.4;5.4 Instead of Summary;141
11;to 6 Effect on the Number of Acceptance Tests Passed per Hour ;142
11.1;6.1 Analysis of Experiment Accounting;142
11.1.1;6.1.1 Descriptive Statistics;143
11.1.2;6.1.2 Non-Parametric Analysis;143
11.2;6.2 Analysis of Experiment Submission;144
11.2.1;6.2.1 Descriptive Statistics;144
11.2.2;6.2.2 Assumption Testing;146
11.2.3;6.2.3 Non-Parametric Analysis;146
11.2.3.1;6.2.3.1 Mann--Whitney Test;146
11.2.3.2;6.2.3.2 Rank-Transformed Analysis of Covariance;147
11.3;6.3 Analysis of Experiment Smells&Library;151
11.3.1;6.3.1 Descriptive Statistics;151
11.3.2;6.3.2 Assumption Testing;153
11.3.3;6.3.3 Non-Parametric Analysis;154
11.3.3.1;6.3.3.1 Wilcoxon Signed-Rank Test;154
11.4;6.4 Instead of Summary;155
12;to 7 Effect on Internal Quality Indicators ;156
12.1;7.1 Confounding Effect of Class Size on the Validity of Object-Oriented Metrics;156
12.2;7.2 Analysis of Experiment Accounting;157
12.2.1;7.2.1 Descriptive Statistics;157
12.2.2;7.2.2 Assumption Testing;160
12.2.3;7.2.3 Mann--Whitney Tests;160
12.2.3.1;7.2.3.1 Calculating Effect Size;161
12.2.3.2;7.2.3.2 Summary;162
12.3;7.3 Analysis of Experiment Submission;162
12.3.1;7.3.1 Descriptive Statistics;162
12.3.2;7.3.2 Assumption Testing;165
12.3.3;7.3.3 Independent t-Test;165
12.3.3.1;7.3.3.1 Calculating Effect Size;166
12.3.3.2;7.3.3.2 Summary;167
12.4;7.4 Analysis of Experiment Smells&Library;167
12.4.1;7.4.1 Descriptive Statistics;167
12.4.2;7.4.2 Assumption Testing;168
12.4.3;7.4.3 Dependent t-Test;170
12.4.3.1;7.4.3.1 Calculating Effect Size;171
12.4.3.2;7.4.3.2 Summary;172
12.5;7.5 Instead of Summary;173
13;to 8 Effects on Unit Tests -- Preliminary Analysis ;174
13.1;8.1 Analysis of Experiment Submission;175
13.1.1;8.1.1 Descriptive Statistics;175
13.1.2;8.1.2 Assumption Testing;177
13.1.3;8.1.3 Mann--Whitney Test;178
13.1.3.1;8.1.3.1 Calculating Effect Size;178
13.1.3.2;8.1.3.2 Summary;179
14;to 9 Meta-Analysis ;180
14.1;9.1 Introduction to Meta-Analysis;181
14.1.1;9.1.1 Combining p-Values Across Experiments;181
14.1.2;9.1.2 Combining Effect Sizes Across Experiments;182
14.1.2.1;9.1.2.1 Fixed Effects Model;183
14.1.2.2;9.1.2.2 Homogeneity Analysis;184
14.1.2.3;9.1.2.3 Random Effects Model;185
14.2;9.2 Preliminary Meta-Analysis;186
14.2.1;9.2.1 Combining Effects on the Percentage of Acceptance Tests Passed (PATP);186
14.2.1.1;9.2.1.1 Combining p-Values Across Experiments;186
14.2.1.2;9.2.1.2 Combining Effect Sizes Across Experiments -- Fixed Effects Model;187
14.2.1.3;9.2.1.3 Combining Effect Sizes Across Experiments -- Random Effects Model;189
14.2.1.4;9.2.1.4 Summary;189
14.2.2;9.2.2 Combining Effects on the Number of Acceptance Tests Passed Per Development Hour (NATPPH);190
14.2.2.1;9.2.2.1 Combining p-Values Across Experiments;190
14.2.2.2;9.2.2.2 Combining Effect Sizes Across Experiments -- Fixed Effects Model;190
14.2.2.3;9.2.2.3 Combining Effect Sizes Across Experiments -- Random Effects Model;191
14.2.2.4;9.2.2.4 Summary;192
14.2.3;9.2.3 Combining Effects on Design Complexity;192
14.2.3.1;9.2.3.1 Combining p-Values Across Experiments;192
14.2.3.2;9.2.3.2 Combining Effect Sizes Across Experiments -- Fixed Effects Model;194
14.2.3.3;9.2.3.3 Combining Effect Sizes Across Experiments -- Random Effects Model;196
14.2.3.4;9.2.3.4 Summary;198
14.3;9.3 Selective Meta-Analysis;199
14.3.1;9.3.1 Combining Effects on the Percentage of Acceptance Tests Passed (PATP);200
14.3.1.1;9.3.1.1 Combining p-Values Across Experiments;200
14.3.1.2;9.3.1.2 Combining Effect Sizes Across Experiments -- Fixed Effects Model;200
14.3.1.3;9.3.1.3 Combining Effect Sizes Across Experiments -- Random Effects Model;201
14.3.1.4;9.3.1.4 Summary;201
14.3.2;9.3.2 Combining Effects on the Number of Acceptance Tests Passed Per Hour (NATPPH);202
14.3.2.1;9.3.2.1 Combining p-Values Across Experiments;202
14.3.2.2;9.3.2.2 Combining Effect Sizes Across Experiments -- Fixed Effects Model;202
14.3.2.3;9.3.2.3 Combining Effect Sizes Across Experiments -- Random Effects Model;203
14.3.2.4;9.3.2.4 Summary;203
14.3.3;9.3.3 Combining Effects on Design Complexity;203
14.3.3.1;9.3.3.1 Combining p-Values Across Experiments;204
14.3.3.2;9.3.3.2 Combining Effect Sizes Across Experiments -- Fixed Effects Model;205
14.3.3.3;9.3.3.3 Combining Effect Sizes Across Experiments -- Random Effects Model;208
14.3.3.4;9.3.3.4 Summary;210
15;to 10 Discussion, Conclusions and Future Work ;211
15.1;10.1 Overview of Results;211
15.2;10.2 Rules of Thumb for Industry Practitioners;214
15.3;10.3 Explaining Plausible Mechanisms Behind the Results;216
15.4;10.4 Contributions;219
15.5;10.5 Threats to Validity;220
15.5.1;10.5.1 Statistical Conclusion Validity;220
15.5.1.1;10.5.1.1 Low Statistical Power;221
15.5.1.2;10.5.1.2 Violated Assumptions of Statistical Tests;221
15.5.1.3;10.5.1.3 Fishing and the Error Rate Problem;221
15.5.1.4;10.5.1.4 Reliability of Measures;222
15.5.1.5;10.5.1.5 Restriction of Range;222
15.5.1.6;10.5.1.6 Reliability of Treatment Implementation;222
15.5.1.7;10.5.1.7 Random Irrelevancies in Experimental Setting;222
15.5.1.8;10.5.1.8 Random Heterogeneity of Subjects;222
15.5.1.9;10.5.1.9 Inaccurate Effect Size Estimation;223
15.5.2;10.5.2 Internal Validity;223
15.5.2.1;10.5.2.1 Ambiguous Temporal Precedence;223
15.5.2.2;10.5.2.2 Selection;223
15.5.2.3;10.5.2.3 History;223
15.5.2.4;10.5.2.4 Maturity;223
15.5.2.5;10.5.2.5 Regression Artefacts;224
15.5.2.6;10.5.2.6 Attrition;224
15.5.2.7;10.5.2.7 Testing;224
15.5.2.8;10.5.2.8 Instrumentation;224
15.5.2.9;10.5.2.9 Additive and Interactive Effects of Threats;224
15.5.3;10.5.3 Construct Validity;225
15.5.3.1;10.5.3.1 Mono-Operation Bias;225
15.5.3.2;10.5.3.2 Mono-method Bias;225
15.5.3.3;10.5.3.3 Construct Confounding;225
15.5.3.4;10.5.3.4 Confounding Constructs with Levels of Constructs;226
15.5.3.5;10.5.3.5 Reactivity to the Experimental Situation and Hypothesis Guessing;226
15.5.3.6;10.5.3.6 Experimenter Expectancies;226
15.5.3.7;10.5.3.7 Compensatory Equalization;226
15.5.3.8;10.5.3.8 Compensatory Rivalry and Resentful Demoralization;226
15.5.3.9;10.5.3.9 Treatment Diffusion;227
15.5.4;10.5.4 External Validity;227
15.5.4.1;10.5.4.1 Generalization to Industrial Setting;227
15.5.4.2;10.5.4.2 Relevance to Industry;229
15.5.5;10.5.5 Threats to Validity of Meta-Analysis;230
15.5.5.1;10.5.5.1 Inadequate Conceptualization of the Problem;230
15.5.5.2;10.5.5.2 Inadequate Assessment of Study Quality;231
15.5.5.3;10.5.5.3 Publication Bias;231
15.5.5.4;10.5.5.4 Dissemination Bias;231
15.6;10.6 Conclusions and Future Work;231
16; Appendix;233
17;Glossary;237
18;References;241
19;Index;256




