E-Book, Englisch, 354 Seiten
Thas Comparing Distributions
1. Auflage 2010
ISBN: 978-0-387-92710-7
Verlag: Springer US
Format: PDF
Kopierschutz: 1 - PDF Watermark
E-Book, Englisch, 354 Seiten
Reihe: Springer Series in Statistics
ISBN: 978-0-387-92710-7
Verlag: Springer US
Format: PDF
Kopierschutz: 1 - PDF Watermark
Provides a self-contained comprehensive treatment of both one-sample and K-sample goodness-of-fit methods by linking them to a common theory backbone Contains many data examples, including R-code and a specific R-package for comparing distributions Emphesises informative statistical analysis rather than plain statistical hypothesis testing
Autoren/Hrsg.
Weitere Infos & Material
1;Preface;6
2;Contents;11
3;Part I One-Sample Problems;17
3.1;1 Introduction;18
3.1.1;1.1 The History of the One-Sample GOF Problem;18
3.1.2;1.2 Example Datasets;19
3.1.2.1;1.2.1 Pseudo-Random Generator Data;19
3.1.2.2;1.2.2 PCB Concentration Data;20
3.1.2.3;1.2.3 Pulse Rate Data;20
3.1.2.4;1.2.4 Cultivars Data;21
3.1.3;1.3 The Pearson Chi-Squared Test;23
3.1.3.1;1.3.1 Pearson Chi-Squared Test for the Multinomial Distribution;23
3.1.3.1.1;1.3.1.1 The Simple Null Hypothesis Case;23
3.1.3.1.2;1.3.1.2 The Composite Null Hypothesis Case;25
3.1.3.2;1.3.2 Generalisations of the Pearson 2 Test;28
3.1.3.3;1.3.3 A Note on the Nuisance Parameter Estimation;29
3.1.4;1.4 Pearson X2 Tests for Continuous Distributions;30
3.2;2 Preliminaries (Building Blocks);33
3.2.1;2.1 The Empirical Distribution Function;33
3.2.1.1;2.1.1 Definition and Construction;33
3.2.1.2;2.1.2 Rationale for Using the EDF;35
3.2.2;2.2 Empirical Processes;36
3.2.2.1;2.2.1 Definition;36
3.2.2.2;2.2.2 Weak Convergence;37
3.2.2.3;2.2.3 Kac--Siegert Decomposition of Gausian Processes;38
3.2.3;2.3 The Quantile Function and the Quantile Process;41
3.2.3.1;2.3.1 The Quantile Function and Its Estimator;41
3.2.3.2;2.3.2 The Quantile Process;42
3.2.4;2.4 Comparison Distribution;43
3.2.5;2.5 Hilbert Spaces;44
3.2.6;2.6 Orthonormal Functions;47
3.2.6.1;2.6.1 The Fourier Basis;47
3.2.6.2;2.6.2 Orthonormal Polynomials;47
3.2.7;2.7 Parameter Estimation;48
3.2.7.1;2.7.1 Locally Asymptotically Linear Estimators;48
3.2.7.2;2.7.2 Method of Moments Estimators;49
3.2.7.3;2.7.3 Efficiency and Semiparametric Inference;50
3.2.8;2.8 Nonparametric Density Estimation;51
3.2.8.1;2.8.1 Introduction;51
3.2.8.2;2.8.2 Orthogonal Series Estimators;53
3.2.8.3;2.8.3 Kernel Density Estimation;56
3.2.8.4;2.8.4 Regression-Based Density Estimation;56
3.2.9;2.9 Hypothesis Testing;56
3.2.9.1;2.9.1 General Construction of a Hypothesis Test;57
3.2.9.2;2.9.2 Optimality Criteria;58
3.2.9.2.1;2.9.2.1 Finite Sample Criteria;58
3.2.9.2.2;2.9.2.2 Asymptotic Criteria;59
3.2.9.3;2.9.3 The Neyman--Pearson Lemma;61
3.3;3 Graphical Tools;62
3.3.1;3.1 Histograms and Box Plots;62
3.3.1.1;3.1.1 The Histogram;62
3.3.1.1.1;3.1.1.1 The Construction;62
3.3.1.1.2;3.1.1.2 Some Properties;63
3.3.1.1.3;3.1.1.3 Regression-Based Density Estimation;65
3.3.1.2;3.1.2 The Box Plot;65
3.3.2;3.2 Probability Plots and Comparison Distribution;69
3.3.2.1;3.2.1 Population Probability Plots;69
3.3.2.2;3.2.2 PP and QQ plots;70
3.3.3;3.3 Comparison Distribution;75
3.3.3.1;3.3.1 Population Comparison Distributions;75
3.3.3.1.1;3.3.1.1 Definition and Interpretation;75
3.3.3.1.2;3.3.1.2 Decomposition of the Comparison Density;76
3.3.3.2;3.3.2 Empirical Comparison Distributions;81
3.3.3.2.1;3.3.2.1 Estimators of the Comparison Density;81
3.3.3.2.2;3.3.2.2 Confidence Intervals of the Comparison Density;82
3.3.3.3;3.3.3 Comparison Distribution for Discrete Data;86
3.4;4 Smooth Tests;89
3.4.1;4.1 Smooth Models;89
3.4.1.1;4.1.1 Construction of the Smooth Model;89
3.4.2;4.2 Smooth Tests;94
3.4.2.1;4.2.1 Simple Null Hypotheses;94
3.4.2.1.1;4.2.1.1 Test Statistics and Null Distributions;94
3.4.2.1.2;4.2.1.2 Interpretation of Components;95
3.4.2.1.3;4.2.1.3 Interpretation of Components when Orthonormal Polynomials Are Used;96
3.4.2.2;4.2.2 Composite Null Hypotheses;100
3.4.2.2.1;4.2.2.1 Maximum Likelihood and Method of Moments Estimators;100
3.4.2.2.2;4.2.2.2 The Efficient Score Test;102
3.4.2.2.3;4.2.2.3 The Generalised Score Test;104
3.4.3;4.3 Adaptive Smooth Tests;107
3.4.3.1;4.3.1 Consistency, Dilution Effects and Order Selection;107
3.4.3.2;4.3.2 Order Selection Within a Finite Horizon;110
3.4.3.3;4.3.3 Order Selection Within an Infinite Horizon;114
3.4.3.4;4.3.4 Subset Selection Within a Finite Horizon;115
3.4.3.5;4.3.5 Improved Density Estimates;119
3.4.4;4.4 Smooth Tests for Discrete Distributions;120
3.4.4.1;4.4.1 Introduction;120
3.4.4.2;4.4.2 The Simple Null Hypothesis Case;120
3.4.4.3;4.4.3 The Composite Null Hypothesis Case;121
3.4.5;4.5 A Semiparametric Framework;123
3.4.5.1;4.5.1 The Semiparametric Hypotheses;123
3.4.5.2;4.5.2 Semiparametric Tests;124
3.4.5.3;4.5.3 A Distance Function;126
3.4.5.4;4.5.4 Interpretation and Estimation of the Nuisance Parameter;126
3.4.5.5;4.5.5 The Quadratic Inference Function;127
3.4.5.6;4.5.6 Relation with the Empirically Rescaled Smooth Tests;128
3.4.6;4.6 Example;129
3.4.7;4.7 Some Practical Guidelines for Smooth Tests;133
3.5;5 Methods Based on the Empirical Distribution Function;135
3.5.1;5.1 The Kolmogorov--Smirnov Test;135
3.5.1.1;5.1.1 Definition;135
3.5.1.2;5.1.2 Null Distribution;137
3.5.1.3;5.1.3 Presence of Nuisance Parameters;139
3.5.2;5.2 Tests as Integrals of Empirical Processes;141
3.5.2.1;5.2.1 The Anderson--Darling Statistics;141
3.5.2.2;5.2.2 Principal Components Decomposition of the Test Statistic;142
3.5.2.2.1;5.2.2.1 Principal Components Decomposition of the Cramér--von Mises Statistic (Simple Null);143
3.5.2.2.2;5.2.2.2 Principal Components Decomposition of the Anderson--Darling Statistic (Simple Null);145
3.5.2.2.3;5.2.2.3 Principal Components Decompositions for Composite Null Hypotheses;146
3.5.2.3;5.2.3 Null Distribution;149
3.5.2.4;5.2.4 The Watson Test;154
3.5.2.4.1;5.2.4.1 The Test Statistic;154
3.5.2.4.2;5.2.4.2 Principal Components Decomposition of the Watson Statistic (Simple Null);155
3.5.2.4.3;5.2.4.3 Null Distribution (Simple Null);156
3.5.3;5.3 Generalisations of EDF Tests;156
3.5.3.1;5.3.1 Tests Based on the Empirical Quantile Function(EQF);157
3.5.3.1.1;5.3.1.1 The Empirical Quantile Function;157
3.5.3.1.2;5.3.1.2 EQF Tests for the Simple Null Hypothesis;158
3.5.3.1.3;5.3.1.3 EQF Tests for Location-Scale Distributions;160
3.5.3.2;5.3.2 Tests Based on the Empirical Characteristic Function (ECF);163
3.5.3.3;5.3.3 Miscellaneous Tests Based on Empirical Functionals of F;165
3.5.4;5.4 The Sample Space Partition Tests;167
3.5.4.1;5.4.1 Another Look at the Anderson--Darling Statistic;167
3.5.4.2;5.4.2 The Sample Space Partition Test;167
3.5.5;5.5 Some Further Bibliographic Notes;170
3.5.6;5.6 Some Practical Guidelines for EDF Tests;171
4;Part II Two-Sample and K-Sample Problems;173
4.1;6 Introduction;174
4.1.1;6.1 The Problem Defined;175
4.1.1.1;6.1.1 The Null Hypothesis of the General Two-Sample Problem;175
4.1.1.2;6.1.2 The Null Hypothesis of the General K-SampleProblem;176
4.1.2;6.2 Example Datasets;177
4.1.2.1;6.2.1 Gene Expression in Colorectal Cancer Patients;177
4.1.2.2;6.2.2 Travel Times;178
4.2;7 Preliminaries (Building Blocks);181
4.2.1;7.1 Permutation Tests;181
4.2.1.1;7.1.1 Introduction by Example;181
4.2.1.2;7.1.2 Some Permutation and Randomisation Test Theory;185
4.2.1.2.1;7.1.2.1 Definitions;185
4.2.1.2.2;7.1.2.2 Construction of the Permutation Test;186
4.2.1.2.3;7.1.2.3 Monte Carlo Approximation to the Exact Permutation Null Distribution;187
4.2.2;7.2 Linear Rank Tests;189
4.2.2.1;7.2.1 Simple Linear Rank Statistics;189
4.2.2.1.1;7.2.1.1 Ranks and Order Statistics;189
4.2.2.1.2;7.2.1.2 Simple Linear Rank Statistics;192
4.2.2.1.3;7.2.1.3 Score Generating Functions;194
4.2.2.1.4;7.2.1.4 The Rank Score Process;195
4.2.2.2;7.2.2 Locally Most Powerful Linear Rank Tests;197
4.2.2.2.1;7.2.2.1 Locally Most Powerful Linear Rank Tests for General Alternatives;197
4.2.2.3;7.2.3 Adaptive Linear Rank Tests;200
4.2.3;7.3 The Pooled Empirical Distribution Function;200
4.2.4;7.4 The Comparison Distribution;201
4.2.5;7.5 The Quantile Process;202
4.2.5.1;7.5.1 Contrast Processes;202
4.2.5.2;7.5.2 Comparison Distribution Processes;204
4.2.5.2.1;7.5.2.1 Construction;204
4.2.5.2.2;7.5.2.2 Weak Convergence;205
4.2.6;7.6 Stochastic Ordering and Related Properties;206
4.3;8 Graphical Tools;210
4.3.1;8.1 PP and QQ Plots;210
4.3.1.1;8.1.1 Population Plots;210
4.3.1.1.1;8.1.1.1 Population QQ Plot;210
4.3.1.1.2;8.1.1.2 Population PP Plot;212
4.3.1.2;8.1.2 Empirical PP and QQ Plots;214
4.3.1.2.1;8.1.2.1 Construction;214
4.3.1.2.2;8.1.2.2 Sample Size Issues;215
4.3.1.2.3;8.1.2.3 When to Use Which Plot;218
4.3.2;8.2 Comparisons Distributions;222
4.3.2.1;8.2.1 The Population Comparison Distribution;222
4.3.2.2;8.2.2 The Empirical Comparison Distribution;222
4.4;9 Some Important Two-Sample Tests;229
4.4.1;9.1 The Relation Between Statistical Tests and Hypotheses;230
4.4.1.1;9.1.1 Introduction;230
4.4.2;9.2 The Wilcoxon Rank Sum and the Mann--Whitney Tests;233
4.4.2.1;9.2.1 Introduction;233
4.4.2.2;9.2.2 The Hypotheses;234
4.4.2.3;9.2.3 The Test Statistics;235
4.4.2.4;9.2.4 The Null Distribution;236
4.4.2.5;9.2.5 The WMW Test as a LMPRT;238
4.4.2.6;9.2.6 The MW Statistic as an Estimator of ;240
4.4.2.7;9.2.7 The Hodges--Lehmann Estimator;242
4.4.2.8;9.2.8 Examples;242
4.4.3;9.3 The Diagnostic Property of Two-Sample Tests;251
4.4.3.1;9.3.1 The Semiparametric Framework;252
4.4.3.2;9.3.2 Natural and Implied Null Hypotheses;254
4.4.3.3;9.3.3 The WMW Test in the Semiparametric Framework;254
4.4.3.3.1;9.3.3.1 Implied Null Hypothesis;255
4.4.3.3.2;9.3.3.2 Null Distributions;255
4.4.3.4;9.3.4 Empirical Variance Estimators of Simple Linear Rank Statistics;258
4.4.3.4.1;9.3.4.1 The Asymptotic Variance of a Simple Linear Rank Statistic;258
4.4.3.4.2;9.3.4.2 The Jackknife Estimator of the Asymptotic Variance;260
4.4.4;9.4 Optimal Linear Rank Tests for Normal Location-ShiftModels;261
4.4.5;9.5 Rank Tests for Scale Differences;262
4.4.5.1;9.5.1 The Scale-Difference Model;263
4.4.5.2;9.5.2 The Capon and Klotz Tests;264
4.4.5.3;9.5.3 Some Other Important Tests;265
4.4.5.3.1;9.5.3.1 Measures for Differences in Scale;265
4.4.5.3.2;9.5.3.2 The Ansari--Bradley Test;267
4.4.5.3.3;9.5.3.3 The Shukatme Test;269
4.4.5.3.4;9.5.3.4 The Mood Test;270
4.4.5.3.5;9.5.3.5 The Lehmann Test;272
4.4.5.3.6;9.5.3.6 The Fligner--Killeen Test;272
4.4.5.4;9.5.4 Conclusion;273
4.4.6;9.6 The Kruskal--Wallis Test and the ANOVA F-Test;273
4.4.6.1;9.6.1 The Hypotheses and the Test Statistic;274
4.4.6.2;9.6.2 The Null Distribution;275
4.4.6.3;9.6.3 The Diagnostic Property;275
4.4.6.4;9.6.4 The F-Test in ANOVA;276
4.4.7;9.7 Some Final Remarks;277
4.4.7.1;9.7.1 Adaptive Tests;277
4.4.7.2;9.7.2 The Lepage Test;278
4.5;10 Smooth Tests;279
4.5.1;10.1 Smooth Tests for the 2-Sample Problem;279
4.5.1.1;10.1.1 Smooth Models and the Smooth Test;279
4.5.1.1.1;10.1.1.1 Smooth Models;279
4.5.1.1.2;10.1.1.2 Smooth Test Statistic and the Null Distribution;282
4.5.1.2;10.1.2 Components;283
4.5.1.2.1;10.1.2.1 The First Component: WMW Statistic;284
4.5.1.2.2;10.1.2.2 The Second Component: Mood Statistic;284
4.5.1.2.3;10.1.2.3 The Third Component: the SKEW Statistic;285
4.5.1.2.4;10.1.2.4 The Fourth Component: the KURT Statistic;286
4.5.2;10.2 The Diagnostic Property;286
4.5.2.1;10.2.1 Examples;287
4.5.3;10.3 Smooth Tests for the K-Sample Problem;290
4.5.3.1;10.3.1 Smooth Models and the Smooth Test;290
4.5.3.2;10.3.2 Components;294
4.5.4;10.4 Adaptive Smooth Tests;296
4.5.4.1;10.4.1 Order Selection and Subset Selection with a Finite Horizon;296
4.5.4.2;10.4.2 Order Selection with an Infinite Horizon;297
4.5.5;10.5 Examples;298
4.5.6;10.6 Smooth Tests That Are Not Based on Ranks;302
4.5.7;10.7 Some Practical Guidelines for Smooth Tests;303
4.6;11 Methods Based on the Empirical Distribution Function;305
4.6.1;11.1 The Two-Sample and K-Sample Kolmogorov--Smirnov Test;305
4.6.1.1;11.1.1 The Kolmogorov--Smirnov Test for the Two-Sample Problem;305
4.6.1.1.1;11.1.1.1 The Test Statistic;305
4.6.1.1.2;11.1.1.2 The Null Distribution;306
4.6.1.2;11.1.2 The Kolmogorov--Smirnov Test for the K-Sample Problem;307
4.6.2;11.2 Tests of the Anderson--Darling Type;307
4.6.2.1;11.2.1 The Test Statistic;307
4.6.2.2;11.2.2 The Components;309
4.6.2.3;11.2.3 The Null Distribution;311
4.6.2.4;11.2.4 Examples;312
4.6.3;11.3 Adaptive Tests of Neuhaus;314
4.6.3.1;11.3.1 The General Idea;314
4.6.3.2;11.3.2 Smooth Tests;316
4.6.3.3;11.3.3 EDF tests;316
4.6.4;11.4 Some Practical Guidelines for EDF Tests;317
4.7;12 Two Final Methods and Some Final Thoughts;319
4.7.1;12.1 A Contigency Table Approach;319
4.7.2;12.2 The Sample Space Partition Tests;321
4.7.3;12.3 Some Final Thoughts and Conclusions;323
5;A Proofs;328
5.1;A.1 Proof of Theorem 1.1;328
5.2;A.2 Proof of Theorem 1.2;329
5.3;A.3 Proof of Theorem 4.1;330
5.4;A.4 Proof of Lemma 4.1;331
5.5;A.5 Proof of Lemma 4.2;332
5.6;A.6 Proof of Lemma 4.3;332
5.7;A.7 Proof of Theorem 4.10;333
5.8;A.8 Proof of Theorem 4.2;333
5.9;A.9 Heuristic Proof of Theorem 5.2;338
5.10;A.10 Proof of Theorem 9.1;339
6;B The Bootstrap and Other Simulation Techniques;341
6.1;B.1 Simulation of EDF Statistics Under the Simple Null Hypothesis;341
6.2;B.2 The Parametric Bootstrap for Composite Null Hypotheses;342
6.3;B.3 A Modified Nonparametric Bootstrap for Testing Semiparametric Null Hypotheses;342
7;References;344
8;Index;355




