1. Exploratory Data Analysis - Detailed Table of Contents [1.]
This chapter presents the assumptions, principles, and techniques
necessary to gain insight into data via EDA--exploratory data analysis.
- EDA Introduction [1.1.]
- What is EDA? [1.1.1.]
- How Does Exploratory Data Analysis differ from
Classical Data Analysis? [1.1.2.]
- Model [1.1.2.1.]
- Focus [1.1.2.2.]
- Techniques [1.1.2.3.]
- Rigor [1.1.2.4.]
- Data Treatment [1.1.2.5.]
- Assumptions [1.1.2.6.]
- How Does Exploratory Data Analysis Differ from Summary Analysis? [1.1.3.]
- What are the EDA Goals? [1.1.4.]
- The Role of Graphics [1.1.5.]
- An EDA/Graphics Example [1.1.6.]
- General Problem Categories [1.1.7.]
- EDA Assumptions [1.2.]
- Underlying Assumptions [1.2.1.]
- Importance [1.2.2.]
- Techniques for Testing Assumptions [1.2.3.]
- Interpretation of 4-Plot [1.2.4.]
- Consequences [1.2.5.]
- Consequences of Non-Randomness [1.2.5.1.]
- Consequences of Non-Fixed Location
Parameter [1.2.5.2.]
- Consequences of Non-Fixed Variation
Parameter [1.2.5.3.]
- Consequences Related to Distributional
Assumptions [1.2.5.4.]
- EDA Techniques [1.3.]
- Introduction [1.3.1.]
- Analysis Questions [1.3.2.]
- Graphical Techniques: Alphabetic [1.3.3.]
- Autocorrelation Plot [1.3.3.1.]
- Autocorrelation Plot: Random Data [1.3.3.1.1.]
- Autocorrelation Plot: Moderate Autocorrelation [1.3.3.1.2.]
- Autocorrelation Plot: Strong Autocorrelation and
Autoregressive Model [1.3.3.1.3.]
- Autocorrelation Plot: Sinusoidal Model [1.3.3.1.4.]
- Bihistogram [1.3.3.2.]
- Block Plot [1.3.3.3.]
- Bootstrap Plot [1.3.3.4.]
- Box-Cox Linearity Plot [1.3.3.5.]
- Box-Cox Normality Plot [1.3.3.6.]
- Box Plot [1.3.3.7.]
- Complex Demodulation Amplitude Plot [1.3.3.8.]
- Complex Demodulation Phase Plot [1.3.3.9.]
- Contour Plot [1.3.3.10.]
- DOE Contour Plot [1.3.3.10.1.]
- DOE Scatter Plot [1.3.3.11.]
- DOE Mean Plot [1.3.3.12.]
- DOE Standard Deviation Plot [1.3.3.13.]
- Histogram [1.3.3.14.]
- Histogram Interpretation: Normal [1.3.3.14.1.]
- Histogram Interpretation:
Symmetric, Non-Normal, Short-Tailed [1.3.3.14.2.]
- Histogram Interpretation: Symmetric, Non-Normal,
Long-Tailed [1.3.3.14.3.]
- Histogram Interpretation:
Symmetric and Bimodal [1.3.3.14.4.]
- Histogram Interpretation:
Bimodal Mixture of 2 Normals [1.3.3.14.5.]
- Histogram Interpretation:
Skewed (Non-Normal) Right [1.3.3.14.6.]
- Histogram Interpretation:
Skewed (Non-Symmetric) Left [1.3.3.14.7.]
- Histogram Interpretation:
Symmetric with Outlier [1.3.3.14.8.]
- Lag Plot [1.3.3.15.]
- Lag Plot: Random Data [1.3.3.15.1.]
- Lag Plot: Moderate Autocorrelation [1.3.3.15.2.]
- Lag Plot: Strong Autocorrelation and Autoregressive Model [1.3.3.15.3.]
- Lag Plot: Sinusoidal Models and Outliers [1.3.3.15.4.]
- Linear Correlation Plot [1.3.3.16.]
- Linear Intercept Plot [1.3.3.17.]
- Linear Slope Plot [1.3.3.18.]
- Linear Residual Standard Deviation Plot [1.3.3.19.]
- Mean Plot [1.3.3.20.]
- Normal Probability Plot [1.3.3.21.]
- Normal Probability Plot: Normally Distributed Data [1.3.3.21.1.]
- Normal Probability Plot: Data Have Short Tails [1.3.3.21.2.]
- Normal Probability Plot: Data Have Long Tails [1.3.3.21.3.]
- Normal Probability Plot: Data are Skewed Right [1.3.3.21.4.]
- Probability Plot [1.3.3.22.]
- Probability Plot Correlation Coefficient Plot [1.3.3.23.]
- Quantile-Quantile Plot [1.3.3.24.]
- Run-Sequence Plot [1.3.3.25.]
- Scatter Plot [1.3.3.26.]
- Scatter Plot: No Relationship [1.3.3.26.1.]
- Scatter Plot: Strong Linear (positive correlation) Relationship [1.3.3.26.2.]
- Scatter Plot: Strong Linear (negative correlation)
Relationship [1.3.3.26.3.]
- Scatter Plot:
Exact Linear (positive correlation) Relationship [1.3.3.26.4.]
- Scatter Plot: Quadratic Relationship [1.3.3.26.5.]
- Scatter Plot: Exponential Relationship [1.3.3.26.6.]
- Scatter Plot: Sinusoidal Relationship (damped) [1.3.3.26.7.]
- Scatter Plot:
Variation of Y Does Not Depend on X (homoscedastic) [1.3.3.26.8.]
- Scatter Plot:
Variation of Y Does Depend on X (heteroscedastic) [1.3.3.26.9.]
- Scatter Plot: Outlier [1.3.3.26.10.]
- Scatterplot Matrix [1.3.3.26.11.]
- Conditioning Plot [1.3.3.26.12.]
- Spectral Plot [1.3.3.27.]
- Spectral Plot: Random Data [1.3.3.27.1.]
- Spectral Plot: Strong Autocorrelation and Autoregressive Model [1.3.3.27.2.]
- Spectral Plot: Sinusoidal Model [1.3.3.27.3.]
- Standard Deviation Plot [1.3.3.28.]
- Star Plot [1.3.3.29.]
- Weibull Plot [1.3.3.30.]
- Youden Plot [1.3.3.31.]
- DOE Youden Plot [1.3.3.31.1.]
- 4-Plot [1.3.3.32.]
- 6-Plot [1.3.3.33.]
- Graphical Techniques: By Problem Category [1.3.4.]
- Quantitative Techniques [1.3.5.]
- Measures of Location [1.3.5.1.]
- Confidence Limits for the Mean [1.3.5.2.]
- Two-Sample t-Test for Equal Means [1.3.5.3.]
- Data Used for Two-Sample t-Test [1.3.5.3.1.]
- One-Factor ANOVA [1.3.5.4.]
- Multi-factor Analysis of Variance [1.3.5.5.]
- Measures of Scale [1.3.5.6.]
- Bartlett's Test [1.3.5.7.]
- Chi-Square Test for the Standard Deviation [1.3.5.8.]
- Data Used for Chi-Square Test for the Standard Deviation [1.3.5.8.1.]
- F-Test for Equality of Two Standard Deviations [1.3.5.9.]
- Levene Test for Equality of Variances [1.3.5.10.]
- Measures of Skewness and Kurtosis [1.3.5.11.]
- Autocorrelation [1.3.5.12.]
- Runs Test for Detecting Non-randomness [1.3.5.13.]
- Anderson-Darling Test [1.3.5.14.]
- Chi-Square Goodness-of-Fit Test [1.3.5.15.]
- Kolmogorov-Smirnov Goodness-of-Fit Test [1.3.5.16.]
- Grubbs' Test for Outliers [1.3.5.17.]
- Yates Analysis [1.3.5.18.]
- Defining Models and Prediction Equations [1.3.5.18.1.]
- Important Factors [1.3.5.18.2.]
- Probability Distributions [1.3.6.]
- What is a Probability Distribution [1.3.6.1.]
- Related Distributions [1.3.6.2.]
- Families of Distributions [1.3.6.3.]
- Location and Scale Parameters [1.3.6.4.]
- Estimating the Parameters of a Distribution [1.3.6.5.]
- Method of Moments [1.3.6.5.1.]
- Maximum Likelihood [1.3.6.5.2.]
- Least Squares [1.3.6.5.3.]
- PPCC and Probability Plots [1.3.6.5.4.]
- Gallery of Distributions [1.3.6.6.]
- Normal Distribution [1.3.6.6.1.]
- Uniform Distribution [1.3.6.6.2.]
- Cauchy Distribution [1.3.6.6.3.]
- t Distribution [1.3.6.6.4.]
- F Distribution [1.3.6.6.5.]
- Chi-Square Distribution [1.3.6.6.6.]
- Exponential Distribution [1.3.6.6.7.]
- Weibull Distribution [1.3.6.6.8.]
- Lognormal Distribution [1.3.6.6.9.]
- Fatigue Life Distribution [1.3.6.6.10.]
- Gamma Distribution [1.3.6.6.11.]
- Double Exponential Distribution [1.3.6.6.12.]
- Power Normal Distribution [1.3.6.6.13.]
- Power Lognormal Distribution [1.3.6.6.14.]
- Tukey-Lambda Distribution [1.3.6.6.15.]
- Extreme Value Type I Distribution [1.3.6.6.16.]
- Beta Distribution [1.3.6.6.17.]
- Binomial Distribution [1.3.6.6.18.]
- Poisson Distribution [1.3.6.6.19.]
- Tables for Probability Distributions [1.3.6.7.]
- Cumulative Distribution Function of the
Standard Normal Distribution [1.3.6.7.1.]
- Upper Critical Values of the Student's-t
Distribution [1.3.6.7.2.]
- Upper Critical Values of the F
Distribution [1.3.6.7.3.]
- Critical Values of the Chi-Square
Distribution [1.3.6.7.4.]
- Critical Values of the t*
Distribution [1.3.6.7.5.]
- Critical Values of the Normal PPCC
Distribution [1.3.6.7.6.]
- EDA Case Studies [1.4.]
- Case Studies Introduction [1.4.1.]
- Case Studies [1.4.2.]
- Normal Random Numbers [1.4.2.1.]
- Background and Data [1.4.2.1.1.]
- Graphical Output and Interpretation [1.4.2.1.2.]
- Quantitative Output and Interpretation [1.4.2.1.3.]
- Work This Example Yourself [1.4.2.1.4.]
- Uniform Random Numbers [1.4.2.2.]
- Background and Data [1.4.2.2.1.]
- Graphical Output and Interpretation [1.4.2.2.2.]
- Quantitative Output and Interpretation [1.4.2.2.3.]
- Work This Example Yourself [1.4.2.2.4.]
- Random Walk [1.4.2.3.]
- Background and Data [1.4.2.3.1.]
- Test Underlying Assumptions [1.4.2.3.2.]
- Develop A Better Model [1.4.2.3.3.]
- Validate New Model [1.4.2.3.4.]
- Work This Example Yourself [1.4.2.3.5.]
- Josephson Junction Cryothermometry [1.4.2.4.]
- Background and Data [1.4.2.4.1.]
- Graphical Output and Interpretation [1.4.2.4.2.]
- Quantitative Output and Interpretation [1.4.2.4.3.]
- Work This Example Yourself [1.4.2.4.4.]
- Beam Deflections [1.4.2.5.]
- Background and Data [1.4.2.5.1.]
- Test Underlying Assumptions [1.4.2.5.2.]
- Develop a Better Model [1.4.2.5.3.]
- Validate New Model [1.4.2.5.4.]
- Work This Example Yourself [1.4.2.5.5.]
- Filter Transmittance [1.4.2.6.]
- Background and Data [1.4.2.6.1.]
- Graphical Output and Interpretation [1.4.2.6.2.]
- Quantitative Output and Interpretation [1.4.2.6.3.]
- Work This Example Yourself [1.4.2.6.4.]
- Standard Resistor [1.4.2.7.]
- Background and Data [1.4.2.7.1.]
- Graphical Output and Interpretation [1.4.2.7.2.]
- Quantitative Output and Interpretation [1.4.2.7.3.]
- Work This Example Yourself [1.4.2.7.4.]
- Heat Flow Meter 1 [1.4.2.8.]
- Background and Data [1.4.2.8.1.]
- Graphical Output and Interpretation [1.4.2.8.2.]
- Quantitative Output and Interpretation [1.4.2.8.3.]
- Work This Example Yourself [1.4.2.8.4.]
- Fatigue Life of Aluminum Alloy Specimens [1.4.2.9.]
- Background and Data [1.4.2.9.1.]
- Graphical Output and Interpretation [1.4.2.9.2.]
- Ceramic Strength [1.4.2.10.]
- Background and Data [1.4.2.10.1.]
- Analysis of the Response Variable [1.4.2.10.2.]
- Analysis of the Batch Effect [1.4.2.10.3.]
- Analysis of the Lab Effect [1.4.2.10.4.]
- Analysis of Primary Factors [1.4.2.10.5.]
- Work This Example Yourself [1.4.2.10.6.]
- References For Chapter 1: Exploratory Data
Analysis [1.4.3.]
|