1.
Exploratory Data Analysis
1.3. EDA Techniques 1.3.3. Graphical Techniques: Alphabetic
|
|||
Purpose: Check for Relationship |
A scatter plot (Chambers 1983) reveals relationships or association between two variables. Such relationships manifest themselves by any non-random structure in the plot. Various common types of patterns are demonstrated in the examples. | ||
Sample Plot: Linear Relationship Between Variables Y and X |
This sample plot of the Alaska pipeline data reveals a linear relationship between the two variables indicating that a linear regression model might be appropriate. |
||
Definition: Y Versus X |
A scatter plot is a plot of the values of Y versus the
corresponding values of X:
|
||
Questions |
Scatter plots can provide answers to the following questions:
|
||
Examples |
|
||
Combining Scatter Plots |
Scatter plots can also be combined in multiple plots per page to
help understand higher-level structure in data sets with more than
two variables.
The scatterplot matrix generates all pairwise scatter plots on a single page. The conditioning plot, also called a co-plot or subset plot, generates scatter plots of Y versus X dependent on the value of a third variable. |
||
Causality Is Not Proved By Association |
The scatter plot uncovers relationships in
data. "Relationships" means that there is some structured
association (linear, quadratic, etc.) between X and Y.
Note, however, that even though
causality implies association
association does NOT imply causality.
|
||
Appearance |
The most popular rendition of a scatter plot is
Other scatter plot format variants include
In both cases, the resulting plot is referred to as a scatter plot, although the former (discrete and disconnected) is the author's personal preference since nothing makes it onto the screen except the data--there are no interpolative artifacts to bias the interpretation. |
||
Related Techniques |
Run Sequence Plot Box Plot Block Plot |
||
Case Study | The scatter plot is demonstrated in the load cell calibration data case study. | ||
Software | Scatter plots are a fundamental technique that should be available in any general purpose statistical software program. Scatter plots are also available in most graphics and spreadsheet programs as well. |