1.
Exploratory Data Analysis
1.3. EDA Techniques 1.3.3. Graphical Techniques: Alphabetic 1.3.3.26. Scatter Plot
|
|||
Purpose: Check pairwise relationship between two variables conditional on a third variable |
A conditional plot, also known as a coplot or subset plot,
is a plot of two variables contional on the value of a third
variable (called the conditioning variable). The conditioning
variable may be either a variable that takes on only a few
discrete values or a continuous variable that is divided into
a limited number of subsets.
One limitation of the scatter plot matrix is that it cannot show interaction effects with another variable. This is the strength of the conditioning plot. It is also useful for displaying scatter plots for groups in the data. Although these groups can also be plotted on a single plot with different plot symbols, it can often be visually easier to distinguish the groups using the conditional plot. Although the basic concept of the conditioning plot matrix is simple, there are numerous alternatives in the details of the plots.
|
||
Sample Plot |
In this case, temperature has six distinct values. We plot torque versus time for each of these temperatures. This example is discussed in more detail in the process modeling chapter. |
||
Definition |
Given the variables Y, X, and
Z, the condition plot is formed by dividing the values
of Z into k groups. There are several ways that these
groups may be formed. There may be a natural grouping of the data,
the data may be divided into several equal sized groups, the grouping
may be determined by clusters in the data, and so on. The page
will be divided into n rows and c columns where n*c ≥ k.
Each row and column defines a single scatter plot.
The individual plot for row i and column j is defined as
|
||
Questions |
The conditioning plot can provide answers to the following
questions:
|
||
Related Techniques |
Scatter plot Scatter plot matrix Locally weighted least squares |
||
Software | Scatter plot matrices are becoming increasingly common in general purpose statistical software programs, including Dataplot. If a software program does not generate conditioning plots, but it does provide multiple plots per page and scatter plots, it should be possible to write a macro to generate a condition plot. |