1.
Exploratory Data Analysis
1.1. EDA Introduction
|
|||
Primary and Secondary Goals |
The primary goal of EDA is to maximize the analyst's insight into
a data set and into the underlying structure of a data set,
while providing all of the specific items that an
analyst would want to extract from a data set, such as:
|
||
Insight into the Data |
Insight implies detecting and uncovering underlying structure in
the data. Such underlying structure may not be encapsulated in
the list of items above; such items serve as the specific targets
of an analysis, but the real insight and "feel" for a data set
comes as the analyst judiciously probes and explores the various
subtleties of the data. The "feel" for the data comes almost
exclusively from the application of various graphical techniques,
the collection of which serves as the window into the essence of
the data. Graphics are irreplaceable--there are no quantitative
analogues that will give the same insight as well-chosen graphics.
To get a "feel" for the data, it is not enough for the analyst to know what is in the data; the analyst also must know what is not in the data, and the only way to do that is to draw on our own human pattern-recognition and comparative abilities in the context of a series of judicious graphical techniques applied to the data. |