Data Analysis
Data analysis cannot be done automatically.
Only you have both the domain expertise and the uniquely human capabilities of
organization, decomposition, synthesis, generalization, induction, proposition,
inference, deduction, thought, and rationalization which can be applied to the
data to acquire knowledge. However, tools can be used to facilitate this analysis.
Data analysis can be approached in a number of ways, but I see them all as being
one of three general methodologies: Reductive, mathematical, and visual.
You can, of course, mix and match these methodologies as you wish
and that
is a wise choice for many projects. However, I describe each individually
here to clarify where VisiCube falls in this spectrum.
Reductive Analysis
Reductive data analysis is a methodology in which individual facts, or aggregations
of those facts, are used as the basis for analysis. Included in this type of
methodology are summary and simple statistical methods.
|
This methodology can be argued not to be analysis at all.
When used alone, it results in a simple reduction of the data set to one or
more statistics which, often, do not adequately represent the underlying phenomena.
Suffice it to say that anyone who has lived in both Chicago and Seattle
can testify to the misleading nature of the annual mean temperature.
|
| Temp |
Mean |
| Chicago | 51 |
| Seattle | 51 |
|
Tools which facilitate this type of analysis include query and report tools.
These are the simplest of analysis tools in that they provide the data without
providing any assistance in the way of modeling.
Mathematical Analysis
Mathematical, sometimes referred to as classical, data analysis is a methodology
in which mathematical models are applied to the data and used as the basis for analysis.
|
The general approach is to apply a model and then test the accuracy and applicability
of the model through analysis. If it is found wanting, a new model is tried.
|
Included in this type of methodology are complex statistical and Bayesian methods.
Mathematical modeling is an important technique in the study of data because
it lets you reduce unmanageable masses of data to models which can be used to
make predictions about the underlying phenomena and understand such attributes
of the data as normality and linearity. This is especially powerful for those
phenomena which are truly mathematical in nature.
Tools which facilitate this type of analysis include those that support various
mathematical modeling techniques such as parabolic or least-squares curve fitting
and regression analysis. Such tools, while providing powerful computational
facilities, can be difficult to use. The mathematics that is involved can be
highly complex and requires great care in its application.
Non-experts are often left to utilize such methods by rote without the ability
to understand their applicability to specific data.
Such tools have often failed to gain broad acceptance in the analysis world
because most researchers are not sophisticated enough (mathematically) to use them.
|
Visual Analysis
Visual data analysis is a methodology in which the data, as a whole,
is used as the basis for analysis. The data is presented visually and any modeling
that occurs is done as a result of the analysis of those visuals.
And, unlike mathematical analysis, the model that may be used need not be mathematical at all.
Included in this type of methodology are various graphical methods.
|
|
Visual analysis is especially powerful because it matches our natural abilities to interpret
data holistically and exposes attributes of the data (such as patterns, trends,
structure, and exceptions) which can be hidden in models.
In his book, Visualizing Data (Hobart Press, 1993), William S. Cleveland, a leading
researcher in the visualization of data, states the importance of visualization
methodologies (even for data that can be modeled mathematically):
Visualization is critical to data analysis. It provides a front line of
attack, revealing intricate structure in data that cannot be absorbed in any other way.
Tools which facilitate this type of analysis include those that present the data
visually so that you might be able to infer or deduce a model,
especially a non-mathematical model, to explain the data.
VisiCube
VisiCube is entirely a visual analysis tool.
Unlike most other tools that support visual analysis, VisiCube has no
mathematical modeling mechanisms in it. Though some will consider this a limitation,
I have purposely done this to provide a purely visual tool which has none of
the complexity or distractions of a mathematical tool. It is, therefore,
far easier to utilize, especially for those less than comfortable with
advanced mathematics, and simpler to become productive with
(because the learning curve is much shorter).
|