It is often the case that some effort must be made to focus your attention
on pertinent aspects of your data before true analysis can begin.
This is almost universally true for large data sets, especially that data
which was not gathered in a controlled or focused manner.
But it is often also true of even small data sets gathered with very
rigid and specific techniques. Even very small data sets have myriad subsets,
each of which might be especially pertinent to a given study.
Narrowing your focus so that you can thoroughly analyze data is problematic
because you may lose important perspectives in doing so.
But it remains an important task. The question is how to go about it.
In general, there are two methodologies available and they can generally
be described as automatic and manual. More commonly, these are called
data mining and data exploration, respectively.
Though these terms are not especially well-defined in the general literature,
they are commonly used and I attempt to clarify these here to help you
understand VisiCube's usefulness as well as my own philosophy.
Data mining, as well as its cousin data prospecting, is a term which is
abused in every day usage
sometimes being used synonymously with
"data analysis", just sounding more interesting.
However, technically speaking, they are not the same.
Data mining is a methodology typically brought to bear on large data sets,
even entire databases, to discover interesting aspects of that data set
which should be further analyzed. Though usually guided by human-specified
parameters, the mechanisms are automated algorithms that may include aspects
of artificial intelligence and machine learning.
Such automation is necessary to make the task of mining feasible
when the size of the data set is very large.
But the methodology can be used on data sets of any size.
Running with the analogy a bit, it can be said that the site of the deposits is vast and
some sort of automated mechanism is needed to locate the desired deposit within that site.
Once that deposit has been located, the more manual process of extracting the ore can begin.
Data prospecting adds an additional step. In some literature, the additional
step is that of locating the site within a vast land. In other literature, the
additional step is that of determining the type of deposit that is located in the site.
But, in either case, this is an activity that precedes the actual extraction.
Stated simply, data mining is a methodology using automated techniques to bring
focus to the pertinent parts of a data set. Having no such automated mechanisms,
VisiCube is not a data mining (or data prospecting) tool.
Data exploration, on the other hand, is a methodology in which manual
techniques are utilized to find one's way through a data set and bring
important aspects of that data into focus for further analysis.
Though such a methodology can be applied to data sets of any size or type,
its manual nature makes it more reasonable for smaller data sets,
especially those in which the data has been carefully gathered and constructed.
Of course, a major advantage of a manual approach is that the mechanisms
utilized do not, by design, prevent you from exploring particular aspects of your data.
The automated methods of data mining are forever limited by their particular design.
As with data mining, there are no specifications as to how these methodologies
are to be implemented. But the analogy to actual exploration is very enlightening.
An exploration is an activity in which any of a great number of paths and techniques
might be utilized
and it may take place over a very long period of time.
The key to managing such an exploration is to be organized.
Keeping records about the exploration, recording your thoughts and ideas along the way,
and organizing your findings are all important.
This is a complex undertaking, though possibly very rewarding.
VisiCube is a data exploration tool.
This is in addition to its capabilities in data analysis.
And it is something that really makes VisiCube stand out from its competitors.
VisiCube is designed to enable you to record any point in your exploration
with a single click of the mouse.
This paradigm allows you to operate in either of two modes:
- You may generate a record of your exploration by capturing
the state of your analysis at any point along that path
recording the steps that lead to your discoveries.
- You may record markers, cairns if you will, along that
annotated as desired
to facilitate easy return to the exact
spot (and state) from which you can then pursue another path.
There are many data analysis tools, but few include support for true data exploration.
And none that I know of do it as completely or as naturally as VisiCube.