|
|
History
of XmdvTool
XmdvTool originated in 1993, when Matt Ward decided to merge several
small programs he had written to visualize multivariate data in
order to evaluate the effectiveness of each. The original techniques
included were parallel coordinates, scatterplots, and star glyphs.
He later added dimensional stacking, a technique he and two of
his students (Jeff LeBlanc and Rajeev Tipnis) had created and
implemented in a tool called N-Land. The techniques were linked
via a highlighting mechanism controlled by N sliders, one for
each dimension. Version 1.0 was developed using Athena Widgets
and the
Widget Creation Library. It had about 1000 lines of C code and
about 600 lines of a resource file to define the interface. It
was released to the public domain during the summer of 1994, and
described in a paper at the Visualization '94 conference.
One of the
reviewers of this paper commented that the program really needed
direct manipulation to support the selection/highlighting process.
Thus Ward and Allen Martin, a WPI graduate student, added functionality
to support interactive specification and modification of the selection
region, called the "brush", on the data displays themselves,
rather than via separate widgets. This work formalized the notion
of an N-dimensional brush, as opposed to the screen-space techniques
that had been used before. By working in data-space rather than
screen-space, users could specify a hyperbox of interest, which
could then be highlighted, deleted, or masked. Also, the concepts
of a ramped brush, which allowed points to belong to a brush in
varying degrees, and a composite brush, built from logical combinations
of existing brushes, weredeveloped. This work was reported at
Visualization '95, and version 2.0 of the system was released
that year. The code had grown to approximately 10,000 lines of
C code and 1000 lines of resource file.
Because the
source code was distributed free-of-charge, several researchers
made their own variations on the code. Torsten Timm, a student
at the University of Erlangen, released the unofficial version
3.0 in 1996. His extensions allowed data-driven glyph positioning,
labeling of data points, and other enhancements.
Version 3.1,
developed at WPI, included many features beyond those in Version
2.0, including interactive enabling and disabling of dimensions,
arbitrary zooming, extended Tukey box plots for visualizing statistics
of selected data, a reconfigured user interface, and an improved
help facility. It was released in 1997.
In 1998, Matt
Ward and Elke Rundensteiner received a 3-year NSF grant to investigate
ways of visually exploring large multivariate datasets. The award,
NSF:IIS 9732897, is being used to fund 2 graduate students: one
in visualization and the other in databases. The key concept of
the research is to use multiresolution techniques and hierarchical
clustering/partitioning methods to make large dataset analysis
more manageable.
Version 4.0,
developed by Ying-Huey Fua at WPI, represented a major change
from numerous perspectives, not the least of which was that it
no longer depended on the X-Window environment. Using OpenGL and
Tcl/tk, the package would now run on both Unix and Windows platforms.
A new interface, featuring Windows-like menus and icons to represent
the different display modes, has a very different look-and-feel
from previous versions. This version also introduced a new method
for exploring large multivariate data sets, based on hierarchically
clustering the data and presenting aggregation information rather
than just the raw data. Hierarchical parallel coordinates, described
in a paper at Visualization '99, show cluster summarizations via
shaded bands rather than single polylines, conveying both cluster
extents and size. Bands are colored via their location in the
hierarchy, such that child and parent clusters have similar clusters.
A new brushing paradigm, called structure-based brushing, was
also introduced. This allows users to navigate the hierarchy,
selecting subtrees for which they would like to display more or
less detail. This new technique for interaction was reported in
a paper at Information Visualization '99, and also in an extended
paper in IEEE Transactions on Visualization and Computer Graphics.
The most recent
version, 4.1 (to be released soon), extends the support for hierarchically
structured data to scatterplot matrices, star glyphs, and dimensional
stacking. Using the same coloring and navigation strategies pioneered
in Version 4.0, Jing Yang has created a consistent and complete
strategy for large-scale multivariate data exploration. She has
also
eradicated several bugs found in version 4.0, such as bands extending
beyond the drawing area boundaries, and made many improvements
in the efficiency of the code.
Version 4.2,
to be released during the Fall of 2000, will finally tie together
the visualization front-end with a database backend. Using a novel
indexing structure for hierarchies, Daniel Stroe has developed
efficient query mechanisms for all navigation operations on the
structure-based brush. He has also experimented with caching and
prefetching techniques to further improve performance. All of
these additions have been integrated into an Oracle backend which
communicates seamlessly with the existing visualization front-end.
Beyond that,
who knows? We have lots of ideas for future directions, including
ways to deal with large numbers of dimensions, user-guided reclustering,
non-numeric data, and a host of other suggestions made by current
users. We hope to have a major release every 18 months or so,
with minor releases as warranted.
Acknowledgement
The XmdvTool
Project is supported by NSF under grant IIS-9732897, the NSF CISE
Instrumentation grant IRIS 97-29878 and the NSF grant IIS-0119276.
|