|
Dimensional
Stacking
Figure 4
Several recent
techniques have emerged which involve projecting high dimensional
data by embedding dimensions within other dimensions. In
the 1-D case [MIH:91], one starts by discretizing the ranges of
each dimension and assigning an ordering to the dimensions (dimensions
are said to have unique ``speeds''). A background color
is also associated with each speed. The next step is to divide
the screen into C0 vertical strips, where C0
is the cardinality of the dimension with the slowest speed.
The strips are colored according to that speed. Each of
these strips is then divided into C1 strips and colored
accordingly. This is repeated until all dimensions have
been embedded and the data value associated with each cell can
be plotted on the vertical axis.
In 2-D [LEB:90],
an analogous technique called Dimensional Stacking involves
recursively embedding images defined by a pair of dimensions within
pixels of a higher-level image. Unlike the previous system, however,
data is not restricted to functions, thus making this technique
amenable to a wider range of data types. In Worlds within
Worlds [BESHERS:93], each location in a 3-D space may in turn
contain a 3-D space which the user may investigate in a hierarchical
fashion. The most detailed level may contain surfaces, solids,
or point data.
XmdvTool requires
three types of information to project data using dimensional stacking.
The first is the cardinality (number of buckets) for each dimension.
The range of values for each dimension is then decomposed into
that many equal sized subranges. The second type of information
needed is the ordering for the dimensions, from outer-most (slowest)
to inner-most (fastest). Dimensions are assumed to alternate in
orientation. The last piece of information used is the minimum
size for the plotted data item (the system will increase this
value if the entire image can fit within the view area).
Each data point then maps into a unique bucket, which in turn
maps to a unique location in the resulting image. If the
image generated exceeds the size of the view area, scroll bars
are automatically generated to allow panning. A key is provided
in a separate window to help users understand the order of embedding,
and grid lines of varying intensity provide assistance in interpreting
transitions between buckets at different levels in the hierarchy.
The sparseness
of the data set of Figure 1 makes uncovering relationships difficult
using Dimensional Stacking. Figure 4 shows a denser set consisting
of 3-D drill hole data with a fourth dimension representing the
ore grade found at the location (more than 8000 data points).
Longitude and latitude are mapped to the outer dimensions, each
with cardinality 10. Depth and ore grade map to the inner dimensions
(ore grade is the vertical orientation), with cardinality 10 and
5, respectively. There is a clear region in which the ore grade
improves with depth, and other places where digging had stopped
prior to the ore grade falling significantly. By adjusting
the cardinalities and ranges for the various dimensions, a more
detailed view of the data may be obtained [WAR:CGI].
The hierarchical
techniques are best suited for fairly dense data sets and do rather
poorly with sparse data. This is due to the fact that each
possible data point is allocated a specific screen location (with
overlaps avoidable by careful discretization of dimensions), and
as the dimension of the data increases, the screen space needed
expands rapidly. In contrast, the techniques described earlier
generally do well with sparse data over high numbers of dimensions,
though scatterplots are constrained somewhat it the maximum manageable
dimension. The major problem with hierarchical methods is
the difficulty in determining spatial relationships between points
in non-adjacent dimensions. Two points which in fact are
quite close in N-space may project to screen locations which are
quite far apart. This is somewhat alleviated by providing
users with the ability to rapidly change the nesting characteristics
and discretization of the dimensions.
We have extended
flat dimensional stacking to hierarchical dimensional stacking
. In the hierarchical dimensional stacking display, a block within
the extent of a cluster is assigned the color of this cluster.
Its density is determined by the distance between it and the mean
of the cluster, that is, the longer the lighter. Movie
4 is a multiresolutional cluster display of hierarchical dimensional
stacking.
References
[BESHERS:93]:
Beshers, C., Feiner, S.. AutoVisual: rule-based design of
interactive multivariate visualizations. IEEE Computer
Graphics and Applications, Vol. 13, No. 4, pp. 41-49, 1993.
[LEB:90]:
LeBlanc, J., Ward, M.O., Wittels, N.. Exploring N-dimensional
databases. Proceedings of Visualization '90, pp.
230 - 237, 1990.
[MIH:91]:
Mihalisin, T., Gawlinski, E., Timlin, J., and Schwegler, J..
Visualizing multivariate functions, data, and distributions. IEEE
Computer Graphics and Applications, Vol. 11, pp. 28 - 37,
1991.
[WAR:CGI]:
Ward, M.O., LeBlanc, J.T., Tipnis, R.. N-Land: a graphical
tool for exploring N-dimensional data. Proceedings of CG International
'94, 1994.
[WARD:94]:
M. Ward. Xmdvtool: Integrating multiple methods for visualizing
multivariate data. Proc. of Visualization '94, p.
326-33, 1994.
|