next up previous
Next: EXPLORING DATA SETS OF Up: N-Land: a Graphical Tool Previous: ABSTRACT

INTRODUCTION

Visualization has long been a traditional phase of data analysis. It provides the user with an intuitive feel for the data, allows rapid qualitative assessment (detecting trends, anomalies, extrema, patterns), and helps in the design of algorithms for analysis of the data. Visual methods of data presentation map data values to some graphical entity or attribute. Typical among these attributes are location, size, shape, opacity, and intensity/color. The time domain (animation) can also be employed in the mapping of one of the data dimensions.

Data sets of dimension greater than four are becoming increasingly common, occurring in such areas as astrophysics, simulation, control problems, and the analysis of statistical data. This necessitates the development of new visualization tools specifically designed for high-dimensional information. One solution is to display subsets of the data by fixing one or more of the dimensions. This permits the use of existing display techniques, but fails to show all possible relationships and values in the data. Another method, described by du Toit et. al. (1), is to project the data on lines or planes (scatterplots) through the data set. This method displays all data, but loses much of the spatial distribution information. Other methods include parallel axis techniques (described by Inselbert and Dimsdale (2)) and glyphs (for example, as described by Beddow (3)).

Dimensional Stacking is a method developed by LeBlanc et. al. (4, 5) for mapping data from a discrete N-dimensional space to a two-dimensional image in a manner which minimizes the occlusion of data while preserving much of the spatial information. Briefly, the mapping is performed as follows: begin with data of dimension 2N+1 (for an even number of dimensions there would be an additional implicit dimension of cardinality one). Select a finite cardinality/discretization for each dimension. Choose one of the dimensions to be the dependent variable; the rest will be considered independent.

Create ordered pairs of the independent dimensions (N pairs) and assign each pair a unique value (speed) from 1 to N. The pair corresponding to speed 1 will create a virtual image whose size coincides with the cardinality of the dimensions (the first dimension in the pair is oriented horizontally, the second vertically). At each position of this virtual image, create another virtual image to correspond to dimensions of speed 2, again whose size is dependent on the cardinality of the dimensions involved. Repeat this process until all dimensions have been embedded. In this manner, every location in the discrete high dimensional space has a unique location in the two-dimensional image resulting from the mapping. The concept of the speed of a dimension can best be likened to the digits on an odometer, where digits cycle through their values at different rates.

The value of the dependent variable at the location in the high dimensional space is then mapped to a color/intensity value at that location in the two-dimensional image. This embedding process is illustrated in Figure 1 with a six-dimensional data set, where dimensions d1, ..., d6 have cardinalities 4, 5, 2, 3, 3, and 6, respectively. For clarity we have not displayed the values associated with a dependent variable, which would dictate the colors in the smallest grid locations.


  
Figure 1: Conceptualization of dimensional stacking - collapsing six dimensions into two dimensions.
\begin{figure}\begin{centering}\setlength{\unitlength}{0.0125in} %
\begin{pict...
...]{\twlrm elements (2,2,1,1,*,*)}}}
\end{picture}\par
\end{centering}\end{figure}

Dimensional stacking is basically a 2-D extension of a technique developed by Mihalisin et al. (6, 7), which involves graphing scalar fields in multiple dimensions. Their technique consists of embedding graphs in a recursive fashion, using color and baseline displacement to indicate steps in the slower dimensions. The major differences between the techniques are the use of intensity/color instead of location for the data/graphic mapping (thus permitting a significant increase in information presentation in exchange for a reduction in quantitative perception) and the display of data sets instead of functions. A 3-D version of embedded dimensions has also been explored by Feiner and Beshers (8) in a technique referred to as Worlds within Worlds.

This paper concentrates on strategies for searching data spaces displayed using dimensional stacking. This includes arbitrary and sequenced view selection, data-driven view selection, tuning of individual views (rotations in N-space), controlling the number and size of discrete bins for each dimension, colormap manipulation, N-dimensional brushing, and the benefits of adapting image processing techniques to assist in the examination of the data.


next up previous
Next: EXPLORING DATA SETS OF Up: N-Land: a Graphical Tool Previous: ABSTRACT
Matthew Ward
1999-02-23