For more detailed information, please read our on-line documents:
InterRing: An Interactive Tool for Visually Navigating and Manipulating Hierarchical Structures
Visual Hierarchical Dimension Reduction for Exploration of High Dimensional Datasets
Visual Hierarchical Dimension Reduction (VHDR) aims to help users handle high dimensional data by generating meaningful lower dimensional spaces with user interactions. VHDR is composed of the following steps:
Step 1: Dimension Hierarchy Generation
First, all the original dimensions of a multidimensional data set are organized into a hierarchical dimension cluster tree
(the tree you see in the InterRing display) automatically according to similarities among the dimensions. Each original dimension is mapped to a leaf node in this tree. Similar dimensions are placed together and form a cluster, and similar clusters in turn compose higher-level clusters.
This step has been finished when the Dimension Reduction dialog pops
up.
Step 2: Dimension Hierarchy Navigation and Modification
Next, users can navigate through the hierarchical dimension cluster tree in order to gain a better understanding of it. Users can also interactively modify the hierarchy structure. The hierarchical dimension cluster tree is visualized
in the Dimension Reduction dialog. It is displayed using a radial space-filling
technique named
InterRing, which contains a suite of navigation and modification tools. In the
following sections of this help file, the tools will be introduced.
Step 3: Dimension Cluster Selection
Next, users interactively select interesting dimension clusters from the hierarchy in order to construct a lower dimensional subspace. Several selection mechanisms are provided in InterRing to facilitate dimension cluster
selection, which will be introduced in the later sections. Selected clusters are
highlighted using the system highlighting color.
Step 4: Representative Dimension Generation
In this step, a representative dimension (RD) is automatically created for each selected dimension cluster. The selected dimension clusters construct the lower dimensional space through these
RDs. RDs are selected to best reflect the aggregate characteristics of their associated clusters.
In this release, for a non-leaf node, the RD is the average of all the original dimensions in the
cluster. For a leaf node, the RD is the dimension itself. Users
can select one leaf node of a cluster instead of the cluster itself so that
the generated lower dimensional space could be more meaningful.
Step 5: Data Projection and Visualization
Finally, the data set is projected from the original high dimensional space to a lower dimensional space (LD space) composed of the RDs of the selected
clusters after the user clicks the "Apply" button in the dialog. We call its projection in the LD space the
mapped data set. The mapped data set is viewed as an ordinary data set in the LD space and
is visualized using the current multidimensional visualization technique in the
main window. You can also change the current visualization technique freely as
you can do for the original data set. In order to provide further dimension cluster characteristics in the LD space, such as the dissimilarity information between dimensions within a cluster, we attach the dimension cluster characteristics information to the mapped data set and provide the option to display it.
We call this dissimilarity visualization. We can perform dissimilarity visualization from two different viewpoints: from that of the individual data items, or from that of the whole data set. We name the former the ``local degree of dissimilarity (LDOD)" and the latter the ``global degree of dissimilarity (GDOD)". They are defined as follows:
LDOD - the degree of dissimilarity for a single data item in a dimension cluster. We use a mean, a maximum, and a minimum value to describe it. The mean is the mapped image of the data item on the representative dimension. The minimum and the
maximum are the minimum and maximum values among the values of the data item on all the original dimensions belonging to the dimension cluster. Note that all the dimensions have been normalized so values lie between 0 and 1.
GDOD - the degree of dissimilarity for the entire data set in a dimension cluster. It is a scalar value and can be calculated according to the similarity measures between each pair of the dimensions in the cluster. We use a simplified approach, namely, we use directly the radius of a dimension cluster as its GDOD. A dimension cluster radius is initially assigned as the similarity threshold of the iteration in which the dimension cluster is formed in the VHDR automatic dimension cluster approach.
In the Menu.Options.Dissimilarity Display section, we will introduce the dissimilarity displays in this release and introduce how to select them.
There are a group of check buttons in the dialog. Each of them represents a mode of the InterRing display. With a button checked, the InterRing display enters the mode the check button represents and allows certain interactions to happen in it. You can click to check a button, and click it again to uncheck it.
In this mode, users can increase the sweep angles of the interested nodes. You can perform the distortion in two modes: the non-pin mode and the pin mode (to switch between them, use Menu->Options->Circular Distort).
In the non-pin mode, directly drag and drop the edge of the interested nodes in the InterRing display to increase or decrease them using the left mouse button.
In the pin mode, firstly click your interested nodes using right mouse button to pin them, then drag and drop their edges to increase or decrease them using the left mouse button. When you pin a node, the previous pinned node will be automatically unpinned.
In this mode, users can increase the radius of the interested levels.
Firstly, click your interested level using the right mouse button to pin it. Then drag and drop it edges to increase or decrease its radius using the left mouse button. When you pin a level, the previous pinned level will be automatically unpinned. You can also directly click or drag and drop the left mouse button without pinning any node to distort the nodes that are closest to the cursor.
In this mode, users can rotate the InterRing display by clicking on InterRing display. Click with left mouse button to rotate anti-clockwise. Click with right mouse button to rotate clockwise.
In this mode, users can hide/show sub-branches of the tree in the InterRing display by clicking on the root node of the sub-branches. Click a node to hide the sub-branch rooted at it. Click it again to show the sub-branch expanded out.
In this mode, users can change the tree structure. You can drag the root of a sub-branch and release it at a node that you want to be its new parent node using you left mouse button.
In this mode, users can select nodes in the tree. These nodes are the dimension clusters that will construct the new lower dimensional space after you click the Apply button.
To select a node, click it using the left mouse button. To unselect it, click it again. Or you can click the right mouse button on a node, than a "Select" dialog will pop up. Change the scaling bar in it and click OK to select multiple nodes in the sub-branch rooted at that node.
Selected clusters are highlighted using the system "highlight-1" color (system colors are the colors in the "Color Requester" dialog; you can change colors in this dialog.).
If you have selected some nodes in the InterRing display, you will get you new lower dimensional space in the display of the main window. It is composed of the dimension clusters you selected in the InterRing display.
Raise the Dissimilarity Display dialog to select different methods to convey dimension cluster characteristics when visualizing the data set in lower dimensional spaces.
You can select the "Wide Axes" button if you are in the (flat or hierarchical) Parallel Coordinates or (flat or hierarchical) Scatterplot Matrices mode.
With this button selected, if you are in the Parallel Coordinates mode, the axis width of a representative dimension is proportional to the GDOD of the dimension cluster it represents. A wider axis represents a dimension cluster with a larger GDOD. In the flat scatterplot matrices, GDOD is mapped to the width of the frames of the plots.
You can select the "Three Axes" button if you are in the flat Parallel Coordinates, flat Scatterplot Matrices, or flat star glyph mode to visualize LDOD.
In the flat Parallel
Coordinates mode, two extra axes are displayed around a representative dimension to indicate the minimum and maximum of the corresponding dimension cluster for every data point.
Good correlation within a cluster would manifest itself as nearly horizontal lines through the 3 axes, while lines with steep slope indicate areas of poor correlation.
In the flat Scatterplot Matrices mode, if you select the "Three Axes"
button, diagonal plots will be used to represent LDOD.
The minimum and maximum of the dimension cluster to the x and y coordinates of the diagonal plot of its representative dimension. Thus in the diagonal plots, if a point has an equal maximum and minimum, it will be represented as a point on the diagonal. On the contrary, if a point has a large LDOD, which means there is a large difference between maximum and minimum and thus a large difference between its x and y coordinates, it will lie a significant distance from the diagonal. Thus a plot along the diagonal of the matrix with points spread out in the plot away from the diagonal indicates low correlation within that dimension cluster.
In the flat Star
Glyph mode, the minimum and maximum of the dimension clusters are visualized
using the system "Grid1" and "Grid2' color. The length of the
line segment from the star center to the beginning of the "Grid1"
color is proportional to the minimum value of the cluster. The length of the
line segment from the star center to the end of the "Grid1" color is
proportional to the mean value of the cluster. The length of the line segment
from the star center to the end of the "Grid2" color is proportional
to the maximum value of the cluster.
You can select the "Mean Band" button if you are in the flat Parallel Coordinates mode. A band is added to each data point ranging in width from the minimum to the maximum for each representative dimension. You can reduce the overlaps using the "Band Extent Scaling" scroll bar in the Dimension Reduction dialog.
You can select Uniform Color or Node Color for the lower dimensional space display. If you choose Uniform Color, then all the axes in the lower dimensional space display will be the system "Grid1" color. If you choose Node Color, the axes in the lower dimensional space display will be the same colors of their corresponding nodes in the dimension hierarchy shown in the InterRing display.
To choose from the Non-Pin or Pin mode when perform the circular distortion.
To enable/disable circular distortion feedback, roll up/drill down feedback, and show/hide selected nodes' name.
To choose from the selecting according to Entries mode or the according to Radius mode..
To switch the structure-based brushing mode between "cover all leaves" and "leaves only and filter out similar leaves".
In the "cover all leaves" mode, for each leaf node in the sub-branch that has been applied structure-based brushing, either itself, or one of its ascendant will be selected. While in the other mode, only leaf nodes will be selected. If there are several leaves that are similar to each other according to the dissimilarity threshold, only one of them will be selected. Unimportant leaves will also be filtered out.
To reorder the dimension hierarchy according to their importance...
If dimension spacing is on, after the "apply" button is pressed, the dimensions in the Paralle Coordinates and Star Glyphs displays will be placed according to the similarities between adjacent dimensions. The more similar two adjacent dimensions are, the more close they are to each other.
To reassign colors to all nodes of the dimension hierarchy according to the current tree structure. It can be used after you modify the hierarchy.
To reassign sweeping angles to all nodes of the dimension hierarchy according to the current tree structure.
To reassign radius to all levels of the dimension hierarchy according to the current tree structure.
To unselect all the currently selected nodes.
To return to the original high dimensional space.
Message bar gives you many useful "how to do" and "what's it" hints.