XmdvTool
File Formats
1. Flat
(Non-Hierarchical) Format
All versions
of XmdvTool support a flat version of the data set, consisting
of ASCII text alphanumeric fields separated by blanks or newlines.
The data, structured as rows and columns of integer or real values,
is preceded by a simple header, giving data dimensions, labels,
and ranges of values for each dimension. Each dimension also has
a cardinality, which indicates how many bins will be used
for dimensional stacking (if you are uncertain how to specify
this, just set it at a relatively small integer, say 2-4).
The basic
flat data format, normally given a .okc extension, is structured
as follows.
Int_N_number_of_dimensions
Int_M_number_of_datapoints
String_fieldName_dimension1
String_fieldName_dimension2
...
String_fieldName_dimensionN
Float_minimum_dimension1 Float_maximum_dimension1 Int_cardinality_dimension1_in_dimstack
Float_minimum_dimension2 Float_maximum_dimension2 Int_cardinality_dimension2_in_dimstack
...
Float_minimum_dimensionN Float_maximum_dimensionN Int_cardinality_dimensionN_in_dimstack
Float_value_dimension1_datapoint1 Float_value_dimension2_datapoint1
... Float_value_dimensionN_datapoint1
Float_value_dimension1_datapoint2 Float_value_dimension2_datapoint2
... Float_value_dimensionN_datapoint2
...
Float_value_dimension1_datapointM Float_value_dimension2_datapointM
... Float_value_dimensionN_datapointM
Note: Blank
is not allowed to be used in the fieldNames. If you use "("
in the fieldNames, system will omit all the contents after it
in the same line.
2. Hierarchical
Data Format
In XmdvTool4.0,
we introduced the concept of hierarchical visualization of multivariate
data, and provided tools for creating hierarchical clusters from
flat data files. In fact, there are many techniques for creating
such hierarchies, either through the natural structure of the
data (e.g., a file system), binning of dimensions (e.g., data
cube methods), and algorithmic techniques (clustering, partitioning).
The format required to use the hierarchical approaches should
be a file with a .cf extension, should be put in the same directories
as their coresponding .okc files.
Formats of
a .cf file:
Int_L_number_of_nodes Int_N_number_of_dimensions
Int_id_node1 Int_parent_node1 Int_entries_node1 Float_sx1_node1
... Float_sxN_node1 Float_sqrtRadius_node1
Int_id_node2 Int_parent_node2 Int_entries_node2 Float_sx1_node2
... Float_sxN_node2 Float_sqrtRadius_node2
...
Int_id_nodeL Int_parent_nodeL Int_entries_nodeL Float_sx1_nodeL
... Float_sxN_nodeL Float_sqrtRadius_nodeL
Download
a tool to translate a *.okc file to a *.cf file
In XmdvTool4.1,
we refined this format to include more aggregation information.
Specifically, rather than assigning a single value to a cluster
(radius, as was used in the Birch Clustering Algorithm), we specify
extents for each dimension. Thus the data file format for using
the hierarchical approaches have a .cg extension and should be
put in the same directories as their coresponding .okc file.
Format of
a *.cg file:
Int_L_number_of_nodes Int_N_number_of_dimensions
Int_id_node1 Int_parent_node1 Int_entries_node1 Float_sx1_node1
... Float_sxN_node1 Float_sqrtRadius_node1 Float_max1_node1 ...
Float_maxN_node1 Float_min1_node1 ... Float_minN_node1
Int_id_node2 Int_parent_node2 Int_entries_node2 Float_sx1_node2
... Float_sxN_node2 Float_sqrtRadius_node2 Float_max1_node1 ...
Float_maxN_node1 Float_min1_node1 ... Float_minN_node1
...
Int_id_nodeL Int_parent_nodeL Int_entries_nodeL Float_sx1_nodeL
... Float_sxN_nodeL Float_sqrtRadius_nodeL Float_max1_node1 ...
Float_maxN_node1 Float_min1_node1 ... Float_minN_node1
Download
a tool to translate a *.cf file to a *.cg file
You can find
examples of .okc, .cf and .cg files in the download page.
|