Scanbio DataSet ( Renewed on 07/15/2005 )

Name

Scanbio

URL

scanbio.tar.gz

scanbio.zip

Dimensions

8

Records

1356

Description

 

They are bioinformatics data, from a database of single nucleotide polymorphisms.
From http://www.cs.wpi.edu/~ruiz/Publications
/SPSR+01:genepi/revised_submitted_02_02.pdf:
We have developed a Multivariate Data Visualization (MDV) program, Scansort (Ward, unpublished; available at matt@cs.wpi.edu), which sorts the SNP loci using differences between genotype frequencies of affected and unaffected subjects. The program generates one record for each SNP as follows: {diff, 11h, 12h, 22h, 11s, 12s, 22s , d11, d12, d22, ID}, where ID is the SNP index in the original order; ijh is the frequency of unaffected (healthy) individuals with genotype ij; ijs is the frequency of affected (sick) individuals with genotype ij; and dij=|ijh - ijs|. diff is the sorted SNP index computed as

diff=4 d11+d12+d22

Dimension Description

Order
ID
11h
12h
22h
11s
12s
22s

Source