GPX Tutorial

Environment System: SUN Solaris 9
Web Server:  Apache HTTP server 2.0.50
CGI: Perl 5.8.4
Programming: Java 1.3
File Format
  1. Your data file should be a tab-delimited ascii text file.
  2. All fields should be tab ('\t') separated.
  3. To obtain the best results you might log-transform your data. It is not necesarry to normalize the data, this is done within our system.
  4. The first column is used to identify the genes in your data file.
  5. The second column is used to annotate the genes in your data file. If there is no annotation available for a gene, leave it blank.
  6. All the other columns contain the expression levels as numerical values. If there are some missing values in your data, leave them blank.
  7. If you have any questions about the data format take a look at an example (use Excel to getter a better view).
Demo Files
  1. Iyer's data set: [Original site]
    Contains mRNA transcript levels during the response of fibroblasts to serum of human cells. It consists of 8600 genes monitored during a 12-point time-series. Only 517 genes survived significance test and used in clustering. Ten clusters were reported by the authors.
  2. Cho's data set : [Original Site]
    Contains mRNA transcript levels during the mitotic cell cycle of the budding yeast S. cerevisiae. It consists of 386 genes containing 17 time points. Five cell-cycle phases of genes were reported by the authors.
  3. Spellman's data set : [Original Site]
    Contains 6276 mRNA transcript levels during three time-series of the budding yeast S. cerevisiae. 800 cell-cycle regulated genes were identified in 5 cell-cycle phases.
Tutorial Tutorial(1)Next Page




© 2004 Daxin Jiang. All right reserved. Department of Computer Science and Engineering
State University of New York at Buffalo
Last Modified: 9/6/2004