ContactPerson: djiang3@cse.buffalo.edu Remote host: tchaikovsky.cse.buffalo.edu ### Begin Citation ### Do not delete this line ### %R 2005-04 %U /scratch/djiang3/paper.ps %A Tang, C., Ramanathan, M., Jiang, D., and Zhang, A. %T A Semi-Supervised Learning Method for Coherent Pattern Detection from Gene-Sample-Time Series Datasets %D March 09, 2005 %I Department of Computer Science and Engineering, SUNY Buffalo %K Gene-sample-time microarray data, semi-supervised learning %Y Algorithms %X DNA microarrays provide simultaneous, semi-quantitative measurements of the expression levels of thousands of genes from a single experimental sample. The availability of such data sets can enhance our understanding of gene function and regulation if the patterns underlying gene expression data can be identified. In this paper we study the problem of coherent pattern detection from gene-sample-time series expression data sets. These data sets result from microarray experiments in which a gene expression time series obtained on multiple samples using microarrays; the gene identities represent the first dimension, the sample properties represent the second dimension and time represents the third dimension. Such Gene-Sample-Time Series (GST) data arise naturally in microarray experimental designs, e.g., when the pharmacodynamics of gene expression in responder and non-responder groups is investigated. A new semi-supervise learning method is proposed to search coherent blocks from gene-sample-time series data set. Each block contains a subset of genes and a subset of samples such that the patterns within the block are coherent along the time series. The coherent blocks may identify the samples corresponding to some phenotypes (e.g., disease states), and suggest the candidate genes correlated to the phenotypes. We empirically evaluate the performance of our approaches on a real microarray data of the pharmacodynamics of gene expression in multiple sclerosis patients after interferon beta treatment.