Tuesday, September 23, 2014
Location: Salon 1-3

Translational Bioinformatics Panel
Chairs: Orly Alter and May D. Wang
Panelists: Orly Alter, Matthew J. Brauer, Thomas G. Graeber, and May D. Wang

Discovery of Principles of Nature from Matrix and Tensor Modeling of Large-Scale Molecular Biological Data

Orly Alter
USTAR Associate Professor of Bioengineering and Human Genetics
Scientific Computing and Imaging (SCI) Institute, University of Utah

I will briefly describe the use of matrix and tensor decompositions in the simultaneous modeling of different types of large-scale molecular biological data, from different studies of cell division and cancer and from different organisms, to computationally predict previously unknown physical, cellular, and evolutionary mechanisms that govern the activity of DNA and RNA. I will briefly present novel multi-matrix and multi-tensor generalizations of the singular value decomposition as well as experimental verification and validation of some of the computational predictions. Last, I will briefly note on a laboratory test based on one discovery, which is on the verge of being implemented in a clinical setting. These models bring physicians a step closer to one day being able to predict and control the progression of cell division and cancer as readily as NASA engineers plot the trajectories of spacecraft today.

Integrating Data from Discovery Research, Preclinical Studies and Clinical Trials

Matthew J. Brauer
Computational Biologist and Head of the Scientific Computing Group
Department of Bioinformatics and Computational Biology, Genentech, Inc.

Biological discovery is only the first step in the process of drug development. Even if a scientific idea has been well-developed in the lab, integrating data from in vitro, in vivo and clinical sources into a coherent picture remains a major challenge of translational bioinformatics. I will describe our efforts to develop biomarkers for an oncology indication, discuss the difficulties facing the field, and present future possible directions for maximizing the value of clinical trial data.

Rank-Rank Hypergeometric Overlap (RRHO) Gene Expression Signature Comparison and the Benefits of Cross-Species Analysis

Thomas G. Graeber
Associate Professor of Molecular and Medical Pharmacology
Crump Institute for Molecular Imaging and UCLA Metabolomics Center, UCLA

The Rank-Rank Hypergeometric Overlap (RRHO) bioinformatic algorithm was developed for gene expression signature comparison in cases where the similarity is relatively weak but of statistical significance. The RRHO approach provides a statistical measure of overlap and a graphical map of the pattern of correlation between expression profiles. Previous techniques involve choosing a proper fixed threshold of differential expression. RRHO uses the full list of genes ranked by their degree of differential expression. RRHO is a two-dimensional analog of the Gene Set Enrichment Analysis (GSEA). Translational bioinformatic uses of RRHO in pharmaceutical research include drug response comparisons to guide drug development, molecular signature-based validation of a mouse model of a human disease, comparisons of neuron developmental stages, and prioritization of leads from genome-wide association studies (GWAS). RRHO is available at http://systems.crump.ucla.edu/rankrank

Comprehensive RNA-Seq Data Analysis Pipeline Investigation for Translational Genomics

May D. Wang
Associate Professor, Kavli Fellow, and Georgia Research Alliance Distinguished Cancer Scholar
Coulter Department of Biomedical Engineering, Department of Electrical and Computer Engineering, and the Winship Cancer Institute, Georgia Institute of Technology and Emory University

As RNA-seq technology becomes available for translational genomics, finding the proper data analysis pipelines remains a critical challenge. At the FDA's Sequencing Quality Control (SEQC) Consortium, we investigated 278 RNA-seq data analysis pipelines to determine their impact on gene expression accuracy and precision, sensitivity in detecting low-expression genes, specificity in detecting differentially expressed genes, and downstream prediction performance. We found that the quality of gene expression and the statistical power of downstream analysis were significantly impacted by the interaction among multiple pipeline components. We established a guideline for selecting RNA-seq data analysis pipelines for improved reproducibility, and effective decision making.

Panel Discussion