Chancellor's Professor in the Department of Computer Science
Over the past three decades machine learning approaches have had a profound influence on many fields, including bioinformatics. We will provide a brief historical perspective of machine learning and its applications to proteomics, particularly structural proteomics, and discuss why structural proteomics is important for machine learning. We will then present state-of-the art machine learning methods for predicting protein structures and structural features, from secondary structure to contact maps. We will stress and demonstrate the importance of combining supervised and unsupervised learning, and using deep and modular architectures capable of integrating information over space and "time" at multiple scales. Finally, we will describe two proteomic applications that have benefited from statistical machine learning methods: (1) the discovery of new drug leads for neglected diseases;and (2) the development of high-throughput platforms to study the immune response with applications to antigen discovery and vaccine development.
Regents-GRA Eminent Scholar Chair and
Professor of bioinformatics and computational biology
We have recently discovered that genomic locations of genes in bacteria are highly constrained by the cellular processes that are involved in. So for the first time, we understand that the locations of genes follow both global and local rules. This realization has led to a new paradigm for tackling and solving some very challenging genomic analysis problems. I will discuss about this new discovery and a number of applications that we are currently doing, including gene assignments of pathway holes and complete genome assembly.
Martha L. Bulyk
Phone: (617) 525-4725
The interactions between sequence-specific transcription factors (TFs) and their DNA binding sites are an integral part of the gene regulatory networks within cells. My group developed highly parallel in vitro microarray technology, termed protein binding microarrays (PBMs), for the characterization of the sequence specificities of DNA-protein interactions at high resolution. Using PBMs, we have determined the DNA binding specificities of hundreds of TFs from a wide range of species. More recently we have used the PBM technology to investigate TF heterodimers and higher order complexes. The PBM data have permitted us to identify novel TFs and their DNA binding sequence preferences, predict the target genes and condition-specific regulatory roles of TFs, predict and analyze tissue-specific transcriptional enhancers, investigate functional divergence of paralogous TFs within a TF family, investigate the molecular determinants of TF-DNA recognition specificity, and distinguish direct versus indirect TF-DNA interactions in vivo. Notably, not all DNA binding sites of a TF function equally. Further analyses of TFs and cis regulatory elements are likely to reveal features of cis regulatory sequences that are important in gene regulation.