Sequence analysis tool development

Current advances in sequencing technologies have enabled systematic, genome-wide readouts of cell function. Bioinformatics approaches are critical to fully take advantage of these new approaches. Our laboratory strives to develop the tools to analyze, integrate, visualize and fully leverage the advancements in genome wide experimental technologies. We have developed and continue to enhance the Scripture toolkit for short read analysis (ChIP-Seq and RNA-Seq) and the SiPhy suite for comparative sequence analysis. Both tools have been critical in our exploration of the functional landscape of the human genome.

The Siphy comparative sequence analyis method

Evolution of non-coding genes

Large intergenic non-coding RNAs are spliced, polyadenylated and capped transcripts that do not overlap annotated protein coding genes and have little to no protein coding potential. We recently identified about 1500 lincRNAs using both epigenetic signatures of expression and transcriptional profiling by RNA-Seq. LincRNAs are an integral part of the cell’s transcriptional network. Interestingly, while lincRNAs display clear signatures of selection, their conservation profiles are markedly different from those of protein coding genes. Our lab aims to integrate functional data such us protein-RNA, RNA-DNA and RNA-RNA interactions with comparative analysis to understand the evolution of these interactions and how they have changed the molecular circuitry of the cell. In this project we closely collaborate with the Guttman Laboratory

Scripture reconstructs lincRNAs across mammals

Gene regulation

It is now clear that most phenotypic changes observed across vertebrates are not due to changes in protein coding genes, but are due to changes in gene regulation. However, how gene regulation is encoded in the genome is only now beginning to be understood. How the binding of a transcription factor to a promoter or enhancer affects target gene expression is still unclear. Only by studying the interplay between regulatory elements and their targets can we evaluate the functional role of cis-regulatory elements. Sequencing assays now allow us to monitor transcription factor binding (ChIP-Seq) and cellular output (RNA-Seq) at an unprecedented scale. We are currently studying this interplay by using the response of bone marrow derived dendritic cells (BMDCs) to pathogen stimuli as our biological system. Up to this point, our integrated analysis of temporal datasets of transcription binding and gene expression showed that binding of different factors is responsible for subtle expression patterns that control specific pathways. These pathways have very distinct forms of regulation: a minority of pathways are regulated by few transcription factors (e.g. Stat1 and Stat2) whose targets are very responsive to knock down of these factors and have very conserved binding sites. In contrast, most pathways are controlled by a larger set of redundant transcription factors, whose binding has an additive effect and expression where the number rather than the type of factors bound gives rise to different expression levels. Our group continues to work closely with experimental groups to further enhance and characterize these datasets in order to crack the regulatory code of mammalian immune cells.