CS700:Graduate Seminar in Computer Science & Informatics

Comparative analysis of regulatory region activity between human and mouse
Olgert Denas, Department of Mathematics and Computer Science

The extent to which the regions that regulate gene expression (cis-regulatory modules or CRMs) are under evolutionary constraint within mammals is not well understood. At the sequence level, different elements and classes of elements show very different levels of constraint. Phenomena such as transcription factor binding site turnover allow substantial sequence level change while still preserving function. Recent studies focusing on specific transcription factors have found that transcription factor occupancy is frequently not conserved. The comprehensive functional genomic annotation being produced by the ENCODE and Mouse ENCODE projects provide a unique opportunity to comprehensively evaluate the relationship between conservation of sequence and conservation of function. Here we focus on ChIP-seq data for specific transcription factors. Using human-mouse sequence alignments to estimate conservation of the regulatory material of human and mouse. Across all annotations and cell lines, we find that on average, ~30% of regulatory regions are conserved at the sequence level with mouse. Conservation is more dispersed among factors than among cell types. If we only consider occupancy in analogous cell types, only a small fraction of regions show conservation of occupancy. While such poor conservation is usually attributed to turnover, we show that a portion of these elements is being reused in mouse in other contexts. Although they are not active in the analogous mouse cell type, many still show regulatory activity in other mouse cell types.