Summer Research Training Program in Big Data Science

maayan summer programThe 2016 BD2K-LINCS DCIC Summer Research Training Program in Biomedical Big Data Science runs June 6 to August 12. Read More

Illuminating the Druggable Genome

Mount Sinai's Knowledge Management Center for Illuminating the Druggable Genome. Read More

Projects and Grants

The Ma'ayan Laboratory applies computational and mathematical methods to study the complexity of regulatory networks in mammalian cells. We apply graph-theory algorithms, machine-learning techniques and dynamical modeling to study how intracellular regulatory systems function as networks to control cellular processes such as differentiation, de-differentiation, apoptosis and proliferation. We develop software systems to help experimental biologists form novel hypotheses from high-throughput data, and develop theories about the structure and function of regulatory networks in mammalian systems. 

We lead two NIH funded Centers: the BD2K-LINCS Data Coordination and Integration Center (DCIC), and the Knowledge Management Center for the Illuminating the Druggable Genome.

Research Aims:

Assemble the largest novel collection of gene set libraries

Gene Set Enrichment Analysis and Gene Ontology analysis are central to all biological investigations that measure gene and protein expression at the global scale. Such analyses were limited until recently to pathways enrichment and/or gene ontology enrichment. We showed that enrichment analysis can be expanded to using data from many biological domains. By developing the tools: Kinase Enrichment Analysis (KEA), ChIP-X Enrichment Analysis (ChEA), Lists2Networks and Enrichr, we demonstrated that many resources can be converted to useful gene set libraries and these can better inform analyses from genome-wide expression studies. So far, over 25,000 unique users utilized the enrichment analyses software tools we developed.

Develop novel methods to identify differentially expressed genes, perform gene set enrichment analysis and setup benchmarks for such methods

One of the key statistical tests in the fields of systems biology and genomics sciences is the identification of differentially expressed genes and performing gene set enrichment analyses to identify biological themes from gene expression data. We develop multivariate methods to better identify the more “correct” differentially expressed genes from genomics studies, and to better perform enrichment analysis. Using a novel benchmarking strategy that we developed, we show we can fairly compare methods such as: limma, SAM and DESeq for differential expression, and GSEA for gene set enrichment analysis to evaluate these methods’ performance.

Understand the structure and dynamics of cellular regulatory networks

Analysis of the gene sets and networks we have collected and analyzed can uncover design principles and modules that make up the complex organizational structure of mammalian cells. Analysis of Big Data in the field also points to the existence of various experimental biases and computational limitations. We aim to develop new theories and new algorithms to better extract knowledge from such complex data.  Many of the theoretical observations we extracted from the topologies of biological networks and gene sets are manifestations of general design principles observed in many complex systems, not just in biological networks, and we are interested in understanding how such principles emerge and are related.

Contact Us

Avi Ma'ayan, PhD
Tel: 212-241-1153
Send e-mail

Annenberg Building
1468 Madison Avenue
19-54 (Office) | 19-50 (Lab)
One Gustave L. Levy Place
Box 1603
New York, NY 10029

In the News

Genetics: Big Hopes for Big Data
News in Nature | Outlook

NIH Launches a United Ecosystem for Big Data
Article in Biomedical Computation Review

Stem Cell (Re)Programming: Computing New Recipes
Article in Biomedical Computation Review

Center to Seek New Therapeutics by Integrating Gene, Protein Databases
Mount Sinai press release

Systems Pharmacology Approaches for Drug and Cancer Research

ESCAPE: Database for Integrating High Content Published Data Collected from Human and Mouse Embryonic Stem Cells
Article in RNA-Seq Blog

Society of Toxicology 2013 Annual Meeting
News article in Drug Discovery News

New Computational Method to Help Organize Scientific Data
Press release on

Mount Sinai Algorithm Predicts Drug Side Effects
Press release on

Mutations in 3 Genes Linked to Autism Spectrum Disorders
Press release on

HIPK2 Regulator Protein Plays a Crucial Role in Kidney Fibrosis
Press release on

New Database Could Speed Up Drug Discovery
Tech news feature on CNET

Animating Molecular Biology
Article in Biomedical Computation Review

Systematic Tracking of Cell Fate Changes
News and views article in Nature Biotechnology

Computational Honeycombs Drip with Data
News item in NIGMS Computing Life

Molecular Movies: New Software Animates Gene Expression Data
Technology observation on Scientific American Online

Stem Cells, Systems Biology and Human Feedback
News feature in Nature Reports Stem Cells