Mount Sinai Center for Bioinformatics


We collaborate with researchers both within the Icahn School of Medicine at Mount Sinai and elsewhere by analyzing their data with the tools and pipelines we developed. In particular, the Center focuses on the strong need for analysis, visualization, and mining of data from omics studies such as transcriptomics, epigenomics, proteomics, and metabolomics for drug discovery.

Software Tools

We have developed several powerful and popular web-based software tools that can be used to discover new knowledge from data, and predict small molecules as novel leads, for a variety of projects involving different data types.

Gene-List Enrichment Analysis

This integrative web-based and mobile gene-list enrichment analysis tool includes more than 120 gene-set libraries, an alternative approach to rank enriched terms, and various interactive visualization approaches to display enrichment results using the JavaScript library Data-Driven Documents (D3). Enrichr is open source and freely available online. Users can easily embed this software into any tools that perform gene list analysis.

Enrichr: a comprehensive gene set enrichment analysis web server 2016 update
Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool

L1000 Characteristic Direction Signature Search Engine

This tool finds consensus signatures that match a user’s input gene lists or input signatures. The underlying dataset is the LINCS L1000 small molecule expression profiles generated at the Broad Institute by the Connectivity Map team. We calculated the differentially expressed genes of these profiles using our multivariate method called the Characteristic Direction.

L1000CDS2: LINCS L1000 characteristic direction signatures search engine

Biological Knowledge Engine

This tool is a biological knowledge engine built on top of information about genes and proteins from 114 datasets. To create the Harmonizome, we distilled information from original datasets into attribute tables that define significant associations between genes and attributes, where attributes could be genes, proteins, cell lines, tissues, experimental perturbations, diseases, phenotypes, or drugs, depending on the dataset. Gene and protein identifiers were mapped to NCBI Entrez Gene Symbols and attributes were mapped to appropriate ontologies. We also computed gene-gene and attribute-attribute similarity networks from the attribute tables. These attribute tables and similarity networks can be integrated to perform many types of computational analyses for knowledge discovery and hypothesis generation.

The harmonizome: a collection of processed datasets gathered to serve and mine knowledge about genes and proteins

Gene Expression and Enrichment Vector Analyzer

This tool is a web-based resource for analyzing collections of themed gene expression signatures. GEN3VA performs bulk enrichment analyses to produce an interactive web-based report with a variety of visualizations. For example, we used GEN3VA to automatically create a report of 49 gene expression signatures collected from studies that compared normal tissues to tissues from patients or mouse models of amyotrophic lateral sclerosis (ALS). This report provides a tabular view of all 49 gene expression signatures with their associated metadata; principal component analysis; hierarchical clustering; and enrichment analyses.

For the enrichment analyses component of the report, GEN3VA submits every gene signature from the collection to Enrichr and L1000CDS2. Enrichr performs enrichment analysis against many gene set libraries that include pathway databases, gene ontology, and regulation of gene sets by transcription factors. L1000CDS2 queries the gene signatures against the LINCS L1000 dataset to identify small molecules that can reverse or mimic the input expression signatures. Users can visualize the results from these analyses in interactive heatmaps that can potentially facilitate the discovery of unique and common regulatory mechanisms as well as potential small molecules that can be further experimentally tested.

GEN3VA: aggregation and analysis of gene expression signatures from related studies