Software Tools

For the purpose of organizing, visualizing, analyzing and modeling data from high-throughput molecular profiling experiments we develop computational approaches which can assist experimental systems-biologists to form rational hypotheses for further experimentation. We analyze high-dimensional data collected for projects integrating results from multiple layers of regulation (genomics, transcriptomics and proteomics). Algorithms and datatsets are delivered  as software so that our methodologies can reach and impact the interested systems biology research community. Below are some of the software tools we developed:


Enrichr is an integrative web-based and mobile gene-list enrichment analysis tool that includes over 30 gene-set libraries, an alternative approach to rank enriched terms, and various interactive visualization approaches to display enrichment results using the JavaScript library Data Driven Documents (D3). Enrichr is open source and freely available online. The software can also be embedded easily into any tools that perform gene list analysis.
Website: Enrichr | Publication: PMID: 23586463


Drug Pair Seeker (DPS) is a Java program that attempts to predict and prioritize pairs of drugs using the Connectivity Map dataset. Users can enter lists of up and down differentially expressed genes from their experiments to receive a ranked list of drug combinations that are predicted to either reverse or aggravate the gene expression state of the cells or tissue of interest using a simple formula.
Website: DPS


Network2Canvas (N2C) is a web application that provides an alternative way to view networks. N2C visualizes networks by placing nodes on a square toroidal canvas. The network nodes are clustered on the canvas using simulated annealing to maximize local connections where a node's brightness is made proportional to its local fitness. The interactive canvas is implemented in HyperText Markup Language (HTML)5 with the JavaScript library Data-Driven Documents (D3). We applied N2C to visualize 30 canvases made from human and mouse gene-set libraries and 6 canvases made from the Food and Drug Administration (FDA)-approved drug-set libraries.
Website: N2C | Publication: PMID: 23749960



ChIP-X Enrichment Analysis (ChEA) database contains manually extracted datasets of transcription-factor/target-gene interactions from over 100 experiments such as ChIP-chip, ChIP-seq, ChIP-PET applied to mammalian cells. We use the database to analyze mRNA expression data where we perform gene-list enrichment analysis as the prior biological knowledge gene-list library. The system is delivered as web-based interactive software. With this software users can input lists of mammalian genes for which the program computes over-representation of transcription factor targets from the ChEA database.
Websites: ChEA web interface | ChEA command-line version | Publication: PMID: 20709693



Kinase Enrichment Analysis (KEA) is a web-based tool with an underlying database providing users with the ability to link lists of mammalian proteins/genes with the kinases that phosphorylate them. The system draws from several available kinase-substrate databases to computes kinase enrichment probability based on the distribution of kinase-substrate proportions in the background kinase-substrate database compared with kinases found to be associated with an input list of genes/proteins.
Websites: KEA web-interface | KEA command-line version | Publication: PMID: 19176546


LINCS Canvas Browser (LCB) is a web-based tool that enables users exploring thousands of genome-wide gene expression experiments applied to breast cancer cell lines. The browser visualizes results from L1000 experiments where drugs or endogenous ligands were applied to six human breast cancer cell lines in different concentrations and where expression was measured at different time points. The visualization of the results is organized by cell-line and batch where perturbations that induced similar responses are clustered together on a canvas. Clicking on a specific experiment on the canvas of experiments displayed on the left results in enrichment analyses displayed on various canvases on the right.
Website: LCB



Grid Analysis of Time-series Expression (GATE) is a computational software platform for integrated visualization and analysis of expression time-series. Given a high-dimensional time-series dataset, GATE employs a clustering algorithm which creates movies of expression dynamics by assigning individual genes/proteins to hexagons on a hexagonal array and dynamically coloring each hexagon according to the expression level of the molecular species to which it is associated. Additionally, in order to infer potential regulatory control mechanisms from patterns of time-series correlations, GATE allows interactive interrogation of the movies with a wide variety of background knowledge datasets.
Website: GATE | Publication: PMID: 19892805


Genes2FANs (G2F) is a web based tool and a database that utilizes 14 carefully constructed FANs and a large-scale protein-protein interaction (PPI) network to build subnetworks that connect input lists of human and mouse genes. The FANs are created from mammalian gene set libraries where mouse genes are converted to their human orthologs. The tool takes as input a list of human or mouse Entrez gene symbols to produce a subnetwork and a ranked list of intermediate genes that are used to connect the query input list. In addition, users can enter any PubMed search term and then the system automatically converts the returned results to gene lists using GeneRIF. This gene list is then used as input to generate a subnetwork from the user’s PubMed query.
Website: G2F | Publication: PMID: 22748121


Sets2Networks (S2N) is general method for network inference from repeated observations of sets of related entities. Given experimental observations of sets of related entities, S2N infers the underlying network of binary interactions between these entities by generating an ensemble of networks consistent with the data; the frequency of occurrence of a given interaction throughout this ensemble is interpreted as the probability that the interaction is present in the underlying real network.
Website: S2N | Publication: PMID: 22824380



Expression2Kinases (X2K)  is a method to identify upstream regulators likely responsible for observed patterns in genome-wide gene expression. By integrating ChIP-seq/chip and position-weight-matrices (PWMs) data, protein-protein interactions, and kinase-substrate phosphorylation reactions, X2K can better identify regulatory mechanisms upstream of genome-wide differences in gene expression. X2K first infers the most likely transcription factors that regulate the differences in gene expression, then use protein-protein interactions to connect the identified transcription factors using additional proteins for building transcriptional regulatory subnetworks centered on these factors, and finally use kinase-substrate protein phosphorylation reactions, to identify and rank candidate protein-kinases that most likely regulate the formation of the identified transcriptional complexes.
Website: X2K | Publication: PMID: 22080467



Lists2Networks (L2N) is a web-based system that allows users to upload and analyze lists of mammalian gene-sets in a client-server software application. Within their workspace users can examine the overlap among the lists they upload, manipulate lists with different set operations, expand lists using existing mammalian networks of protein-protein, co-expression correlations, or background knowledge annotation correlations, and apply simple gene-set enrichment analyses on many gene lists at once against a plethora of prior knowledge datasets.
Website: L2N | Publication: PMID: 20152038



Genes2Networks (G2N) is a command-line software tool that can be used to place lists of mammalian genes in the context of a background mammalian signalome and interactome networks. The input to the program is a list of human Entrez Gene gene symbols and background networks in SIG format, while the output includes: (a) all identified interactions for the genes/proteins, (b) a subnetwork connecting the genes/proteins using intermediate components that are used to connect the genes, (c) ranking of the specificity of intermediate components to interact with the list of genes/proteins.
Websites: G2N command-line version | G2N web interface | Publication: PMID: 17916244


Flashed-based Network Viewer (FNV) is for the visualization of small to moderately sized biological networks and pathways. FVN can also be used to embed pathways inside PDF files for the communication of pathways in soft publication materials.
Website: FNV | Publication: PMID: 21349871


Genes2WordCloud (G2W) is a word-cloud generator and a word-cloud viewer that is based on WordCram implemented using Java, Processing, AJAX, mySQL, and PHP. Text is fetched from several sources and then processed to extract the most relevant terms with their computed weights based on word frequencies. 
Website: G2W | Publication: PMID: 21995939



Sig2BioPAX is a command-line Java program that can be used to convert structured text files describing molecular interactions into the BioPAX Level 3 standard format.
Website: Sig2BioPAX | Publication: PMID: 21418653



Signaling Networks Analysis and Visualization (SNAVI) is a Windows-based desktop application that implements standard network analysis methods to compute the clustering, connectivity distribution, and detection of network motifs, as well as provides means to visualize networks and network motifs. SNAVI is capable of generating linked web pages from network datasets loaded in text format. SNAVI can also create networks from lists of gene or protein names. SNAVI is a useful tool for analyzing, visualizing and sharing cell signaling data. SNAVI is open source free software.
Website: SNAVI | Publication: PMID: 19154595


SequenceViewer is a software system that includes two bioinformatics tools: DNAViewer and RepeatsViewer. These Windows based computer programs create graphical images from text DNA sequences in FASTA format. The images created by the tools can help researchers identify repeating patterns, letter concentrations in different regions of DNA and possibly give clues about DNA physical structure as well as identifying novel promoter binding sites.
Website: SequenceViewer



AJAX Viewer for Signaling Networks (AVIS) is a visualization tool for viewing and sharing intracellular signaling, gene regulation and protein interaction networks. AVIS is implemented as an AJAX enabled syndicated Google gadget. It allows any webpage to render an image from a text file representation of signaling, gene regulatory, or protein interaction networks.
Website: AVIS | Publication: PMID: 17855420


PubMed Alert Me! is a software utility that allows users to enter a list of PubMed queries. Once a list of queries is configured, the program runs either daily or weekly. It searches PubMed and if it finds new matching published papers, the program sends an e-mail notification with a list of links to the new articles.
Website: PubMed Alert Me! | Publication: PMID: 18402930


Databases and Datasets


ESCAPE is a mammalian embryonic stem cell specific database created by collecting and integrating data reporting results from various published studies that profiled human and mouse ESCs including: protein-DNA binding interactions extracted from ChIP-seq/chip experiments, gene regulatory interactions from loss/gain-of-function studies followed by genome-wide mRNA expression profiling, protein interactions from immunoprecipitation followed by mass-spectrometry proteomics, a list of potential pluripotency regulators from RNA interference screens, ESC-specific proteins and phosphoproteins with specified phosphosites from proteomics and phosphoproteomics studies, time-course genome-wide mRNA microarray datasets from differentiating mouse ESCs, and histone modification status from genome-wide studies.
Website: ESCAPE | Publication: PMID: 23794736



Adhesome is a comprehensive literature derived biochemical network developed in collaboration with Benny Geiger's Lab. The network is made of known interactions and cellular components composing the focal adhesion complex in mammalian cells. The Adhesome website provides a reference and supporting materials to the analysis published in Nature Cell Biology.
Website: Adhesome | Publication: PMID: 17671451


neuronal signalome

Neuronal Signalome consists of cell signaling interactions extracted from literature describing components and interactions in mammalian neurons. This network integrates cell signaling pathways specific to mammalian neurons. 
Website: Neuronal signaling network | latest network version | Publication: PMID: 16099987


  Presynaptome consists of literature-based protein-protein interactions extracted from low-throughput experimental studies reporting interactions in mammalian presynaptic nerve terminals.
Website: Presynaptome | Publication: PMID: 19562802

Additional databases and datasets generated by the lab are available for download from the Systems Biology Center New York's Databases and Datasets page.

Network Datasets for Download: in SIG file format

Contact Us

Avi Ma'ayan, PhD
Tel: 212-659-1739
Fax: 212-831-0114
Send e-mail

Icahn Medical Institute
1425 Madison Avenue
Room 12-78 (Office), 12-76 (Lab)
One Gustave L. Levy Place
Box 1215
New York, NY 10029

In the News

Systems Pharmacology Approaches for Drug and Cancer Research

ESCAPE: Database for Integrating High Content Published Data Collected from Human and Mouse Embryonic Stem Cells
Article in RNA-Seq Blog

Society of Toxicology 2013 Annual Meeting
News article in Drug Discovery News

New Computational Method to Help Organize Scientific Data
Press release on

Mount Sinai Algorithm Predicts Drug Side Effects
Press release on

Mutations in 3 Genes Linked to Autism Spectrum Disorders
Press release on

HIPK2 Regulator Protein Plays a Crucial Role in Kidney Fibrosis
Press release on

New Database Could Speed Up Drug Discovery
Tech news feature on CNET

Animating Molecular Biology
Article in Biomedical Computation Review

Systematic Tracking of Cell Fate Changes
News and views article in Nature Biotechnology

Computational Honeycombs Drip with Data
News item in NIGMS Computing Life

Molecular Movies: New Software Animates Gene Expression Data
Technology observation on Scientific American Online

Stem Cells, Systems Biology and Human Feedback
News feature in Nature Reports Stem Cells