Mount Sinai Center for Bioinformatics


We, at the Mount Sinai Center for Bioinformatics, develop algorithms, pipelines, web-based software systems, and databases that enable experimental biologists to better analyze their data by unravelling the regulatory networks within mammalian cells. We use a variety of mathematical and computational methods such as machine learning and dimensionality reduction to organize data for further discovery and for making predictions.

We were awarded funding from the National Institutes of Health that supports our research:

Our scientific goals for this project are to develop ways to:

  • Evaluate the quality of relevant computational and experimental methods by benchmarking
  • Develop methods to correct for inherent biases within different types of omics experiments
  • Determine how to integrate data from various datasets collected by different laboratories to increase knowledge extraction and address concerns regarding reproducibility
  • Understand how different layers of human cellular regulatory networks correlate and interact

Our technical goals for this grant are to:

  • Develop new standards for data annotation for large-scale experiments
  • Collect and prepare the largest possible set of annotated molecular cellular signatures for data reuse
  • Generate methods to connect cellular and organismal phenotypes with molecular cellular signatures
  • Create web-based data visualization methods to interact with large genomics and proteomics datasets
  • Produce educational materials and events for training the next generation of data scientists in molecular genomic and proteomic biomedicine

Learn more about the grant

The goal of this project is to collect, process, and serve attributes about druggable targets from four protein families: protein kinases, G-protein coupled receptors, nuclear receptors, and ion channels. We focus on genes and proteins that have not been studied extensively but have the potential to become useful drug targets.

We also collect, process, and serve data for all other mammalian genes and proteins; drugs small molecules, and other perturbagens; pheontypes, diseases, and side effects; and clinical genomics datasets from cohorts of patients. These processed data enable us to identify links between and across these various sources of information, and to encourage new discoveries. We develop and apply clustering and classification algorithms as well as workflow analyses to help predict the applicability of targeting these under-studied proteins for translational applications in personalized medicine.

Learn more about the grant

Visit the Harmonizome, a web-based system that serves all the data processed for this project

FAIRshake is a web-based software toolkit that enables the assessment of compliance of biomedical digital research objects with the FAIR guiding principles. FAIRshake functions as a repository to store and serve FAIR assessments. FAIRness assessments of different types of digital objects can be visually communicated via the FAIR insignia. The FAIRness insignia identifies areas of strength and weakness in the FAIRness level of digital objects, guiding digital object producers on how to improve the FAIRness of their products. The FAIRshake toolkit consists of the FAIRshake website, through which assessments are completed and insignias minted; the FAIRshake Google Chrome browser extension; the FAIRshake Bookmarklet; and FAIRshake APIs for direct programmatic access to the information within the FAIRshake database. The Chrome extension and Bookmarklet provide easy access to display and perform FAIR assessments on any relevant website. We are hoping that the NIH Data Commons Pilot Phase Consortium (DCPPC) will find the FAIRshake platform helpful to enhance the FAIRness of the digital objects they serve. We hope that the FAIRshake system will also support the FAIR evaluations of other digital objects for other projects.

Learn more about the award

Visit our project's YouTube channel for FAIRshake demos