Mount Sinai Center for Bioinformatics

Education

Members of the Mount Sinai Center for Bioinformatics develop graduate-level courses both within the Graduate School of Biomedical Sciences and through Coursera. We also offer an annual summer research training program geared towards undergraduate and graduate students interested in participating in cutting edge research projects aimed at solving data-intensive biomedical problems. The major aim of our education and outreach activities is to engage the larger research community and train the next generation of Bioinformaticians and Biomedical Data Scientists.

Courses and Research Training Opportunities

We provide access to our Center’s resources through education, outreach, and training programs aimed at various scientific communities.

The 2025 Summer Research Training Program in Biomedical Big Data Science is a research-intensive ten-week training program for undergraduate and master's students interested in participating in cutting-edge research projects aimed at solving data-intensive biomedical problems. Summer fellows training in the Ma'ayan Laboratory conduct faculty-mentored independent research projects in the following areas: data harmonization, machine learning, cloud computing, and dynamic data visualization.

The benefits of participating in the training program are:

  • Direct research experience with projects aimed at solving data-intensive biomedical problems
  • A $8000 salary for the ten-week training period
  • Interaction with the Center's computational experts through weekly meetings, enrichment lectures, and a project presentation session

We are looking for applicants who are:

  • U.S. citizen or U.S. permanent resident
  • Undergraduate or master's student in good academic standing
  • Available to work full-time (40 hours per week) on their research project in the Ma’ayan Laboratory and take part in all program activities (e.g., weekly meetings, enrichment lectures, and the project presentation session).
  • Majoring in Computer Science, Informatics, Mathematics, Statistics, Physics, Engineering, Chemistry/Chemical Sciences, or Biological Sciences
  • Have an interest in Biomedical Big Data Science

We strongly encourage women and members of underrepresented groups to apply.

A full-time postdoctoral position is available in the Ma’ayan Laboratory of Computational Systems Biology within the Mount Sinai Center for Bioinformatics. The Ma’ayan Laboratory conducts multi-disciplinary NIH funded research that utilizes Big Data analytics to develop better understanding about drug action in human cells, build molecular regulatory networks from high-content genome-wide data, and predict optimized therapeutics for individual patients across several complex diseases.

The successful candidate will collaborate with an interdisciplinary team to develop tools and algorithms for the analysis, integration, and visualization of large scale biological omics datasets. The datasets include genomics, transcriptomics, epigenomics, proteomics, and metabolomics. In addition, the position involves the application of machine learning, including deep learning, to mining electronic medical records and combining such data with omics datasets.

Candidates are required to have a recent PhD in Biomedical Science, Computer Science, Mathematics, Biostatistics, Statistics, Physics, Engineering, and relevant experience with applications to biology.

  • Experience with machine learning, multithread programming, and cloud computing
  • Experience developing and deploying web-based and mobile apps
  • Experience with bioinformatics research projects
  • Knowledge of Python, R, Java, JavaScript, Node.js, MongoDB, MySQL, Docker

To apply, please e-mail your CV, research statement, and the names and contact information of three references to: sherry.jenkins@mssm.edu

Avi Ma’ayan, PhD, is the course director for two massive open online courses (MOOCs) on the Coursera platform. As of October 2022, these courses had over 258,750 unique visitors and 24,364 enrolled students.

Big Data Science with the BD2K-LINCS Data Coordination and Integration Center

In this course, students organize, analyze, visualize, and integrate LINCS data with other publicly available relevant resources. In this course, we discuss the various centers that collect data for LINCS, looking at the experimental data procedures and data types. We then cover the design and collection of metadata and how metadata is linked to ontologies, followed by basic data processing and data normalization methods to clean and harmonize LINCS data. We examine how the data is served as RESTful APIs and JSON, which involves exploring concepts from client-server computing. Most importantly, the course focuses on various bioinformatics methods of analysis including: unsupervised clustering, gene-set enrichment analyses, Bayesian integration, network visualization, and supervised machine learning applications to LINCS data and other relevant Big Data from molecular biomedicine.

Network Analysis in Systems Biology

This MOOC is an introduction to the data integration and statistical methods used in contemporary systems biology, bioinformatics, and systems pharmacology research. The course covers methods to process raw data from genome-wide mRNA expression studies (microarrays and RNA-sequencing) including data normalization, differential expression, clustering, enrichment analysis, and network construction. We provide practical tutorials for using tools and setting up pipelines, and cover the mathematics behind the methods applied within the tools.

This course is mostly appropriate for beginning graduate students and advanced undergraduates majoring in fields such as biology, mathematics, physics, chemistry, computer science, and biomedical and electrical engineering. It would also be useful for researchers who encounter large datasets in their own research. The course presents software, applications, and tools developed by the Ma’ayan Laboratory as well as other freely available data analysis and visualization tools.

The aim of the course is to enable participants to use these methods for analyzing their own data for their own projects. For participants who do not work in the field, the course introduces the current research challenges in the field of computational systems biology.

We offer two graduate-level Big Data courses at the Graduate School of Biomedical Sciences.

Programming for Big Data Biomedicine

The course covers computational methodologies applied to analyze data in the broad fields of bioinformatics and big data science. Topics covered include RNA-seq and proteomics data analysis, Machine Learning, Deep Learning, Text Mining, Python and Jupyter Notebooks, Appyters, cloud computing, data visualization, network analysis, version control, and Knowledge Graphs. Students are required to complete small programming assignments throughout the course. The course uses Jupyter Notebooks and Appyters to run most tutorials.

Data Mining and Network Analysis

This course covers methods that include machine learning applications in systems biology including unsupervised clustering and supervised learning; analysis of the topology of biological regulatory networks; and a survey of how these approaches are applied to study biological molecular networks. Papers that combine computational predictions with experimental validation are highlighted; and we present the use of software tools to analyze proteomics and genomics data collected for the LINCS project.