While neurogenomics studies have largely depended on bulk sequencing data for the past 10 to 15 years, advancements in computational power and tools are now enabling researchers to study genomics from the perspective of single cells. The Single Cell Neurogenomics Group, under the direction of Donghoon Lee, PhD, develops sophisticated computational and machine learning approaches to capture the underlying biological and pathological processes of neuropsychiatric diseases at the cellular level.
A Deep Look Inside Single Cell Mechanisms
Bulk sequencing has provided enormous gains in our knowledge of neurogenomics, however the drawback is that it measures a composite heterogenous mixture of cells. It is difficult to decipher biological function based on aggregated information coming from a mixture of different cells. With the advent of single cell technology, we are able to measure omics profiles from single cells across a large breadth of samples. We are now able to study the function of specific cell types in the brain and understand how they contribute to disease pathology. Through the lens of a single cell, we can precisely measure the molecular readout from individual cells and their processes, gathering transcriptomic, regulatory, and proteomic data. By combining clinical data with single cell data, we can develop a comprehensive understanding of how neuropsychiatric diseases such as Alzheimer’s disease manifest across the lifespan of affected individuals.
The amount of data that we gather is massive, exponentially more than bulk sequencing data. Because of this enormous scale, we are developing innovative algorithmic tools to decipher this information and understand the underlying biology. Our group has the advantage of access to the Mount Sinai High Performance Computing group and their supercomputer, Minerva, which provides the processing power to handle these tasks.
Our goal is to understand, at the single-cell level, how neuropsychiatric and neurodegenerative diseases manifest. If we can pinpoint which cells are driving the pathogenesis of these diseases and isolate the mechanisms involved, we may be able to modulate or reverse these effects. And by translating our findings from basic science to the bedside, eventually build a pathway for personalized medicine.
Performing Multiple Assays Simultaneously on a Single Cell
We now have the ability to conduct multi-modal single cell sequencing, which allows us to perform multiple assays on a single cell. Previously, researchers could only conduct one assay at a time, for instance, measuring the transcript level through RNA-seq or chromatin access through ATAC-seq. The capability to perform more than one modality on a single cell is deepening our knowledge of genetic, molecular, and cellular processes for neuropsychiatric diseases.
Currently, we are gathering data from one of the largest universes of brain samples to be studied, involving approximately 1,700 postmortem donors. We are measuring population-level single-cell transcriptomics, and for a subset, we are simultaneously measuring transcriptome and regulatory data and integrating this molecular information with digital pathology and clinical symptoms. This yields approximately 8 million cells that we are examining at single cell resolution.
We are also studying the immune cells of the brain in fresh brain tissue from approximately 100 donors from patients with Alzheimer’s disease and controls. We are conducting simultaneous single cell RNA-seq and ATAC-seq and integrating this data with human brain immune cell multi-omics, including whole-genome sequencing, Hi-C, ATAC-seq, ChIP-seq, RNA-seq, ISO-seq, and proteomics.
We are also examining the development trajectory of spatial transcriptomics from fetal stage to adult brain cells by conducting multi-modal RNA-seq and ATAC-seq at the single cell level. Capturing data from a temporal as well as a spatial perspective is providing data that, until now, has not been available to the research community.
Deep Learning Networks to Predict the Path of Disease
The study of individual cells of the brain helps us pinpoint which brain regions and cell types are contributing to disease. With the amount of data at our disposal, our group is leveraging the power of machine learning to decipher the underlying biology. We are training deep learning networks for single cell applications. Each cell is a single observation, and we require many examples to train the network to be able to predict the course of pathogenesis. This deep learning model enables us to manipulate the network to understand the connections between different components. We can use it to knock out specific genes, enhancers, or promoters and observe the effect on disease in silico. By understanding how the disease is manifested, we can attempt to reverse the direction of disease progression. The integrative model provides us with possibilities of drug targets and enables us to validate them. This ultimately provides a prioritized list of candidates for further research.