Genomics Core Facility

The Genomics Core Facility at the Icahn School of Medicine at Mount Sinai currently operates a diverse world-class next generation sequencing platform (NGS) suite directed by Robert Sebra, Ph.D. alongside Associate Director, Kristin Beaumont, Ph.D. (single cell molecular biology & sequencing), and Assistant Director, Mike Beaumont, Ph.D (physiology and functional validation) who guide sequencing/molecular and single cell technologies. Since 2013, the team has successfully published over 250 collaborative high-impact publications and has played an ample role in submitting dozens of grants which have been funded across various disease foci including cancer, inherited disease, structural variation, infectious disease and innovative technology development.

The Center for Advanced Genomic Technology facility's NGS suite includes platforms from Illumina (2 NovaSeq 6000s, 2 NextSeq 550s, 2 MiSeqs, 2 MiniSeqs, 1 HiScan), PacBio (1 Revio, 1 Sequel IIe), IonTorrent (3 Ion S5XL instruments, 3 Ion Chefs), 10X Genomics (1 Chromium X instrument, 2 Chromium iX, 2 CytAssists, 1 Xenium), MissionBio (Tapestri instruments), and Element (1 AVITI/24 platform). The combination of various sequencing equipment facilitates a broader range of clinical and scientific applications through the generation of flexible and robust data across a variety of genetic loci of varying complexity. Beyond bulk DNA and RNA sequencing methods, the lab also has equipment and expertise centered on single cell and low-input characterization using 1 Element system, 4 Chromium 10X Genomics instrument, and 1 MissionBio Tapestri system. With this equipment, single cells can be isolated from viable tissue and processed by a variety of molecular methods for sequencing purposes, post amplification, at capacity of tens of thousands of cells per day from individual samples. Current example projects include characterization of single cells derived from various patient tumor biopsies across primary and metastatic sites to better understand tumor heterogeneity, as well as characterization of single cells isolated from various regions of brain, heart, and other tissues for discovery and characterization of niche functional populations. The team is comprised of 23 staff and faculty harbored in a >4500 sq ft innovation laboratory inclusive of instrumentation, wet bench, and cell biology space, including a cell biology space with a BSL2 lab and the basic science lab.

Computational Resources

High Performance Computing (HPC) cluster: A central High-Performance Computing (HPC) facility with 40-Petabyte storage capacity called “Minerva” is available for secondary and tertiary data analysis. The Minerva cluster consists of 25,584 Intel Platinum processors with 70 teraflop peak speed and 408 NVIDIA GPUs, interconnected through a 400 Gb/s NDR Fat-Tree InfiniBand network. Each compute node is equipped with at least 512 GB of memory, with 33 1.5-TB high-memory nodes. The hardware accessible for analysis is optimized for parallel jobs that are CPU-bound such as NGS read mapping, as well as parallel jobs such as Bayesian network reconstruction that are memory-bound. In addition, the high-speed InfiniBand interconnects enable jobs requiring substantial shared memory, such as all-by-all comparisons of splice-form specific RNA-seq results to generate isoform-specific co-expression networks. Minerva is connected to Globus, a secure and high-performance data transfer service, for fast data transfers to external sites. Access to ISMMS computational resources is restricted by firewalls and external access is provided through secure shell/ftp with two-factor authentication.

The software and programming environments offered on Minerva are cutting edge, including community standards such as Linux and MPI. The cluster runs resource managers and schedulers to balance job workload, optimized to process as many jobs as possible for the highest overall machine utilization, job throughput, and job success rate. Minerva is operated with over 95% uptime, using scalable and reproducible configuration management techniques. Long-term archival storage is provided by a high-capacity Tivoli Storage Manager (TSM) system and protects against data loss. One copy of the tapes is kept off-site (New Jersey) and one stays onsite at Mount Sinai. All data on the tapes are encrypted. In accordance with Sinai policy, tapes are kept for at least six years.

The Minerva file system provides extensive data capacity (approximately 32 PB of storage overall) for high performance storing and accessing research data, based on IBM Spectrum Scale (formerly known as GPFS). The Center for Advanced Genomic Technology has a dedicated allocation of 1480 TB for various projects in parallel.

Project Submission Form

Our submission form includes our most up-to-date selection of assays, as well as contact information for all teams. Please read the Center policies on the right side and check for the correct section on the left before submitting your request. Sample submission information, including shipping address, will be provided after your project is approved by the team.

Single Cell and Spatial Analysis

Contact: Kristin Beaumont, PhD: Associate Professor and Associate Director (Kristin.beaumont@mssm.edu)

Submission form: Third section (blue). Please read the Center policies on the right side before submitting your request. Sample submission information, including shipping address, will be provided after your project is approved.

We offer a wide range of standard and custom single cell and spatial transcriptomic analysis approaches, designed around access to the following technology:

Sample Prep for Single-Cell and Single-Nuclei Analysis

Miltenyi GentleMACS Octodissociator for dissociation of tissues into single cell suspensions
Levitas LeviCell sorting system for live cell/nuclei enrichment prior to single cell analysis
S2 Singulator for nuclei isolation from various fresh, fixed and frozen tissue types

Single-Cell/Nuclei Analysis

10X Genomics Chromium system for high throughput 3’/5’ RNA Sequencing of thousands to tens of thousands of single cells or nuclei for subpopulation identification and expression analyses. We offer all immune profiling, CITESeq, ATACSeq, multiome RNA/ATACSeq and additional development based single cell assays using these instruments.

Input requirements: Debris-free suspension of at least 500,000 cells (>80% viability) for scRNASeq and scATACSeq and at least 1e6 cells (>80% viability) for CITESeq. Suspension should be at a concentration of 1e6 cells/mL in PBS + 0.04% BSA

MissionBio Tapestri platform for commercial and custom targeted amplicon DNA-sequencing at the single cell level. Input requirements vary by experiment – contact for detail

Spatial Transcriptomics

10X Genomics Visium HD and Visium HD 3': We are a Certified Service Provider for these assays, which are used for whole transcriptome spatial profiling of tissues up to 6.5mm x 6.5mm
10X Genomics Xenium: Targeted, in situ spatial transcriptomic profiling of fresh frozen or FFPE preserved tissues at subcellular resolution
Element AVITI24: Used for multiomic spatial profiling of cells (morphology, protein and gene expression) - can be combined with perturbation analysis
Stellaromics - Coming soon!

Input requirements: High quality fresh frozen tissue embedded in OCT or FFPE blocks - please contact for details

Microarrays

Contact: Irene Salib (irene.salib@mssm.edu)

Submission form: Fourth section (teal). Please read the Center policies on the right side before submitting your request. Sample submission information, including shipping address, will be provided after your project is approved.

The Department of Genetics and Genomic Sciences has made a substantial investment in high-throughput processing equipment for large BeadArray projects. We operate two TECAN Evo liquid handling robots, capable of processing 24 chips per run. Our Illumina HiScan microarray scanner is a top-of-the-line system with a robotic arm that allows for 24-hour microarray data acquisition. With the existing equipment and facilities, the genomics core can process up to 600 samples per week.

If you would like to submit samples for BeadArray analysis, please see sample submission page for location and pricing. Guidance for experimental design is also available.

For more information, see Illumina's site or view our specification sheet.

About Microarrays

DNA microarrays are a well-established technology for genome-wide characterization of gene transcription, single nucleotide variation, copy number variation, and epigenetic cytosine methylation. Our genomics core employs the most robust and accurate method on the market––the Illumina BeadArray platform.

Traditionally, microarrays were glass slides printed with short DNA strands. This method has serious drawbacks such as uneven spot morphology, signal bias due to spot position, and low design flexibility. BeadArrays are a unique approach to microarray technology, using glass slides with micrometer-sized holes to house oligonucleotide-coated beads. The BeadArrays are used as a standard array would be, with the exception of a molecular decoding step that is performed by the scanner.

The BeadArray platform can be used for genotyping, gene transcription quantification, and cytosine methylation.

Human genotyping studies are most often performed using the MethylationEPIC v2.0 (950k SNPs). These chips are also capable of copy number variant estimation (CNV). Custom content chips for Genome Wide Association Studies (GWAS) are also available. For more information, please contact us with details of your project.

Gene expression analysis chips are available for humans, mice, and rats. All chips contain all known genes and many regulatory RNAs. As of March 2012, microRNA chips and custom gene expression services have been discontinued by Illumina.

Genome-wide methylation scanning in human samples (including stem cells and tumor cells) can be accomplished with the human methylation 450K array. This array covers CpG islands, sites known to be methylated in promoters, DNase hypersensitive sites and miRNA promoter regions.

Learn more about formalin fixed, paraffin embedded, samples that can be used for genotyping studies

Ion Torrent

		Samples per chip/per assay
	510	520	530	540	550
Oncomine v3	N/A	1	2	8	16
Oncomine Myeloid	N/A	4	12	N/A	N/A
HotSpot v2	4	8	26	84	N/A
ReproSeq	16	24	96	N/A	N/A
AmpliSeq Whole Transcriptome	N/A	N/A	2	8	16

Contact: Ethan Ellis (ethan.ellis@mssm.edu)

Submission form: First section (purple). Please read the Center policies on the right side before submitting your request. Sample submission information, including shipping address, will be provided after your project is approved.

We perform Ion Torrent sequencing on pre-existing and custom gene panels on the S5/XL system, including AmpliSeq and AmpliSeq HD based panels. Our most frequently used panels are listed below. Please reach out to our team to learn more about these or other, more custom options.

Oncomine Comprehensive Assay Plus

A targeted, next generation sequencing assay that provides a comprehensive genomic profiling solution appropriate for tissues from solid tumors. It targets a panel of 517 Oncogenes and allows concurrent analysis of DNA and RNA to simultaneously detect multiple types of variants in a single workflow, including single nucleotide variants (SNVs), insertions and deletions (InDels), copy number variants (CNVs), BRCA1 and BRCA2 large genomic rearrangements (LGRs), genomic instability metric (GIM), microsatellite instability (MSI), and tumor mutational burden (TMB), loss-of-heterozygosity (LOH), and gene fusions featuring FusionSync. This panel is FFPE compatible and can accept DNA, RNA, or matched DNA/RNA samples with input as low as 20 ng for each sample type.

Oncomine Comprehensive Assay v3

A targeted next generation sequencing assay based on latest clinical oncology research for targeted solid tumor applications. This assay enables the detection of relevant SNVs, CNVs, gene fusions, and indels from 161 unique oncogenes. This panel is FFPE compatible and can accept DNA, RNA, or matched DNA/RNA samples with input as low as 10ng for each sample type.

AmpliSeq Whole Transcriptome

A targeted next generation sequencing assay for gene expression analysis which enables the simultaneous measurement of the expression levels of over 20,000 human RefSeq genes in a single assay. This panel is FFPE compatible and can accept RNA input as low as 0.1 ng of highquality RNA, or 10 ng of FFPE RNA for each sample.

Pacific Biosciences SMRT sequencing

PacBio Systems	SMRTcell (movie length)	Read Length	Total Raw Data Output	HiFi Data Output	Expected Total Reads	Apps
Sequel IIe	SMRTcell 8M (15h)	1-5 kb	100-300 Gb	5-15 Gb	3-5 Gb	IsoSeq (+ Targeted IsoSeq), Amplicons
	SMRTcell 8M (24h)	3-15 kb	200-500 Gb	25 Gb	2-4 M	IsoSeq (+ Targeted IsoSeq), Amplicons, Twist, HLA
	SMRTcell 8M (30h)	10-30 kb	300-600 Gb	20-30 Gb	1-3 M	CLR WGS, HiFi WGS (+ Low-Input), Microbial Sequencing, Kinnex, PureTarget
Revio (with SPRQ chemistry)	SMRTcell 25M (12h)	500 bp – 5 kb	N/A	20-60 Gb	7-12 M	Targeted IsoSeq, Amplicons
	SMRTcell 25M (24h)	5-20 kb	N/A	70-120 Gb	7-10 M	HiFi WGS (+ Low-Input), Amplicons, Twist, HLA, PureTarget
	SMRTcell 25M (30h)	10-25 kb	N/A	80-150 Gb	5-10 M (x 8, 12, or 16 for Kinnex)	HiFi WGS, Kinnex(IsoSeq, scisoSeq, 16S, Amplicons)

Contact person: Ethan Ellis (ethan.ellis@mssm.edu)

Submission form: Second section (pink). Please read the Center policies on the right side before submitting your request. Sample submission information, including shipping address, will be provided after your project is approved.

We offer single molecule, real-time (SMRT) sequencing on the PacBio Sequel IIe and Revio systems. SMRT sequencing is characterized by long read lengths and high sequence accuracy, which can be used to sequence templates ranging from 500-30 kb. Common applications include de novo genome assembly, full-length transcriptome profiling and highly accurate amplicon sequencing. Please reach out to our team to learn more about the options listed below, or to inquire about other applications that may be under development.

Library Preparation Methods available:

HiFi whole genome sequencing

HiFi WGS utilizes a library preparation ranging from 10-18kb, that is tightly size selected prior to sequencing on either the Sequel IIe or Revio platforms. During sequencing, each molecule is sequenced repeatedly, and these multiple “passes” of the sequencing polymerase are collapsed in primary data processing to generate a single highly accurate (>99.9%) circular consensus sequence (CCS) per molecule. These data can be used to characterize SNPs and structural variants when compared to a reference genome.

HMW gDNA extraction services are provided through the PacBio pipeline.

Input requirements (native prep): For native strand sequencing and calls on 5mC and 6mA, at least 2 μg HMW gDNA with average fragment sizes >30 kb

Input requirements (PCR-based prep): For lower input samples, we offer the PCR-based AmpliFi prep that requires at least 50 ng HMW gDNA

Twist targeted sequencing

The Twist protocol utilizes a probe-based pulldown approach to enrich for selected loci and provides haplotype-level resolution across connected regions. This strategy provides coverage across only regions of interest, allowing for high plex-to-cost ratio and highly customizable coverage. Probe design and ordering can take up to 6 weeks, please contact ahead of time.

Input requirements: 2 μg HMW gDNA per sample

Isoform sequencing (IsoSeq) + Kinnex

IsoSeq libraries are constructed from cDNA generated by oligo dT priming of total RNA, in order to capture all polyadenylated transcripts. Sequencing data generation results in highly accurate HiFi reads per isoform, and initial primary analysis steps trim primer sequences and polyA tails, removes concatemers and performs de novo (reference-free) full-length isoform predictions. These data are typically used to generate novel reference transcriptomes, examine alternative splicing patterns or to characterize alternate promoter, exon and UTR usage under different experimental conditions. The Kinnex protocol can be used to drastically increase read count per sequencing run.

Input requirements: At least 500ng total RNA, RIN > 7 or DV200 > 90%

Single-Cell IsoSeq with 10X or ArgenTag + Kinnex

CDNA generated through 10X or ArgenTag preps can be concatenated using the Kinnex protocol and sequenced at high throughput using Revio with SPRQ chemistry. The integration of long-read sequencing with single-cell technology brings isoform-level resolution to single-cell analysis, allowing for transcriptome profiling of individual cells. This data can determine which populations are generating which isoforms, adding a new layer of valuable biological insight to single-cell analysis. Contact our team to discuss which single-cell assay would best suit your experimental needs.

Input requirements (10X + Kinnex): 100 ng of unfragmented 10X cDNA. 10X cDNA can be generatd at the CAGT or provided from a previous prep. Supported assays include 10X Universal 3’, Universal 5’, and Multiome assays. Contact us to inquire about other 10X assays.

Input requirements (ArgenTag + Kinnex): 100 ng of ArgenTag cDNA, or tissue for ArgenTag prep. Contact our team to discuss options.

Amplicon sequencing

Amplicons generated from cDNA, DNA or bisulfite-treated DNA can be sequenced on the PacBio systems with high accuracy (>99.9%) and high contiguity, allowing for phasing of variants across the full amplicon length and providing haplotyping capabilities within a complex mixture. In the case of cDNA amplicons, targeted IsoSeq can be used to examine differential isoform usage or to resolve fusion transcripts at disease-relevant gene loci. Contact our team to discuss experimental design options and multiplexing strategies.

Input requirements: At least 500 ng nucleic acid template

HLA genotyping

The Human leukocyte antigen (HLA) genes are some of the most polymorphic in the genome and play a key role in determining the quality of immune responses in the context of infectious disease, cancer and autoimmune disorders. HLA profiling allows investigators to assess the association of HLA alleles and resilience/susceptibility to disease. Using commercially available reagents and software from GenDx, our team offers profiling of class I and class II HLA genes. Depending on the profiling and throughput requested, up to 96 subject samples can be profiled per run.

Input requirements: 500-2000 ng gDNA from each individual to be profiled, dependent on loci screened

Pipeline FAQs:

HMW gDNA and RNA extraction services are also available and can be performed from cells, tissue, blood, OCT, buccal swabs, any vertebrate/invertebrate specimens and plants.

QC of DNA and RNA should be performed using a system that accurately quantifies DNA and RNA molecules with nucleic acid-specific kits, such as the Qubit Fluorometer or the Agilent systems, including Bioanalyzer, Tape Station, Fragment Analyzer or Femto Pulse. We do not recommend using a Nanodrop, as these systems measure all nucleotides in a solution and may vastly overestimate the quantity of material in a sample.

All sequencing services include all necessary QC and size selection of libraries.

Bioinformatics Services

Demultiplexing of raw data and raw data delivery is always provided via secure FTP link as part of the sequencing cost. If requested, we are also able to perform all analysis pipelines available within SMRTLink, the Pacific Biosciences software associated with the Sequel and Revio systems, which are listed below. In these cases, an additional fee is included to cover compute costs.

Custom bioinformatic services are also available upon request, including large genome assemblies, HiFi genome assemblies, transcriptome annotation, targeted IsoSeq analysis and repeat expansion analysis, among others.

Circular Consensus Sequencing (CCS)

Also known as HiFi reads, the CCS algorithm removes any residual sequencing error by collapsing multiple sequencing reads over the same molecule, resulting in individual sequences with accuracy > 99.9%. Data delivered as BAM CCS reads. Revio only delivers HiFi reads by nature, there is no option for subreads delivery. Sequel IIe first generates subreads, which can be converted to HiFi reads using CCS.

Hierarchical Genome Assembly (HGAP)

Typically used to assemble small, haploid genomes. Data delivered as FASTA/FASTQ assemblies.

Microbial Assembly (MA)

Similar to HGAP, but further tuned for particular features of microbial genomes, including circularizing assemblies and resolution of extrachromosomal features, such as plasmids. Data delivered as FASTA/FASTQ assemblies.

HiFi Target Enrichment

Designed for targeted DNA sequencing assays such as Twist, Target Enrichment maps reads to a chosen genome and provides structural information, including variant calls and haplotype phasing, about the targeted region.

Base Modifications

Using the kinetics of base addition during SMRT-sequencing of native (not amplified) DNA templates, this analysis allows for the detection of DNA modifications and associated motifs, including 5-mC, 5-hmC, 5-fC, 5-caC, 4-mC, 6-mA, 8-oxoG and 8-oxoA. All Revio runs will provide 5mC and 6mA tags by default.

IsoSeq

Initial CCS processing generates highly accurate reads per molecule sequenced. These reads are trimmed of primers and polyA tails, PCR artifacts (i.e. concatemers, heteroduplexes) are removed, and full-length isoforms are predicted de novo (i.e. reference-free).

Single-Cell IsoSeq

IsoSeq and single-cell read classification are combined to obtain a sample’s transcriptome at the cellular level. First, the data is filtered full-length transcripts with intact polyA tails and identifiable 10X or ArgenTag molecular barcodes. True cells are predicted based on how many UMIs are found per cell barcode, and the transcripts in these cells are further filtered and categorized using IsoSeq tools.

Kinnex

Unprocessed data from Kinnex preps is provided as 8-, 12-, or 16-mer reads, including Kinnex adapter sequences. Read segmentation can be requested as a standalone service to break apart the concatemers and clip the Kinnex adapter sequences if further processing is not requested, or it can be included with downstream IsoSeq or scIsoSeq analysis.

Illumina Next Generation Sequencing

Contact person: Ethan Ellis (ethan.ellis@mssm.edu)

Submission form: First section (purple). Please read the Center policies on the right side before submitting your request. Sample submission information, including shipping address, will be provided after your project is approved.

We operate a full suite of Illumina NGS library preparation and sequencing pipelines, including DNA and RNA sequencing, as well as epigenetic profiling. We accept projects for library preparation and sequencing, including pre-made libraries, and offer extraction services for projects that require them. We provide troubleshooting assistance when needed to ensure successful completion of your project and we will work with you for any custom requests that are not listed below.

Pricing is determined by the assay(s) requested and the total number of samples. Please reach out and we will generate a custom cost estimate for your project.

Library Preparation Methods Available:

RNA methods

Stranded mRNA Sequencing
RNA-seq library preparation that investigates mRNAs for gene expression analysis. This method is compatible with any species that has mRNA with polyadenylated 3’ ends. Not FFPE compatible. 
Input requirements: total RNA with RIN > 7; 25 to 1 µg input.

Stranded Total RNA Sequencing
RNA-seq library preparation that captures protein coding mRNA as well as long non-coding RNA. This method is compatible with low yield, as well as low quality, degraded, and FFPE RNA samples. The kitThis method specifically depletes abundant cytoplasmic RNA species (i.e. rRNA, mtRNA, globin RNA) from human, mouse, and rat origin but may also be applicable to a variety of eukaryotic species (see link for species compatibility)
Input requirements: 250 pg to 1 µg total RNA input with RIN greater than or equal to 7, 1 ng to 1 µg total RNA input from FFPE or with RIN between 2 and 7. Minimum integrity is DV200 >30%.

smRNA Sequencing
Library preparation method used to study small non-coding RNAs (including miRNAs) responsible for gene silencing and post-transcriptional regulation of gene expression.
Input: RIN>8, 1 ng–2 µg input of total RNA or enriched small RNA samples.

DNA methods

Whole Genome Sequencing
DNA preparation method that enables whole genome sequencing, FFPE compatible.
Input requirements: 100 – 500 ng of intact gDNA for human or other large genomes. As low as 10ng for small genomes.

Whole Exome and Targeted Sequencing
DNA library preparation method for whole exome or targeted sequencing, to investigate protein coding regions or other targets of interestin a genome. Coding regions are captured by a panel of complimentary oligonucleotides for enrichment. FFPE compatible.
Input requirements: 10 ng to 1 µg genomic DNA, 50 ng minimum for human whole exome samples.

Amplicon Sequencing:

We provide two methods for amplicon library preperation: (1) Full length amplicon sequencing using direct ligation for small amplicons (< 500 bp); (2) Fragmentation and library preparation for larger amplicons (~ 10 – 20 kb)

Input requirements: (1) Small amplicons (< 500 bp): 1 ng per sample; (2) Large amplicons (~ 10 – 20 kb): 10 ng per sample

Library preparation of DNA recovered from the ChIP technique to produce amplified libraries for sequencing on the Illumina sequencing platforms.
Input Requirements: 0.1 - 10 ng bp ssDNA or dsDNA eluted from a ChIP protocol; Optimized for starting inputs of 200 - 400 bp.

ATAC-Seq

Chromatin Accessibility Profiling assay, compatible with isolated nuclei, cells or tissue samples. Please reach out to coordinate cell handling and sample dropoff prior to submission

User-Prepared Libraries
We are happy to accept and sequence user prepared libraries. Please submit >15nM library in >20 µl water or EB (must be EDTA free) for direct QC and loading. Sequencing parameters and specifications can be discussed with our team.

Extractions
DNA and RNA extractions from fresh, frozen or FFPE samples are supported, as well as custom requests.  QIAcube

QC-Only Services:

Submission form: Fifth section (navy). Please read the Center policies on the right side before submitting your request. Sample submission information, including shipping address, will be provided after your project is approved.

Qubit: Gold standard quantification of double-stranded DNA and RNA. Available in both broad range (DNA: 0.2 - 4000 ng/µL; RNA: 0.5 - 1200 ng/µL) and high sensitivity (DNA: 0.005 - 120 ng/µL; RNA: 0.2 - 200 ng/µL) configurations.

Agilent TapeStation and Agilent BioAnalyzer: Quantitation and visualization of DNA/RNA mass, integrity and size distribution. Provides RIN, DIN, and DV200 measurements of integrity critical for selecting appropriate assays.

qPCR: Quantitation of specific templates. Available instrumentation provides throughput for 384-well plates.

Illumina Flowcells

Illumina Instruments	Total Cycles	Output	Reads Passing Filter
NextSeq	High-Output 300 cycle	100–120 Gb	Up to 400 M
	High-Output 150 cycle	50–60 Gb	Up to 400 M
	High-Output 75 cycle	25–30 Gb	Up to 400 M
	Mid-Output 300 cycle	32.5–39 Gb	Up to 130 M
	Mid-Output 150 cycle	16.25–19.5 Gb	Up to 130 M
MiSeq	v3 150	3.3–3.8 Gb	22–25 M
	v2 300	4.5–5.1 Gb	12-15 M
	v2 500	7.5–8.5 Gb	12-15 M
	v3 600	13.2–15 Gb	22–25 M
	v2 Micro 300	1.2 Gb	4 M
	v2 Nano 300	300 Mb	1 M
MiniSeq	High-Output 300 cycle	6.6–7.5 Gb	22–25 M
	High-Output 150 cycle	3.3–3.75 Gb	22–25 M
	High-Output 75 cycle	1.65–1.875 Gb	22–25 M
	Mid-Output 300 cycle	2.1–2.4 Gb	7–8 M
NovaSeq	SP 100	65–80 Gb	650–800 M
	SP 200	134–167 Gb	650–800 M
	SP 300	200–250 Gb	650–800 M
	SP 500	325-400 Gb	650–800 M
	S1 100	134–167 Gb	1.3–1.6 B
	S1 200	266–333 Gb	1.3–1.6 B
	S1 300	400–500 Gb	1.3–1.6 B
	S2 100	333–417 Gb	3.3 B–4.1 B
	S2 200	667–833 Gb	3.3 B–4.1 B
	S2 300	1000–1250 Gb	3.3 B–4.1 B
	S4 200	1600–2000 Gb	8-10 B
	S4 300	2400–3000 Gb	8-10 B

Projects do not share a flowcell with other projects.

Bioinformatics

Contact: Hardik Shah (hardik.shah@mssm.edu)

The CAGT provides a comprehensive suite of bioinformatics services to translate data from NGS and other cutting-edge planforms into scientific insights. The bioinformatic pipelines are hosted and executed on the Minerva HPC cluster. Our standard primary analysis pipelines include:

Element Biosciences – AVITI and Teton CytoProfiling

Regular demultiplexing
Visualization for TetonCytoProfiling data

Takara Bio – Cogent NGS Analysis Pipeline

SSC seq analysis pipeline that processes Takara Bio datasets and generates a cell-by-gene expression matrix, QC metrics, clustering results, and interactive HTML reports for data exploration and downstream analysis

Mission Bio – Tapestri Pipeline

SSC DNA seq pipeline designed for targeted variant detection. The primary output is a cell-by-variant genotype matrix, enabling clonal architecture, mutation co-occurrence, and phylogenetic analyses at single-cell resolution

10X Genomics – Single-Cell Gene Expression, Immune Profiling, Multiome, Spatial

10X Cell Ranger:scRNA-seq analysis pipeline that processes raw sequencing data to produce a cell-by-gene expression matrix, clustering results, dimensionality reduction analyses, and comprehensive QC metrics
10X Cell Ranger ARC:Single-cell multiome ATAC + gene expression analysis.10X Space Ranger: A spatial transcriptomics analysis pipeline that integrates sequencing data with tissue imaging. Produces spatial gene expression matrix, tissue coordinates, clustering results, and QC metrics, allowing gene expression to be mapped back to tissue architecture

RAPiD

Nextflow-based RNA pipeline that generates gene-by-sample expression matrixes, raw gene count matrices, differential expression results, and interactive HTML reports summarizing QC metrics and downstream analyses. The pipeline supports extensive customization through configurable parameters

DSeek

Nextflow-based DNA-seq pipeline for variant discovery. It performs alignment, quality control, and variant calling, producing outputs such as BAM files, VCF files, and associated QC reports for downstream genomic analyses
Also provides peaks calling for ATAC-seq andChIP-seq

TDIsoSeq

AnIsoSeq analysis pipeline with a goal of identifying full-length transcript isoforms from long-read sequencing data. Outputs include isoform-by-sample expression matrices, transcript annotations, and novel isoform discovery results, enabling comprehensive transcriptome characterization

Custom Services

Beyond standard analyses, the Center for Advanced Genomic Technologies also provides bioinformatic consulting and support for customized analyses. For example, collaboratively creating customized references toidentify HIV contents in single-cell transcriptomic analysis. Other common supports include differential expression analysis, visualization for single-cell and spatial transcriptomic data, cell-type deconvolution, etc.