10X Genomics – Guide to Getting Started

This page is intended to familiarize Gene Expression Center clients with the many aspects of the various 10X Genomics single cell/single nuclei/spatial transcriptomics assays that are available through the University of Wisconsin-Madison Biotechnology Center. Please see the below tabs to go over each of these topics in detail. Note that this page deals primarily with the fresh cell and spatial transcriptomics assays. Please see this page for information on the Flex assay (formerly known as Fixed RNA) for fixed samples.

Last updated: 2025/03/04

Please note: This video was recorded on 1/3/2023. Sequencing options and costs have changed considerably since then with the acquisition of the NovaSeq X+ instrument. Please see this page for a broad overview of how the options (and pricing) have changed as a result.

This is an accordion element with a series of buttons that open and close related content panels.

Available 10X Genomics kits/library preparation services

A note about these kits: We only maintain a stock of the 3′ Gene Expression kits. For any other assay, we will order the kits specifically for your project. We ask that you provide a funding string at the time we place the order so that we can eventually bill for the cost of the kits even if your project plans do not end up proceeding. While 10X Genomics is usually able to deliver kits a day or two after they are ordered, they do suffer from occasional back-order issues. If this happens, we will keep in touch with you to let you know when we receive the kits, so that you are not bringing over samples on a day when we do not have the reagents to process them.

Single cell/single nucleus 3′ Gene Expression – Also known as Universal 3′ Gene Expression. The standard “workhorse” kit for single cell/nucleus RNA sequencing. The kit employs polyA-based capture of mRNA at the 3′ end to generate dual indexed libraries containing both a cell barcode identifying the cell of origin as well as a unique molecular identifier (UMI), which will be unique to every transcript captured. “Feature barcoding” add-on modules are available for cell surface protein expression (analogous to CITE-seq) and sample multiplexing. Available in standard singleplex format and the new “On-Chip Multiplexing” format.

Single cell 5′ Gene Expression/Immune Profiling – Also known as Universal 5′ Gene Expression. This kit also generates single cell RNA-seq libraries through capture at the 5′ end by capturing the TSO sequence added to this end of the transcripts in a template-switching reverse transcription reaction. The main reason to choose this kit over the 3′ Gene Expression kit is the immune repertoire profiling add-on module, which allows for the parallel PCR enrichment and library preparation of B cell/T cell receptor V(D)J sequences. Currently, 10X Genomics only sells these modules for human and mouse V(D)J amplification. “Feature barcoding” add-on modules are available for cell surface protein expression and CRISPR screening. Available in standard singleplex format and the new “On-Chip Multiplexing” format.

Single nucleus ATAC Sequencing – This kit is used to generate ATAC-seq libraries for measurement of chromatin accessibility at single-nucleus resolution.

Single nucleus Multiome ATAC + Gene Expression – This kit uses gel beads with capture oligos for both mRNA polyA tails and transposed DNA for the parallel preparation of ATAC-seq and 3′ Gene Expression libraries from the same nucleus. At approximately the mid-point of the library preparation process, the samples undergo an initial PCR amplification to produce sufficient material for library construction, and the pool is then split to produce ATAC-seq and gene expression libraries from the same initial sample.

Visium Spatial Transcriptomics (CytAssist) – The CytAssist Visium assays use a dual probe-based method of capturing transcripts and is currently only available for human and mouse. When both probes are able to bind to their adjacent target sites, they are ligated together, and this ligated probe construct becomes the template for capture on the Visium slide and subsequent library preparation. Before starting the Visium FFPE workflow, an RNA quality assessment is performed to determine the fraction of RNA in the sample that is above 200 bp in length (“DV200”); 10X Genomics recommends a minimum DV200 value of 30%. This assessment can also be done by the GEC for users who lack access to an instrument such as the Bioanalyzer or Tapestation. Visium projects are done in collaboration with the TRIP Lab, which handles the initial sectioning and imaging of the tissue. Standard CytAssist slides are available with 6.5 x 6.5mm capture areas or 11 x 11mm capture areas. Their newest version, Visium HD, uses 6.5 x 6.5mm capture areas with an unbroken lawn of capture oligos for achieving purported single-cell resolution

Information on the probe sets and the number of targeted genes can be found here.

(The images in this section were taken from their respective User Guides provided by 10X Genomics; you can find their Support database containing these documents and more here.)

Do I need to include biological replicates, or can I treat single cells as replicates for statistical testing?

This section is provided by Anthony Veltri, Ph.D., a staff member in the UW Biotechnology Center Bioinformatics Resource Core

Just like any other experiment, biological replicates are necessary to perform statistical tests comparing gene expression or cell population size between conditions. Although single cell data is made up of thousands of individual cells, each cell cannot be considered a replicate because of correlations between cells within samples. Treating cells as replicates can greatly increase the false-positive rate of statistical tests for differential gene expression. The practice of analyzing pooled samples can lead to a statistical mistake called sacrificial pseudoreplication, which confounds the variation between samples and the variation within samples (i.e. significant differences could either be due to a robust treatment effect or one outlier sample in the pool).[1]

A commonly-used correction for this is called “pseudobulking,” where between-sample variation is accounted for by performing traditional bulk RNA-seq differential expression testing methods on summed or averaged read counts within samples for each cell type. One study[2] describes simulations to compare false positive rates of statistical tests that do account for biological samples versus statistical tests that do not. The authors found that false positive rates ranged between ~0.3-0.8 when samples were analyzed without consideration for sample variation, whereas the pseudobulk correction method had a false-positive rate between ~0.02-0.03.

Further recommended reading on this topic includes the following publications:

· Gibson G (2022) Perspectives on rigor and reproducibility in single cell genomics. PLoS Genet 18(5): e1010210. https://doi.org/10.1371/journal.pgen.1010210

The discussion under the headings “Evaluation of significance” and “Covariate adjustment” is particularly relevant.

· Squair, J.W., Gautier, M., Kathe, C. et al. Confronting false discoveries in single-cell differential expression. Nat Commun 12, 5692 (2021). https://doi.org/10.1038/s41467-021-25960-2

The authors test common single-cell differential expression methods. Figure 4 demonstrates the false positive DE genes that arise from not handling variation between replicates. In particular, note the poor results of the “Wilcox” (Wilcoxon rank sum) test, which is the default statistical test for Seurat’s FindMarkers function.

In summary, failing to account for the variation between biological samples when statistically testing condition-dependent effects strongly increases false positive differential expression results in single-cell data. While the literature is full of examples of this problem, reviewers are becoming increasingly aware of it as the technology becomes cheaper and more mainstream. It is likely that single-cell experiments without biological replicates will be much more difficult to publish in peer-reviewed journals in the future.

[1] Heffner, R. A., Butler, M. J., & Reilly, C. K. (1996). Pseudoreplication revisited. Ecology, 77(8), 2558-2562.

[2] Zimmerman, K. D., Espeland, M. A., & Langefeld, C. D. (2021). A practical solution to pseudoreplication bias in single-cell studies. Nature communications, 12(1), 738.

How do I schedule a submission for a single cell project?

For most projects, you can e-mail us at gecinfo@biotech.wisc.edu to schedule a submission. If you have not used UWBC services recently, we will first need to add you to our submission system, which you can use to log in and create a submission request. Please give us at least one week’s notice, as staff may already have their projects scheduled for the week. Outside of our peak busy season in November and December, there is generally not too much risk of scheduling conflicts, but we advise you to schedule your submission with us as early as possible to try to ensure that your desired submission window is available.

How does the single cell capture work in the 10X Genomics workflows?

10X Genomics uses a droplet-based method of cell capture. The Chromium instrument uses single-use microfluidic chips to combine the gel beads bearing the capture oligos with partitioning oil, the reverse transcription master mix, and the cells into a unit called a gel bead-in-emulsion, or GEM. In order to limit the number of GEMs containing multiple cells (called multiplets), the cells are delivered at such a limiting dilution that the majority of GEMs will contain no cells at all. Importantly, the capture efficiency of this process is such that in the newest version of the assay, only 70-80% of the cells loaded into the assay will be captured – i.e. to capture 20,000 cells, we will load approximately 29,000. In order to avoid an increasing burden of multiplets, 10X Genomics recommends a maximum target of 20,000 captured cells per reaction (though with multiplexing this can be increased).

How many cells are needed, and at what concentration?

When possible, we ask for a minimum of ~100,000-150,000 cells at a concentration of around 1,000 – 1,600 cells/uL. Under these conditions, it is typically possible to count the sample twice (at minimum), load the assay to target anywhere in the 500 – 10,000 cells per sample capture range, and repeat the run if needed in the rare event of a clog in the microfluidics of the chip. Please take note of the loading table linked at the bottom of this table for an overview of the concentrations and sample volumes needed to aim for specific recovery targets, and note that under normal conditions, counting the sample in duplicate on our cell counter will use 36 uL of the sample volume.

If you are sorting samples prior to bringing them to the GEC, we ask that you sort at least ~150,000-200,000 cells, if possible. Again, this depends somewhat on how many cells per sample you would like us to collect. Sorted samples often need to be concentrated to be used for single cell submissions, and due to the loss of cells from the harsh nature of the sort and the concentration process, it is common for the number of cells we measure on our cell counter to be approximately 50% of the number of “sorted events” reported by the Flow lab.

When the number of available cells is around 50,000 or less, it becomes increasingly difficult to obtain duplicate counts and load the assay (especially if you are hoping to target the high end of the capture range), and it becomes more likely that we will not have the volume to repeat the run if we do experience a clog. For cases where you are able to sort a small number of cells (~10,000 or less), we can do a “sort-and-load” run where the cells are sorted into a small volume and loaded directly into the assay without measuring them on the Luna FX7. We can approximate the number of captured cells based on the volume and the number of sorted events, although this approach carries risks with not being able to assess their quality before loading.

Related document(s): 3′ v4 cell concentration versus targeted cell recovery table (from the 3’ v4 User Guide, CG000731, Rev A)

How should samples be prepared for submission to the Gene Expression Center?

Due to sample type-specific characteristics, we currently leave the preparation of single cell or single nuclei suspensions to the submitting lab. In the related documents below, you will find links to a number sample prep resources from 10X Genomics. Additionally, the Worthington Tissue Disassociation Database is a great resource, with protocols for many different tissues and species. As you are developing your protocol, we are happy to schedule “mock” measurements with you to look at your samples on the Countess and get a sense of where they are at in terms of concentration and viability/quality.

When preparing your samples, it is important that they are delivered in buffer that is free of any components that might inhibit the reverse transcription reaction (e.g. EDTA at concentrations above 0.1 mM). 10X Genomics recommends PBS with 0.04% BSA, if possible.

Ideally the cell viability for your samples should be above 90%. Due to the high costs of the reagents for these preps, we caution against proceeding with samples below 75% viability, as it becomes increasingly difficult to hit our cells per sample target and the level of background from “ambient” RNA increases (more on that further down this page).

Related document(s):
10X Genomics Single Cell Protocols: Cell Preparation Guide
Acceptable sample buffer options, per the above document (CG00053, Rev D)
How can I isolate nuclei for 3’ Gene Expression profiling?
What are the best practices for working with nuclei samples for 3’ single-cell gene expression?
Are RNase inhibitors required in the preparation of my sample?

What are the limits on the number of samples per run? How many cells per sample can be captured?

For the Chromium iX currently housed in the GEC, the compatible 10X Genomics chips can hold eight samples per run. We can run consecutive chips to process more than eight samples in a project, although the need to obtain reliable counts on every sample prior to loading means that increasing the number of samples risks extending the time the samples are sitting on ice to the point where the quality may begin to drift in unpredictable ways. The scaling costs of the reagents and the required sequencing (more on this below) also pose challenges to many labs.

In the standard-throughput assays, for non-multiplexed samples, the optimized range for capture is 500 – 20,000 cells per sample. If the volume of cells permits, the same sample can be loaded into multiple lanes of a chip to increase the number of cells captured, although each lane used will incur an additional sample’s worth of reagent costs. As discussed below in the section on sample multiplexing, the number of cells captured in a given channel can be pushed above 20,000, though there are some potential drawbacks.

If your experience with bulk RNA sequencing has led you to plan for a large number of biological replicate samples, this may also not be necessary. While the standards of the field are constantly evolving, much of the replication in a single cell experiment comes from the cells themselves, and including multiple biological replicates may not strictly be necessary to obtain valuable, useful data. We encourage you to look at current publications in your field, especially those in journals that you might seek to publish your work in eventually, for insight into how many replicates you should consider including. The Bioinformatics Resource Center here at UWBC is also a fantastic resource for this aspect of your experimental design.

If you do find yourself considering a large number of samples for your initial single cell experiment, we would strongly encourage you to first plan for a small pilot experiment with 1-2 samples, if time permits. While it will add additional upfront costs to your project, you can learn a lot about how effective your sample prep process is; how your cells behave in the assay; and what read depth is sufficient to get the answers to your experimental questions – all factors that can potentially offer ways to better design your larger experiment to get the best value for your money.

What options are available for sample multiplexing?

There are currently two main options available for cell multiplexing. The first is hashing, which involves staining cells with antibodies conjugated to hashtag oligos that are captured during the same process that captures transcripts from the cell. BioLegend’s TotalSeq B and C product lines are designed to be natively compatible with the 3’ and 5’ gene expression assays, respectively, and if you wish to use hashing we strongly suggest you use either TotalSeq B or C if possible. The TotalSeq A product line is also possible to use, although this requires the use of additional reagents from other vendors (custom oligos and master mix) and “off-protocol” work.

10X Genomics also recently introduced a new version of the 3′ and 5′ assays using an approach called On-Chip Multiplexing. While this approach comes with a much lower per sample reagent cost and allows for multiplexing without having to do any additional sample labeling steps, it also has some notable limitations – most importantly, it comes with a reduced captured cells per sample cap of 5,000 compared to the 20,000 cells that can be targeted in the standard assays.

Related document(s):
Hashing application note from BioLegend

How do batch submissions work (e.g. for time course experiments)?

For the 3’ and 5’ gene expression assays, the assay has two main pause points within the workflow. After the reverse transcription reaction, the samples are stored at -20C, where they are considered stable for one week. On the next day of work, they will be taken up through the cDNA amplification and cleanup, after which they are considered stable for four weeks.

Either of these stopping points can be an opportunity to merge batches together to save money on labor/consumables costs, although the initial portion(s) of the prep that cannot be batched will still carry their own costs for labor and consumables. If you are able to bring batches to us within a week of the initial submission, we can batch them together for day two and onward; if you need more time between batches, we can batch them together for the final day of library construction if they are submitted within four weeks of the first batch.

The other assays have similar batching opportunities. For the current version of the ATAC protocol (v2), the samples can be stored at -20C for one week after the initial Chromium instrument run and GEM incubation, and we can also pause after the following cleanup step for two weeks if needed. For the multiome protocol, the samples are stored at -80C for up to four weeks after the reverse transcription reaction, and then as it splits into the gene expression and ATAC library prep there are similar pause points, providing ample opportunities for batching.

In all cases, the final libraries are considered stable indefinitely at -20C, so even if we are unable to batch them for library construction, you can hold samples for later sequencing if you plan to submit additional samples down the line.

How long does it take to get sequencing data? What are the QC checkpoints during the prep?

As discussed in the batching section, most of the 10X Genomics library preps are split into three days. Preparation of single nuclei ATAC libraries typically takes two days, while Multiome projects or projects with add-ons such as feature barcoding/cell surface protein libraries may require an extra day to finish the additional set of libraries. Though we try to work through this process as quickly as possible, these three days of work may occasionally be spread over a slightly longer period depending on our current workload and staffing situation at the time of submission.

Once we are finished with the libraries, we will pass them to the DNA Sequencing facility for a relatively low-depth MiSeq sequencing run, which provides important library QC information. The turnaround time for the MiSeq tends to be approximately 3-5 business days. Depending on the type of library, we may then send you a summary of the MiSeq data and a recommendation for or against proceeding with the NovaSeq. At this point we will wait for confirmation from you before we submit paperwork to DNA Sequencing to move the samples into the NovaSeq queue. ATAC and Visium libraries will typically go directly into the NovaSeq queue after the MiSeq is completed. The typical turnaround time for NovaSeq data is approximately 1-2 weeks.

Throughout this process we will have four QC checkpoints. The first is the examination of the cells/nuclei on our Countess II cell counter, where we will assess factors such as viability, proportion of single cells versus aggregates, presence of debris, etc. Proceeding beyond this point commits you to the cost of the reagents for the library prep, even if the samples fail at a later checkpoint. The second QC checkpoint is the cDNA yield at the end of the second section of the library prep. Though this is highly variable depending on the cell type and sample condition, very low cDNA yield can be an indicator of poor quality initial samples or issues with the sample prep process. Unfortunately this result can occasionally occur even with cells that look intact on our cell counter, an outcome that seems to correlate with excessively lengthy or stressful sample prep conditions.

The third QC checkpoint is the final library QC, during which we will measure the library yield and evaluate the libraries on the Agilent Tapestation. In almost all cases where the cDNA yield was good, the final library QC will look fine. The final QC checkpoint is the MiSeq data.

How does the initial analysis of the NovaSeq data work?

When the sequencing of your libraries is finished, the DNA Sequencing core will run the bcl2fastq pipeline to generate FASTQ files from the raw sequencing reads, which is what they will release to your group. The Cell Ranger pipeline then takes those FASTQ files, aligns them to the selected reference transcriptome for each sample/library, and counts cell barcodes and UMI (individual transcript) barcodes for each gene to give you some measure of the level of gene expression for a given gene.

It will also generate what is called a web summary html file, which is a very nice summary of QC statistics for a given library (e.g. how many cells were captured; what fraction of reads for that library could be mapped to “real” cells versus background; how many genes and transcripts were detected; etc.) in addition to giving you a preview of how the cells clustered into groups by their broad expression patterns.

Cell Ranger will also generate cloupe files, which can be imported into their Loupe Browser software for a fairly user-friendly way to explore the data and learn more about the cell clusters (such as what genes are defining the individual clusters).

10X Genomics has some very handy tutorials for learning how Cell Ranger and Loupe Browser work, complete with example data sets for you to practice with.

Please understand: GEC staff are not trained bioinformaticians, and we are not involved in the downstream sequencing data analysis. We can discuss some of the QC aspects of the results and offer our interpretation, but we strongly encourage you to take advantage of the UWBC Bioinformatics Resource Core‘s analysis service for the initial Cell Ranger pipeline, even if you or a colleague are familiar with these data sets. In addition to the Cell Ranger service being relatively inexpensive compared to the rest of the costs for the project, the BRC’s knowledge and experience will be of much more use to you if the analysis requires any troubleshooting with libraries from more challenging samples.