Nanopore sequencing accuracy
- ホームページ
- Platform solution
- Nanopore sequencing accuracy
For many years Oxford Nanopore has continuously iterated our technology to improve its performance. We continue to improve the nanopore sensing system, through updates to analytical methods and new chemistries. This page guides you on what to expect from the nanopore sequencing system, and which tools to choose to achieve these results.
What is sequencing accuracy?
Accuracy is a generic term that might refer to different aspects of DNA and RNA sequencing performance. Typically, it refers to the accuracy at a single read level or at the consensus level, combining the information from multiple reads of a DNA/RNA region into a single high-quality sequence. Depending on the application, other relevant factors to consider are the proportion of the genome covered and the ability to detect epigenetic modifications. Usually, genomic research focuses either on resequencing and mapping to a reference genome or reconstructing unknown genomes through de novo assembly. For mapping-based projects, changes compared with the reference sequences are used for inference, hence variant calling becomes the main focus. For de novo assembly, quality is estimated by the accuracy of the reconstructed sequence and other metrics such as N50.
Variant calling accuracy
Variant calling identifies differences from a reference sequence and is crucial in understanding how genotypes drive phenotypes. Nanopore technology can sequence any length of DNA and RNA molecule, offering unprecedented resolution of complex structural variants and efficient haplotype phasing of variants.
Measuring the accuracy of variant calling is critical to ensure that the genetic variants identified are biological differences and not artefacts. Accuracy is commonly measured with the so called F1 score, the harmonic mean of precision (proportion of called variants that are actually variants) and sensitivity or recall (proportion of all variants that are correctly called). This metric is especially useful when you want to balance the trade-off between identifying as many variants as possible (high sensitivity) and ensuring the variants identified are truly variants (high precision).
Learn more about accuracy measures.
Read more about structural variation and small variant calling & phasing.
Nanopore sequencing achieves:
Base modification accuracy
The four DNA bases (A, C, G, T) and RNA bases (A, C, G, U) can undergo biological modifications like methylation, impacting gene expression and contributing to diseases such as cancer. Oxford Nanopore’s technology allows for direct, real-time sequencing and detection of these modifications for both DNA and RNA (e.g. 5mC, 5hmC, 6mA, 4mC for DNA, and m6A for RNA) without additional experiments or preparation, unlike legacy methods, such as bisulphite sequencing, that have several limitations.
Read more about direct DNA and RNA base modifications detection
Assembly accuracy
Assembly accuracy refers to the degree to which a reconstructed sequence of DNA or RNA matches the true biological sequence from which it was derived. This involves building a consensus sequence from multiple DNA/RNA reads, enhancing accuracy and creating a reliable sequence for further analysis.
Find out more about assembly & whole-genome sequencing.
Nanopore sequencing achieves:
Flow cell | Library preparation kit | Assembly accuracy | Sequencing & basecalling parameters | Analysis tools | Sample |
---|---|---|---|---|---|
PromethION R10.4.1 | Ligation Sequencing Kit V14 Ultra-long Sequencing Kit V14 | Telomere-to-telomere (T2T): Q42* 18 full chromosome haplotype- resolved, N50 >135 Mb | 400 bps, 5 kHz, simplex SUP, duplex | Assembly with Verkko, phasing with Gfase | Human HG002 |
MinION R10.4.1 | Ligation Sequencing Kit V14 | Q50 at 10–20x | 400 bps, 4 kHz, simplex SUP | Assembly with Flye | Zymo mock community (bacterial) |
*Generated by combining approx. 40x duplex, 40x ultra-long and 40x Pore-C
Covering all of the genome
To create an accurate picture of the genome, it is important for a sequencing technology to reach all parts of it, even the parts which are difficult to map. Genomes are littered with repetitive and low-complexity regions, which are difficult to sequence and align using legacy technologies. For example, it is estimated that short-read technology reaches only 92% of the human genome, leaving 8% that contains many disease-relevant genes excluded from the dataset.
Nanopore technology has been shown to reduce these ‘dark’ areas of the genome by 81%, shedding light on parts of the genome not sequenced by any other technology (Ebbert et al., 2019), and giving a more complete picture. The extensive genome mapping capabilities of nanopore data manage to achieve 99.49% genome coverage (Uddin et al., 2024). Ultra-long nanopore sequencing reads were central to completing the human genome, resolving repetitive regions that were unattainable with other technologies (Nurk et al., 2022).
Raw read and single molecule accuracy
Nanopore sequencing uses direct electronic analysis of native DNA and RNA molecules to generate raw reads, eliminating PCR bias. Basecalling algorithms based on machine learning have been improving with time, providing more and more accurate reads. Raw read accuracy refers to the accuracy achieved when reading a single DNA or RNA strand once. Most applications focus on variant calling, consensus accuracy, or other metrics, where the information from several reads is combined. These can be improved by increased raw read accuracy but can also be enhanced in other ways (e.g. increased genome coverage).
Duplex sequencing suits applications in which single-molecule sequencing is relevant; by reading both DNA strands, the reads generated achieve a high-quality single molecule accuracy. Our latest Q20+ chemistry enables the generation of duplex reads: the second strand can follow the first through the same nanopore, producing information from two orthogonal signals, merged into one consensus sequence. Single molecule accuracy of duplex is ~Q30 or higher. A specific basecaller for nanopore duplex reads is available.
Nanopore sequencing achieves:
Flow cell | Library preparation kit | Sequencing & basecalling parameters | Sample | Accuracy | Output |
---|---|---|---|---|---|
PromethION R10.4.1 | Ligation Sequencing Kit V14 | 400 bps, 5 kHz, HAC basecalling | Human HG002 | 99.0% (Q20) | ●●● |
PromethION R10.4.1 | Ligation Sequencing Kit V14 | 400 bps, 5 kHz, SUP basecalling | Human HG002 | 99.5% (Q23) | ●●● |
PromethION R10.4.1 | Ligation Sequencing Kit V14 | 400 bps, 5 kHz, Duplex basecalling | Human HG002 | >99.9% (Q30) | ● |
Tuning accuracy for your experimental need
Optimise accuracy according to your requirements by selecting simplex or duplex reads and the most suitable basecalling model. Simplex reads achieve Q20 accuracy and are generated by reading a single DNA/RNA strand through a nanopore, and accuracy fine-tuned with the following basecalling models.
Fast basecalling: fastest, least computationally intense. Highest compatibility with real-time basecalling on device
High accuracy basecalling (HAC): highly accurate, intermediate speed and computational requirement. Good compatibility with real-time basecalling device
Super accuracy basecalling (SUP): the most accurate and computationally intense
Duplex reads achieve Q30 accuracy by combining the information from both DNA strands. A specific basecaller for nanopore duplex reads is available, with computational requirement similar to the SUP basecalling model.
Available datasets
Oxford Nanopore Technologies provides open access to a range of nanopore sequencing datasets through its initiative hosted on Amazon Web Services (AWS), called ‘ont-open-data’. This initiative allows researchers worldwide to explore and utilise extensive sequencing data to enhance their genomic studies. For example, the dataset for the human genome sample GM24385 (HG002) is one of the available resources, which has been utilised in numerous research applications, reflecting Oxford Nanopore's commitment to supporting the scientific community by providing freely accessible, high-quality data.
Subscribe
Get in touch
Talk to us
If you have any questions about our products or services, chat directly with a member of our sales team.
Book a sales call
To book a call with one of our sales team, please click below.