Provisional Spring 2026 issueScores are based on public information, pending reader-supported independent testing.Methodology →
genomereviews
Volume IX · Issue 4 · Spring 2026Independent · Reader-supported · No affiliate linksSaturday, April 18, 2026
The field guide

Your genome is 3.2 billion letters long. Here is how a lab reads them.

Whole genome sequencing is less one technology than a chain of them — each handing off to the next, each with its own failure modes. We explain them in the order your sample moves through the lab, with diagrams you can poke.

First, a distinction

WGS is not 23andMe.

A whole genome sequenceWGSWhole Genome Sequencing — reading (nearly) all 3 billion base pairs of your DNA, as opposed to genotyping arrays which sample ~600,000 known positions. reads (nearly) every position in your DNA. A genotyping array — the method used by 23andMe, AncestryDNA and most ancestry services — reads ~600,000 pre-selected positions known to vary between people.

Arrays are cheap because they ignore 99.98% of your genome. WGS is 5–10× more expensive because it doesn't. The consequences are cumulative: WGS can detect rare or novel variantsvariantA position where your genome differs from the reference. Most are harmless; a few are medically meaningful., structural changes, and anything the array designer didn't think to include.

Positions read (scaled)
Genotyping array · 600k sites
≈ 0.02% of the genome
WGS · 3.2B bases
≈ 98% of the genome (mappable regions)

~$80
Array cost
~$250–600
WGS cost, 2026
The pipeline

From spit tube to VCF, stage by stage.

Click a stage to open it. Each one takes 1–10 days in a commercial lab, with sequencing itself the longest queue.

The two technologies

Short-read vs long-read.

Short-read sequencers (Illumina) read DNA in ~150-letter chunks, perfectly and cheaply. Long-read sequencers (PacBio, Oxford Nanopore) read 10,000+ letters at a time, with higher error rates and higher prices.

The trade-off matters in repetitive regions — stretches of DNA where short reads can't tell this copy from that copy. About 8% of the human genome falls into this category. If it contains variants that matter to your family history, only long-read resolves them.

Reading a 3,000-base region
Short-read · 150 bases
Long-read · 15,000 bases

Error rate
0.1%
vs 1–5% long-read
Cost / genome
~$200
vs ~$1,000 long-read
Coverage, explained

What “30×” actually means.

“30×” means each position in your genome is read, on average, 30 times. It's an average because the machine doesn't distribute reads evenly — some regions get 50×, some get 10×, some get 0×.

The deeper the coverage, the more confident the variant callervariantA position where your genome differs from the reference. Most are harmless; a few are medically meaningful. can be that a difference is real and not a sequencing error. 30× is the consumer standard. 100× is used for clinical variant confirmation. 500× is the floor for some cancer assays.

Confidence at depth
30%
Ancestry-only
55%
Trait survey
15×
82%
Most SNPs confident
30×
96%
Consumer standard
60×
99%
Clinical confirmation
100×
99.7%
Variant discovery
Glossary

Words the industry uses without defining.

ACMG
American College of Medical Genetics and Genomics. ACMG SF (Secondary Findings) v3.3 is the standard list of medically actionable genes labs are expected to report back on when they turn up.
BAM
Aligned reads mapped to a reference genome. The typical intermediate file.
CLIA
US regulatory certification for clinical laboratories. A marker of clinical-grade testing.
coverage
How many times, on average, each base is read. 30× is the consumer standard; 100× is used for hard-to-call variants and some cancer assays; 1× (low-pass) is suitable for genealogy but not for clinical variant calls.
CRAM
Newer compressed alignment format — stores only differences from the reference. ~4–10× smaller than BAM for the same data. MyHeritage's LP-WGS ships CRAM files.
FASTQ
Raw sequencing read file. The unprocessed output of the sequencer.
ISO 15189
International medical-laboratory quality standard. Some labs (Dante Labs among them) cite ISO 15189 accreditation as a clinical-quality signal.
long-read
Sequencing method (e.g. PacBio, Oxford Nanopore) producing reads of 10,000+ bases. Better for structural variants; more expensive; higher per-read error rates.
LP-WGS
Low-pass whole genome sequencing — WGS at ~1× average depth. Covers the whole genome but with much lower per-position confidence; used today for cost-efficient genealogy applications, not for clinical variant calling.
pharmacogenomics
How your genetics influence your response to specific drugs — one of WGS's most actionable outputs.
PRS
Polygenic Risk Score — a statistical estimate of disease risk summed across many small-effect variants. Useful as a probability estimate; not a medical diagnosis.
reference genome
A consensus human genome (e.g. GRCh38, or the newer T2T-CHM13) that your reads are compared against.
short-read
Sequencing method (e.g. Illumina) that reads DNA in fragments of ~150 bases. Cheap, accurate, but weak in repetitive regions.
variant
A position where your genome differs from the reference. Most are harmless; a few are medically meaningful.
VCF
Variant Call Format — the compact list of positions where your genome differs from the reference.
WGS
Whole Genome Sequencing — reading (nearly) all 3 billion base pairs of your DNA, as opposed to genotyping arrays which sample ~600,000 known positions.
© 2026 Genome Reviews · ISSN 2769-4102 · Press T to toggle Journal ↔ Terminal