Complete Mitochondrial DNA Sequences

The revised Cambridge Reference Sequence (rCRS) is GenBank number NC_012920.
Please use this new number when citing the rCRS in publications. The rCRS is a reference sequence, not a "consensus" sequence. It is a single reference individual from haplogroup H2a2 and has been used as a standard for reporting variants for over 30 years. See Bandelt's new 2013 rCRS review, link below.

View MITOMAP's fully annotated rCRS sequence here .

The Cambridge Reference Sequence, revised & original:

Version GenBank # Fasta format Article links
Revised Cambridge Reference Sequence ("rCRS")
The rCRS is available as sequence number NC_012920 (formerly AC_000021.2) in GenBank's RefSeq database. This specific rCRS is the most commonly used and standard comparison sequence for human mtDNA research. It is 16569 bp in length, which includes one spacer at position 3107 to preserve the historical CRS position numbering. For publications, please cite NC_012920 as the rCRS.

A duplicate of the rCRS exists as J01415.2 . This is a fully corrected update of the original Cambridge sequence and is identical to NC_012920.

NC_012920 rCRS-fasta Bandelt's 2013 rCRS review (PDF)
Andrews et al 1999 (PubMed)
Original Cambridge Reference Sequence ("CRS") J01415
(no longer available)

Other comparison hmtDNAs in GenBank & elsewhere:

  • Root Sequence "RSRS" of Behar et al, 2012 This is an inferred sequence between haplogroups L0 and L1′2′3′4′5′6, used by many for better rooting of phylogenetic trees. Available on the Phylotree web site as a fasta file (not in GenBank).
    SNP differences in the RSRS from the rCRS are listed here (rCRS on the left, RSRS on the right): A73G, T146C, T152C, T195C, G247A, A263G, A523d, C524d, A750G, G769A, T825A, G1018A, A1438G, A2706G, G2758A, T2885C, C3594T, A4104G, C4312T, A4769G, C7028T, A7146G, C7256T, G7521A, C8468T, C8655T, A8701G, A8860G, T9540C, A10398G, C10664T, G10688A, T10810C, T10873C, T10915C, G11719A, G11914A, C12705T, A13105G, A13276G, C13506T, C13650T, C14766T, A15326G, G16129A, C16187T, T16189C, C16223T, A16230G, C16278T, T16311C, T16519C. See also Phylotree's summary table of differences. The RSRS sequence is 16569 bp in length, which includes three spacers (positions 523, 524 and 3107) to preserve the historical CRS position numbering.
  • African (Yoruba) Sequence AF347015, formerly NC_001807.4. This L3e sequence is 16571 bp in length and has over 40 variant nucleotides from the rCRS. Some sequencing chips (for example, Affymetrix Genome-Wide Human SNP Array 6.0; Illumina 550 v.1, 550 v.3, 610 v.1) have used nucleotide numbers based on the Yoruban sequence. To convert Yoruban-based position numbers to ones relative to the revised Cambridge sequence, follow the rules in the Yoruba Conversion table.
  • African (Uganda) Sequence D38112 This L0a sequence is 16559 bp in length and has over 90 variant nucleotides from the rCRS.
  • Swedish Sequence X93334 This U5a sequence is 16570 bp in length and has over 30 variant nucleotides from the rCRS.
  • Japanese Sequence AB055387 This B5b sequence is 16554 bp in length and has over 50 variant nucleotides from the rCRS.

NEW Other mtDNA sequences: Mitobank

To find >41,000 complete human mtDNAs in GenBank: execute search
*includes sequences which have complete, or near-complete, coding region, but may be without the control region (15400 nucleotides minimum).

To find >69,000 human Control Region sequences in GenBank: execute search .
*includes sequences which have all or part of the control region and are 400-1600 nucleotides in length.
Partial and full sequences are also available for Homo sapiens neanderthalensis, Homo heidelbergensis and Homo sp. Altai mtDNA.

To find >26,000 complete non-human mtDNA genomes in GenBank: execute search.

Representative Complete Mitochondrial Genomes of >8,000 different eukaryote organisms are listed at NCBI.

