Correction: Initial sequencing and analysis of the human genome

doi:10.1038/35087627

Download PDF

Correction
Open access
Published: 01 August 2001

Correction: Initial sequencing and analysis of the human genome

International Human Genome Sequencing Consortium

Nature volume 412, pages 565–566 (2001)Cite this article

8624 Accesses
46 Citations
18 Altmetric
Metrics details

The Original Article was published on 01 February 2001

Nature 409 , 860 –921 (2001 ) .

We have identified several items requiring correction or clarification in our paper on the sequencing of the human genome.

• Six additional authors should have been included: Pieter de Jong, Joseph J. Catanese, and Kazutoyo Osoegawa (Department of Cancer Genetics, Roswell Park Cancer Institute, Buffalo, New York 14263, USA; present address: Children's Hospital Oakland Research Institute, 747 52nd street Oakland, California 94609, USA) and Hiroaki Shizuya, Sangdun Choi and Yu-Juin Chen (Division of Biology, California Institute of Technology, Pasadena, California 91125, USA). These investigators and their laboratories constructed the high-quality BAC libraries that were crucial in sequencing the genome, as described in Table 1. These libraries were not previously published. We apologize to our colleagues for this omission.

• The Supplementary Information on Nature's website has been revised. Changes to the original Supplementary Information are available in the Supplementary Information to this Correction. We have added 7 additional investigators to the full list of authors. We have also added 79 additional references, citing previously published sequences that were included in the draft genome sequence.

• Table 27 reported 18 instances of apparently novel paralogues of genes encoding drug targets. We have carefully reviewed these 18 cases and found that two are incorrect: a paralogue of an insulin-like growth factor-1 receptor gene and a paralogue of the calcitonin-related polypeptide alpha gene. In both cases, we had incorrectly recorded the chromosomal location sequence of the known gene, thereby erroneously giving rise to an apparent paralogue (the first instance was identified by J. Englebrecht and C. Kristensen (personal communication)). Of the 16 remaining apparent paralogues, two (calcium channel paralogue IGI_M1_ctg17137_10 and heparan N-deacetylase/N-sulphotransferase paralogue IGI_M1_ctg13263_18) have so far been confirmed as bona fide genes^1,2.

• Several correspondents have written to point out that a handful of clones listed as human sequence in the HTG division of GenBank (established to house ‘unfinished’ sequence data) are actually mouse sequence (about two dozen out of 30,000 clones). They asked whether these clones give rise to contamination in the human draft sequence. As noted in the paper, we used computer programs to identify and eliminate instances of such contamination (with mouse sequence, vector sequence, and so on) before assembling the draft genome sequence. In reviewing the work, we identified one mouse clone that slipped through the filter. This clone has been eliminated in subsequent assemblies (http://genome.cse.ucsc.edu/). Because the draft sequence remains an imperfect partial product, we welcome additional comments that could help in improving it.

• The discussion of possible horizontal gene transfer from bacterial genomes to vertebrate genomes has provoked considerable discussion^3,4,5. We reported 113 instances of human genes that had reasonably close homologues in bacteria, but either had no homologue or only a weaker homologue in non-vertebrate eukaryotes for which extensive genomic sequence was available. We suggested two hypotheses to explain these data: horizontal gene transfer (HGT) from bacteria to human or gene loss in the other lineages. We had no data to distinguish between these hypotheses, although we suggested that the latter was a more “parsimonious” explanation as it involved fewer independent events. In the introduction we stated that this seemed “likely”.

Several correspondents have undertaken a more comprehensive analysis and have argued that a significant proportion of the cases can be explained by gene loss^3,4,5. We agree. We believe that the two hypotheses cannot be distinguished on the basis of parsimony, because too little is known about the relative rates of HGT and gene loss in evolution. Instead, extensive sequence data from many additional organisms will be required to assess definitively the provenance of each gene.

We note that the process of HGT into the vertebrate genome from other organisms has clearly occurred on multiple occasions, as seen from the sudden arrival of many DNA transposons with strong similarities to other organisms. The most recent documented cases occurred subsequent to the eutherian radiation (see Fig. 19).

• A key reference concerning 3′-transduction by LINE elements was omitted on page 887. The sentence citing references 205 and 206 should also have cited Goodier et al.⁶.

• In Fig. 33, the unit on the y axis should be bp, not kb. The legend should read: “Sequence properties of segmental duplications. Distribution of length and per cent nucleotide identity are shown as a function of the number of aligned bp from the finished vs finished human genomic sequence dataset. Intrachromosomal (blue), interchromosomal (red).”

• In Fig. 41, the legend should begin: “For each of the 27 common domain families, the number of different Pfam domain types that co-occur with the family in each of the five eukaryotic proteomes. The 27 families were chosen to include the 10 most common domain families in each proteome. The data are ranked …”

• In Table 22, the entry 81,126 should be 8,126.

• On page 898, line 31, the final phrase of the sentence (“… and the representativeness of currently ‘known’ human genes”) should be deleted. The sentence should read: “Before discussing the gene predictions for the human genome, it is useful to consider background issues, including previous estimates of the number of human genes and lessons learned from worms and flies”.

• On page 900, line 38, remove “(see above)”.

• We failed to acknowledge the crucial role of sequence editing software, which has been widely used for inspection and subsequent finishing of the sequence assemblies. The two principal programs used were CONSED⁷ and GAP4⁸.

References

Burgess, D. L. et al. A cluster of three novel (Ca(2+) channel gamma subunit genes on chromosome 19q13.4: Evolution and expression profile of the gamma subunit gene family. Genomics 71, 339–350 (2001).
Article CAS Google Scholar
Aikawa, J. et al. Multiple isozymes of heparan sulfate/heparin GlcNAc N-deacetylase/GlcN N-sulfotransferase. Structure and activity of the fourth member, NDST4. J. Biol. Chem. 276, 5876–5882 (2001).
Article CAS Google Scholar
Salzberg, S. L. et al. Microbial genes in the human genome: Lateral transfer or gene loss? Science 292, 1903–1906 (2001).
Article ADS CAS Google Scholar
Stanhope, M. J. et al. Phylogenetic analyses do not support horizontal gene transfers from bacteria to vertebrates. Nature 411, 940–944 (2001).
Article ADS CAS Google Scholar
Reelofs, J. & Van Haastert, P. J. M. Genes lost during evolution. Nature 411, 1013–1014 (2001).
Article ADS Google Scholar
Goodier, J. L., Ostertag, E. M. & Kazazian, H. H. Transduction of 3′-flanking sequences is common in L1 retrotransposition. Hum. Mol. Genet. 9, 653–657 (2000).
Article CAS Google Scholar
Gordon, D., Abajian, C. & Green, P. Consed: a graphical tool for sequence finishing. Genome Res. 3, 195–202 (1998).
Article Google Scholar
Staden, R., Beal, K. F. & Bonfield, J. K. The Staden package. 1998. Methods Mol. Biol. 132, 115–130 (2000).
CAS PubMed Google Scholar

Download references

Author information

Consortia

International Human Genome Sequencing Consortium

Additional information

The online version of the original article can be found at 10.1038/35057062

Supplementary information

Revised Supplementary Information

Rights and permissions

This article is distributed under the terms of the Creative Commons Attribution-Non-Commercial-Share Alike licence (http://creativecommons.org/licenses/by-nc-sa/3.0/).

Reprints and permissions

About this article

Cite this article

International Human Genome Sequencing Consortium. Correction: Initial sequencing and analysis of the human genome. Nature 412, 565–566 (2001). https://doi.org/10.1038/35087627

Download citation

Published: 01 August 2001
Issue Date: 02 August 2001
DOI: https://doi.org/10.1038/35087627

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Correction: Initial sequencing and analysis of the human genome

References

Author information

Consortia

International Human Genome Sequencing Consortium

Additional information

Supplementary information

Revised Supplementary Information

Rights and permissions

About this article

Cite this article

Comments

Search

Quick links

References

Author information

Consortia

International Human Genome Sequencing Consortium

Additional information

Supplementary information

Revised Supplementary Information

Rights and permissions

About this article

Cite this article

Share this article

Comments

Search

Quick links