DHHS Eagle graphic
ASL Header
Mission Nav Button Division Nav Button Grants Nav Button Testimony Nav Button Other Links Nav Button ASL Home Nav Button
US Capitol Building
HHS Home
Contact Us
dot graphic Testimony bar

This is an archive page. The links are no longer being updated.

Testimony on "The Human Genome Project:How Private Sector Developments Affect the Government Program" by Francis S. Collins, M.D., PH.D.
Director, National Human Genome Research Institute
National Institutes of Health
U.S. Department of Health and Human Services

Before the House Committee on Science, Subcommittee on Energy and the Environment
June 17, 1998

I am Dr. Francis Collins, Director of the National Human Genome Research Institute (NHGRI) of the National Institutes of Health. I appreciate the opportunity to appear before the Subcommittee today to discuss the Human Genome Project and the implications of the recent announcement by a private company of their intentions to carry out large-scale sequencing of the human genome.

The NHGRI is one of the 22 Institutes and Centers that comprise the federation of federal research entities known as the National Institutes of Health (NIH). The vast majority of research dollars appropriated to the NIH flow out to the scientific community across the Nation, primarily in the form of peer-reviewed research grants. Today, that community numbers more than 50,000 investigators affiliated with nearly 2,000 universities, hospitals, and other research facilities located in all 50 states, the District of Columbia, Puerto Rico, Guam, the Virgin Islands, and certain points abroad.

The NHGRI is the lead Institute at the NIH with responsibility for The Human Genome Project (HGP). The HGP officially began in October of 1990 as a 15-year program to characterize in detail the complete set of human genetic instructions (the "genome"). The central aim of the project, which the federal government funds through programs at the NIH's National Human Genome Research Institute and the Department of Energy, is to arm health researchers with powerful gene-finding and DNA analysis tools to unravel and understand the myriad human diseases that have their roots in DNA. Now at its half-way mark, genome project tools have underpinned virtually all gene discoveries of this decade.

The Human Genome Project's success stems largely from a unique and rigorous planning process that sets ambitious research goals, time lines and budgets. The first joint NIH/DOE plan, which covered years 1991-1995, included goals for:

physical and genetic maps; experimental DNA sequencing of the fruit fly, a round worm, yeast, and the bacterium E.coli; computer management of research data; and studies of the ethical, legal, and social implications (ELSI) of these new abilities to read genetic information.

Because of the rapid pace of genome research and technology development, scientists met many of those initial goals ahead of schedule and under budget. So the research plan was updated again in 1993 to establish new NIH-DOE goals through 1998. All of these goals have now been met or exceeded. Original expectations were that the NIH cost of these activities from FY=91-97 would exceed $1 billion in 1991 dollars. I am pleased to report that the cost has been about 25 percent less than that projection.

Gene Discovery

Today, with Human Genome Project tools, it is possible to track down a disease-related gene even when nothing is known about the biochemical problems of the disease or how the gene works. This technique, based on identifying the position of a gene in the chromosome and then isolating it, is commonly referred to as positional cloning and was successfully used for the first time in 1986. Now, the increasing detail and quality of genome maps have reduced the time it takes to find a disease gene from years, to months, to weeks, to sometimes just days, and scientists are using the tools to discover dozens of disease genes each year.

An Example - Parkinson's Disease

The isolation of a gene for Parkinson's disease (PD) last year demonstrated the power of this new discovery method and showed conclusively that changes in DNA can cause PD in some families. Only two years ago, the National Institute of Neurological Disorders and Stroke held a workshop to explore using genetic approaches to understand PD. A team led by scientists in NHGRI's Division of Intramural Research (DIR) began large-scale genetic analysis of DNA from members of a large Italian family containing almost 600 people, more than 60 of whom have been diagnosed with Parkinson's. In nine days, NHGRI gene hunters mapped the gene to a region of chromosome 4, which contained approximately 100 genes. One of the several genes in that interval had already been identified on the gene map and was known to encode a protein called alpha-synuclein.

In just a few months, the researchers showed conclusively that an altered alpha-synuclein gene caused Parkinson's disease in the study families. Many have hailed this as the most significant advance in Parkinson's disease research in 30 years. Just last month, a Japanese research team used genome mapping tools to isolate another gene, this time on chromosome 6, that also appears to contain a gene that, when altered, predisposes the individual to a rare juvenile form of Parkinson's disease.

Ethical, Legal, and Social Implications

NHGRI has established productive partnerships among consumers, scientists, and policy makers to help reduce the possibility that genetic information will be used to harm an individual or family members and ensure that it will be of benefit to both patients and providers. As an integral part of the Human Genome Project, the NHGRI and the DOE have each set aside a portion of their funding to anticipate, analyze, and address the ethical, legal, and social implications (ELSI) of the Project's new advances in human genetics. The current goals of the ELSI program are to improve the understanding of these issues through research and education, to stimulate informed public discussion, and to develop policy options intended to ensure that genetic information is used for the benefit of individuals and society. Because genetic information is personal, powerful, and potentially predictive, it can be used to stigmatize and discriminate against people. Genetic information must be private. DNA Sequencing If the letters representing the 3 billion bases in the human genome were printed out in books, and the books were stacked one on top of the other, they would reach as high as the Washington Monument. The current major goal of the Human Genome Project is to read the order, letter by letter, of those 3 billion bases.

Sequencing was once done by hand as a series of chemical reactions - a slow and costly method. In 1990, when the HGP began, the sequencing cost was $10/base. Now, because of public investment and collaboration with the private sector, machines read the sequence fragments quickly and efficiently. As a result, the sequencing cost has been dramatically reduced to roughly $.50/base for high-quality "finished" sequence.

Using a strategy referred to as a "shotgun" sequencing, an investigator takes each page of those books stacked as tall as the Washington Monument, and randomly cuts the text into small fragments. These fragments are small enough for sequencing machines to read. To get long stretches of contiguous DNA, investigators must then reassemble these sequenced fragments back into sentences, paragraphs, chapters, and books. The reassembly of this puzzle is carried out largely by sophisticated computer programs.

The sequencing strategy the public genome project uses employs shotgun sequencing of DNA fragments that already have been carefully mapped and catalogued. This process makes reassembling the sequenced fragments into contiguous sequence easier because you know where the fragment came from. In addition, scientists periodically encounter DNA fragments that are particularly difficult to sequence. To return to the analogy, it is much easier, takes less time, and is less costly to assemble the text in "finished" form if all the fragments are known to have come from the same chapter.

In 1996, NHGRI began pilot projects to test strategies and technologies for full-scale sequencing of the human genome. We now have undertaken human sequencing in earnest. As a result, investigators have deposited almost 150 million bases of "finished" high-quality human DNA sequence in GenBank, the publicly funded database supported by the National Library of Medicine. In accordance with the agreed-upon standards of the international genomic community, all NIH-DOE funded sequencers have agreed to a rapid data release policy, such that, new sequence data is submitted to publicly accessible data banks within 24 hours. If one includes "finished" and "close-to-finished" sequence, over 300 million bases, or 10 percent, of the human DNA sequence has been deposited in GenBank.

In order to meet the standards adopted by the international genomic community, the sequence produced must have four characteristics --the "4 A's" of the Human Genome Project --

  1. the sequence must be accurate, that is, the DNA spellings must be correct. The publicly funded genome effort will ensure accuracy of 99.99 percent or better.

  2. the sequence must be assembled. Large-scale sequencing relies on the accurate assembly of smaller lengths of sequenced DNA into longer, genomic-scale pieces, so DNA will be assembled into long pieces that reflect the original genomic DNA.

  3. Because human DNA sequence must also be affordable, a portion of our research funds focuses on technology development to reduce the cost as much as possible.

  4. Finally, high-quality, finished human DNA sequence must be accessible. In order to be useful, sequence data needs to be rapidly available to the entire research community.
Research Planning

Informed by a series of workshops over the past year that reviewed research progress and identified genome research opportunities, Human Genome Project leaders recently met with more than 100 representatives from a range of scientific disciplines to develop the next 5-year plan, scheduled to begin in the fall of 1998. With both the physical and genetic maps complete, and human DNA sequencing pilot projects underway, goals of the 1998-2003 draft plan considered at that meeting focused on:

  • completing a full, highly accurate and contiguous human genome DNA sequence;
  • further development of technologies for steadily increasing sequencing capacity and reducing costs;
  • studies of variations in human DNA; studies of how large sets of genes function;
  • studies of the similarities and differences between the human genome and those of important laboratory animals;
  • improved computer methods for data management; and
  • studies regarding the ethical, legal and social implications of the HGP.
Private Sector Developments

Just prior to the HGP planning meeting, industry researchers from The Institute for Genomic Research (TIGR) and Perkin Elmer, Inc. announced a plan to apply a DNA sequencing strategy they had used on micro-organisms to produce a "rough draft" of the human genome sequence. The sequencing strategy recently proposed by Perkin-Elmer, Inc. and TIGR differs from the public effort in two significant ways: quality and access.

First, that strategy, called "whole-genome shotgun sequencing", employs fragments that have not been previously mapped or catalogued prior to sequencing. Because scientists will not know where in the long chain of 3 billion base pairs the fragment might belong, the task of reassembling the fragments becomes far more difficult. This difficulty in reassembly inevitably will lead to gaps and misassemblies in the sequence. Some of these may occur in DNA regions with great biological significance. The private sector approach does not propose to fill in all the gaps left by these unsequenced fragments, thereby creating a product that will be incomplete for many research uses.

Secondly, release of sequence data from the Perkin-Elmer-TIGR effort will occur quarterly, rather than daily. The policy of daily release of DNA sequence data by publicly-funded efforts was arrived at because of the great interest in the scientific community in gaining access to this highly valuable information. Any delay can result in wasted effort in research.

Deliberations on Five-Year Research Plan

Because the industry plan seemed to parallel some aspects of the federal Human Genome Project, planners and advisors to the NIH-DOE program have been debating extensively how the two proposals could be matched up. The scientists, at the recent planning meeting on the draft HGP 5-Year Plan, concluded that while the two projects should complement one another, the federal project should continue its plans to provide high-quality human DNA sequence as soon as possible and that all data should be freely accessible. Those conclusions rested on a few key factors:

  • The industry effort may not deliver the product in the time and manner proposed. The industry approach to sequencing has not been tried on large and complex genomes, such as the human, and depends on newly developed and unproven machines. Data to evaluate the "whole genome" shotgun approach will initially come from a trial project on the fruitfly, Drosophila, but is not expected on the human for at least 12 to 18 months;

  • The industry plan will produce a large amount of highly useful sequence data, but this plan will yield a qualitatively different product that will likely contain tens of thousands of gaps;

  • The industry plan calls for release of sequence data on a quarterly basis, and patenting of 100-300 "gene systems." While quarterly data release is commendable, the plan is not as strong as the standards established by the international sequencing community which require release of data within 24 hours and discourage patenting. Further, some concerns were expressed that the private effort's commitment to data release might diminish over time, if business pressures came to the forefront.

In view of those concerns, advisors at the planning meeting enthusiastically made several unanimous recommendations:

  • The publicly funded genome project should continue with plans to provide a complete, high-quality human DNA sequence by the year 2005, and sooner if at all possible;

  • All possible steps must be taken to ensure that all sequence data remain in the public domain;

  • The publicly funded effort should take advantage of technology advances to increase sequencing capacity as much as possible as soon as possible to meet research needs, both for sequencing of the human and model organisms; and

  • The sequencing of DNA regions of high utility and research interest should be emphasized.

Now, Human Genome Project leaders at the NIH and DOE are considering that advice as they put the final touches on the new research plan, which will be published in the fall of 1998. The complete plan will contain details for all of the Human Genome Project's goals, including sequencing, gene function, human variation, technology development, and Ethical Legal and Social Implications.

The private and public genome sequencing efforts should not be seen as engaged in a race. In fact, scientists at TIGR and Perkin-Elmer have expressed their enthusiasm for a continued vigorous public effort on the HGP, and have conveyed their willingness to collaborate with NIH and DOE on the production of the complete human sequence. The NIH and DOE welcome this collaborative approach, as the whole should be greater than the sum of the parts.


Mr. Chairman, I commend you, and the Members of this Subcommittee, for convening this hearing today. The impact on the future of biology of knowing the order of all 3 billion human DNA bases has been compared to Mendeleev's establishment of the Periodic Table of the Elements in the 19th century and the advances in chemistry that followed. The complete set of human genes--the biologic periodic table--will make it possible to begin to understand how they function and interact. Rapidly evolving technologies, comparable to those used in the semi-conductor industry, will allow scientists to build detectors that analyze tens of thousands of genes in a single experiment. Scientists will use the powerful new tools to reveal the secrets of disease susceptibility. This knowledge will in turn allow researchers to create broad new opportunities for preventive medicine, lay the foundation needed to develop and better target effective therapeutics, and provide unprecedented information about the origin and migration of human populations.

The investment of substantial funds by the private sector in human sequencing reaffirms the enormous value of Human Genome Project products and is a testament to the success and value of the tools already developed by the publicly supported project. For the reasons outlined above, it is not yet known what role this new endeavor will play over the long term in providing the publicly available, detailed "A-to-Z" instruction book ultimately promised by the Human Genome Project. Project leaders at the National Institutes of Health and the Department of Energy look forward to close cooperation with Perkin-Elmer and TIGR as the new initiative unfolds over the next few years.

This concludes my remarks. I would be pleased to answer any questions.

Privacy Notice (www.hhs.gov/Privacy.html) | FOIA (www.hhs.gov/foia/) | What's New (www.hhs.gov/about/index.html#topiclist) | FAQs (answers.hhs.gov) | Reading Room (www.hhs.gov/read/) | Site Info (www.hhs.gov/SiteMap.html)