What have we found?


Cover of Science journal

Repinted with permission from Science vol 291, no 5507, 16 February 2001. Image Ann Elliot Cutting. http://www.science.org

The sequence of a ‘working draft’ of the human genome was published in the science journals Science and Nature in early 2001. This was a special event, as the public and private research efforts were publishing their results at the same time.

The final draft was completed in 2003, and the final papers were published in 2006. However, the data will be analysed for many years to come.

Analysis of the draft sequence revealed a vast amount of information, including:

  • the average human gene consists of 3000 nucleotide bases, but sizes vary greatly
  • the largest known human gene has 2.4 million bases
  • chromosome 1 has the most genes (2968) and the Y chromosome has the least (231)
  • the order of 99.9% of nucleotide bases is exactly the same in all people
  • the functions of more than 50% of discovered genes remain unknown
  • less than 2% of the genome encodes for the production of proteins
  • gene-rich areas of the genome are mostly made up of G and C bases, whereas gene-poor regions are mostly made up of A and T bases
  • at least 50% of the genome consists of repetitive base sequences that appear to have no direct function, but over time reshape the genome by
    • rearranging it
    • creating new genes
    • modifying and reshuffling existing genes.

Much is still unknown about our genome. Some of the things we still don’t fully understand are:

  • exact gene number, locations and functions
  • how genes are regulated
  • the amount, distribution, information content and functions of ‘noncoding’ DNA (DNA that does not code for a protein product)
  • how gene expression, protein expression and post-translational events are orchestrated
  • how genes and proteins are evolutionarily conserved amongst different organisms
  • how genetic variation among individuals is correlated to health and disease.

How many genes did you say?

  • The size of genomes differs from one organism to the next. The human genome contains about 3.1 billion base pairs and about 30,000 genes.
  • The largest known genome belongs to a microscopic amoeba, Amoeba dubia, closely followed by the lungfish and the Easter lily.
  • Three quarters of the Japanese pufferfish's 31,000 genes have direct human counterparts.