What have we found?


Cover of Science journal

Repinted with permission from Science vol 291, no 5507, 16 February 2001. Image Ann Elliot Cutting. http://www.science.org

The sequence of a ‘working draft’ of the human genome was published in the science journals Science and Nature in early 2001. This was a special event, as the public and private research efforts were publishing their results at the same time.

Analysis of the draft sequence revealed a vast amount of information including:

  • the average human gene consists of 3,000 nucleotide bases, but sizes vary greatly – the largest known human gene has 2.4 million bases,
  • the order of 99.9% of nucleotide bases is exactly the same in all people,
  • the functions of over 50% of discovered genes remain unknown,
  • less than 2% of the genome encodes for the production of proteins,
  • much of the genome consists of repetitive base sequences. These repeats appear to have no direct function, but over time reshape the genome by rearranging it; creating new genes or modifying and reshuffling existing genes,
  • gene-rich areas of the genome are predominantly made up of G and C bases, whereas gene-poor regions are mainly composed of A and T bases,
  • chromosome 1 has the most genes (2968) whereas the Y chromosome has the least (231).

Much is still unknown about our genome. Some of the things we still don’t know are:

  • the exact number of genes in the human genome,
  • the exact location, function and regulation of these genes,
  • the amount, distribution, information content and functions of ‘non-coding’ DNA, that is, DNA that does not code for a protein product,
  • how gene expression, protein expression and post-translational events are orchestrated,
  • evolutionary conservation of genes and proteins amongst different organisms,
  • correlation of genetic variation between individuals with respect to health and disease.

How many genes did you say?

  • The size of genomes differs from one organism to the next. The human genome contains more than 3.2 billion base pairs and about 30,000 genes.
  • The largest known genome belongs to a microscopic amoeba, Amoeba dubia, which is closely followed in size by the lungfish and the Easter lily.
  • Three quarters of the Japanese pufferfish's 31,000 genes have direct human counterparts.