To understand the relationship between the genome and disease, the structure of the genome, variations between the genomes of individuals, and the features of an individual's genome must be understood ('genotype'). The individual genome then needs to be related to the physical attributes of the individual ('phenotype'). For the human genome, a phenotype may be a personal trait such as height or eye colour, or a disease manifestation such as cystic fibrosis or a raised risk of Alzheimer's disease. It might also relate to a pathogenic bacterium's toxicity or a plant's drought resistance.
Genetic variations between humans are key to differences in phenotype. Variation between individuals' genomes is the subject of substantial genomic research. Specific types of variation include:
The Human Genome Project was a 13-year project that resulted in the first complete map of a human genome. The project involved leading academic centres throughout the world, including the UK's Wellcome Trust and the US National Institutes of Health (NIH).
In 2001, the draft genome sequence was published in Nature, and in 2004, the completed analysis of the sequence was published.
The new sequence identified almost all known genes (99.74%), and defined 22,287 'gene loci'. Previously it had been believed that there were as many as 100,000 genes. The finished genome now works as a template for researchers conducting analyses of the genome.
International HapMap ProjectThe International HapMap Project was started in 2002. Its goal was to identify and catalogue genetic similarities and differences in human beings, by comparing the genetic sequences of different individuals to identify chromosomal regions where genetic variants are shared.
Using the publicly available information in the HapMap, researchers are able to find genes that affect health, disease, and individual responses to medications and environmental factors. The Project is a collaboration among scientists and funding agencies from Japan, the United Kingdom, Canada, China, Nigeria, and the United States.
In June 2007, phase 2 of the HapMap project
was reported in Nature. Three million more single nucleotide polymorphisms (SNPs) have been identified, representing between 25-33% of all human SNPs with a frequency of more than 5%.
In 2007, the journal Science named Human Genetic Variation as
Breakthrough of the Year. The journal commented that improvements in DNA sequencing technology will enable even deeper analysis of genetic variation:
"New technologies that are slashing the costs of sequencing and genome analyses will make possible the simultaneous genome-wide search for SNPs and other DNA alterations in individuals. Already, the unexpected variation within one individual's published genome has revealed that we have yet to fully comprehend the degree to which our DNA differs from one person to the next." Encyclopedia of DNA Elements (ENCODE) studyThe Encyclopedia of DNA Elements (ENCODE) study, published in June 2007, represented a major advance in understanding of the human genome. It found that the areas of the genome that were not genes, previously sometimes referred to as 'junk DNA', are critical to the regulation and control of DNA processes. This confirmed the idea that the workings of DNA were more complex than originally thought, and highlighted the need for more detailed research into the genome.
Large-scale Genome Sequencing programme: Medical Sequencing projectsThe US National Human Genome Research Institute (NHGRI), part of the National Institutes of Health (NIH), funds several sequencing projects. Key projects include:
Wellcome Trust Case Control ConsortiumPhase 1 of the Wellcome Trust Case Control Consortium (WTCCC) analysed DNA samples from 17,000 people and reviewed 8 major diseases. In April 2008, phase 2 of the WTCCC project was announced to analyse the DNA of 120,000 people, and in January 2009 the WTCCC phase 3 was funded for 4 diseases and 30,000 samples. These studies use microarray technologies rather than sequencing technologies to identify genetic variations associated with diseases. These data are a foundation for further 'deep sequencing' studies.
International Cancer Genome Consortium In April 2008, a new consortium was announced that will gain high quality data on the genomes of at least 50 different cancers. The consortium includes researchers from ten countries including the UK's WEllcome Trust Sanger Institute. Each of the projects will use specimens from roughly 500 patients. The project comprises the Cancer Genome Project and Cancer Genome Atlas.
In 2011, the complete sequence of a whole genome no longer guarantees a publication in a leading journal such as Nature or Science, as the number of projects underway has expanded so dramatically. Countless research projects that examine multiple whole genomes are underway, more complex genomes (such as certain polyploid plant genomes) are being tackled by international consortia, and whole human genome data is starting to be introduced into clinical practice in some specialist centres.
Two major online resources for information about the genome and genomic research are:
The leading UK charity,
The Wellcome TrustThe US National Human Genome Research Institute, part of the National Institutes of Health.