January 21, 2014. Where are we today with the promise of personal genomics? To great fanfare, Illumina announced last week that the $1000 genome had arrived (1). This milestone ushers in a new future where anyone and everyone will get their genome “done” (e.g. sequenced) and use that information to manage their health, relationships, and daily activities. Or will it? What can we actually do with our genomic information today? Despite the recent breakthroughs in sequencing technology, we have yet to develop the tools and methodologies that connect gene sequences to traits in a way that makes the data useful. Without interpretation, one’s genome sequence is meaningless. In a bid to address this challenge, we recently started Molquant, an algorithm-driven enterprise that builds powerful tools to improve genome interpretation, the next big challenge in personal genomics.
The Promise of Personal Genomics
With more than 1300 Genome Wide Association Studies (GWAS), one might imagine that we have a whole trove of gene:trait linkages. But what can we learn if we want to get sequenced today? Not much, really. To promote whole genome sequencing, Illumina hosts personal genome conferences, where you can get your genome sequenced and interpreted for $5000. At a recent conference, one session provided interpretation for 340 genes affecting 140 traits. The state-of-the-art in genomic interpretation today links less than 2% of the genome to associated trait data -- a little disappointing. Even though an individual can get their genome sequenced, there isn’t much information that ties genes to specific traits.
Consumer genomics leader, 23andMe, prior to being muzzled by the FDA, had a similar offering, reporting on over 240 “health conditions and traits.” While 23andMe isn’t actually sequencing yet (too expensive still) they assay over 1 million sequence variants or SNPs and cover most of the reported trait associations, as well as a few of their own researched associations (e.g. SNPs rs4481887 and rs4309013 in the Olfactory Receptor gene cluster determine your ability to smell asparagus pee). 23andMe does a fantastic job organizing and interpreting the available information, it’s just that there isn’t much known. Given the limited interpretability to date, how can we begin to approach this complex puzzle?
4001 Gene-linked Disorders
Some clues to where we should be looking can be found in another dataset, the repository of human genetic disorders: Online Mendelian Inheritance in Man (OMIM). As of December 2013 OMIM tallied 4001 disorders for which the molecular basis (gene or genes) are known. These syndromes are typically extremely rare, and appear to represent the extremes of the natural variation in genes seen in “healthy” individuals. These direct links between gene and trait provide important clues to the functions of many more genes than have been identified in GWAS to date.
Clues in Faces
A few illuminating examples lie in the genetics of faces. One’s face is arguably the most clear, albeit complex trait that is highly genetically determined. We all know how difficult it is to distinguish identical twins; we recognize familial resemblances; and we can often guess someone’s ancestral origins from facial features (increasingly becoming blurred as we all become more mobile, mixing our genes together). Individuals with a particular genetic alteration often exhibit specific facial features (e.g. Down’s syndrome--an entire chromosomal gain--produces a recognizable face). Although subtle, many genetic disorders produce specific facial traits as a feature of what are often complex phenotypes. Peter Hammond at University College London has developed morphometric tools to characterize facial traits from a number of genetic disorders associated with intellectual disability such as: 22q11 Deletion Syndrome; Noonan syndrome (PTPN11 in 50% ); Smith Magenis Syndrome (RAI1); and Williams Beuren Syndrome (likely GTF2I).
Source: Hammond et al., Am. J. Hum. Genet. 77:999, 2005
In the studied examples, Hammond’s tools can use facial features to diagnose the syndromes, demonstrating the role of each of the affected genes in face shape.
In 2012, two GWAS papers (2,3) found only five genes linked to three subtle changes in the 3D measurement of faces. However, mutations of two of the genes had been previously linked to facial dysmorphic traits, including PAX3, the gene associated with Waardenberg syndrome. PAX3 in the general population was associated with the width of the nose bridge; Waardenberg syndrome PAX3 mutations cause a distinctive “wide nasal bridge” phenotype. (the other, PBRM16 was previously identified as causal in a cleft palate associated syndrome), again suggesting that the genetic syndromes may represent extremes of normal variation.
These clues from human disorders suggest two things: 1) The genome likely contains a wealth of extractable data linking gene variation to a wide variety of human traits and health conditions; 2) A comprehensive gene::trait catalog from the monogenic disorders may provide an important tool to aid in interpreting genomic data in the broader population.
Follow @molquant to receive our news and updates.
1 Matthew Herper, “The $1000 Genome Arrives -- For Real This Time” Forbes, January 14, 2014
2 Paternoster et al., “Genome-wide Association Study of Three-Dimensional Facial Morphology Identifies a Variant in PAX3” American Journal of Human Genetics, 2012 March 9; 90(3): 478–485
3 Liu et al.. (2012) “A Genome-Wide Association Study Identifies Five Loci Influencing Facial Morphology in Europeans”. PLoS Genet 8(9): e1002932, 2012
TOP OF PAGE: A sampling of various genome visualizations (non-Molquant) Source: Google image search