PhD, University of Washington
Postdoctoral, Cornell University
Office: 311 Ricks Hall
Website: Visit our Lab Home Page
Research Areas: Computation / Bioinformatics
My colleagues and I study evolution by developing statistical techniques for analyzing DNA and protein sequence data. Our main efforts concern:
(1) Improving probabilistic models of DNA sequence evolution by incorporating phenotype and reconciling these models with population genetics. The relationship between phenotype and survival of the genotype is central to both genetics and evolution. The field of population genetics has a rich body of theory for explaining how within-species genetic variation is shaped by fitness, mutation, recombination, population size, and population structure. However, this theory does not purport to map genotypes to phenotypes nor does it map phenotypes to fitness. A wide variety of computational biology schemes aim to predict phenotype from genotype. I am working to improve models of molecular evolution by incorporating these computational biology prediction systems. I have concentrated on protein tertiary structure and RNA secondary structure, but are very excited by the potential to quantify the impacts on evolution of diverse other aspects of phenotype. Rather than designing statistical techniques exclusively for understanding within-species genetic variation, he has been attempting to apply population genetic theory to data sets representing sequences from different species. This is a challenging endeavor but a paucity of intraspecific genetic variation means that many of the most important evolutionary questions can only be addressed via interspecific comparisons.
(2) Evolution of the rate of evolution. Evolutionary analysis of DNA and protein sequences is typically performed by either assuming that all evolutionary lineages change at the same rate or by avoiding any attempt to directly consider the fact that the rate of evolution changes over time. Factors that affect the rate of molecular evolution (e.g., mutation, population size, generation time, selection) change over time and therefore the rate of molecular evolution is extremely unlikely to be identical for different evolutionary lineages. However, it is reasonable to expect an autocorrelation of rates over time. Closely related evolutionary lineages tend to evolve at similar rates and distantly related lineages might evolve at more different rates. My collaborators (especially Hirohisa Kishino of the University of Tokyo) and I are developing methods for estimating dates of evolutionary events from molecular sequence data. These methods lack the restrictive and implausible assumption that rates of evolution have been constant over time.
Choi SC, Redelings BD, and Thorne JL. (2008). Basing population genetic inferences and models of molecular evolution upon desired stationary distributions of DNA or protein sequences. Phil Trans R Soc B. Oct. 7 [Epub ahead of print]
Choi SC, Stone EA, Kishino H, and Thorne JL. (2008). Estimates of natural selection due to protein tertiary structure inform the ancestry of biallelic loci. Gene. Jul 29 [Epub ahead of print]
Thorne JL. (2007). Protein evolution constraints and model-based techniques to study them. Current Opinion in Structural Biology. 17: 337–341.
Choi SC, Hobolth A, Robinson DM, Kishino H, Thorne JL. (2007). Quantifying the impact of protein tertiary structure on molecular evolution. Mol Biol Evol. 24: 1769–1782.
Thorne JL, Choi SC, Yu J, Higgs PG, Kishino H. (2007). Population genetics without intraspecific data. Mol Biol Evol. 24: 1667–1677.