The QMEE CDT Project proposal database

Welcome to the QMEE CDT Project proposal database. This is a live list of projects proposals put forward by PIs across the CDT partner institutions

PIs/Supervisors will continue to add projects to this list over the next few months, so do keep checking back! You can search the projects using the box below: simply enter some text and press Search to do a text search across all the database fields. If you want to search more finely, the search tool also allows you to search on particular details of the project descriptions: you will see these finer search options appear if you click on the search box.

Click on the view button next to a project to get the full proposal description. If you want to download project details, either for all projects, or for a subset you have searched for, then click on the 'Download details' button.

A general likelihood framework for inferring hybridization from genetic marker data
Hybridization between closely related species is common in both plants and animals. A recent survey showed that at least 25% of plant species and 10% of animal species are involved in natural interspecific hybridization and potential introgression [1]. Understanding the hybridization process and its possible consequences (e.g. interspecific gene flow) are important in many research areas such as speciation, selection, recombination, and the maintenance of species boundaries [2] and in the conservation managements of endangered species [3]. A typical hybridization study requires identifying purebred and hybrid individuals, and ideally subdividing the latter into categories such as F1, F2, B1, …However, this proves to be difficult using phenotypic data, because hybridization usually involves closely related species or subspecies that are morphologically similar. Purebreds and different kinds of hybrids can be identified from their multilocus genotypes at a number of marker loci. This can be realized by a Bayesian method [4] which does not need any background information (e.g. allele frequencies) of the parental species and does not require the prior identification of purebred individuals as reference. This method, implemented in the software NewHybrids, has been widely applied in hybridization analyses. However, NewHybrids has several important limitations that prevent its wider applications. First, it assumes only two parental species contributing to the hybridization, and does not apply to the case when 3 or more species are involved in the hybridization. Unfortunately, however, multi-species hybridization does occur in nature, as is the case of kestrel in the Round island, Mauritius (Ken Norris, personal communication). Second, it assumes no inbreeding (i.e. Hardy-Weinberg equilibrium) in each parental species. While this is often a good approximation for animals, it is often invalid in many plants with partial selfing. Third, it assumes a diploid model, and cannot be applied to many insect species which have the haplo-diploid inheritance model. This project proposes to develop a new general likelihood framework for inferring hybridization that overcomes the above mentioned limitations of NewHybrids. The main project objectives are: 1. To develop the population genetics model and statistical methodology of inferring multi-species hybridization, allowing diploid model and haplo-diploid model, and allowing inbreeding. 2. To implement the methodology in a computer program. 3. To simulate genotype data which are then used to test the power and accuracy of the methodology. 4. To analyse data from kestrel, tropical pitcher plants, and other empirical data using the method and computer program developed. 5. To improve the computational efficiency (e.g. by using MPI and openMP for parallel computation) for large genomic datasets, and to add a GUI for easy use. [1] Mallet J. 2005. Hybridization as an invasion of the genome. TREE 20, 229-237. [2] Hewitt G. M. 1988 Hybrid zones—natural laboratories for evolutionary studies. TREE 3, 158–167. [3] Goodman S. J. et al. 1999 Introgression through rare hybridization: a genetic study of a hybrid zone between red and sika deer (genus Cervus) in Argyll, Scotland. Genetics 152, 355–371. [4] Anderson E. C. & Thompson E. A. 2002 A model-based method for identifying species hybrids using multilocus genetic data. Genetics 160, 1217–1229.
Jinliang Wang
Tim Barraclough
Professor Ken Norris, Institute of Zoology, Zoological Society of London. Email:
Development of mathematical theory, Computing, Quantitative data analysis
Jinliang Wang
Multiple quantitative skills are involved in conducting the project. These include (1) Modelling using population and quantitative genetics; (2) Statistical methodology (maximum likelihood) developments; (3) Computational algorithm developments (e.g. using simulated annealing algorithm); (4) Computer programming (including parallel coding with openMP and MPI); (5) Computer simulations.
Bayesian methods are available to identify hybrids and their hybrid classes (e.g. F1, F2, B1, B2) from marker data. However, they assume only two parental species and cannot handle the more general situation of three or more species participating in the hybridization. This project will develop a more general method that applies to the hybridization among any number of parental species.
To address the problem of hybrid identification from genetic marker data. It is expected to have wide applications in evolutionary studies, such as identifying parental species involved in hybridization, introgression, speciation by hybridization.
Many multi-species hybridization analyses are hampered due to the lack of suitable analysis method. A good example is kestrel in the Round island, Mauritius. The hybridization occurs among more than 2 kestrel species which migrate to the Round island for reproduction. The current methods either are inapplicable (NewHybrid) or have low power (Structure).
This project aims to develop and implement (in software) the new method for inferring multi-species hybridization. We expect the method and software will be widely used and thus have high impact on the study of hybridization.
This project combines population genetics modelling, statistical methodology and computational algorithm (e.g. simulated annealing for combinatorial optimizatoion) developments, computer programming and simulation. Both supervisors have extensive experiences in these areas, and these quantitative techniques have been successfully used in reconstructing pedigrees from genetic marker data.
Population ecology, Population genetics and evolution, Ecological/Evolutionary tools, technology & methods
The student will gain training in theoretical population genetics, statistical methodology development, computer programming, software development, computational algorithms and computer simulations. These quantitative skills are transferable to other ecological, evolutionary and conservation study areas.
ZSL and Imperial College London
2017-10-02 12:02:52