Project

To infect or not to infect? Predict lifestyle from gene content

Many fungal pathogens behave as neutral or even beneficial partners on plants that are not their hosts. Host-microbe interactions are mediated by effector proteins that are secreted by the microbe and may for example manipulate the hosts immune responses. These effectors are often very dynamic: they are lost and gained frequently throughout evolution and accumulate mutations, e.g. to escape recognition by the host. Therefore, the difference in effector repertoires between and within species is vast, but strains that are known to infect the same host often have similar effector profiles. This suggests that we should be able to predict the host preference based on effector content. In this project, you will train simple AI algorithms, such as k-NN or random forests on a dataset of ±700 F. oxysporum genomes. You will not only take presence and absence of effectors into account but also look at specific genotypes of universal effectors that associate with preference for a host.

Research aims

  • Develop a tool that improves current host-range predictions based on effector profiles.
  • Identify keystone (sets of) effectors that are predictive of host range

Used techniques

  • Prepare dataset: predict effector profiles using existing software (FoEc2)
  • Add missing host preference labels where possible, based on literature.
  • Split up groups of universal effectors based on genotype.
  • Train k-NN, random forests or an algorithm of your choice on this dataset.