Publications

Development of genomic resources for ornamental lilies (Lilium L.)

Shahin, A.

Summary

Lily (Lilium L.) is a perennial bulbous ornamental, belonging to subclass Monocotyledonae and family Liliaceae. Lily, according to statistics of Dutch auctions, is the fifth most important cut flower and the second in flower bulbs based on acreage. This species has been extensively used for cytogenetic studies, but molecular genetic studies are limited. The heterogenic nature and the very complex and huge genome (36 Gb) of lily might be the reason for this. To improve the efficiency of breeding and selection in this species, and set up the basis for genetic studies in Lilium, genomic resources are needed.

Next generation sequencing (NGS) technology (454 pyro-sequencing) was used to sequence the transcriptomes (RNA-seq) of four lily cultivars: ‘Connecticut King’, ‘White Fox’, ‘Star Gazer’, and Trumpet that belong to the four most important hybrid groups: Asiatic, Longiflorum, Oriental, and Trumpet respectively. Successfully, 52,172 unigenes with an average length of 555 bp were developed and used for a wide range of genetic and genomic studies: SNP marker identification for genetic mapping, gene annotation, and comparative genomic studies.

Combining NGS with SNP genotyping techniques to accelerate genetic studies is of considerable interest in different species. In this study, thousands of SNPs out of the 52,172 lily unigenes were identified. Genotyping technique KASPar (KBiosciences competitive Allele Specific PCR) was used to genotype two lily mapping populations: ‘LA (L. longiflorum ‘White Fox’ x Asiatic hybrid ‘Connecticut King’) and AA (‘Connecticut King’ x ‘Orlito’) using 225 SNP markers selected from ‘Connecticut King’ unigenes. Genotyping success rate was 75.5% (170 SNP markers worked), polymorphic SNP rate was 45% (102 SNP markers), and mapped SNP marker rate was 42% (94 SNP mapped) in LA population and 38% (85 SNP mapped) in AA population. Thus, we validated a subset of the putative SNP makers and showed the usability of this type of markers to improve genetic maps for complex genomes like that of lily.

The SNP markers together with the available AFLP (amplified fragment length polymorphisms), DArT (diversity arrays technology), and NBS (nucleotide binding site) markers were used to build reference genetic maps for these two lily populations. These maps represent the first reasonably saturated maps that cover 89% of the lily genome with an average marker density of one marker per 4 cM. The availability of more SNP markers for genotyping, opens the door for further enriching these genetic maps and thus improve the marker density.

The genetic maps were used to map and understand the genetic of several horticultural traits in Lilium. Fusarium oxysporum and lily mottle virus (LMoV) are considered as very serious diseases in Lilium and as such present important targets for breeding. Six putative QTLs (quantitive trait loci) were identified for Fusarium resistance in AA population, from which QTL1 was the strongest (explains ~25 % of phenotypic variation). In LA population, QTL1 was also confirmed. Thus, QTL1 is a strong and reliable QTL in both populations and it can be used to develop markers for most of the Fusarium resistance for molecular assisted breeding (MAB) applications. The LMoV was mapped as a marker on the AA genetic maps, however, no close markers to this trait (i.e. distance of the closest marker to LMoV was 9 cM) were identified yet. Several ornamental traits: lily flower color ‘carotene’ (LFCc), flower spots (lfs), stem color (LSC), antherless phenotype (lal), and flower direction (up-side facing, lfd) were phenotyped and mapped. Some of these traits showed to be recessive traits (spots, antherless, and flower direction) and controlled by a single gene. Developing markers for recessive traits is valuable since such markers allow the identification of suitable breeding parents so the presence of the recessive trait can be either enhanced or repressed. A more complex trait is flower longevity because it is a function of: the number of buds per inflorescence, the expansion and opening of the buds, the life span of the individual flowers, and also the life-span of the leaves. Moreover, senescence in Lilium is ethylene-insensitive and the regulator(s) of its vase life is not known yet. Our study showed that vase life of individual lily flowers increased significantly by the exogenous application of sugars. Abscisic acid (ABA) level, furthermore, increased dramatically in lily flowers at senescence compared with anthesis. This indicates that ABA might be the main regulator of vase life in lily. However, more experiments should be conducted to prove this conclusion.

The genomic resources developed for lily together with genomic resources developed for Tulipa L., in the same way, offered a valuable source of information to conduct comparative genomic studies within and between these two genera. We initiated the first step towards linking molecular genetic maps of Lilium and Tulipa using transcriptome sequences generated by 454 pyro-sequencing. Orthologous genes between lily and tulip were identified (10,913 unigenes) based on sequence data of four lily cultivars and five tulip cultivars. Next, common SNP and EST-SSR markers between the parents of lily mapping populations (AA and LA population) and the parents of tulip mapping population (‘Kees Nelis’ (T. gesneriana) x ‘Cantata’ (T. fosteriana)) based on these orthologous sequences were generated. A total of 229 common SNP and 140 common EST-SSR markers were identified. Genotyping and mapping these markers in the populations of both genera will link the genetic maps of Lilium and Tulipa and thus allow insight into the preservation of gene order, structure, and ‘putative’ functional homology in addition to evolutionary processes.

Also, these genomic resources can be used to increase the resolution of, and support for, phylogenetic trees. We selected a set of orthologous genes of Lilium (19 genes, 11,766 bp containing 433 polymorphic sites), of Tulipa (20 genes, 10,347 bp containing 216 polymorphic sites), and of the orthologous genes between the two genera (7 genes, 5,790 bp containing 587 polymorphic sites). These sets are uniquely present in the sequences and informative in estimating the genetic divergence of the two genera, thus they can be used to genotypes more species per genera to build genera and maybe family trees later on. The nucleotide polymorphism rate of Lilium was twice as high as that of Tulipa, on average one substitution per 26 bp for Lilium compared with one substitution per 48 bp for Tulipa. NGS provide a valuable source for large numbers of phylogenetic informative substitutions that might revolutionize the phylogenetic, population genetic, and biodiversity studies. However, the use of bi-allelic information from multiple loci in phylogenetic studies is still challenging and it needs to be studied further.

Moreover, having such high numbers of sequence data, allows us to test some evolutionary hypotheses such as positive selection: selection during domestication/breeding processes might be imprinted in the species genome, which can be examined based on omega (dn/ds) values. The higher the omega value the stronger the indication of positive selection. Positive selection was recorded in Lilium and Tulipa genomes when this small subset of gene contigs (46) of the two genera was tested. Our hypothesis could not be confirmed, however to draw final conclusions on this matter, omega values for many more genes of the two genera have to be measured.

Finally, a wealth of putative molecular markers (SNPs and SSRs) has become available that can have direct applications for breeding in these genera. SNP markers are important since they are user friendly, efficient, transferable, and co-dominant markers. Applying high throughput genotyping technology to genotype the two lily populations improved the coverage of the two genetic maps. Also, genotyping the same SNP markers in the two populations facilitated the comparisons between the linkage groups of the two populations and will allow the construction of a consensus map. Consequently, exchange of genetic knowledge (mainly QTLs) between the two populations will be easier. The thousands of SNPs identified in the genome of the four lily cultivars opens the door for combining the current linkage mapping studies with association studies which will have a direct impact on improving the resolution of mapping and on MAB applications in Lilium.