Project

Solve the genomic jig-saw puzzle: start with the borders

Chromosome-level assemblies are of key importance in studying genome evolution. Recent technological advances resulting in long sequencing reads permit chromosome-level assembly for many species. Yet some species, including some model species, do not have a gapless, telomere-to-telomere assembly. In many cases, due to the lower coverage of reads and their repetitive nature, sub-telomeric regions remain particularly problematic to assemble and the evolutionary dynamics of these regions is thus not well-studied.

To resolve these difficult regions of the genome and study genome evolution at the sub-telomeres, we will develop a pipeline in which we start with selecting long reads that contain telomeric repeats, cluster these reads based on sequence similarity, assemble these reads separately and integrate these chromosome ends into an existing assembly.

Research aims

Develop a pipeline that efficiently identifies telomeric repeats in high-error rate Nanopore reads.
Develop a pipeline that identifies informative positions in read alignments and cluster reads based on differences in these positions

Used techniques

Programming in Python (or language of choice)
Bash, assembly tools (e.g. Flye), mapping/alignment tools (e.g. minimap2), samtools, and bbduk,2fast2q or seqkit for extracting reads with telomeric repeats

Supervisor

dr. L (Like) Fokkens
Universitair docent

Contact person dr. HC (Lotje) van der Does

Contact

Project information

Using conserved telomeric repeats to achieve chromosome-level genome assemblies

Status:: Planned

Partners

Laboratory of Phytopathology

Flexible start

Used techniques

Supervisor

dr. L (Like) Fokkens