The seminar will be held over Microsoft Teams and to simplify matters, I would like to ask everyone interested in the seminar series to register here: https://forms.gle/C7S1Vcvw8gAVexiP6
|Travis Gagie, Dalhousie University|
|Computational pangenomics with the r-index (and friends)|
|June 1, 2021, 4:00pm CEST|
|Click here to join the meeting|
One of the basic principles of computational genomics is that humans are genetically almost identical. As we sequence more and more people’s DNA, however, we are finding that we may be more diverse than we thought, and ignoring our differences can bias and undermine both research conclusions and medical diagnoses. One solution is to switch from tools using one reference genome to ones using many reference genomes; since so many tools in computational genomics are based on the FM-index, scaling that up seems an obvious place to start. In this talk we will first discuss some of the variants of the FM-index, and alternatives to it, that have been proposed for computational pangenomics in the past twenty years, and how well they maintain three of the FM-index’s key features: theoretical elegance, practicality, and versatility. We will then focus on the r-index, its advantages and disadvantages with respect to those three features, and some recent results and open research problems related to it..
Travis Gagie is an associate professor in the Faculty of Computer Science at Dalhousie University. He has a BSc in Cognitive Science from Queen’s University at Kingston, an MSc in Computer Science from the University of Toronto and a Dr. rer. nat. in Genome Informatics from Bielefeld University. Before moving back to Canada in 2019, he was a pre-doc at the University of Eastern Piedmont; a post-doc at the University of Chile, Aalto University and the University of Helsinki; an associate professor at Diego Portales University; and a visitor at the Italian National Research Council in Pisa, Illumina, the University of A Coruña and the Czech Technical University. His main research interest is compact data structures for bioinformatics.