[RAM 2021] Chasing perfection: validating and polishing strategies for T2T assemblies


2021年7月29日 (木) 9:00


L4E48/ Zoom


Come join us for the final RAM 2021 Invited Guest event, an excellent lecture on Genome assembly and correction methods, by Dr Arang Rhie, of the NIH and the Vertebrate Genomes Project, and developer of Merqury/meryl and other assembly and polishing/correction software.

Abstract: Advances in long-read sequencing technologies and genome assembly methods have enabled the recent completion of the first Telomere-to-Telomere (T2T) human genome assembly, which resolves complex segmental duplications and large tandem repeats, including centromeric satellite arrays in a complete hydatidiform mole (CHM13). Though derived from highly accurate sequencing, evaluation revealed that the initial T2T draft assembly had evidence of small errors and structural misassemblies. To correct these errors, we designed a novel repeat-aware polishing strategy that made accurate assembly corrections in large repeats without overcorrection, ultimately fixing 51% of the existing errors and improving the assembly QV to 73.9. By comparing our results to standard automated polishing tools, we outline common polishing errors and offer practical suggestions for genome projects with limited resources. We also show how sequencing biases in both PacBio HiFi and Oxford Nanopore Technologies reads cause signature assembly errors that can be corrected with a diverse panel of sequencing technologies*

Dr Arang Rhie received a BS in computer science in 2009 and MS in bioinformatics in 2011 from Ewha Womans University. She completed her PhD in 2017 at the Genome Medicine Institute, Department of Biomedical Science, Seoul National University College of Medicine. Her dissertation research aimed to build the first high-quality Korean reference genome for use in medical diagnostics. Currently she is a visiting fellow in the Genome Informatics Section at NIH/NHGRI, where her research continues to focus on the reconstruction of true haplotypes from long-read sequencing and other emerging technologies. She is an active member of the Vertebrate Genomes Project (VGP), which aims to generate complete and error-free genome assemblies of all vertebrates.*

Zoom Link: https://oist.zoom.us/j/84967336568?pwd=ZkhVWlRCdzBjM3VDSWRneDVXaUhVUT09

RAM 2021

*Preprint paper for the abstract and talk title: https://www.biorxiv.org/content/10.1101/2021.07.02.450803v1

