MERLIN: Identifying Inaccuracies in Multiple Sequence Alignments Using Object Detection - IFIP Open Digital Library Access content directly
Conference Papers Year : 2022

MERLIN: Identifying Inaccuracies in Multiple Sequence Alignments Using Object Detection

Abstract

Multiple Sequence Alignments set the basis for many biological sequence analysis methods. However, they are susceptible to irregularities that result either from the predicted sequences or from natural biological events. In this paper, we propose MERLIN (Msa ERror Localization and IdentificatioN), an object detector that consists in identifying such irregularities using visual representations of MSAs. Our model is developed using a state-of-the-art deep learning object detector, YOLOv4, and trained on a set of MSA images from an in-house built dataset with automatically annotated errors. Our object detector exhibits a mean Average Precision of 71.18% in predicting different types of errors within MSAs. We conducted a thorough examination of the obtained results which showed that our method correctly identifies certain inconsistencies that were missed by the automatic annotation algorithm.
Fichier principal
Vignette du fichier
MERLIN_AIAI22_.pdf (464.03 Ko) Télécharger le fichier
Origin Files produced by the author(s)

Dates and versions

hal-04317586 , version 1 (12-10-2022)
hal-04317586 , version 2 (23-10-2023)
hal-04317586 , version 3 (01-12-2023)

Identifiers

Cite

Hiba Khodji, Lucille Herbay, Pierre Collet, Julie D. Thompson, Anne Jeannin-Girardon. MERLIN: Identifying Inaccuracies in Multiple Sequence Alignments Using Object Detection. International Conference on Artificial Intelligence Applications & Innovations, Springer, Jul 2022, Crete Island, Greece. pp.192-203, ⟨10.1007/978-3-031-08333-4_16⟩. ⟨hal-04317586v2⟩
138 View
78 Download

Altmetric

Share

Gmail Mastodon Facebook X LinkedIn More