Querying Highly Similar Structured Sequences via Binary Encoding and Word Level Operations - Artificial Intelligence Applications and Innovations - Part II (AIAI 2012)
Conference Papers Year : 2012

Querying Highly Similar Structured Sequences via Binary Encoding and Word Level Operations

Abstract

In the post-genomic era there has been an explosion in the amount of genomic data available and the primary research problems have moved from being able to produce interesting biological data to being able to efficiently process and store this information. In this paper we present efficient data structures and algorithms for the High Similarity Sequencing Problem. In the High Similarity Sequencing Problem we are given the sequences S0, S1, …, Sk where Sj = $e_{j_1} I_{\sigma_1}e_{j_2} I_{\sigma_2} e_{j_3} I_{\sigma_3}, \dots,e_{j_\ell} I_{\sigma_\ell}$ and must perform pattern matching on the set of sequences. In this paper we present time and memory efficient datastructures by exploiting their extensive similarity, our solution leads to a query time of $O(m + vk \log \ell + \frac{m occ_v v}{w} + \frac{PSC(p)m}{w})$ with a memory usage of O(N logN + vk logvk).
Fichier principal
Vignette du fichier
978-3-642-33412-2_60_Chapter.pdf (278.92 Ko) Télécharger le fichier
Origin Files produced by the author(s)
Loading...

Dates and versions

hal-01523079 , version 1 (16-05-2017)

Licence

Identifiers

Cite

Ali Alatabbi, Carl Barton, Costas S. Iliopoulos, Laurent Mouchard. Querying Highly Similar Structured Sequences via Binary Encoding and Word Level Operations. 8th International Conference on Artificial Intelligence Applications and Innovations (AIAI), Sep 2012, Halkidiki, Greece. pp.584-592, ⟨10.1007/978-3-642-33412-2_60⟩. ⟨hal-01523079⟩
84 View
84 Download

Altmetric

Share

More