Informacja

Drogi użytkowniku, aplikacja do prawidłowego działania wymaga obsługi JavaScript. Proszę włącz obsługę JavaScript w Twojej przeglądarce.

Tytuł pozycji:

A novel fast multiple nucleotide sequence alignment method based on FM-index.

Tytuł:
A novel fast multiple nucleotide sequence alignment method based on FM-index.
Autorzy:
Liu H; School of Computer Science, University of Science and Technology of China and Key Laboratory on High Performance Computing of Anhui, China.
Zou Q; Institute of basic and Frontier Sciences, University of Electronic Science and Technology of China and Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Chengdu, Sichuan, China.
Xu Y; School of Computer Science, University of Science and Technology of China and Key Laboratory on High Performance Computing of Anhui, China.
Źródło:
Briefings in bioinformatics [Brief Bioinform] 2022 Jan 17; Vol. 23 (1).
Typ publikacji:
Journal Article; Research Support, Non-U.S. Gov't
Język:
English
Imprint Name(s):
Publication: Oxford : Oxford University Press
Original Publication: London ; Birmingham, AL : H. Stewart Publications, [2000-
MeSH Terms:
Base Sequence*
Sequence Alignment*
Sequence Analysis, DNA/*methods
Algorithms ; Databases, Factual ; Genome, Bacterial ; Genome, Human ; Humans ; Research Design ; Software
Contributed Indexing:
Keywords: FM-index; common segments; divide-and-conquer strategy; dividing sequences; multiple sequence alignment
Entry Date(s):
Date Created: 20211211 Date Completed: 20220311 Latest Revision: 20220311
Update Code:
20240105
DOI:
10.1093/bib/bbab519
PMID:
34893794
Czasopismo naukowe
Multiple sequence alignment (MSA) is fundamental to many biological applications. But most classical MSA algorithms are difficult to handle large-scale multiple sequences, especially long sequences. Therefore, some recent aligners adopt an efficient divide-and-conquer strategy to divide long sequences into several short sub-sequences. Selecting the common segments (i.e. anchors) for division of sequences is very critical as it directly affects the accuracy and time cost. So, we proposed a novel algorithm, FMAlign, to improve the performance of multiple nucleotide sequence alignment. We use FM-index to extract long common segments at a low cost rather than using a space-consuming hash table. Moreover, after finding the longer optimal common segments, the sequences are divided by the longer common segments. FMAlign has been tested on virus and bacteria genome and human mitochondrial genome datasets, and compared with existing MSA methods such as MAFFT, HAlign and FAME. The experiments show that our method outperforms the existing methods in terms of running time, and has a high accuracy on long sequence sets. All the results demonstrate that our method is applicable to the large-scale nucleotide sequences in terms of sequence length and sequence number. The source code and related data are accessible in https://github.com/iliuh/FMAlign.
(© The Author(s) 2021. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.)
Zaloguj się, aby uzyskać dostęp do pełnego tekstu.

Ta witryna wykorzystuje pliki cookies do przechowywania informacji na Twoim komputerze. Pliki cookies stosujemy w celu świadczenia usług na najwyższym poziomie, w tym w sposób dostosowany do indywidualnych potrzeb. Korzystanie z witryny bez zmiany ustawień dotyczących cookies oznacza, że będą one zamieszczane w Twoim komputerze. W każdym momencie możesz dokonać zmiany ustawień dotyczących cookies