AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
PDF (4.4 MB)
Collect
Submit Manuscript AI Chat Paper
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Open Access

A Method for Bio-Sequence Analysis Algorithm Development Based on the PAR Platform

School of Software, Jiangxi Normal University, Nanchang 330022, China
School of Computer and Information Engineering, Jiangxi Normal University, Nanchang 330022, China
Show Author Information

Abstract

The problems of biological sequence analysis have great theoretical and practical value in modern bioinformatics. Numerous solving algorithms are used for these problems, and complex similarities and differences exist among these algorithms for the same problem, causing difficulty for researchers to select the appropriate one. To address this situation, combined with the formal partition-and-recur method, component technology, domain engineering, and generic programming, the paper presents a method for the development of a family of biological sequence analysis algorithms. It designs highly trustworthy reusable domain algorithm components and further assembles them to generate specifific biological sequence analysis algorithms. The experiment of the development of a dynamic programming based LCS algorithm family shows the proposed method enables the improvement of the reliability, understandability, and development efficiency of particular algorithms.

References

[1]
Y. Zhao, R. Gu, and S. Du, Current status and development trend of bioinformatics research, Journal of Medical Informatics, vol. 33, no. 5, pp. 26, 2012.
[2]
S. Liu, Y. P. Wang, W. N. Tong, and S. W. Wei, A fast and memory efficient MLCS algorithm by character merging for DNA sequences alignment, Bioinformatics, vol. 36, no. 4, pp. 10661073, 2020.
[3]
H. Shi and X. Zhang, Component-based design and assembly of heuristic multiple sequence alignment algorithms, Front. Genet, vol. 11, p. 105, 2020.
[4]
S. Liu, Design and implementation of first descending squad based flipping sorting algorithm, Master dissertation, School of Computer Science and Technology, Shandong University, Jinan, China, 2006.
[5]
M. D. Cao, S. H. Nguyen, D. Ganesamoorthy, A. G. Elliott, M. Cooper, and L. J. M. Coin, Scaffolding and completing genome assemblies in real-time with nanopore sequencing, Nature Communications, vol. 8, p. 14515, 2017.
[6]
R. A. Wagner and M. J. Fischer, The string-to-string correction problem, Journal of ACM, vol. 21, no. 1, pp. 168173, 1974.
[7]
D. S. Hirschberg, Algorithms for the longest common subsequence problem, Journal of the ACM, vol. 24, no. 4, pp. 664675, 1977.
[8]
Y. Li, Y. Wang, Z. Zhang, Y. Wang, D. Ma, and J. Huang, A novel fast and memory efficient parallel MLCS algorithm for long and large-scale sequences alignments, in Proc. 2016 IEEE 32nd International Conference on Data Engineering, Helsinki, Finland, 2016, pp. 11701181.
[9]
Y. Chen, A. Wan, and W. Liu, A fast parallel algorithm for finding the longest common sequence of multiple biosequences, BMC Bioinformatics, vol. 7, no. 4, p. S4, 2006.
[10]
Z. Peng and Y. Wang, A novel efficient graph model for the multiple longest common subsequences (MLCS) problem, Frontiers in Genetics, vol. 8, p. 104, 2017.
[11]
N. Nakatsu, Y. Kambayashi, and S. Yajima, A longest common subsequence algorithm suitable for similar text strings, Acta Informatica, vol. 18, no. 2, pp. 171179, 1982.
[12]
E. W. Myers and W. Miller, Optimal alignments in linear space, Computer Applications in the Biosciences: CABIOS, vol. 4, no. 1, pp. 1117, 1988.
[13]
Y. -T. Tsai, The constrained longest common subsequence problem, Information Processing Letters, vol. 88, pp. 173176, 2003.
[14]
C. L. Peng, An approach for solving the constrained longestcommon subsequence problem, http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/search, 2003.
[15]
A. N. Arslan and Ö. Eǧecioǧlu, Algorithms for the constrained longest common subsequence problems, International Journal of Foundations of Computer Science, vol. 16, no. 6, pp. 10991109, 2005.
[16]
Y. C. Chen and K. M. Chao, On the generalized constrained longest common subsequence problems, Journal of Combinatorial Optimization, vol. 21, no. 3, pp. 383392, 2011.
[17]
J. Xue, PAR method and its supporting platform, in Proc. 1st Asian Working Conference on Verified Software (AWCVS 2006), Macao, China, 2006, pp. 2931.
[18]
J. Xue, Y. Zheng, Q. Hu, Z. You, W. Xie, and Z. Cheng, PAR: A practicable formal method and its supporting platform, in Proc. 20th International Conference on Formal Engineering Methods (ICFEM), Gold Coast, Australia, 2018, pp. 7086.
[19]
H. Mili, F. Mili, and A. Mili, Reusing software: Issues and research directions, IEEE Trans. on Software Engineering, vol. 21, no. 6, pp. 528562, 1995.
[20]
E. A. Karlsson, Software Reuse: A Holistic Approach. New York, NY, USA: John Wiley and Sons Ltd, 1995.
[21]
K. C. Kang, S. G. Cohen, J. A. Hess, W. E. Novak, and A. S. Peterson, Feature-oriented domain analysis (FODA) feasibility study, Tech. Rep. CMU/SEI-90-TR-21, Carnegie Mellon University, Pittsburgh, PA, USA, 1990.
Big Data Mining and Analytics
Pages 11-20
Cite this article:
Shi H, Chen H, Yang Q, et al. A Method for Bio-Sequence Analysis Algorithm Development Based on the PAR Platform. Big Data Mining and Analytics, 2023, 6(1): 11-20. https://doi.org/10.26599/BDMA.2022.9020030

791

Views

69

Downloads

2

Crossref

2

Web of Science

3

Scopus

0

CSCD

Altmetrics

Received: 28 June 2022
Revised: 12 August 2022
Accepted: 17 August 2022
Published: 24 November 2022
© The author(s) 2023.

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).

Return