| Sign up

PDF (4.4 MB)

Cite

EndNote(RIS) BibTeX

Collect

Collect

Submit Manuscript

Open Access

A Method for Bio-Sequence Analysis Algorithm Development Based on the PAR Platform

Haipeng Shi^¹, Huan Chen^², Qinghong Yang^², Jun Wang^², Haihe Shi^²()

1School of Software, Jiangxi Normal University, Nanchang 330022, China

2School of Computer and Information Engineering, Jiangxi Normal University, Nanchang 330022, China

Show Author Information

Abstract

The problems of biological sequence analysis have great theoretical and practical value in modern bioinformatics. Numerous solving algorithms are used for these problems, and complex similarities and differences exist among these algorithms for the same problem, causing difficulty for researchers to select the appropriate one. To address this situation, combined with the formal partition-and-recur method, component technology, domain engineering, and generic programming, the paper presents a method for the development of a family of biological sequence analysis algorithms. It designs highly trustworthy reusable domain algorithm components and further assembles them to generate specifific biological sequence analysis algorithms. The experiment of the development of a dynamic programming based LCS algorithm family shows the proposed method enables the improvement of the reliability, understandability, and development efficiency of particular algorithms.

Keywords

partition-and-recur (PAR)domain engineering biological sequences feature model component assembly

References

[1]

Y.

Zhao

, R.

Gu

, and S.

Du

, Current status and development trend of bioinformatics research, Journal of Medical Informatics, vol. 33, no. 5, pp. 2–6, 2012.

[2]

S.

Liu

, Y. P.

Wang

, W. N.

Tong

, and S. W.

Wei

, A fast and memory efficient MLCS algorithm by character merging for DNA sequences alignment, Bioinformatics, vol. 36, no. 4, pp. 1066–1073, 2020.

Crossref Google Scholar

[3]

H.

Shi

and X.

Zhang

, Component-based design and assembly of heuristic multiple sequence alignment algorithms, Front. Genet, vol. 11, p. 105, 2020.

Crossref Google Scholar

[4]

S.

Liu

, Design and implementation of first descending squad based flipping sorting algorithm, Master dissertation, School of Computer Science and Technology, Shandong University, Jinan, China, 2006.

[5]

M. D.

Cao

, S. H.

Nguyen

, D.

Ganesamoorthy

, A. G.

Elliott

, M.

Cooper

, and L. J. M.

Coin

, Scaffolding and completing genome assemblies in real-time with nanopore sequencing, Nature Communications, vol. 8, p. 14515, 2017.

Crossref Google Scholar

[6]

R. A.

Wagner

and M. J.

Fischer

, The string-to-string correction problem, Journal of ACM, vol. 21, no. 1, pp. 168–173, 1974.

Crossref Google Scholar

[7]

D. S.

Hirschberg

, Algorithms for the longest common subsequence problem, Journal of the ACM, vol. 24, no. 4, pp. 664–675, 1977.

Crossref Google Scholar

[8]

Y.

Li

, Y.

Wang

, Z.

Zhang

, Y.

Wang

, D.

Ma

, and J.

Huang

, A novel fast and memory efficient parallel MLCS algorithm for long and large-scale sequences alignments, in Proc. 2016 IEEE 32^nd International Conference on Data Engineering, Helsinki, Finland, 2016, pp. 1170–1181.

Crossref Google Scholar

[9]

Y.

Chen

, A.

Wan

, and W.

Liu

, A fast parallel algorithm for finding the longest common sequence of multiple biosequences, BMC Bioinformatics, vol. 7, no. 4, p. S4, 2006.

Crossref Google Scholar

[10]

Z.

Peng

and Y.

Wang

, A novel efficient graph model for the multiple longest common subsequences (MLCS) problem, Frontiers in Genetics, vol. 8, p. 104, 2017.

Crossref Google Scholar

[11]

N.

Nakatsu

, Y.

Kambayashi

, and S.

Yajima

, A longest common subsequence algorithm suitable for similar text strings, Acta Informatica, vol. 18, no. 2, pp. 171–179, 1982.

Crossref Google Scholar

[12]

E. W.

Myers

and W.

Miller

, Optimal alignments in linear space, Computer Applications in the Biosciences: CABIOS, vol. 4, no. 1, pp. 11–17, 1988.

Crossref Google Scholar

[13]

Y. -T.

Tsai

, The constrained longest common subsequence problem, Information Processing Letters, vol. 88, pp. 173–176, 2003.

Crossref Google Scholar

[14]

C. L.

Peng

, An approach for solving the constrained longestcommon subsequence problem, http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/search, 2003.

[15]

A. N.

Arslan

and Ö.

Eǧecioǧlu

, Algorithms for the constrained longest common subsequence problems, International Journal of Foundations of Computer Science, vol. 16, no. 6, pp. 1099–1109, 2005.

Crossref Google Scholar

[16]

Y. C.

Chen

and K. M.

Chao

, On the generalized constrained longest common subsequence problems, Journal of Combinatorial Optimization, vol. 21, no. 3, pp. 383–392, 2011.

Crossref Google Scholar

[17]

J.

Xue

, PAR method and its supporting platform, in Proc. 1st Asian Working Conference on Verified Software (AWCVS 2006), Macao, China, 2006, pp. 29–31.

[18]

J.

Xue

, Y.

Zheng

, Q.

Hu

, Z.

You

, W.

Xie

, and Z.

Cheng

, PAR: A practicable formal method and its supporting platform, in Proc. 20^th International Conference on Formal Engineering Methods (ICFEM), Gold Coast, Australia, 2018, pp. 70–86.

Crossref Google Scholar

[19]

H.

Mili

, F.

Mili

, and A.

Mili

, Reusing software: Issues and research directions, IEEE Trans. on Software Engineering, vol. 21, no. 6, pp. 528–562, 1995.

Crossref Google Scholar

[20]

E. A.

Karlsson

, Software Reuse: A Holistic Approach. New York, NY, USA: John Wiley and Sons Ltd, 1995.

[21]

K. C.

Kang

, S. G.

Cohen

, J. A.

Hess

, W. E.

Novak

, and A. S.

Peterson

, Feature-oriented domain analysis (FODA) feasibility study, Tech. Rep. CMU/SEI-90-TR-21, Carnegie Mellon University, Pittsburgh, PA, USA, 1990.

Big Data Mining and Analytics

Volume 6 Issue 1,
March 2023

Pages 11-20

DOI: 10.26599/BDMA.2022.9020030

Cite this article:

Shi H, Chen H, Yang Q, et al. A Method for Bio-Sequence Analysis Algorithm Development Based on the PAR Platform. Big Data Mining and Analytics, 2023, 6(1): 11-20. https://doi.org/10.26599/BDMA.2022.9020030

About Us

Learn about Open Access

Tsinghua University Press

Publish with Us

Peer Review Policy

Copyright and Licensing

Article Processing Charge

Contact Us

Journal Collaboration: Yao Meng (Ms.)✉️ +86-10-83470574

Technical Support: Kuo Zhao (Mr.)✉️ +86-10-83470507

Media Contact: Hao Jin (Mr.)✉️ +86-10-83470559

Address: Floor 6, Tower B, Xueyan Building, Shuangqing Road, Haidian District, Beijing 100084, China.

SciOpen——中国科技期刊卓越行动计划支持项目

Copyright © 2025 Tsinghua University Press Ltd.

京ICP备 10035462号-42 京公网安备11010802044758号