PIM-Align: A Processing-in-Memory Architecture for FM-Index Search Algorithm

Xue-Qi Li; Guang-Ming Tan; Ning-Hui Sun

doi:10.1007/s11390-020-0825-3

| Sign up

Article Link

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

Show Outline

Outline

Abstract

Keywords

Electronic Supplementary Material

References

Show full outline

Hide outline

Regular Paper

PIM-Align: A Processing-in-Memory Architecture for FM-Index Search Algorithm

Xue-Qi Li, Guang-Ming Tan, Ning-Hui Sun

State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences Beijing 100190, China

University of Chinese Academy of Sciences, Beijing 100049, China

Show Author Information

Abstract

Genomic sequence alignment is the most critical and time-consuming step in genomic analysis. Alignment algorithms generally follow a seed-and-extend model. Acceleration of the extension phase for sequence alignment has been well explored in computing-centric architectures on field-programmable gate array (FPGA), application-specific integrated circuit (ASIC), and graphics processing unit (GPU) (e.g., the Smith-Waterman algorithm). Compared with the extension phase, the seeding phase is more critical and essential. However, the seeding phase is bounded by memory, i.e., fine-grained random memory access and limited parallelism on conventional system. In this paper, we argue that the processing-in-memory (PIM) concept could be a viable solution to address these problems. This paper describes “PIM-Align”—application-driven near-data processing architecture for sequence alignment. In order to achieve memory-capacity proportional performance by taking advantage of 3D-stacked dynamic random access memory (DRAM) technology, we propose a lightweight message mechanism between different memory partitions, and a specialized hardware prefetcher for memory access patterns of sequence alignment. Our evaluation shows that the proposed architecture can achieve 20x and 1820x speedup when compared with the best available ASIC implementation and the software running on 32-thread CPU, respectively.

Keywords

accelerator design genomic sequence alignment near-memory computing

Electronic Supplementary Material

Download File(s)

jcst-36-1-56-Highlights.pdf (917.7 KB)

References

[1]

Shendure J, Ji H. Next-generation DNA sequencing. Nature Biotechnology, 2008, 26(10): 1135-1145. DOI: 10.1038/nbt1486.

Crossref Google Scholar

[2]

Erdmann J. Next generation technology edges genome sequencing toward the clinic. Chemistry & Biology, 2011, 18(12): 1513-1514. DOI: 10.1016/j.chembiol.2011.12.006.

Crossref Google Scholar

[3]

Stephens Z D, Lee S Y, Faghri F, Campbell R H, Zhai C, Efron M J, Iyer R, Schatz M C, Sinha S, Robinson G E. Big data: Astronomical or genomical? PLoS Biology, 2015, 13(7): Article No. e1002195. DOI: 10.1371/journal.pbio.1002195.

Crossref Google Scholar

[4]

Turakhia Y, Bejerano G, Dally W J. Darwin: A genomics co-processor provides up to 15, 000X acceleration on long read assembly. In Proc. the 23rd International Conference on Architectural Support for Programming Languages and Operating Systems, Mar. 2018, pp.199-213. DOI: 10.1145/3173162.3173193.