AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
PDF (923.9 KB)
Collect
Submit Manuscript AI Chat Paper
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Open Access

Efficient Scheduling Mapping Algorithm for Row Parallel Coarse-Grained Reconfigurable Architecture

School of Computer and Information Science, Anhui Polytechnic University, Wuhu 241000, China
School of Computer Science and Technology, Shanghai University of Electric Power, Shanghai 200090, China
School of Software Engineering, Tongji University, Shanghai 201804, China
Show Author Information

Abstract

Row Parallel Coarse-Grained Reconfigurable Architecture (RPCGRA) has the advantages of maximum parallelism and programmable flexibility. Designing an efficient algorithm to map the diverse applications onto RPCGRA is difficult due to a number of RPCGRA hardware constraints. To solve this problem, the nodes of the data flow graph must be partitioned and scheduled onto the RPCGRA. In this paper, we present a Depth-First Greedy Mapping (DFGM) algorithm that simultaneously considers the communication costs and the use times of the Reconfigurable Cell Array (RCA). Compared with level breadth mapping, the performance of DFGM is better. The percentage of maximum improvement in the use times of RCA is 33% and the percentage of maximum improvement in non-original input and output times is 64.4% (Given Discrete Cosine Transfor 8 (DCT8), and the area of reconfigurable processing unit is 56). Compared with level-based depth mapping, DFGM also obtains the lowest averages of use times of RCA, non-original input and output times, and the reconfigurable time.

References

[1]
J. M. P. Cardoso, P. C. Diniz, and M. Weinhardt, Compiling for reconfigurable computing: A survey, ACM Computing Surveys, vol. 42, no. 4, pp. 13011365, 2010.
[2]
J. W. Yoon, A. Shrivastava, S. Park, M. Ahn, and Y. Paek, A graph drawing based spatial algorithm for coarse-grained reconfigurable architectures, IEEE Transactions on Very Large Scale Integration Systems, vol. 17, no. 11, pp. 15651578, 2009.
[3]
M. Berekovic, A. Kanstein, B. Mei, and B. D. Sutter, Mapping of nomadic multimedia applications on the ADRES reconfigurable array processor, Microprocessors and Microsystems, vol. 33, no. 4, pp. 290294, 2009.
[4]
R. S. Ferreira, J. M. P. Cardoso, A. Damiany, J. Vendramini, and T. Teixeira, Fast placement and routing by extending coarse grained reconfigurable arrays with Omega networks, Journal of Systems Architecture, vol. 57, no. 8, pp. 761777, 2011.
[5]
R. Krishnamoorthy, K. Varadarajan, and S. K. Nandy, Interconnect-topology independent mapping algorithm for a coarse grained reconfigurable architecture, in Proc. of 2011 International Conference on Field Programmable Technology, New Delhi, India, 2011, pp. 15.
[6]
M. Ahn, J. W. Yoon, Y. Paek, Y. Kim, M. Kiemb, and K. Choi, A spatial mapping algorithm for heterogeneous coarse grained reconfigurable architectures, in Proc. of the Conference on Design, Automation and Test in Europe, Munich, Germany, 2006, pp. 363368.
[7]
G. Ansaloni, K. Tanimura, L. Pozzi, and N. Dutt, Integrated kernel partitioning and scheduling for coarse grained reconfigurable arrays, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 31, no. 12, pp. 18031816, 2012.
[8]
G. Lee, K. Choi, and N. D. Dutt, Mapping multi-domain applications onto coarse grained reconfigurable architectures, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 30, no. 5, pp. 637650, 2011.
[9]
M. Jo, D. Lee, K. Han, and K. Choi, Design of a coarse-grained reconfigurable architecture with floating-point support and comparative study, Integration, the VLSI Journal, vol. 47, no. 2, pp. 232241, 2014.
[10]
W. Kim, Y. Choi, and H. park, Fast modulo scheduler utilizing patternized routes for coarse-grained reconfigurable architectures, ACM Transactions on Architecture and Code Optimization, vol. 10, no. 4, pp. 124, 2013.
[11]
N. J. Chen and J. H. Jiang, Mapping algorithm for coarse-grained reconfigurable multimedia architectures, in Proc. of 19th IEEE International Parallel & Distributed Processing Symposium (IPDPS) Workshop, Shanghai, China, 2012, pp. 281286.
[12]
S. D. Yu, Research on the software/hardware co-design for reconfigurable processor, PhD dissertation, School of Information Science and Technology, Tsinghua University, Beijing, China, 2009.
[13]
H. Singh, M.-H. Lee, G. Lu, F. J. Kurdahi, and E. M. C. Filho, MorphoSys: An integrated reconfigurable system for data parallel and computation intensive applications, IEEE Transactions on Computers, vol. 49, no. 5, pp. 465481, 2000.
[14]
N. J. Chen, J. H. Jiang, X. Chen, Z. Zhou, and Y. Xu, An improved level partitioning algorithm considering minimum execution delay and resource restraints, Acta Electronica Sinica, vol. 40, no. 5, pp. 10551066, 2012.
[15]
J. Xiao, Z. H. Shi, W. D. Zhu, J. H. Jiang, Q. W. Zhou, J. Lou, Y. Huang, Q. Ji, and Z. Sun, Uniform non-Bernoulli sequences oriented locating method for reliability-critical gates, Tsinghua Science and Technology, vol. 26, no. 1, pp. 2435, 2021.
[16]
Y. M. Ouyang, Q. Wang, Z. Li, H. G. Liang, and J. Li, Fault-tolerant design for data efficient retransmission in WiNoC, Tsinghua Science and Technology, vol. 26, no.1, pp. 8594, 2021.
[17]
O. Sangyun, L. Hongsik, and L. Jongeun, Efficient execution of stream graphs on coarse-grained reconfigurable architectures, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 36, no. 12, pp. 19781988, 2017.
[18]
I. Bae, B. Harris, H. Min, and B. Egger, Auto-tuning CNNs for coarse-grained reconfigurable array-based accelerators, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 37, no. 11, pp. 23012310, 2018.
[19]
J. Y. Gu, S. Y. Yin, L. B. liu, and S. J. Wei, Stress-aware loops mapping on CGRAs with dynamic multi-map reconfiguration, IEEE Transactions on Parallel and Distributed Systems, vol. 29, no. 9, pp. 21052120, 2018.
Tsinghua Science and Technology
Pages 724-735
Cite this article:
Chen N, Wang Z, He R, et al. Efficient Scheduling Mapping Algorithm for Row Parallel Coarse-Grained Reconfigurable Architecture. Tsinghua Science and Technology, 2021, 26(5): 724-735. https://doi.org/10.26599/TST.2020.9010035

815

Views

57

Downloads

8

Crossref

9

Web of Science

11

Scopus

2

CSCD

Altmetrics

Received: 30 July 2020
Accepted: 08 September 2020
Published: 20 April 2021
© The author(s) 2021

© The author(s) 2021. The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).

Return