GPGPU Cloud: A Paradigm for General Purpose Computing

Liang Hu; Xilong Che; Zhenzhen Xie

doi:10.1109/TST.2013.6449404

| Sign up

PDF (1.9 MB)

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

Show Outline

Figures (4)

Fig. 1

Fig. 2

Fig. 3

Fig. 4

Tables (1)

Table 1

Open Access

GPGPU Cloud: A Paradigm for General Purpose Computing

Liang Hu, Xilong Che(), Zhenzhen Xie

College of Computer Science and Technology, Jilin University, Changchun 130012, China

Show Author Information

Abstract

The Kepler General Purpose GPU (GPGPU) architecture was developed to directly support GPU virtualization and make GPGPU cloud computing more broadly applicable by providing general purpose computing capability in the form of on-demand virtual resources. This paper describes a baseline GPGPU cloud system built on Kepler GPUs, for the purpose of exploring hardware potential while improving task performance. This paper elaborates a general scheme which defines the whole cloud system into a cloud layer, a server layer, and a GPGPU layer. This paper also illustrates the hardware features, task features, scheduling mechanism, and execution mechanism of each layer. Thus, this paper provides a better understanding of general-purpose computing on a GPGPU cloud.

Keywords

Kepler GK110 GPGPU cloud virtualization SMX

References

[1]

Nickolls

and W. J.

Dally

, The GPU computing era, IEEE Micro., vol. 30, no. 2, pp. 56-69, March-April 2010.

Crossref Google Scholar

[2]

Leinweber

, L.

Baumgartner

, M.

Mernberger

, T.

Fober

, E.

Hullermeier

, G.

Klebe

, and B.

Freisleben

, GPU-based cloud computing for comparing the structure of protein binding sites, in Proc. of the 6th IEEE International Conference on Digital Ecosystems Technologies (DEST’12), Campione d’Italia, Italy, 18-20 June, 2012, pp. 1-6.

Crossref

[3]

Nishiyama

, S.

Yamagiwa

, and T.

Hisamitsu

, Prototyping GPU-based cloud system for IODP core image database, in Proc. of the 2011 Second International Conference on Networking and Computing (ICNC ’11), Osaka, Japan, November 30-December 02, 2011, pp. 327-331.

Crossref

[4]

Wang

and Z.

Shen

, Artificial societies and GPU-based cloud computing for intelligent transportation management, IEEE Intelligent Systems, vol. 26, no. 4, pp. 22-28, July 2011.

Crossref Google Scholar

[5]

Giunta

, R.

Montella

, G.

Laccetti

, F.

Isaila

, and J. G.

Blas

, A GPU accelerated high performance cloud computing infrastructure for grid computing based virtual environmental laboratory, in Advances in Grid Computing, Z

Constantinescu

, Ed. InTech, Feb. 28, 2011, Chapter 7, pp. 121-146.

Crossref

[6]

Duato

, A. J.

Pena

, F.

Silla

, J. C.

Fernandez

, R.

Mayo

, and E. S.

Quintana-Orti

, Enabling CUDA acceleration within virtual machines using rCUDA, in Proc. of the 18th International Conference on High Performance Computing (HiPC 2011), Bengaluru, India, December 18-21, 2011, pp. 1-10.

Crossref

[7]

Shi

, H.

Chen

, J.

Sun

, and K.

, vCUDA: GPU-accelerated high-performance computing in virtual machines, IEEE Transactions on Computers, vol. 61, no. 6, pp. 804-816, June 2012.

Crossref Google Scholar

[8]

Gupta

, A.

Gavrilovska

, K.

Schwan

, H.

Kharche

, N.

Tolia

, V.

Talwar

, and P.

Ranganathan

, GViM: GPU-accelerated virtual machines, in Proc. of the 3rd ACM Workshop on System-Level Virtualization for High Performance Computing (HPCVirt 09), Nuremberg, Germany, March 31, 2009, pp. 17-24.

Crossref

[9]

Yang

, C.

Huang

, and C.

Lin

, Hybrid CUDA, OpenMP, and MPI parallel programming on multicore GPU clusters, Computer Physics Communications, vol. 182, no. 1, pp. 266-269, January 2011.

Crossref Google Scholar

[10]

Yang

, C.

Huang

, C.

Lin

, and T.

Chang

, Hybrid parallel programming on GPU clusters, in Proc. of the IEEE International Symposium on Parallel and Distributed Processing with Applications (ISPA 2010), Taipei, China, September 6-9, 2010, pp. 142-147.

Crossref

[11]

NVIDIA Corporation, Whitepaper: NVIDIA’s next generation CUDA compute architecture: Kepler GK110, http://www.nvidia.com/content/PDF/kepler/NVIDIA-Kepler-GK110-Architecture-Whitepaper.pdf, 2012.

[12]

Fatahalian

and M.

Houston

, A closer look at GPUs, Communications of the ACM, vol. 51, no. 10, pp. 50-57, October 2008.

Crossref Google Scholar

[13]

Luebke

and G.

Humphreys

, How GPUs work, Computer, vol. 40, no. 2, pp. 96-100, February 2007.

Crossref Google Scholar

[14]

V. V.

Kindratenko

, J. J.

Enos

, G.

Shi

, M. T.

Showerman

, G. W.

Arnold

, J. E.

Stone

, J. C.

Phillips

, and W.

Hwu

, GPU clusters for high-performance computing, in Proc. of the 2009 IEEE International Conference on Cluster Computing (CLUSTER 2009), New Orleans, Louisiana, USA, August 31-September 4, 2009, pp. 1-8.

Crossref

[15]

Mellanox Technologies Inc. Introduction to InfiniBand, http://www.mellanox.com/pdf/whitepapers/ IB_Intro_WP_190.pdf, 2012.

[16]

Message Passing Interface Project, http://www.mcs.anl.gov/research/projects/mpi/, 2012.

[17]

NVIDIA Corporation, NVIDIA CUDA C Programming Guide, Version 5.0, http://docs.nvidia.com/cuda/pdf/CUDA_C_Programming_Guide.pdf, 2012.

[18]

NVIDIA Corporation, The CUDA compiler driver NVCC, Version 5.0, http://docs.nvidia.com/cuda/pdf/CUDA_Compiler_ Driver_NVCC.pdf, 2012.

[19]

F. R.

Diard

, Programming multiple chips from a command buffer (Assignee NVIDIA Corp.), US Patent US7528836B2, May 5, 2009.

[20]

NVIDIA Corporation. Whitepaper: NVIDIA’s next generation CUDA compute architecture: Fermi, http://www.nvidia.com/content/PDF/fermi_white_papers/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf, 2012.

[21]

J. F.

Duluk

Jr., S. D.

Lew

, and J. R.

Nickolls

, Counter-based delay of dependent thread group execution (Assignee NVIDIA Corp.), US Patent US7526634B1, April 28, 2009.

[22]

P. C.

Mills

, J. E.

Lindholm

, B. W.

Coon

, G. M.

Tarolli

, and J. M.

Burgess

, Scheduler in multi-threaded processor prioritizing instructions passing qualification rule (Assignee NVIDIA Corp.), US Patent US7949855B1, May 24, 2011.

[23]

J. F.

Duluk

Jr., Predicated launching of compute thread arrays (Assignee NVIDIA Corp.), US Patent US7697007B1, April 13, 2010.

Tsinghua Science and Technology

Volume 18 Issue 1,
February 2013

Pages 22-33

DOI: 10.1109/TST.2013.6449404

Cite this article:

Hu L, Che X, Xie Z. GPGPU Cloud: A Paradigm for General Purpose Computing. Tsinghua Science and Technology, 2013, 18(1): 22-33. https://doi.org/10.1109/TST.2013.6449404

Command name	Description
DefineSemaphore	Define a semaphore at a given memory location.
AcquireSemaphore	Wait for a previous kernel to release the semaphore.
SetLanchID	Associate the blocks of the current kernel to a reference counter.
SetRefCntValue	Set a reference counter used by ResetRefCnt and WaitRefCnt.
ResetRefCnt	Wait for blocks of the previous kernel to complete.
WaitRefCnt	Wait for blocks of the current kernel to complete.
Launch	Launch a block in the current kernel for execution.
SetParameterSize	Allocate space for blocks to accept kernel parameters.
Parameter	Transfer the kernel parameters to blocks when launched.