Scholar - SciOpen

Managing software packages in a scientific computing environment is a challenging task, especially in the case of heterogeneous systems. It is error prone when installing and updating software packages in a sophisticated computing environment. Testing and performance evaluation in an on-the-fly manner is also a troublesome task for a production system. In this paper, we discuss a package management scheme based on containers. The newly developed method can ease the maintenance complexity and reduce human mistakes. We can benefit from the self-containing and isolation features of container technologies for maintaining the software packages among intricately connected clusters. By deploying the SuperComputing application Strore (SCStore) over the WAN connected world-largest clusters, it proved that it can greatly reduce the effort for maintaining the consistency of software environment and bring benefit to achieve automation.

Open Access Issue

SwiftArray: Accelerating Queries on Multidimensional Arrays

Yifeng Geng, Xiaomeng Huang, Guangwen Yang

Tsinghua Science and Technology 2014, 19(5): 521-530

Published: 13 October 2014

Abstract

PDF (2.4 MB) Collect Collected

Downloads：23

Scientific instruments and simulation programs are generating large amounts of multidimensional array data. Queries with value and dimension subsetting conditions are commonly used by scientists to find useful information from big array data, and data storage and indexing methods play an important role in supporting queries on multidimensional array data efficiently. In this paper, we propose SwiftArray, a new storage layout with indexing techniques to accelerate queries with value and dimension subsetting conditions. In SwiftArray, the multidimensional array is divided into blocks and each block stores sorted values. Blocks are placed in the order of a Hilbert space-filling curve to improve data locality for dimension subsetting queries. We propose a 2-D-Bin method to build an index for the blocks’ value ranges, which is an efficient way to avoid accessing unnecessary blocks for value subsetting queries. Our evaluations show that SwiftArray surpasses the NetCDF-4 format and FastBit indexing technique for queries on multidimensional arrays.

Open Access Issue

A Fully Pipelined Probability Density Function Engine for Gaussian Copula Model

Huabin Ruan, Xiaomeng Huang, Haohuan Fu, Guangwen Yang

Tsinghua Science and Technology 2014, 19(2): 195-202

Published: 15 April 2014

Abstract

PDF (502.8 KB) Collect Collected

Downloads：29

The Gaussian Copula Probability Density Function (PDF) plays an important role in the fields of finance, hydrological modeling, biomedical study, and texture retrieval. However, the existing schemes for evaluating the Gaussian Copula PDF are all computationally-demanding and generally the most time-consuming part in the corresponding applications. In this paper, we propose an FPGA-based design to accelerate the computation of the Gaussian Copula PDF. Specifically, the evaluation of the Gaussian Copula PDF is mapped into a fully-pipelined FPGA dataflow engine by using three optimization steps: transforming the calculation pattern, eliminating constant computations from hardware logic, and extending calculations to multiple pipelines. In the experiments on 10 typical large-scale data sets, our FPGA-based solution shows a maximum of 1870 times speedup over a well-tuned single-core CPU-based solution, and 610 times speedup over a well-optimized parallel quad-core CPU-based solution when processing two-dimensional data.

Total 3

<1/11>GOpage