Correlation-aware probabilistic data summarization for large-scale multi-block scientific data visualization

Yang Yang; Kecheng Lu; Yu Wu; Yunhai Wang; Yi Cao

doi:10.1007/s41095-022-0304-6

| Sign up

PDF (6.1 MB)

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

Research Article | Open Access

Correlation-aware probabilistic data summarization for large-scale multi-block scientific data visualization

Yang Yang^¹, Kecheng Lu^², Yu Wu^¹, Yunhai Wang^², Yi Cao^¹()

1Institute of Applied Physics and ComputationalMathematics, Beijing 100094, China

2School of Computer Science and Technology, Shandong University, Qingdao 266237, China

Show Author Information

Graphical Abstract

View original image Download original image

Abstract

In this paper, we propose a correlation-aware probabilistic data summarization technique to efficiently analyze and visualize large-scale multi-block volume data generated by massively parallel scientific simulations. The core of our technique is correlation modeling of distribution representations of adjacent data blocks using copula functions and accurate data value estimation by combining numerical information, spatial location, and correlation distribution using Bayes’ rule. This effectively preserves statisticalproperties without merging data blocks in different parallel computing nodes and repartitioning them, thus significantly reducing the computational cost. Furthermore, this enables reconstruction of the original data more accurately than existing methods. We demonstrate the effectiveness of our technique using six datasets, with the largest having one billion grid points. The experimental results show that our approach reduces the data storage cost by approximately one order of magnitude compared to state-of-the-artmethods while providing a higher reconstruction accuracy at a lower computational cost.

Keywords

correlation-awareness large-scale data multi-block methods probabilistic data summarization

References

[1]

Ahrens,

; Hendrickson,

; Long,

; Miller,

; Ross,

; Williams,

Data intensive science in the Department of Energy. Technical Report, LA-UR-10-07088. Los Alamos National Laboratory, 2010.

Google Scholar

[2]

Nowell,

Science at extreme scale: Architectural challenges and opportunities. 2014. Available at https://www.mcs.anl.gov/∼hereld/doecgf2014/slides/ScienceAtExtremeScale_DOECGF_Nowell_140424v2.pdf.

[3]

Luo,

; Kao,

; Pang,

Visualizing spatial distribution data sets. In: Proceedings of the Symposium on Data Visualisation, 29–38, 2003.

[4]

Kniss,

J. M.

; Van Uitert,

; Stephens,

; Li,

; Tasdizen,

; Hansen,

Statistically quantitative volume visualization. In: Proceedings of the IEEE Visualization, 287–294, 2005.

[5]

Potter,

; Krüger,

; Johnson,

Towards the visualization of multi-dimensional stochastic distribution data. In: Proceedings of the International Conference on Computer Graphics and Visualization, 2008. Available at http://www.sci.utah.edu/publications/Pot2008a/ CGV08-Potter-Kruger-Johnson.pdf.

[6]

Johnson,

C. R.

; Huang,

Distribution-driven visualization of volume data. IEEE Transactions on Visualization and Computer Graphics Vol. 15, No. 5, 734–746, 2009.

Crossref Google Scholar

[7]

Gosink,

L. J.

; Garth,

; Anderson,

J. C.

; Bethel,

E. W.

; Joy,

K. I.

An application of multivariate statistical analysis for query-driven visualization. IEEE Transactions on Visualization and Computer Graphics Vol. 17, No. 3, 264–275, 2011.

Crossref Google Scholar

[8]

Potter,

; Kniss,

; Riesenfeld,

; Johnson,

C. R.

Visualizing summary statistics and uncertainty. Computer Graphics Forum Vol. 29, No. 3, 823–832, 2010.

Crossref Google Scholar

[9]

Thompson,

; Levine,

J. A.

; Bennett,

J. C.

; Bremer,

P. T.

; Gyulassy,

; Pascucci,

; Pébay,

P. P.

Analysis of large-scale scalar data using hixels. In: Proceedings of the IEEE Symposium on Large Data Analysis and Visualization, 23–30, 2011.

Crossref

[10]

Liu,

S. S.

; Levine,

J. A.

; Bremer,

P. T.

; Pascucci,

Gaussian mixture model based volume visualization. In: Proceedings of the IEEE Symposium on Large Data Analysis and Visualization, 73–77, 2012.

[11]

Dutta,

; Shen,

H. W.

Distribution driven extraction and tracking of features for time-varying data analysis. IEEE Transactions on Visualization and Computer Graphics Vol. 22, No. 1, 837–846, 2016.

Crossref Google Scholar

[12]

Pöthkow,

; Hege,

Nonparametric models for uncertainty visualization. Computer Graphics Forum Vol. 32, No. 3pt2, 131–140, 2013.

Crossref Google Scholar

[13]

Chaudhuri,

; Wei,

T. H.

; Lee,

T. Y.

; Shen,

H. W.

; Peterka,

Efficient range distribution query for visualizing scientific data. In: Proceedings of the IEEE Pacific Visualization Symposium, 201–208, 2014.

Crossref

[14]

Nouanesengsy,

; Woodring,

; Patchett,

; Myers,

; Ahrens,

ADR visualization: A generalized framework for ranking large-scale scientific data using Analysis-Driven Refinement. In: Proceedings of the IEEE 4th Symposium on Large Data Analysis and Visualization, 43–50, 2014.

Crossref

[15]

Athawale,

; Sakhaee,

; Entezari,

Isosurface visualization of data with nonparametric models for uncertainty. IEEE Transactions on Visualization and Computer Graphics Vol. 22, No. 1, 777–786, 2016.

Crossref Google Scholar

[16]

Wei,

T. H.

; Chen,

C. M.

; Biswas,

Efficient local histogram searching via bitmap indexing. Computer Graphics Forum Vol. 34, No. 3, 81–90, 2015.

Crossref Google Scholar

[17]

Dutta,

; Chen,

C. M.

; Heinlein,

; Shen,

H. W.

; Chen,

J. P.

In situ distribution guided analysis and visualization of transonic jet engine simulations. IEEE Transactions on Visualization and Computer Graphics Vol. 23, No. 1, 811–820, 2017.

Crossref Google Scholar

[18]

Dutta,

; Woodring,

; Shen,

H. W.

; Chen,

J. P.

; Ahrens,

Homogeneity guided probabilistic data summaries for analysis and visualization of large-scale data sets. In: Proceedings of the IEEE Pacific Visualization Symposium, 111–120, 2017.

Crossref

[19]

Reynolds,

D. R.

; Gardner,

D. J.

; Balos,

C. J.

; Woodward,

C. S.

SUNDIALS Multiphysics+MPIMany-Vector performance testing. arXiv preprint arXiv: 1909.12966, 2019.

Crossref Google Scholar

[20]

Wang,

K. C.

; Lu,

K. W.

; Wei,

T. H.

; Shareef,

; Shen,

H. W.

Statistical visualization and analysis of large data using a value-based spatial distribution. In: Proceedings of the IEEE Pacific Visualization Symposium, 161–170, 2017.

Crossref

[21]

Sklar,

Fonctions de Répartition à n Dimensions et Leurs Marges. Publications de l’Institut Statistique de l’Université de Paris Vol. 8, 229–231, 1959.

Google Scholar

[22]

Hazarika,

; Biswas,

; Shen,

H. W.

Uncertainty visualization using copula-based analysis in mixed distribution models. IEEE Transactions on Visualization and Computer Graphics Vol. 24, No. 1, 934–943, 2018.

Crossref Google Scholar

[23]

Hazarika,

; Dutta,

; Shen,

H. W.

; Chen,

J. P.

CoDDA: A flexible copula-based distribution driven analysis framework for large-scale multivariate data. IEEE Transactions on Visualization and Computer Graphics Vol. 25, No. 1, 1214–1224, 2019.

Crossref Google Scholar

[24]

Ihm,

; Park,

Wavelet-based 3D compression scheme for very large volume data. In: Proceedings of the Graphics Interface, 107–116, 1998.

Crossref

[25]

Kim,

; Shin,

An efficient wavelet-based compression method for volume rendering. In: Proceedings of the 7th Pacific Conference on Computer Graphics and Applications, 147–156, 1999.

[26]

Sasaki,

; Sato,

; Endo,

; Matsuoka,

Exploration of lossy compression for application-level checkpoint/restart. In: Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 914–922, 2015.

Crossref

[27]

Deering,

Geometry compression. In: Proceedings of the 22nd Annual Conference on Computer Graphics and Interactive Techniques, 13–20, 1995.

Crossref

[28]

Peng,

J. L.

; Kuo,

C.-C. J.

Geometry-guided progressive lossless 3D mesh coding with octree (OT) decomposition. In: Proceedings of the ACM SIGGRAPH 2005 Papers, 609–616, 2005.

Crossref

[29]

Khodakovsky,

; Schröder,

; Sweldens,

Progressive geometry compression. In: Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques, 271–278, 2000.

Crossref

[30]

Gu,

X. F.

; Gortler,

S. J.

; Hoppe,

Geometry images. In: Proceedings of the 29th Annual Conference on Computer Graphics and Interactive Techniques, 355–361, 2002.

Crossref

[31]

Tzeng,

F. Y.

; Lum,

E. B.

; Ma,

K. L.

A novel interface for higher-dimensional classification of volume data. In: Proceedings of the IEEE Visualization, 505–512, 2003.

[32]

Kindlmann,

; Whitaker,

; Tasdizen,

; Moller,

Curvature-based transfer functions for direct volume rendering: Methods and applications. In: Proceedings of the IEEE Visualization, 513–520, 2003.

[33]

Tenginakai,

; Lee,

; Machiraju,

Salient iso-surface detection with model-independent statistical signatures. In: Proceedings of the Visualization, 231–238, 2001.

[34]

Hladůvka,

; König,

; Gröller,

Salient representation of volume data. In: Data Visualization 2001. Eurographics. Ebert,

D. S.

; Favre,

J. M.

; Peikert,

Eds. Springer Vienna, 203–211, 2001.

Crossref

[35]

Kniss,

; Kindlmann,

; Hansen,

Multidimensional transfer functions for interactive volume rendering. IEEE Transactions on Visualization and Computer Graphics Vol. 8, No. 3, 270–285, 2002.

Crossref Google Scholar

[36]

Wang,

K. C.

; Xu,

J. Y.

; Woodring,

; Shen,

H. W.

Statistical super resolution for data analysis and visualization of large scale cosmological simulations. In: Proceedings of the IEEE Pacific Visualization Symposium, 303–312, 2019.

Crossref

[37]

Schmidt,

Coping with copulas. In: Copulas - From Theory to Application in Finance. Bloomberg Press, 3–34, 2006.

[38]

Bilmes,

J. A.

A gentle tutorial of the EM algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models. International Computer Science Institute, 1998. Available at http://www.leap.ee.iisc.ac.in/sriram/teaching/MLSP_18/refs/GMM_Bilmes.pdf.

[39]

Nocedal,

; Wright,

Numerical Optimization. New York: Springer, 2006.

[40]

Wang,

C. L.

; Shen,

H. W.

Information theory in scientific visualization. Entropy Vol. 13, No. 1, 254–273, 2011.

Crossref Google Scholar

[41]

Wang,

; Bovik,

A. C.

; Sheikh,

H. R.

; Simoncelli,

E. P.

Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing Vol. 13, No. 4, 600–612, 2004.

Crossref Google Scholar

Computational Visual Media

Volume 9 Issue 3,
September 2023

Pages 513-529

DOI: 10.1007/s41095-022-0304-6

Cite this article:

Yang Y, Lu K, Wu Y, et al. Correlation-aware probabilistic data summarization for large-scale multi-block scientific data visualization. Computational Visual Media, 2023, 9(3): 513-529. https://doi.org/10.1007/s41095-022-0304-6