AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
PDF (5.3 MB)
Collect
Submit Manuscript AI Chat Paper
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Open Access

Dynamic Scene Graph Generation of Point Clouds with Structural Representation Learning

School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing 100876, China
Standard and Metrology Research Institute, China Academy of Railway Sciences Corporation Limited, Beijing 100081, China
Show Author Information

Abstract

Scene graphs of point clouds help to understand object-level relationships in the 3D space. Most graph generation methods work on 2D structured data, which cannot be used for the 3D unstructured point cloud data. Existing point-cloud-based methods generate the scene graph with an additional graph structure that needs labor-intensive manual annotation. To address these problems, we explore a method to convert the point clouds into structured data and generate graphs without given structures. Specifically, we cluster points with similar augmented features into groups and establish their relationships, resulting in an initial structural representation of the point cloud. Besides, we propose a Dynamic Graph Generation Network (DGGN) to judge the semantic labels of targets of different granularity. It dynamically splits and merges point groups, resulting in a scene graph with high precision. Experiments show that our methods outperform other baseline methods. They output reliable graphs describing the object-level relationships without additional manual labeled data.

References

[1]
A. A. Liu, H. Zhou, W. Nie, Z. Liu, W. Liu, H. Xie, Z. Mao, X. Li, and D. Song, Hierarchical multi-view context modelling for 3D object classification and retrieval, Inf. Sci., vol. 547, pp. 984995, 2021.
[2]
L. Deng, M. Yang, Z. Liang, Y. He, and C. Wang, Fusing geometrical and visual information via superpoints for the semantic segmentation of 3D road scenes, Tsinghua Science and Technology, vol. 25, no. 4, pp. 498507, 2020.
[3]
J. Yang, J. Lu, S. Lee, D. Batra, and D. Parikh, Graph R-CNN for scene graph generation, in Proc. 15th European Conf. Computer Vision, Munich, Germany, 2018, pp. 690706.
[4]
X. Li and S. Jiang, Know more say less: Image captioning based on scene graphs, IEEE Trans. Multimedia, vol. 21, no. 8, pp. 21172130, 2019.
[5]
D. Xu, Y. Zhu, C. B. Choy, and L. Fei-Fei, Scene graph generation by iterative message passing, in Proc. 2017 IEEE Conf. Computer Vision and Pattern Recognition, Honolulu, HI, USA, 2017, pp. 30973106.
[6]
J. Wald, H. Dhamo, N. Navab, and F. Tombari, Learning 3D semantic scene graphs from 3D indoor reconstructions, in Proc. 2020 IEEE/CVF Conf. Computer Vision and Pattern Recognition, Seattle, WA, USA, 2020, pp. 39603969.
[7]
C. Zhang, J. Yu, Y. Song, and W. Cai, Exploiting edge-oriented reasoning for 3D point-based scene graph analysis, in Proc. 2021 IEEE/CVF Conf. Computer Vision and Pattern Recognition, Nashville, TN, USA, 2021, pp. 97009710.
[8]
J. Gu, H. Zhao, Z. Lin, S. Li, J. Cai, and M. Ling, Scene graph generation with external knowledge and image reconstruction, in Proc. 2019 IEEE/CVF Conf. Computer Vision and Pattern Recognition, Long Beach, CA, USA, 2019, pp. 19691978.
[9]
Z. Lin, F. Zhu, Q. Wang, Y. Kong, J. Wang, L. Huang, and Y. Hao, RSSGG_CS: Remote sensing image scene graph generation by fusing contextual information and statistical knowledge, Remote Sens., vol. 14, no. 13, p. 3118, 2022.
[10]
J. You, R. Ying, X. Ren, W. L. Hamilton, and J. Leskovec, GraphRNN: Generating realistic graphs with deep auto-regressive models, in Proc. 35th Int. Conf. Machine Learning, Stockholm, Sweden, 2018, pp. 56945703.
[11]
R. Zellers, M. Yatskar, S. Thomson, and Y. Choi, Neural motifs: Scene graph parsing with global context, in Proc. 2018 IEEE/CVF Conf. Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 58315840.
[12]
J. Wald, A. Avetisyan, N. Navab, F. Tombari, and M. Nießner, RIO: 3D object instance Re-localization in changing indoor environments, in Proc. 2019 IEEE/CVF Int. Conf. Computer Vision, Seoul, Republic of Korea, 2019, pp. 76577666.
[13]
Y. Xie, J. Tian, and X. X. Zhu, Linking points with labels in 3D: A review of point cloud semantic segmentation, IEEE Geosci. Remote Sens. Mag., vol. 8, no. 4, pp. 3859, 2020.
[14]
W. Liu, Z. Liu, Q. Li, Z. Han, and A. Núñez, High-precision detection method for structure parameters of catenary cantilever devices using 3-D point cloud data, IEEE Trans. Instrum. Meas., vol. 70, p. 3507811, 2021.
[15]
J. Xiao, J. Zhang, B. Adler, H. Zhang, and J. Zhang, Three-dimensional point cloud plane segmentation in both structured and unstructured environments, Rob. Auton. Syst., vol. 61, no. 12, pp. 16411652, 2013.
[16]
J. E. Deschaud and F. Goulette, A fast and accurate plane detection algorithm for large noisy point clouds using filtered normals and voxel growing, presented at 3D Data Processing Visualization and Transmission, Paris, France, 2010.
[17]
T. Rabbani, F. Van Den Heuvel, and G. Vosselman, Segmentation of point clouds using smoothness constraints, in Proc. ISPRS Commission V Symp.: Image Engineering and Vision Metrology, Dresden, Germany, 2006, pp. 248253.
[18]
M. A. Wani and H. R. Arabnia, Parallel edge-region-based segmentation algorithm targeted at reconfigurable MultiRing network, J. Supercomput., vol. 25, no. 1, pp. 4362, 2003.
[19]
J. Papon, A. Abramov, M. Schoeler, and F. Wörgötter, Voxel cloud connectivity segmentation-supervoxels for point clouds, in Proc. 2013 IEEE Conf. Computer Vision and Pattern Recognition, Portland, OR, USA, 2013, pp. 20272034.
[20]
A. Delong, A. Osokin, H. N. Isack, and Y. Boykov, Fast approximate energy minimization with label costs, Int. J. Comput. Vis., vol. 96, no. 1, pp. 127, 2012.
[21]
D. Kong, L. Xu, X. Li, and S. Li, K-plane-based classification of airborne LiDAR data for accurate building roof measurement, IEEE Trans. Instrum. Meas., vol. 63, no. 5, pp. 12001214, 2014.
[22]
L. Landrieu and M. Simonovsky, Large-scale point cloud semantic segmentation with superpoint graphs, in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 45584567.
[23]
C. Qi and J. Yin, Multigranularity semantic labeling of point clouds for the measurement of the rail tanker component with structure modeling, IEEE Trans. Instrum. Meas., vol. 70, p. 5000312, 2021.
[24]
S. Guinard and L. Landrieu, Weakly supervised segmentation-aided classification of urban scenes from 3D LiDAR point clouds, in Proc. Int. Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Hannover, Germany, 2017, pp. 151157.
[25]
L. Landrieu and G. Obozinski, Cut pursuit: Fast algorithms to learn piecewise constant functions on general weighted graphs, SIAM J. Imaging Sci., vol. 10, no. 4, pp. 17241766, 2017.
[26]
L. P. Chew, Constrained Delaunay triangulations, Algorithmica, vol. 4, nos. 1–4, pp. 97108, 1989.
[27]
R. Q. Charles, S. Hao, K. Mo, and L. J. Guibas, PointNet: Deep learning on point sets for 3D classification and segmentation, in Proc. 2017 IEEE Conf. Computer Vision and Pattern Recognition, Honolulu, HI, USA, 2017, pp. 7785.
[28]
K. Cho, B. van Merrienboer, A. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio, Learning phrase representations using RNN encoder-decoder for statistical machine translation, in Proc. 2014 Conf. Empirical Methods in Natural Language Processing, Doha, Qatar, 2014, pp. 17241734.
[29]
I. Armeni, O. Sener, A. R. Zamir, H. Jiang, I. Brilakis, M. Fischer, and S. Savarese, 3D semantic parsing of large-scale indoor spaces, in Proc. 2016 IEEE Conf. Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 2016, pp. 15341543.
[30]
X. Roynard, J. E. Deschaud, and F. Goulette, Paris-lille-3D: A large and high-quality ground-truth urban point cloud dataset for automatic segmentation and classification, Int. J. Rob. Res., vol. 37, no. 6, pp. 545557, 2018.
[31]
C. Lu, R. Krishna, M. S. Bernstein, and L. Fei-Fei, Visual relationship detection with language priors, in Proc. 14th European Conf. Computer Vision, Amsterdam, The Netherlands, 2016, pp. 852869.
[32]
A. Gretton, K. M. Borgwardt, M. J. Rasch, B. Schölkopf, and A. Smola, A kernel two-sample test, J. Mach. Learn. Res., vol. 13, pp. 723773, 2012.
[33]
H. Thomas, C. R. Qi, J. E. Deschaud, B. Marcotegui, F. Goulette, and L.J. Guibas, KPConv: Flexible and deformable convolution for point clouds, in Proc. 2019 IEEE/CVF Int. Conf. Computer Vision, Seoul, Republic of Korea, 2019, pp. 64106419.
[34]
T. N. Kipf and M. Welling, Semi-supervised classification with graph convolutional networks, presented at the 5th Int. Conf. Learning Representations, Toulon, France, 2017.
Tsinghua Science and Technology
Pages 232-243
Cite this article:
Qi C, Yin J, Zhang Z, et al. Dynamic Scene Graph Generation of Point Clouds with Structural Representation Learning. Tsinghua Science and Technology, 2024, 29(1): 232-243. https://doi.org/10.26599/TST.2023.9010002

466

Views

47

Downloads

2

Crossref

0

Web of Science

1

Scopus

0

CSCD

Altmetrics

Received: 25 August 2022
Revised: 09 November 2022
Accepted: 06 January 2023
Published: 21 August 2023
© The author(s) 2024.

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).

Return