Towards robustness and generalization of point cloud representation: A geometry coding method and a large-scale object-level dataset

Mingye Xu; Zhipeng Zhou; Yali Wang; Yu Qiao

doi:10.1007/s41095-022-0305-5

| Sign up

PDF (4.2 MB)

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

Show Outline

Figures (10)

Fig. 1

Fig. 2

Fig. 3

Fig. 4

Fig. 5

Fig. 6

Fig. 7

Fig. 8

Fig. 9

Tables (13)

Table 1

Table 2

Table 3

Table 4

Table 5

Research Article | Open Access

Towards robustness and generalization of point cloud representation: A geometry coding method and a large-scale object-level dataset

Mingye Xu^{¹^,²}, Zhipeng Zhou^⁵, Yali Wang^{¹^,⁴}(), Yu Qiao^{¹^,³}()

1 Guangdong–Hong Kong–Macao Joint Laboratory of Human–Machine Intelligence-Synergy Systems, Shenzhen Institute of Advanced Technology, Chinese Academyof Sciences, Shenzhen 518000, China

2 University of Chinese Academy of Sciences, Beijing 065001, China

3 Shanghai AI Laboratory, Shanghai 200001, China

4 SIAT Branch, Shenzhen Institute of Artificial Intelligence and Robotics for Society, Shenzhen 518000, China

5 Alibaba DAMO Academy, Hangzhou 242332, China

Show Author Information

Graphical Abstract

View original image Download original image

Abstract

Robustness and generalization are two challenging problems for learning point cloud represen-tation. To tackle these problems, we first design a novel geometry coding model, which can effectively use an invariant eigengraph to group points with similar geometric information, even when such points are far from each other. We also introduce a large-scale point cloud dataset, PCNet184. It consists of 184 categories and 51,915 synthetic objects, which brings new challenges for point cloud classification, and provides a new benchmark to assess point cloud cross-domain generalization. Finally, we perform exten-sive experiments on point cloud classification, using ModelNet40, ScanObjectNN, and our PCNet184, and segmentation, using ShapeNetPart and S3DIS. Our method achieves comparable performance to state-of-the-art methods on these datasets, for both supervised and unsupervised learning. Code and our dataset are available at https://github.com/MingyeXu/PCNet184.

Keywords

geometry coding self-supervised learning point cloud classification segmentation 3D analysis

References

[1]

Alqazzaz,

; Sun,

X. F.

; Yang,

; Nokes,

Automated brain tumor segmentation on multi-modal MR image using SegNet. Computational Visual Media Vol. 5, No. 2, 209–219, 2019.

Crossref Google Scholar

[2]

Han,

W. K.

; Wu,

; Wen,

C. L.

; Wang,

; Li,

BLNet: Bidirectional learning network for point clouds. Computational Visual Media Vol. 8, No. 4, 585–596, 2022.

Crossref Google Scholar

[3]

Zhang,

J. H.

; Wang,

Y. L.

; Zhou,

Z. P.

; Luan,

T. Y.

; Wang,

; Qiao,

Learning dynamical human-joint affinity for 3D pose estimation in videos. IEEE Transactions on Image Processing Vol. 30, 7914–7925, 2021.

Crossref Google Scholar

[4]

Wu,

Z. R.

; Song,

S. R.

; Khosla,

; Yu,

; Zhang,

L. G.

; Tang,

X. O.

; Xiao,

J. X.

3D ShapeNets: A deep representation for volumetric shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1912–1920, 2015.

[5]

Ma,

; Qin,

; You,

H. X.

; Ran,

H. X.

; Fu,

Rethinking network design and local geometry in point cloud: A simple residual MLP framework. arXiv preprint arXiv:2202.07123, 2022.

Google Scholar

[6]

Thomas,

; Qi,

C. R.

; Deschaud,

J. E.

; Marcotegui,

; Goulette,

; Guibas,

KPConv: Flexible and deformable convolution for point clouds. In: Pro-ceedings of the IEEE/CVF International Conference on Computer Vision, 6410–6419, 2019.

Crossref

[7]

Liu,

Y. C.

; Fan,

; Xiang,

S. M.

; Pan,

C. H.

Relation-shape convolutional neural network for point cloud analysis. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8887–8896, 2019.

Crossref

[8]

Wang,

Y. E.

; Sun,

Y. B.

; Liu,

Z. W.

; Sarma,

S. E.

; Bronstein,

M. M.

; Solomon,

J. M.

Dynamic graph CNN for learning on point clouds. ACM Transactions on Graphics Vol. 38, No. 5, Article No. 146, 2019.

Crossref Google Scholar

[9]

Rao,

Y. M.

; Lu,

J. W.

; Zhou,

Global-local bidirectional reasoning for unsupervised representation learning of 3D point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5375–5384, 2020.

Crossref

[10]

Qin,

; You,

H. X.

; Wang,

L. C.

; Kuo,

C. C. J.

; Fu,

PointDAN: A multi-scale 3D domain adaption network for point cloud representation. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems, Article No. 646, 7192–7203, 2019.

[11]

Xu,

M. Y.

; Zhou,

Z. P.

; Qiao,

Geometry sharing network for 3D point cloud classification and segmentation. Proceedings of the AAAI Conference on Artificial Intelligence Vol. 34, No. 7, 12500–12507, 2020.

Crossref Google Scholar

[12]

Charles,

R. Q.

; Hao,

; Mo,

K. C.

; Guibas,

L. J.

PointNet: Deep learning on point sets for 3D classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 77–85, 2017.

Crossref

[13]

Qi,

C. R.

; Yi,

; Su,

; Guibas,

L. J.

PointNet++: Deep hierarchical feature learning on point sets in a metric space. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, 5105–5114, 2017.

[14]

Xu,

M. T.

; Ding,

R. Y.

; Zhao,

H. S.

; Qi,

X. J.

PAConv: Position adaptive convolution with dynamic kernel assembling on point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3172–3181, 2021.

Crossref

[15]

Yang,

; Xu,

; Chen,

; Fu,

H. B.

View suggestion for interactive segmentation of indoor scenes. Computational Visual Media Vol. 3, No. 2, 131–146, 2017.

Crossref Google Scholar

[16]

Li,

; Bu,

; Sun,

; Wu,

; Di,

; Chen,

PointCNN: Convolution on

X

-transformed points. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems, 828–838, 2018.

[17]

Wu,

W. X.

; Qi,

Z. A.

; Li,

F. X.

PointConv: Deep convolutional networks on 3D point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 9613–9622, 2019.

Crossref

[18]

Yan,

; Zheng,

C. D.

; Li,

; Wang,

; Cui,

S. G.

PointASNL: Robust point clouds processing using nonlocal neural networks with adaptive sampling. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5588–5597, 2020.

Crossref

[19]

Bengio,

; Courville,

; Vincent,

Representation learning: A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 35, No. 8, 1798–1828, 2013.

Crossref Google Scholar

[20]

Bachman,

; Hjelm,

R. D.

; Buchwalter,

Learning representations by maximizing mutual information across views. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems, Article No. 1392, 15535–15545, 2019.

[21]

Doersch,

; Gupta,

; Efros,

A. A.

Unsupervised visual representation learning by context prediction. In: Proceedings of the IEEE International Conference on Computer Vision, 1422–1430, 2015.

Crossref

[22]

Doersch,

; Zisserman,

Multi-task self-supervised visual learning. In: Proceedings of the IEEE International Conference on Computer Vision, 2070–2079, 2017.

Crossref

[23]

Hénaff,

O. J.

; Srinivas,

; De Fauw,

; Razavi,

; Doersch,

; Ali Eslami,

S. M.

; Van Den Oord,

Data-efficient image recognition with contrastive predictive coding. In: Proceedings of the 37th International Conference on Machine Learning, Article No. 391, 4182–4192, 2020.

[24]

Tian,

Y. L.

; Krishnan,

; Isola,

Contrastive multiview coding. In: Computer Vision – ECCV 2020. Lecture Notes in Computer Science, Vol. 12356. Vedaldi,

; Bischof,

; Brox,

; Frahm,

J. M.

Eds. Springer Cham, 776–794, 2020.

Crossref

[25]

Yang,

Y. Q.

; Feng,

; Shen,

Y. R.

; Tian,

FoldingNet: Point cloud auto-encoder via deep grid deformation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 206–215, 2018.

Crossref

[26]

Zhao,

Y. H.

; Birdal,

; Deng,

H. W.

; Tombari,

3D point capsule networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1009–1018, 2019.

Crossref

[27]

Zhang,

; Kadam,

; Liu,

; Kuo,

C. C. J.

Unsupervised feedforward feature (UFF) learning for point cloud classification and segmentation. In: Proceedings of the IEEE International Conference on Visual Communications and Image Processing, 144–147, 2020.

Crossref

[28]

Hassani,

; Haley,

Unsupervised multi-task feature learning on point clouds. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 8159–8170, 2019.

Crossref

[29]

Xu,

M. Y.

; Wang,

Y. L.

; Zhou,

Z. P.

; Xu,

H. B.

; Qiao,

CP-net: Contour-perturbed reconstruction network for self-supervised point cloud learning. arXiv preprint arXiv:2201.08215, 2022.

Google Scholar

[30]

Uy,

M. A.

; Pham,

Q. H.

; Hua,

B. S.

; Nguyen,

; Yeung,

S. K.

Revisiting point cloud classification: A new benchmark dataset and classification model on real-world data. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 1588–1597, 2019.

Crossref

[31]

Yi,

; Kim,

V. G.

; Ceylan,

; Shen,

I. C.

; Yan,

M. Y.

; Su,

; Lu,

C. W.

; Huang,

Q. X.

; Sheffer,

; Guibas,

A scalable active framework for region annotation in 3D shape collections. ACM Transactions on Graphics Vol. 35, No. 6, Article No. 210, 2016.

Crossref Google Scholar

[32]

Mo,

K. C.

; Zhu,

S. L.

; Chang,

A. X.

; Yi,

; Tripathi,

; Guibas,

L. J.

; Su,

PartNet: A large-scale benchmark for fine-grained and hierarchical part-level 3D object understanding. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 909–918, 2019.

Crossref

[33]

Pan,

; Chen,

X. Y.

; Cai,

Z. A.

; Zhang,

J. Z.

; Zhao,

H. Y.

; Yi,

; Liu,

Z. W.

Variational relational point completion network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8520–8529, 2021.

Crossref

[34]

Zhou,

; Jiang,

Z. Q.

; Shui,

C. J.

; Wang,

B. Y.

; Chaib-draa,

Domain generalization via optimal transport with metric similarity learning. arXiv preprint arXiv:2007.10573, 2020.

Crossref Google Scholar

[35]

Dai,

; Chang,

A. X.

; Savva,

; Halber,

; Funkhouser,

; Nießner,

ScanNet: Richly-annotated 3D reconstructions of indoor scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2432–2443, 2017.

Crossref

[36]

Xu,

; Fan,

; Xu,

; Zeng,

; Qiao,

SpiderCNN: Deep learning on point sets with parameterized convolutional filters. In: Computer Vision – ECCV 2018. Lecture Notes in Computer Science, Vol. 11212. Ferrari,

; Hebert,

; Sminchisescu,

; Weiss,

Eds. Springer Cham, 90–105, 2018.

Crossref

[37]

Guo,

M. H.

; Cai,

J. X.

; Liu,

Z. N.

; Mu,

T. J.

; Martin,

R. R.

; Hu,

S. M.

PCT: Point cloud transformer. Computational Visual Media Vol. 7, No. 2, 187–199, 2021.

Crossref Google Scholar

[38]

Xu,

M. T.

; Zhang,

J. H.

; Zhou,

Z. P.

; Xu,

M. Y.

; Qi,

X. J.

; Qiao,

Learning geometry-disentangled representation for complementary understanding of 3D object point cloud. Proceedings of the AAAI Conference on Artificial Intelligence Vol. 35, No. 4, 3056–3064, 2021.

Crossref Google Scholar

[39]

Hua,

B. S.

; Tran,

M. K.

; Yeung,

S. K.

Pointwise convolutional neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 984–993, 2018.

Crossref

[40]

Simonovsky,

; Komodakis,

Dynamic edge-conditioned filters in convolutional neural networks on graphs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 29–38, 2017.

Crossref

[41]

Xie,

S. N.

; Liu,

S. N.

; Chen,

Z. Y.

; Tu,

Z. W.

Attentional ShapeContextNet for point cloud recognition. In: Proceedings of the IEEE/CVF Con-ference on Computer Vision and Pattern Recognition, 4606–4615, 2018.

Crossref

[42]

Groh,

; Wieschollek,

; Lensch,

H. P. A.

Flex-convolution (million-scale point-cloud learning beyond grid-worlds). arXiv preprint arXiv:1803.07289, 2018.

Google Scholar

[43]

Li,

J. X.

; Chen,

B. M.

; Lee,

G. H.

SO-net: Self-organizing network for point cloud analysis. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 9397–9406, 2018.

Crossref

[44]

Shen,

Y. R.

; Feng,

; Yang,

Y. Q.

; Tian,

Mining point cloud local structures by kernel correlation and graph pooling. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4548–4557, 2018.

Crossref

[45]

Gadelha,

; Wang,

; Maji,

Multiresolution tree networks for 3D point cloud processing. In: Computer Vision – ECCV 2018. Lecture Notes in Computer Science, Vol. 11211. Ferrari,

; Hebert,

; Sminchisescu,

; Weiss,

Eds. Springer Cham, 105–122, 2018.

Crossref

[46]

Wang,

; Samari,

; Siddiqi,

Local spectral graph convolution for point set feature learning. In: Computer Vision – ECCV 2018. Lecture Notes in Computer Science, Vol. 11208. Ferrari,

; Hebert,

; Sminchisescu,

; Weiss,

Eds. Springer Cham, 56–71, 2018.

Crossref

[47]

Yang,

J. C.

; Zhang,

; Ni,

B. B.

; Li,

L. G.

; Liu,

J. X.

; Zhou,

M. D.

; Tian,

Modeling point clouds with self-attention and gumbel subset sampling. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3318–3327, 2019.

Crossref

[48]

Klokov,

; Lempitsky,

Escape from cells: Deep kd-networks for the recognition of 3D point cloud models. In: Proceedings of the IEEE International Conference on Computer Vision, 863–872, 2017.

Crossref

[49]

Atzmon,

; Maron,

; Lipman,

Point convo-lutional neural networks by extension operators. arXiv preprint arXiv:1803.10091, 2018.

Crossref Google Scholar

[50]

Zhong,

; Han,

X. F.

Point cloud learning with transformer. arXiv preprint arXiv:2104.13636, 2021.

Crossref Google Scholar

[51]

Zhang,

; You,

H. X.

; Kadam,

; Liu,

; Kuo,

C. C J.

PointHop: An explainable machine learning method for point cloud classification. IEEE Transactions on Multimedia Vol. 22, No. 7, 1744–1755, 2020.

Crossref Google Scholar

[52]

Kingma,

D. P.

; Ba,

Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.

Google Scholar

[53]

Cortes,

; Vapnik,

Support-vector networks. Machine Learning Vol. 20, No. 3, 273–297, 1995.

Crossref Google Scholar

[54]

Ganin,

; Ustinova,

; Ajakan,

; Germain,

; Larochelle,

; Laviolette,

; Marchand,

; Lempitsky,

Domain-adversarial training of neural networks. In: Domain Adaptation in Computer Vision Applications. Advances in Computer Vision and Pattern Recognition. Csurka,

Ed. Springer Cham, 189–209, 2017.

Crossref

[55]

Sauder,

; Sievers,

Self-supervised deep learning on point clouds by reconstructing space. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems, Article No. 1161, 12962–12972, 2019.

[56]

Achituve,

; Maron,

; Chechik,

Self-supervised learning for domain adaptation on point clouds. In: Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 123–133, 2021.

Crossref

[57]

Alliegro,

; Boscaini,

; Tommasi,

Joint supervised and self-supervised learning for 3D real world challenges. In: Proceedings of the 25th International Conference on Pattern Recognition, 6718–6725, 2020.

[58]

Wang,

S. L.

; Suo,

; Ma,

W. C.

; Pokrovsky,

; Urtasun,

Deep parametric continuous convolutional neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2589–2597, 2018.

Crossref

[59]

Komarichev,

; Zhong,

Z. C.

; Hua,

A-CNN: Annularly convolutional neural networks on point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 7413–7422, 2019.

Crossref

[60]

Zhao,

H. S.

; Jiang,

; Fu,

C. W.

; Jia,

J. Y.

PointWeb: Enhancing local neighborhood features for point cloud processing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5560–5568, 2019.

Crossref

[61]

Zhang,

Z. Y.

; Hua,

B. S.

; Yeung,

S. K.

ShellNet: Efficient point cloud convolutional neural networks using concentric shells statistics. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 1607–1616, 2019.

Crossref

[62]

Han,

W. K.

; Wen,

C. L.

; Wang,

; Li,

Point2Node: Correlation learning of dynamic-node for point cloud feature modeling. Proceedings of the AAAI Conference on Artificial Intelligence Vol. 34, No. 7, 10925–10932, 2020.

Crossref Google Scholar

[63]

Hu,

Q. Y.

; Yang,

; Xie,

L. H.

; Rosa,

; Guo,

Y. L.

; Wang,

Z. H.

; Trigoni,

; Markham,

RandLA-net: Efficient semantic segmentation of large-scale point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 11105–11114, 2020.

Crossref

[64]

Lin,

Y. Q.

; Yan,

Z. Z.

; Huang,

H. B.

; Du,

; Liu,

L. G.

; Cui,

S. G.

; Han,

X. G.

FPConv: Learning local flattening for point convolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4292–4301, 2020.

Crossref

[65]

Huang,

Q. G.

; Wang,

W. Y.

; Neumann,

Recurrent slice networks for 3D segmentation of point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2626–2635, 2018.

Crossref

[66]

Chang,

A. X.

; Funkhouser,

; Guibas,

; Hanrahan,

; Huang,

Q. X.

; Li,

Z. M.

; Savarese,

; Savva,

; Song,

S. R.

; Su,

; et al. ShapeNet: An information-rich 3D model repository. arXiv preprint arXiv:1512.03012, 2015.

Google Scholar

[67]

Armeni,

; Sener,

; Zamir,

A. R.

; Jiang,

; Brilakis,

; Fischer,

; Savarese,

3D semantic parsing of large-scale indoor spaces. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1534–1543, 2016.

Crossref

[68]

Boulch,

ConvPoint: Continuous convolutions for point cloud processing. Computers & Graphics Vol. 88, 24–34, 2020.

Crossref Google Scholar

[69]

Xu,

M. Y.

; Zhou,

Z. P.

; Zhang,

J. H.

; Qiao,

Investigate indistinguishable points in semantic segmentation of 3D point cloud. Proceedings of the AAAI Conference on Artificial Intelligence Vol. 35, No. 4, 3047–3055, 2021.

Crossref Google Scholar

[70]

Chen,

Y. L.

; Hu,

V. T.

; Gavves,

; Mensink,

; Mettes,

; Yang,

P. W.

; Snoek,

C. G. M.

PointMixup: Augmentation for point clouds. In: Computer Vision – ECCV 2020. Lecture Notes in Computer Science, Vol. 12348. Vedaldi,

; Bischof,

; Brox,

; Frahm,

J. M.

Eds. Springer Cham, 330–345, 2020.

Crossref

Computational Visual Media

Volume 10 Issue 1,
February 2024

Pages 27-43

DOI: 10.1007/s41095-022-0305-5

Cite this article:

Xu M, Zhou Z, Wang Y, et al. Towards robustness and generalization of point cloud representation: A geometry coding method and a large-scale object-level dataset. Computational Visual Media, 2024, 10(1): 27-43. https://doi.org/10.1007/s41095-022-0305-5

Return

10.1007/s41095-022-0305-5.T001Table 1Comparison of object-level datasets used for 3D point cloud analysis

Dataset	Year	Type	Objects	Classes	RGB	Task
ModelNet10 [4]	2015	Synthetic	3,556	10	no	Classification, normal estimation, retrieval
ModelNet40 [4]	2015	Synthetic	13,823	40	no	Classification, normal estimation, retrieval
ShapeNetPart [31]	2016	Synthetic	15,023	16	no	Part segmentation
ScanNet [35]	2017	Real, scanned	7,879	10	yes	Classification
ScanObjectNN [30]	2019	Real, scanned	2,902	15	yes	Classification
PartNet [32]	2019	Synthetic	26,671	24	no	Fine grained part segmentation
MVP [33]	2021	Synthetic	4,000	16	no	Completion
PCNet184 (ours)	2022	Synthetic	51,915	184	no	Classification

10.1007/s41095-022-0305-5.T002Table 2Classification results on PCNet184, using 1k points. We report whether the method is supervised or not, and classification accuracy for different methods

Method	Supervised	Accuracy (%)
PointCNN [16]	✓	56.9
PointNet [12]	✓	58.5
SpiderCNN [36]	✓	57.8
PCT [37]	✓	59.6
PointNet++ [13]	✓	61.9
RSCNN [7]	✓	61.2
DGCNN [8]	✓	61.4
GDA-Net [38]	✓	61.5
Ours	✓	61.5
Ours	$\times$	60.0

10.1007/s41095-022-0305-5.T003Table 3Generalization analysis for our method. We trained supervised and unsupervised versions of our method on a source dataset and classified objects in a target dataset. Sup = supervised, Unsup = unsupervised

Task	Sup (%)	Unsup (%)
ModelNet40 $\to$ ScanObjectNN	82.2	81.7
PCNet184 $\to$ ScanObjectNN	83.3	82.3
ScanObjectNN $\to$ ModelNet40	89.4	88.6
PCNet184 $\to$ ModelNet40	92.3	90.8

10.1007/s41095-022-0305-5.T004Table 4Comparison of generalization ability. Methods were trained on PCNet184; we report the classification accuracy (%) for each target dataset, as follows: MN40 = ModelNet40, SONN = ScanObjectNN, SN = ScanNet, SN55 = ShapeNet55

Method	MN40	SONN	SN	SN55
PointNet [12]	48.6	74.2	76.5	83.2
PointNet++ [13]	92.0	82.2	85.4	87.8
DGCNN [8]	91.2	79.9	82.8	86.4
RSCNN [7]	92.9	80.2	80.6	85.4
GDANet [38]	91.6	81.2	88.2	87.0
Ours	92.3	83.3	86.4	87.9

10.1007/s41095-022-0305-5.T005Table 5Classification accuracy using ModelNet40 for training and testing

Method	Supervised	Input	Accuracy (%)
Pointwise-CNN [39]	✓	1k points	86.1
ECC [40]	✓	1k points	87.1
PointNet [12]	✓	1k points	89.2
SCN [41]	✓	1k points	90.0
Flex-Conv [42]	✓	1k points	90.2
PointNet++ [13]	✓	1k points	90.7
SO-Net [43]	✓	2k points	90.9
KCNet [44]	✓	1k points	91.0
MRTNet [45]	✓	1k points	91.2
Spec-GCN [46]	✓	1k points	91.5
PAT(FPS+GSS) [47]	✓	1k points	91.7
Kd-Net [48]	✓	1k points	91.8
SpiderCNN [36]	✓	1k points	92.2
DGCNN [8]	✓	1k points	92.2
PCNN [49]	✓	1k points	92.3
KPConv [6]	✓	1k points	92.9
RSCNN [7]	✓	1k points	93.6
PCT [37]	✓	1k points	93.2
MLMSPT [50]	✓	1k points	92.9
Ours	✓	1k points	92.9
Ours	✓	2k points	93.3
FoldingNet [25]	$\times$	1k points	88.9
PointCapsNet [26]	$\times$	1k points	88.9
PointHop [51]	$\times$	2k points	89.1
PointHop++ [51]	$\times$	2k points	91.1
MultiTask [28]	$\times$	1k points	89.1
UFF [27]	$\times$	2k points	90.4
Ours	$\times$	1k points	91.2

10.1007/s41095-022-0305-5.T006Table 6Classification results on ScanObjectNN dataset (OBJ_ ONLY), using 1k input points. All methods were trained on ScanObjectNN

Method	Supervised	Accuracy (%)
PointNet [12]	✓	79.2
SpiderCNN [36]	✓	79.5
PointNet++ [13]	✓	84.3
DGCNN [8]	✓	86.2
PointCNN [16]	✓	85.5
Ours	✓	86.9
Ours	$\times$	82.6

10.1007/s41095-022-0305-5.T007Table 7Comparison of generalization ability on PointDA-10 dataset. Results are reported over three runs ( $\pm$ SEM)

Method	M $\to$ S	M $\to S^{*}$	S $\to$ M	S $\to S^{*}$	$S^{*} \to$ M	$S^{*} \to$ S
DANN [54]	75.3 $\pm$ 0.6	41.5 $\pm$ 0.2	62.5 $\pm$ 1.4	46.1 $\pm$ 2.8	53.3 $\pm$ 1.2	63.2 $\pm$ 1.2
PointDAN [10]	82.5 $\pm$ 0.8	47.7 $\pm$ 1.0	77.0 $\pm$ 0.3	48.5 $\pm$ 2.1	55.6 $\pm$ 0.6	67.2 $\pm$ 2.7
RS [55]	81.5 $\pm$ 1.2	35.2 $\pm$ 5.9	71.9 $\pm$ 1.4	39.8 $\pm$ 0.7	61.0 $\pm$ 3.3	63.6 $\pm$ 3.4
DAE-Global [28]	83.5 $\pm$ 0.8	42.6 $\pm$ 1.4	74.8 $\pm$ 0.8	45.5 $\pm$ 1.6	64.9 $\pm$ 4.4	67.3 $\pm$ 0.6
DAE-Point	82.5 $\pm$ 0.4	40.2 $\pm$ 1.6	76.4 $\pm$ 0.7	50.2 $\pm$ 0.5	66.3 $\pm$ 1.5	66.1 $\pm$ 0.5
DefRec [56]	83.3 $\pm$ 0.2	46.6 $\pm$ 2.0	79.8 $\pm$ 0.5	49.9 $\pm$ 1.8	70.7 $\pm$ 1.4	64.4 $\pm$ 1.2
Alliegro et al. [57]	81.6 $\pm$ 0.6	49.7 $\pm$ 1.4	73.6 $\pm$ 0.5	41.9 $\pm$ 0.9	65.9 $\pm$ 0.7	68.1 $\pm$ 1.6
Ours	83.2 $\pm$ 0.2	54.9 $\pm$ 1.3	75.1 $\pm$ 0.9	53.6 $\pm$ 1.1	73.3 $\pm$ 0.4	74.0 $\pm$ 0.3

10.1007/s41095-022-0305-5.T008Table 8Semantic segmentation accuracy (%) on the S3DIS dataset with 6-fold cross validation

Method	mIoU	ceil.	floor	wall	beam	col.	win.	door	table	chair	sofa	bookc.	board	clu.
PointNet [12]	47.8	88.0	88.7	69.3	42.4	23.1	47.5	51.6	54.1	42.0	9.6	38.2	29.4	35.2
PCNN [58]	58.3	92.3	96.2	75.9	0.27	6.0	69.5	63.5	66.9	65.6	47.3	68.9	59.1	46.2
PointCNN [16]	65.4	94.8	97.3	75.8	63.3	51.7	58.4	57.2	69.1	71.6	61.2	39.1	52.2	58.6
A-CNN [59]	62.9	92.4	96.4	79.2	59.5	34.2	56.3	65.0	66.5	78.0	28.5	56.9	48.0	56.8
PointWeb [60]	66.7	93.5	94.2	80.8	52.4	41.3	64.9	68.1	71.4	67.1	50.3	62.7	62.2	58.5
KPConv [6]	70.6	93.6	92.4	83.1	63.9	54.3	66.1	76.6	64.0	57.8	74.9	69.3	61.3	60.3
ShellNet [61]	66.8	90.2	93.6	79.9	60.4	44.1	64.9	52.9	71.6	84.7	53.8	64.6	48.6	59.4
Point2Node [62]	70.0	94.1	97.3	83.4	62.7	52.3	72.3	64.3	75.8	70.8	65.7	49.8	60.3	60.9
RandLA-Net [63]	68.5	92.7	95.6	79.2	61.7	47.0	63.1	67.7	68.9	74.2	55.3	63.4	63.0	58.7
FPConv [64]	68.7	94.8	97.5	82.6	42.8	41.8	58.6	73.4	71.0	81.0	59.8	61.9	64.2	64.2
Ours	70.5	93.8	98.0	81.4	56.0	43.1	64.2	71.8	74.9	77.4	65.8	65.4	64.2	59.9

10.1007/s41095-022-0305-5.T009Table 9Segmentation accuracy (%) on the ShapeNet Part dataset. Mean IoU is across all instances and IoU for each category

Method	Sup.	mIoU	aero	bag	cap	car	chair	earp.	guitar	knife	lamp	laptop	motor	mug	pistol	rocket	skateb.	table
Kd-Net [48]	✓	82.3	80.1	74.6	74.3	70.3	88.6	73.5	90.2	87.2	81.0	94.9	57.4	86.7	78.1	51.8	69.9	80.3
PointNet [12]	✓	83.7	83.4	78.7	82.5	74.9	89.6	73.0	91.5	85.9	80.8	95.3	65.2	93.0	81.2	57.9	72.8	80.6
SCN [41]	✓	84.6	83.8	80.8	83.5	79.3	90.5	69.8	91.7	86.5	82.9	96.0	69.2	93.8	82.5	62.9	74.4	80.8
SO-Net [43]	✓	84.6	81.9	83.5	84.8	78.1	90.8	72.2	90.1	83.6	82.3	95.2	69.3	94.2	80.0	51.6	72.1	82.6
KCNet [44]	✓	84.7	82.8	81.5	86.4	77.6	90.3	76.8	91.0	87.0	84.5	95.5	69.2	94.4	81.6	60.1	75.2	81.3
RS-Net [65]	✓	84.9	82.7	86.4	84.1	78.2	90.4	69.3	91.4	87.0	83.5	95.4	66.0	92.6	81.8	56.1	75.8	82.2
PointNet++ [13]	✓	85.1	82.4	79.0	87.7	77.3	90.8	71.8	91.0	85.9	83.7	95.3	71.6	94.1	81.3	58.7	76.4	82.6
DGCNN [8]	✓	85.1	84.2	83.7	84.4	77.1	90.9	78.5	91.5	87.3	82.9	96.0	67.8	93.3	82.6	59.7	75.5	82.0
SpiderCNN [36]	✓	85.3	83.5	81.0	87.2	77.5	90.7	76.8	91.1	87.3	83.3	95.8	70.2	93.5	82.7	59.7	75.8	82.8
RS-CNN [7]	✓	86.2	83.5	84.8	88.8	79.6	91.2	81.1	91.6	88.4	86.0	96.0	73.7	94.1	83.4	60.5	77.7	83.6
Ours	✓	85.3	82.9	84.3	88.6	78.4	89.7	78.3	91.7	86.7	81.2	95.6	72.8	94.7	83.1	62.3	81.5	83.8
Ours (100%)	$\times$	82.3	80.3	73.7	82.6	75.4	88.0	73.2	89.7	79.9	77.3	94.9	66.4	91.7	73.2	54.9	79.4	80.7
Ours (50%)	$\times$	82.1	79.6	66.1	79.0	72.8	87.7	70.4	90.1	81.7	77.7	94.6	61.9	93.4	78.4	52.7	78.6	80.9
Ours (10%)	$\times$	80.2	77.0	71.2	72.4	72.3	86.7	61.3	89.2	80.9	73.4	94.3	50.1	91.4	74.1	50.8	74.1	79.7

10.1007/s41095-022-0305-5.T010Table 10Data augmentation analysis for unsupervised classification on the ModelNet40 dataset

Data augmentation	Accuracy (%)
Without data augmentation	89.1
Rotation	90.7
Translation	90.5
Add normally distributed noise	90.2
Point cloud mixup	90.0

10.1007/s41095-022-0305-5.T011Table 11Loss ablation study for unsupervised cross dataset classification with source: ModelNet40, target: ScanObjectNN

$L_{recon}$	$L_{normal}$	$L_{L2G}$	$L_{domain}$	Accuracy (%)
✓				73.1
✓	✓			80.7
✓	✓	✓		81.4
✓	✓	✓	✓	81.7

10.1007/s41095-022-0305-5.T012Table 12Architecture design study. “EU” indicates grouping neighbors in Euclidean space, while “EI” indicates grouping neighbors in eigenvalue space. “FPS” indicates whether FPS down-sampling strategy is used in the classification network

Model	FPS	$k$ -nn space	Points	Accuracy (%)
A	On	EU + EI	1024	92.8
B	Off	EU + EI	1024	92.5
C	On	EU + EI	2048	92.9
D	On	EU	1024	92.6
E	On	EI	1024	92.5

10.1007/s41095-022-0305-5.T013Table 13Supervised classification accuracy (%) for six intuitive input features. $i$ denotes the index of the anchor point and $j$ denotes its neighbors’ indices ( $x_{i}$ : coordinates, $x_{j}$ : shape context, $λ$ : eigenvalues, $v$ : eigenvectors, $d$ : Euclidean distance)

Input feature	Channel	Accuracy (%)
$x_{j}$	3	92.5
$λ_{j}$	3	85.2
$s_{i}$	8	91.9
$x_{j} - x_{i}$	3	92.6
$x_{j} - x_{i}, x_{j}$	6	92.7
$x_{j} - x_{i}, x_{j}, λ_{j} - λ_{i}, λ_{j}$	12	92.8
$x_{j} - x_{i}, x_{j}, λ_{j} - λ_{i}, λ_{j}, d_{i j}$	13	92.9
$x_{j} - x_{i}, x_{j}, λ_{j} - λ_{i}, λ_{j}, v_{j} - v_{i}, v_{j}$	30	92.4