AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
Article Link
Collect
Submit Manuscript
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Regular Paper

OAAFormer: Robust and Efficient Point Cloud Registration Through Overlapping-Aware Attention in Transformer

School of Computer Science and Technology, Shandong University, Qingdao 266237, China
School of Information and Technology, Qingdao University of Science and Technology, Qingdao 266061, China
College of Engineering, Texas A&M University, Texas, TX 77843, U.S.A.
Show Author Information

Abstract

In the domain of point cloud registration, the coarse-to-fine feature matching paradigm has received significant attention due to its impressive performance. This paradigm involves a two-step process: first, the extraction of multi-level features, and subsequently, the propagation of correspondences from coarse to fine levels. However, this approach faces two notable limitations. Firstly, the use of the Dual Softmax operation may promote one-to-one correspondences between superpoints, inadvertently excluding valuable correspondences. Secondly, it is crucial to closely examine the overlapping areas between point clouds, as only correspondences within these regions decisively determine the actual transformation. Considering these issues, we propose OAAFormer to enhance correspondence quality. On the one hand, we introduce a soft matching mechanism to facilitate the propagation of potentially valuable correspondences from coarse to fine levels. On the other hand, we integrate an overlapping region detection module to minimize mismatches to the greatest extent possible. Furthermore, we introduce a region-wise attention module with linear complexity during the fine-level matching phase, designed to enhance the discriminative capabilities of the extracted features. Tests on the challenging 3DLoMatch benchmark demonstrate that our approach leads to a substantial increase of about 7% in the inlier ratio, as well as an enhancement of 2%–4% in registration recall. Finally, to accelerate the prediction process, we replace the Conventional Random Sample Consensus (RANSAC) algorithm with the selection of a limited yet representative set of high-confidence correspondences, resulting in a 100 times speedup while still maintaining comparable registration performance.

Electronic Supplementary Material

Download File(s)
JCST-2402-14165-Highlights.pdf (858.2 KB)

References

[1]
Bai X Y, Luo Z X, Zhou L, Fu H B, Quan L, Tai C L. D3Feat: Joint learning of dense detection and description of 3D local features. In Proc. the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2020, pp.6358–6366. DOI: 10.1109/CVPR42600.2020.00639.
[2]
Huang S Y, Gojcic Z, Usvyatsov M, Wieser A, Schindler K. PREDATOR: Registration of 3D point clouds with low overlap. In Proc. the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2020, pp.4265–4274. DOI: 10.1109/CVPR46437.2021.00425.
[3]
Yu H, Li F, Saleh M, Busam B, Ilic S. CoFiNet: Reliable coarse-to-fine correspondences for robust point cloud registration. In Proc. the 35th Conference on Neural Information Processing Systems, Dec. 2021, pp.23872–23884.
[4]
Qin Z, Yu H, Wang C J, Guo Y L, Peng Y X, Xu K. Geometric transformer for fast and robust point cloud registration. In Proc. the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2022, pp.11133–11142. DOI: 10.1109/CVPR52688.2022.01086.
[5]
Cheng X, Lin H Z, Wu X Y, Yang F, Shen D. Improving video-text retrieval by multi-stream corpus alignment and dual softmax loss. arXiv: 2109.04290, 2021. https://doi.org/10.48550/arXiv.2109.04290, Jun. 2024.
[6]

Fischler M A, Bolles R C. Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 1981, 24(6): 381–395. DOI: 10.1145/358669.358692.

[7]

Johnson A E, Hebert M. Using spin images for efficient object recognition in cluttered 3D scenes. IEEE Trans. Pattern Analysis and Machine Intelligence, 1999, 21(5): 433–449. DOI: 10.1109/34.765655.

[8]
Rusu R B, Blodow N, Beetz M. Fast point feature histograms (FPFH) for 3D registration. In Proc. the 2009 IEEE International Conference on Robotics and Automation, May 2009, pp.3212–3217. DOI: 10.1109/ROBOT.2009.5152473.
[9]
Zeng A, Song, S R, Nießner M, Fisher M, Xiao J X, Funkhouser T. 3DMatch: Learning local geometric descriptors from RGB-D reconstructions. In Proc. the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Jul. 2017, pp.199–208. DOI: 10.1109/CVPR.2017.29.
[10]
Gojcic Z, Zhou C F, Wegner J D, Wieser A. The perfect match: 3D point cloud matching with smoothed densities. In Proc. the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2019, pp.5540–5549. DOI: 10.1109/CVPR.2019.00569.
[11]
Choy C, Park J, Koltun V. Fully convolutional geometric features. In Proc. the 2019 IEEE/CVF International Conference on Computer Vision, Oct. 2019, pp.8957–8965. DOI: 10.1109/ICCV.2019.00905.
[12]
Ao S, Hu Q Y, Yang B, Markham A, Guo Y L. SpinNet: Learning a general surface descriptor for 3D point cloud registration. In Proc. the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2021, pp.11748–11757. DOI: 10.1109/CVPR46437.2021.01158.
[13]
Thomas H, Qi C R, Deschaud J E, Marcotegui B, Goulette F, Guibas L. KPConv: Flexible and deformable convolution for point clouds. In Proc. the 2019 IEEE/CVF International Conference on Computer Vision, Oct. 27–Nov. 2, 2019, pp.6410–6419. DOI: 10.1109/ICCV.2019.00651.
[14]
Wang H P, Liu Y, Dong Z, Wang W P. You only hypothesize once: Point cloud registration with rotation-equivariant descriptors. In Proc. the 30th ACM International Conference on Multimedia, Oct. 2022, pp.1630–1641. DOI: 10.1145/3503161.3548023.
[15]

Wang H P, Liu Y, Hu Q Y, Wang B, Chen J G, Dong Z, Guo Y L, Wang W P, Yang B S. RoReg: Pairwise point cloud registration with oriented descriptors and local rotations. IEEE Trans. Pattern Analysis and Machine Intelligence, 2023, 45(8): 10376–10393. DOI: 10.1109/TPAMI.2023.3244951.

[16]

Yu H, Hou J, Qin Z, Saleh M, Shugurov I, Wang K, Busam B, Ilic S. RIGA: Rotation-invariant and globally-aware descriptors for point cloud registration. IEEE Trans. Pattern Analysis and Machine Intelligence, 2024, 46(5): 3796–3812. DOI: 10.1109/TPAMI.2023.3349199.

[17]
Myatt D R, Torr P H S, Nasuto S J, Bishop J M, Craddock R. NAPSAC: High noise, high dimensional robust estimation—It’s in the bag. In Proc. the British Machine Vision Conference, Sept. 2022, pp.1–10. DOI: 10.5244/C.16.44.
[18]
Choy C, Dong W, Koltun V. Deep global registration. In Proc. the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2020, pp.2511–2520. DOI: 10.1109/CVPR42600.2020.00259.
[19]
Bai X Y, Luo Z X, Zhou L, Chen H K, Li L, Hu Z Y, Fu H B, Tai C L. PointDSC: Robust point cloud registration using deep spatial consistency. In Proc. the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2021, pp.15854–15864. DOI: 10.1109/CVPR46437.2021.01560.
[20]
Zhou Q J, Sattler T, Leal-Taixé L. Patch2Pix: Epipolar-guided pixel-level correspondences. In Proc. the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2021, pp.4667–4676. DOI: 10.1109/CVPR46437.2021.00464.
[21]

Peyré G, Cuturi M. Computational optimal transport: With applications to data science. Foundations and Trends® in Machine Learning, 2019, 11(5/6): 355–607. DOI: 10.1561/2200000073.

[22]
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A N, Kaiser Ł, Polosukhin I. Attention is all you need. In Proc. the 31st International Conference on Neural Information Processing Systems, Dec. 2017, pp.6000–6010.
[23]
Katharopoulos A, Vyas A, Pappas N, Fleuret F. Transformers are RNNs: Fast autoregressive transformers with linear attention. In Proc. the 37th International Conference on Machine Learning, Jul. 2020, pp.5156–5165.
[24]
Zaheer M, Guruganesh G, Dubey A, Ainslie J, Alberti C, Ontanon S, Pham P, Ravula A, Q. F. Wang, Yang L, Ahmed A. Big bird: Transformers for longer sequences. In Proc. the 34th International Conference on Neural Information Processing Systems, Dec. 2020, Article No. 1450.
[25]
Wu C H, Wu F Z, Qi T, Huang Y F, Xie X. Fastformer: Additive attention can be all you need. arXiv: 2108.09084, 2021. https://doi.org/10.48550/arXiv.2108.09084, Jun. 2024.
[26]
Li Y, Harada T. Lepard: Learning partial point cloud matching in rigid and deformable scenes. In Proc. the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2022, pp.5544–5554. DOI: 10.1109/CVPR52688.2022.00547.
[27]
Yew Z J, Lee G H. REGTR: End-to-end point cloud correspondences with transformers. In Proc. the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2022, pp.6667–6676. DOI: 10.1109/CVPR52688.2022.00656.
[28]
Su J L, Lu Y, Pan S F, Murtadha A, Wen B, Liu Y F. RoFormer: Enhanced transformer with rotary position embedding. arXiv: 2104.09864, 2021. https://doi.org/10.48550/arXiv.2104.09864, Jun. 2024.
[29]
Leordeanu M, Hebert M. A spectral technique for correspondence problems using pairwise constraints. In Proc. the 10th IEEE International Conference on Computer Vision, Oct. 2005, pp.1482–1489. DOI: 10.1109/ICCV.2005.20.
[30]
Geiger A, Lenz P, Urtasun R. Are we ready for autonomous driving? The KITTI vision benchmark suite. In Proc. the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Jun. 2012, pp.3354–3361. DOI: 10.1109/CVPR.2012.6248074.
[31]
Yew Z J, Lee G H. 3DFeat-Net: Weakly supervised local 3D features for point cloud registration. In Proc. the 15th European Conference on Computer Vision, Sept. 2018. pp.630–646. DOI: 10.1007/978-3-030-01267-0_37.
[32]
Huang X S, Mei G F, Zhang J. Feature-metric registration: A fast semi-supervised approach for robust point cloud registration without correspondences. In Proc. the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2020, pp.11363–11371. DOI: 10.1109/CVPR42600.2020.01138.
[33]
Pais G D, Ramalingam S, Govindu V M, Nascimento J C, Chellappa R, Miraldo P. 3DRegNet: A deep neural network for 3D point registration. In Proc. the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2020, pp.7191–7201. DOI: 10.1109/CVPR42600.2020.00722.
[34]
Li X Q, Pontes J K, Lucey S. PointNetLK revisited. In Proc. the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2021, pp.12758–12767. DOI: 10.1109/CVPR46437.2021.01257.
[35]
Xu H, Liu S C, Wang G F, Liu G H, Zeng B. OMNet: Learning overlapping mask for partial-to-partial point cloud registration. In Proc. the 2021 IEEE/CVF International Conference on Computer Vision, Oct. 2021, pp.3112–3121. DOI: 10.1109/ICCV48922.2021.00312.
[36]
Wang Y, Solomon J. Deep closest point: Learning representations for point cloud registration. In Proc. the 2019 IEEE/CVF International Conference on Computer Vision, Oct. 27–Nov. 2, 2019, pp.3522–3531. DOI: 10.1109/ICCV.2019.00362.
[37]
Yew Z J, Lee G H. RPM-Net: Robust point matching using learned features. In Proc. the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2020, pp.11821–11830. DOI: 10.1109/CVPR42600.2020.01184.
Journal of Computer Science and Technology
Pages 755-770
Cite this article:
Gao J-J, Dong Q-J, Wang R-A, et al. OAAFormer: Robust and Efficient Point Cloud Registration Through Overlapping-Aware Attention in Transformer. Journal of Computer Science and Technology, 2024, 39(4): 755-770. https://doi.org/10.1007/s11390-024-4165-6

81

Views

0

Crossref

0

Web of Science

0

Scopus

0

CSCD

Altmetrics

Received: 01 February 2024
Accepted: 13 June 2024
Published: 20 September 2024
© Institute of Computing Technology, Chinese Academy of Sciences 2024
Return