Discover the SciOpen Platform and Achieve Your Research Goals with Ease.
Search articles, authors, keywords, DOl and etc.
The cross-view matching of local image features is a fundamental task in visual localization and 3D reconstruction. This study proposes FilterGNN, a transformer-based graph neural network (GNN), aiming to improve the matching efficiency and accuracy of visual descriptors. Based on high matching sparseness and coarse-to-fine covisible area detection, FilterGNN utilizes cascaded optimal graph-matching filter modules to dynamically reject outlier matches. Moreover, we successfully adapted linear attention in FilterGNN with post-instance normalization support, which significantly reduces the complexity of complete graph learning from
Mur-Artal, R.; Montiel, J. M. M.; Tardós, J. D. ORB-SLAM: A versatile and accurate monocular SLAM system. IEEE Transactions on Robotics Vol. 31, No. 5, 1147–1163, 2015.
Huang, J.; Yang, S.; Zhao, Z.; Lai, Y. K.; Hu, S. M. ClusterSLAM: A SLAM backend for simultaneous rigid body clustering and motion estimation. Computational Visual Media Vol. 7, No. 1, 87–101, 2021.
Guo, M. H.; Xu, T. X.; Liu, J. J.; Liu, Z. N.; Jiang, P. T.; Mu, T. J.; Zhang, S. H.; Martin, R. R.; Cheng, M. M.; Hu, S. M. Attention mechanisms in computer vision: A survey. Computational Visual Media Vol. 8, No. 3, 331–368, 2022.
Thomee, B.; Elizalde, B.; Shamma, D. A.; Ni, K.; Friedland, G.; Poland, D.; Borth, D.; Li, A. L. J. YFCC100M: The new data in multimedia research. Communications of the ACM Vol. 59, No. 2, 64–73, 2016.
Zhang, Z.; Sattler, T.; Scaramuzza, D. Reference pose generation for long-term visual localization via learned features and view synthesis. International Journal of Computer Vision Vol. 129, No. 4, 821–844, 2021.
Guo, M. H.; Liu, Z. N.; Mu, T. J.; Hu, S. M. Beyond self-attention: External attention using two linear layers for visual tasks. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 45, No. 5, 5436–5447, 2023.
Gu, Y.; Qin, X.; Peng, Y.; Li, L. Content-augmented feature pyramid network with light linear spatial transformers for object detection. IET Image Processing Vol. 16, No. 13, 3567–3578, 2022.
Lowe, D. G. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision Vol. 60, No. 2, 91–110, 2004.
Roy, A.; Saffar, M.; Vaswani, A.; Grangier, D. Efficient content-based sparse attention with routing transformers. Transactions of the Association for Computational Linguistics Vol. 9, 53–68, 2021.
Toft, C.; Maddern, W.; Torii, A.; Hammarstrand, L.; Stenborg, E.; Safari, D.; Okutomi, M.; Pollefeys, M.; Sivic, J.; Pajdla, T.; et al. Long-term visual localization revisited. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 44, No. 4, 2074–2088, 2022.
Sakai, S.; Ito, K.; Aoki, T.; Watanabe, T.; Unten, H. Phase-based window matching with geometric correction for multi-view stereo. IEICE Transactions on Information and Systems Vol. E98.D, No. 10, 1818–1828, 2015.
Sinkhorn, R.; Knopp, P. Concerning nonnegative matrices and doubly stochastic matrices. Pacific Journal of Mathematics Vol. 21, No. 2, 343–348, 1967.
Guo, J.; Wang, H.; Cheng, Z.; Zhang, X.; Yan, D. M. Learning local shape descriptors for computing non-rigid dense correspondence. Computational Visual Media Vol. 6, No. 1, 95–112, 2020.
This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.
The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www.editorialmanager.com/cvmj.