Specificity-preserving RGB-D saliency detection

Tao Zhou; Deng-Ping Fan; Geng Chen; Yi Zhou; Huazhu Fu

doi:10.1007/s41095-022-0268-6

| Sign up

PDF (7.3 MB)

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

Show Outline

Figures (12)

Fig. 1

Fig. 2

Fig. 3

Fig. 4

Fig. 5

Fig. 6

Fig. 7

Fig. 8

Fig. 9

Tables (7)

Table 1

Table 2

Table 3

Table 4

Table 5

Research Article | Open Access

Specificity-preserving RGB-D saliency detection

Tao Zhou^{¹^,²}, Deng-Ping Fan^³

(), Geng Chen^⁴, Yi Zhou^⁵, Huazhu Fu^⁶

1 School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China

2 Key Laboratory of System Control and Information Processing, Ministry of Education, Shanghai, China

3 Computer Vision Lab, ETH Zürich, Zürich, Switzerland

4 School of Computer Science and Engineering, North-western Polytechnical University, Xi’an, China

5 School of Computer Science and Engineering, SoutheastUniversity, Nanjing, China

6 Inception Institute of Artificial Intelligence, Abu Dhabi, United Arab Emirates

Show Author Information

Graphical Abstract

View original image Download original image

Abstract

Salient object detection (SOD) in RGB and depth images has attracted increasing research interest. Existing RGB-D SOD models usually adopt fusion strategies to learn a shared representation from RGB and depth modalities, while few methods explicitly consider how to preserve modality-specific characteristics. In this study, we propose a novel framework, the specificity-preserving network (SPNet), which improves SOD performance by exploring both the shared information and modality-specific properties. Specifically, we use two modality-specific networks and a shared learning network to generate individual and shared saliency prediction maps. To effectively fuse cross-modal features in the shared learning network, we propose a cross-enhanced integration module (CIM) and propagate the fused feature to the next layer to integrate cross-level information. Moreover, to capture rich complementary multi-modal information to boost SOD performance, we use a multi-modal feature aggregation (MFA) module to integrate the modality-specific features from each individual decoder into the shared decoder. By using skip connections between encoder and decoder layers, hierarchical features can be fully combined. Extensive experiments demonstrate that our SPNet outperforms cutting-edge approaches on six popular RGB-D SOD and three camouflaged object detection benchmarks. The project is publicly available at https://github.com/taozh2017/SPNet.

Keywords

salient object detection (SOD)RGB-D cross-enhanced integration module (CIM)multi-modal feature aggregation (MFA)

References

[1]

Peng,

; Li,

; Xiong,

; Hu,

; Ji,

RGBD salient object detection: A benchmark and algorithms. In: Computer Vision – ECCV 2014. Lecture Notes in Computer Science, Vol. 8691. Fleet,

; Pajdla,

; Schiele,

; Tuytelaars,

Eds. Springer Cham, 92–109, 2014.

[2]

Zhu,

J.-Y.

; Wu,

J.-J.

; Xu,

; Chang,

; Tu,

Z. W.

Unsupervised object class discovery via saliency-guided multiple class learning. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 37, No. 4, 862–875, 2015.

Crossref Google Scholar

[3]

Rapantzikos,

; Avrithis,

; Kollias,

Dense saliency-based spatiotemporal feature points for action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1454–1461, 2009.

Crossref

[4]

Shimoda,

; Yanai,

Distinct class-specific saliency maps for weakly supervised semantic segmentation. In: Computer Vision – ECCV 2016. Lecture Notes in Computer Science, Vol. 9908. Leibe,

; Matas,

; Sebe,

; Welling,

Eds. Springer Cham, 218–234, 2016.

Crossref

[5]

Wang,

W. G.

; Shen,

J. B.

; Yang,

R. G.

; Porikli,

Saliency-aware video object segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 40, No. 1, 20–33, 2018.

Crossref Google Scholar

[6]

Zhao,

; Oyang,

; Wang,

Person re-identification by saliency learning. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 39, No. 2, 356–370, 2017.

Crossref Google Scholar

[7]

Fan,

D. P.

; Lin,

; Zhang,

; Zhu,

M. L.

; Cheng,

M. M.

Rethinking RGB-D salient object detection: Models, data sets, and large-scale benchmarks. IEEE Transactions on Neural Networks and Learning Systems Vol. 32, No. 5, 2075–2089, 2021.

Crossref Google Scholar

[8]

Zhang,

; Fan,

D.-P.

; Dai,

Y. C.

; Yu,

; Zhong,

Y. R.

; Barnes,

; Shao,

RGB-D saliency detection via cascaded mutual information minimization. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 4318–4327, 2021.

Crossref

[9]

Liu,

; Zhang,

; Wan,

K. Y.

; Shao,

; Han,

J. W.

Visual saliency transformer. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 4702–4712, 2021.

Crossref

[10]

Zhou,

; Fan,

D. P.

; Cheng,

M. M.

; Shen,

J. B.

; Shao,

RGB-D salient object detection: A survey. Computational Visual Media Vol. 7, No. 1, 37–69, 2021.

Crossref Google Scholar

[11]

Fu,

K. R.

; Fan,

D. P.

; Ji,

G. P.

; Zhao,

Q. J.

; Shen,

J. B.

; Zhu,

Siamese network for RGB-D salient object detection and beyond. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2021.

Crossref Google Scholar

[12]

Zhang,

; Fan,

D.-P.

; Dai,

Y. C.

; Anwar,

, Saleh,

, Aliakbarian,

; Barnes,

Uncertainty inspired RGB-D saliency detection. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2021.

Crossref Google Scholar

[13]

Chen,

; Li,

Y. F.

; Deng,

Y. J.

; Lin,

G. S.

CNN-based RGB-D salient object detection: Learn, select, and fuse. International Journal of Computer Vision Vol. 129, No. 7, 2076–2096, 2021.

Crossref Google Scholar

[14]

Li,

G. Y.

; Liu,

; Chen,

M. Y.

; Bai,

; Lin,

W. S.

; Ling,

H. B.

Hierarchical alternate interaction network for RGB-D salient object detection. IEEE Transactions on Image Processing Vol. 30, 3528–3542, 2021.

Crossref Google Scholar

[15]

Zhao,

Y. F.

; Zhao,

J. W.

; Li,

; Chen,

X. W.

RGB-D salient object detection with ubiquitous target awareness. IEEE Transactions on Image Processing Vol. 30, 7717–7731, 2021.

Crossref Google Scholar

[16]

Ren,

J. Q.

; Gong,

X. J.

; Lu,

; Zhou,

W. H.

; Yang,

M. Y.

Exploiting global priors for RGB-D saliency detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 25–32, 2015.

Crossref

[17]

Song,

H. K.

; Liu,

; Du,

; Sun,

G. L.

; Le Meur,

; Ren,

T. W.

Depth-aware salient object detection and segmentation via multiscale discriminative saliency fusion and bootstrap learning. IEEE Transactions on Image Processing Vol. 26, No. 9, 4204–4216,2017.

Crossref Google Scholar

[18]

Liu,

Z. Y.

; Shi,

; Duan,

Q. T.

; Zhang,

; Zhao,

Salient object detection for RGB-D image by single stream recurrent convolution neural network. Neurocomputing Vol. 363, 46–57, 2019.

Crossref Google Scholar

[19]

Guo,

J. F.

; Ren,

T. W.

; Bei,

Salient object detection for RGB-D image via saliency evolution. In: Proceedings of the IEEE International Conference on Multimedia and Expo, 1–6, 2016.

Crossref

[20]

Wang,

N. N.

; Gong,

X. J.

Adaptive fusion for RGB-D salient object detection. IEEE Access Vol. 7, 55277–55284, 2019.

Crossref Google Scholar

[21]

Ding,

; Liu,

; Huang,

M. K.

; Shi,

; Wang,

X. Y.

Depth-aware saliency detection using convolutional neural networks. Journal of Visual Communication and Image Representation Vol. 61, 1–9, 2019.

Crossref Google Scholar

[22]

Chen,

; Li,

Y. F.

Progressively complementarity-aware fusion network for RGB-D salient object detection. In: Proceedings of the IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition, 3051–3060, 2018.

Crossref

[23]

Liu,

; Hu,

; Zhang,

; Chen,

Two-stream refinement network for RGB-D saliency detection. In: Proceedings of the IEEE International Conference on Image Processing, 3925–3929, 2019.

Crossref

[24]

Chen,

; Li,

Y. F.

Three-stream attention-aware network for RGB-D salient object detection. IEEE Transactions on Image Processing Vol. 28, No. 6, 2825–2835, 2019.

Crossref Google Scholar

[25]

Han,

J. W.

; Chen,

; Liu,

; Yan,

C. G.

; Li,

X. L.

CNNs-based RGB-D saliency detection via cross-view transfer and multiview fusion. IEEE Transactions on Cybernetics Vol. 48, No. 11, 3171–3183, 2018.

Crossref Google Scholar

[26]

Chen,

; Li,

Y. F.

; Su,

Attention-aware cross-modal cross-level fusion network for RGB-D salient object detection. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 6821–6826, 2018.

Crossref

[27]

Ji,

; Li,

J. J.

; Yu,

; Zhang,

; Piao,

Y. R.

; Yao,

S. Y.

; Bi,

; Ma,

; Zheng,

; Lu,

; et al. Calibrated RGB-D salient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 9466–9476, 2021.

Crossref

[28]

Huang,

; Chen,

H. X.

; Zhou,

; Yang,

Y. Z.

; Liu,

B. Y.

Multi-level cross-modal interaction network for RGB-D salient object detection. Neurocomputing Vol. 452, 200–211, 2021.

Crossref Google Scholar

[29]

Chen,

; Li,

Y. F.

; Su,

Multi-modal fusion network with multi-scale multi-path and cross-modal interactions for RGB-D salient object detection. Pattern Recognition Vol. 86, 376–385, 2019.

Crossref Google Scholar

[30]

Zhao,

J.-X.

; Cao,

; Fan,

D.-P.

; Cheng,

M.-M.

; Li,

X.-Y.

; Zhang,

Contrast prior and fluid pyramid integration for RGBD salient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3922–3931, 2019.

Crossref

[31]

Zhu,

C. B.

; Cai,

; Huang,

; Li,

T. H.

; Li,

PDNet: Prior-model guided depth-enhanced network for salient object detection. In: Proceedings of the IEEE International Conference on Multimedia and Expo, 199–204, 2019.

Crossref

[32]

Fan,

D. P.

; Zhai,

; Borji,

; Yang,

; Shao,

BBS-Net: RGB-D salient object detection with a bifurcated backbone strategy network. In: Computer Vision – ECCV 2020. Lecture Notes in Computer Science, Vol. 12357. Vedaldi,

; Bischof,

; Brox,

; Frahm,

J. M.

Eds. Springer Cham, 275–292, 2020.

[33]

Zhai,

Y. J.

; Fan,

D.-P.

; Yang,

J. F.

; Borji,

; Shao,

; Han,

J. W.

; Wang,

Bifurcated backbone strategy for RGB-D salient object detection. IEEE Transactions on Image Processing Vol. 30, 8727–8742, 2021.

Crossref Google Scholar

[34]

Hu,

J. L.

; Lu,

J. W.

; Tan,

Y. P.

Sharable and individual multi-view metric learning. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 40, No. 9, 2281–2288, 2018.

Crossref Google Scholar

[35]

Lu,

; Wu,

; Liu,

; Zhang,

; Li,

; Chu,

; Yu,

Cross-modality person re-identification with shared-specific feature transfer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 13376–13386, 2020.

Crossref

[36]

Zhou,

; Zhang,

; Peng,

; Bhaskar,

; Yang,

Dual shared-specific multiview subspace clustering. IEEE Transactions on Cybernetics Vol. 50, No. 8, 3517–3530, 2020.

Crossref Google Scholar

[37]

Zhou,

; Fu,

H. Z.

; Chen,

; Shen,

J. B.

; Shao,

Hi-net: Hybrid-fusion network for multi-modal MR image synthesis. IEEE Transactions on Medical Imaging Vol. 39, No. 9, 2772–2781, 2020.

Crossref Google Scholar

[38]

Ronneberger,

; Fischer,

; Brox,

U-Net: Convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015. Lecture Notes in Computer Science, Vol. 9351. Navab,

; Hornegger,

; Wells,

; Frangi,

Eds. Springer Cham, 234–241, 2015.

Crossref

[39]

Zhou,

; Fu,

; Chen,

; Zhou,

; Fan,

D.-P.

; Shao,

Specificity-preserving RGB-D saliency detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 4661–4671, 2021.

Crossref

[40]

Zhu,

W. J.

; Liang,

; Wei,

Y. C.

; Sun,

Saliency optimization from robust background detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2814–2821, 2014.

Crossref

[41]

Achanta,

; Hemami,

; Estrada,

; Susstrunk,

Frequency-tuned salient region detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1597–1604, 2009.

Crossref

[42]

Zhou,

; Yang,

Z. H.

; Yuan,

; Zhou,

Z. T.

; Hu,

D. W.

Salient region detection via integrating diffusion-based compactness and local contrast. IEEE Transactions on Image Processing Vol. 24, No. 11, 3308–3320, 2015.

Crossref Google Scholar

[43]

Jiang,

Z. L.

; Davis,

L. S.

Submodular salient region detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2043–2050, 2013.

Crossref

[44]

Hou,

Q. B.

; Cheng,

M. M.

; Hu,

X. W.

; Borji,

; Tu,

Z. W.

; Torr,

P. H. S.

Deeply supervised salient object detection with short connections. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 41, No. 4, 815–828, 2019.

Crossref Google Scholar

[45]

Wang,

L. Z.

; Wang,

L. J.

; Lu,

H. C.

; Zhang,

P. P.

; Ruan,

Salient object detection with recurrent fully convolutional networks. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 41, No. 7, 1734–1746, 2019.

Crossref Google Scholar

[46]

Liu,

; Han,

; Yang,

PiCANet: Learning pixel-wise contextual attention for saliency detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3089–3098, 2018.

Crossref

[47]

Deng,

; Hu,

; Zhu,

; Xu,

; Qin,

; Han,

; Heng,

P.-A.

^{3}

Net: Recurrent residual refinement network for saliency detection. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence, 684–690, 2018.

Crossref

[48]

Wang,

; Lai,

; Fu,

; Shen,

; Ling,

; Yang,

Salient object detection in the deep learning era: An in-depth survey. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 44, No. 6, 3239–3259, 2022.

Crossref Google Scholar

[49]

Wang,

; Ma,

; Chen,

; You,

Edge preserving and multi-scale contextual neural network for salient object detection. IEEE Transactions on Image Processing Vol. 27, No. 1, 121–134, 2018.

Crossref Google Scholar

[50]

Zhang,

P. P.

; Wang,

; Lu,

H. C.

; Wang,

H. Y.

; Ruan,

Amulet: Aggregating multi-level convolutional features for salient object detection. In: Proceedings of the IEEE International Conference on Computer Vision, 202–211, 2017.

Crossref

[51]

Zhang,

; Dai,

; Lu,

H. C.

; He,

; Wang,

A bi-directional message passing model for salient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1741–1750, 2018.

Crossref

[52]

Pang,

Y. W.

; Zhao,

X. Q.

; Zhang,

L. H.

; Lu,

H. C.

Multi-scale interactive network for salient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 9410–9419, 2020.

Crossref

[53]

Lang,

; Nguyen,

T. V.

; Katti,

; Yadati,

; Kankanhalli,

; Yan,

Depth matters: Influence of depth cues on visual saliency. In: Computer Vision – ECCV 2012. Lecture Notes in Computer Science, Vol. 7573. Fitzgibbon,

; Lazebnik,

; Perona,

; Sato,

; Schmid,

Eds. Springer Berlin Heidelberg, 101–115, 2012.

Crossref

[54]

Ju,

; Ge,

; Geng,

; Ren,

; Wu,

Depth saliency based on anisotropic center-surround difference. In: Proceedings of the IEEE International Conference on Image Processing, 1115–1119, 2014.

Crossref

[55]

Desingh,

; Krishna,

K. M.

; Rajan,

; Jawahar,

C. V.

Depth really matters: Improving visual salient region detection with depth. In: Proceedings of the British Machine Vision Conference, 98.1–98.11, 2013.

Crossref

[56]

Zhu,

C. B.

; Li,

; Wang,

W. M.

; Wang,

R. G.

An innovative salient object detection using center-dark channel prior. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, 1509–1515, 2017.

Crossref

[57]

Liang,

F. F.

; Duan,

L. J.

; Ma,

; Qiao,

Y. H.

; Cai,

; Qing,

L. Y.

Stereoscopic saliency model using contrast and depth-guided-background prior. Neurocomputing Vol. 275, 2227–2238, 2018.

Crossref Google Scholar

[58]

Feng,

; Barnes,

; You,

S. D.

; McCarthy,

Local background enclosure for RGB-D salient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2343–2350, 2016.

Crossref

[59]

Qu,

L. Q.

; He,

S. F.

; Zhang,

J. W.

; Tian,

J. D.

; Tang,

Y. D.

; Yang,

Q. X.

RGBD salient object detection via deep fusion. IEEE Transactions on Image Processing Vol. 26, No. 5, 2274–2285, 2017.

Crossref Google Scholar

[60]

Piao,

Y. R.

; Ji,

; Li,

J. J.

; Zhang,

; Lu,

H. C.

Depth-induced multi-scale recurrent attention network for saliency detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 7253–7262, 2019.

Crossref

[61]

Li,

C. Y.

; Cong,

R. M.

; Piao,

Y. R.

; Xu,

Q. Q.

; Loy,

C. C.

RGB-D salient object detection with cross-modality modulation and selection. In: Computer Vision – ECCV 2020. Lecture Notes in Computer Science, Vol. 12353. Vedaldi,

; Bischof,

; Brox,

; Frahm,

J. M.

Eds. Springer Cham, 225–241, 2020.

[62]

Li,

G. Y.

; Liu,

; Ye,

L. W.

; Wang,

; Ling,

H. B.

Cross-modal weighting network for RGB-D salient object detection. In: Computer Vision – ECCV 2020. Lecture Notes in Computer Science, Vol. 12362. Vedaldi,

; Bischof,

; Brox,

; Frahm,

J. M.

Eds. Springer Cham, 665–681, 2020.

[63]

Chaudhuri,

; Kakade,

S. M.

; Livescu,

; Sridharan,

Multi-view clustering via canonical correlation analysis. In: Proceedings of the 26th Annual International Conference on Machine Learning, 129–136, 2009.

Crossref

[64]

Ding,

C. X.

; Tao,

D. C.

Robust face recognition via multimodal deep face representation. IEEE Transactions on Multimedia Vol. 17, No. 11, 2049–2058, 2015.

Crossref Google Scholar

[65]

Gönen,

; Alpaydın,

Multiple kernel learning algorithms. The Journal of Machine Learning Research Vol. 12, 2211–2268, 2011.

Google Scholar

[66]

White,

; Yu,

; Zhang,

; Schuurmans,

Convex multi-view subspace learning. In: Proceedings of the 25th International Conference on Neural Information Processing Systems, Vol. 1, 1673–1681, 2012.

[67]

Zhang,

C. Q.

; Hu,

Q. H.

; Fu,

H. Z.

; Zhu,

P. F.

; Cao,

X. C.

Latent multi-view subspace clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 4333–4341, 2017.

Crossref

[68]

Ngiam,

; Khosla,

; Kim,

; Nam,

; Lee,

; Ng,

Multimodal deep learning. In: Proceedings of the 28th International Conference on International Conference on Machine Learning, 689–696, 2011.

[69]

Eitel,

; Springenberg,

J. T.

; Spinello,

; Riedmiller,

; Burgard,

Multimodal deep learning for robust RGB-D object recognition. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 681–687, 2015.

Crossref

[70]

Gao,

S. H.

; Cheng,

M. M.

; Zhao,

; Zhang,

X. Y.

; Yang,

M. H.

; Torr,

Res2Net: A new multi-scale backbone architecture. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 43, No. 2, 652–662, 2021.

Crossref Google Scholar

[71]

Russakovsky,

; Deng,

; Su,

; Krause,

; Satheesh,

; Ma,

; Huang,

; Karpathy,

; Khosla,

; Bernstein,

; et al. ImageNet large scale visual recognition challenge. International Journal of Computer Vision Vol. 115, No. 3, 211–252, 2015.

Crossref Google Scholar

[72]

Wu,

; Su,

; Huang,

Q. M.

Cascaded partial decoder for fast and accurate salient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3902–3911, 2019.

Crossref

[73]

Wei,

; Wang,

; Huang,

^{3}

Net: Fusion, feedback and focus for salient object detection. Proceedings of the AAAI Conference on Artificial Intelligence Vol. 34, No. 7, 12321–12328, 2020.

Crossref Google Scholar

[74]

Cheng,

Y. P.

; Fu,

H. Z.

; Wei,

X. X.

; Xiao,

J. J.

; Cao,

X. C.

Depth enhanced saliency detection method. In: Proceedings of the International Conference on Internet Multimedia Computing and Service, 23–27, 2014.

Crossref

[75]

Li,

; Zhu,

C. B.

A three-pathway psychobiological framework of salient object detection using stereo-scopic technology. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, 3008–3014, 2017.

Crossref

[76]

Niu,

Y. Z.

; Geng,

Y. J.

; Li,

X. Q.

; Liu,

Leveraging stereopsis for saliency analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 454–461, 2012.

[77]

Cheng,

M. M.

; Fan,

D. P.

Structure-measure: A new way to evaluate foreground maps. International Journal of Computer Vision Vol. 129, No. 9, 2622–2638, 2021.

Crossref Google Scholar

[78]

Fan,

D.-P.

; Gong,

; Cao,

; Ren,

; Cheng,

M.-M.

; Borji,

Enhanced-alignment measure for binary foreground map valuation. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence, 698–704, 2018.

Crossref

[79]

Perazzi,

; Krähenbühl,

; Pritch,

; Hornung,

Saliency filters: Contrast based filtering for salient region detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 733–740, 2012.

Crossref

[80]

Cong,

R. M.

; Lei,

J. J.

; Zhang,

C. Q.

; Huang,

Q. M.

; Cao,

X. C.

; Hou,

C. P.

Saliency detection for stereoscopic images based on depth confidence analysis and multiple cues fusion. IEEE Signal Processing Letters Vol. 23, No. 6, 819–823, 2016.

Crossref Google Scholar

[81]

Cong,

R. M.

; Lei,

J. J.

; Fu,

H. Z.

; Hou,

J. H.

; Huang,

Q. M.

; Kwong,

Going from RGB to RGBD saliency: A depth-guided transformation model. IEEE Transactions on Cybernetics Vol. 50, No. 8, 3627–3639, 2020.

Crossref Google Scholar

[82]

Jiang,

; Zhou,

Z. T.

; Wang,

; Tang,

; Luo,

cmSalGAN: RGB-D salient object detection with cross-view generative adversarial networks. IEEE Transactions on Multimedia Vol. 23, 1343–1353, 2021.

Crossref Google Scholar

[83]

Li,

C. Y.

; Cong,

R. M.

; Kwong,

; Hou,

J. H.

; Fu,

H. Z.

; Zhu,

G. P.

; Zhang,

; Huang,

ASIF-Net: Attention steered interweave fusion network for RGB-D salient object detection. IEEE Transactions on Cybernetics Vol. 51, No. 1, 88–100, 2021.

Crossref Google Scholar

[84]

Li,

; Liu,

; Ling,

ICNet: Information conversion network for RGB-D based salient object detection. IEEE Transactions on Image Processing Vol. 29, 4873–4884, 2020.

Crossref Google Scholar

[85]

Piao,

Y. R.

; Rong,

Z. K.

; Zhang,

; Ren,

W. S.

; Lu,

H. C.

A2dele: Adaptive and attentive depth distiller for efficient RGB-D salient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 9057–9066, 2020.

Crossref

[86]

Liu,

; Zhang,

; Han,

J. W.

Learning selective self-mutual attention for RGB-D saliency detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 13753–13762, 2020.

Crossref

[87]

Zhang,

; Ren,

W. S.

; Piao,

Y. R.

; Rong,

Z. K.

; Lu,

H. C.

Select, supplement and focus for RGB-D saliency detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3469–3478, 2020.

Crossref

[88]

Pang,

; Zhang,

; Zhao,

; Lu,

Hierarchical dynamic filtering network for RGB-D salient object detection. In: Computer Vision – ECCV 2020. Lecture Notes in Computer Science, Vol. 12370. Vedaldi,

; Bischof,

; Brox,

; Frahm,

J. M.

Eds. Springer Cham, 235–252, 2020.

[89]

Luo,

; Li,

; Yang,

; Jiao,

; Cheng,

; Lyu,

Cascade graph neural networks for RGB-D salient object detection. In: Computer Vision – ECCV 2020. Lecture Notes in Computer Science, Vol. 12357. Vedaldi,

; Bischof,

; Brox,

; Frahm,

J. M.

Eds. Springe Cham, 346–364, 2020.

[90]

Ji,

; Li,

; Zhang,

; Piao,

; Lu,

Accurate RGB-D salient object detection via collaborative learning. In: Computer Vision – ECCV 2020. Lecture Notes in Computer Science, Vol. 12363. Vedaldi,

; Bischof,

; Brox,

; Frahm,

J. M.

Eds. Springer Cham, 52–69, 2020.

[91]

Zhao,

; Zhang,

; Pang,

; Lu,

; Zhang,

A single stream network for robust and real-time RGB-D salient object detection. In: Computer Vision – ECCV 2020. Lecture Notes in Computer Science, Vol. 12367. Vedaldi,

; Bischof,

; Brox,

; Frahm,

J. M.

Eds. Springer Cham, 646–662, 2020.

[92]

Chen,

; Fu,

Progressively guided alternate refinement network for RGB-D salient object detection. In: Computer Vision – ECCV 2020. Lecture Notes in Computer Science, Vol. 12353. Vedaldi,

; Bischof,

; Brox,

; Frahm,

J. M.

Eds. Springer Cham, 520–538, 2020.

[93]

Lin,

T. Y.

; Dollár,

; Girshick,

; He,

K. M.

; Hariharan,

; Belongie,

Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 936–944, 2017.

Crossref

[94]

He,

K. M.

; Gkioxari,

; Dollár,

; Girshick,

Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, 2980–2988, 2017.

[95]

Zhao,

H. S.

; Shi,

J. P.

; Qi,

X. J.

; Wang,

X. G.

; Jia,

J. Y.

Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 6230–6239, 2017.

Crossref

[96]

Qin,

X. B.

; Zhang,

Z. C.

; Huang,

C. Y.

; Gao,

; Dehghan,

; Jagersand,

BASNet: Boundary-aware salient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 7471–7481, 2019.

Crossref

[97]

Zhao,

; Wu,

X. Q.

Pyramid feature attention network for saliency detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3080–3089, 2019.

Crossref

[98]

Wu,

; Su,

; Huang,

Q. M.

Cascaded partial decoder for fast and accurate salient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3902–3911, 2019.

Crossref

[99]

Zhao,

J. X.

; Liu,

J. J.

; Fan,

D. P.

; Cao,

; Yang,

J. F.

; Cheng,

M. M.

EGNet: Edge guidance network for salient object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 8778–8787, 2019.

Crossref

[100]

Fan,

D.-P.

; Ji,

G.-P.

; Cheng,

M.-M.

; Shao,

Concealed object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2021.

Crossref Google Scholar

[101]

Sun,

Y. J.

; Chen,

; Zhou,

; Zhang,

; Liu,

Context-aware cross-level fusion network for camouflaged object detection. In: Proceedings of the 30th International Joint Conference on Artificial Intelligence, 1025–1031, 2021.

Crossref

[102]

Li,

; Dong,

; Rigall,

; Zhou,

; Dong,

J. Y.

; Chen,

Marine animal segmentation. IEEE Transactions on Circuits and Systems for Video Technology Vol. 32, No. 4, 2303–2314, 2022.

Crossref Google Scholar

[103]

Zhang,

; Lv,

; Xiang,

; Li,

; Dai,

; Zhong,

Depth confidence-aware camouflaged object detection. arXiv preprint arXiv:2106.13217, 2021.

Google Scholar

[104]

Le,

T. N.

; Nguyen,

T. V.

; Nie,

Z. L.

; Tran,

M. T.

; Sugimoto,

Anabranch network for camouflaged object segmentation. Computer Vision and Image Understanding Vol. 184, 45–56, 2019.

Crossref Google Scholar

[105]

Fan,

D.-P.

; Ji,

G.-P.

; Sun,

; Cheng,

M.-M.

; Shen,

; Shao,

Camouflaged object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2774–2784, 2020.

Crossref

Computational Visual Media

Volume 9 Issue 2,
June 2023

Pages 297-317

DOI: 10.1007/s41095-022-0268-6

Cite this article:

Zhou T, Fan D-P, Chen G, et al. Specificity-preserving RGB-D saliency detection. Computational Visual Media, 2023, 9(2): 297-317. https://doi.org/10.1007/s41095-022-0268-6

Return

10.1007/s41095-022-0268-6.T001Table 1Benchmarking results using 8 representative traditional models and 23 deep models on six public RGB-D saliency detection datasets using four widely used evaluation metrics: $S_{α}$ [77], max $E_{ϕ}$ [78], max $F_{β}$ [41], and $ℳ$ [79]). $↑, ↓$ indicate that larger or smaller is better. The subscript for each model denotes the publication year. Best results are highlighted in bold

Model	NJU2K [54]				STERE [76]				DES [74]				NLPR [1]				SSD [75]				SIP [7]
Model	$S_{α} ↑$	$F_{β} ↑$	$E_{ϕ} ↑$	$ℳ ↓$	$S_{α} ↑$	$F_{β} ↑$	$E_{ϕ} ↑$	$ℳ ↓$	$S_{α} ↑$	$F_{β} ↑$	$E_{ϕ} ↑$	$ℳ ↓$	$S_{α} ↑$	$F_{β} ↑$	$E_{ϕ} ↑$	$ℳ ↓$	$S_{α} ↑$	$F_{β} ↑$	$E_{ϕ} ↑$	$ℳ ↓$	$S_{α} ↑$	$F_{β} ↑$	$E_{ϕ} ↑$	$ℳ ↓$
LHM $_{14}$ [1]	0.514	0.632	0.724	0.205	0.562	0.683	0.771	0.172	0.562	0.511	0.653	0.114	0.630	0.622	0.766	0.108	0.566	0.568	0.717	0.195	0.511	0.574	0.716	0.184
ACSD $_{14}$ [54]	0.699	0.711	0.803	0.202	0.692	0.669	0.806	0.200	0.728	0.756	0.850	0.169	0.673	0.607	0.780	0.179	0.675	0.682	0.785	0.203	0.732	0.763	0.838	0.172
LBE $_{16}$ [58]	0.695	0.748	0.803	0.153	0.660	0.633	0.787	0.250	0.703	0.788	0.890	0.208	0.762	0.745	0.855	0.081	0.621	0.619	0.736	0.278	0.727	0.751	0.853	0.200
DCMC $_{16}$ [80]	0.686	0.715	0.799	0.172	0.731	0.740	0.819	0.148	0.707	0.666	0.773	0.111	0.724	0.648	0.793	0.117	0.704	0.711	0.786	0.169	0.683	0.618	0.743	0.186
SE $_{16}$ [19]	0.664	0.748	0.813	0.169	0.708	0.755	0.846	0.143	0.741	0.741	0.856	0.090	0.756	0.713	0.847	0.091	0.675	0.710	0.800	0.165	0.628	0.661	0.771	0.164
MDSF $_{17}$ [17]	0.748	0.775	0.838	0.157	0.728	0.719	0.809	0.176	0.741	0.746	0.851	0.122	0.805	0.793	0.885	0.095	0.673	0.703	0.779	0.192	0.717	0.698	0.798	0.167
CDCP $_{17}$ [56]	0.669	0.621	0.741	0.180	0.713	0.664	0.786	0.149	0.709	0.631	0.811	0.115	0.669	0.621	0.741	0.180	0.603	0.535	0.700	0.214	0.595	0.505	0.721	0.224
DTM $_{20}$ [81]	0.706	0.716	0.799	0.190	0.747	0.743	0.837	0.168	0.752	0.697	0.858	0.123	0.733	0.677	0.833	0.145	0.677	0.651	0.773	0.199	0.690	0.659	0.778	0.203
DF $_{17}$ [59]	0.763	0.804	0.864	0.141	0.757	0.757	0.847	0.141	0.752	0.766	0.870	0.093	0.802	0.778	0.880	0.085	0.747	0.735	0.828	0.142	0.653	0.657	0.759	0.185
CTMF $_{18}$ [25]	0.849	0.845	0.913	0.085	0.848	0.831	0.912	0.086	0.863	0.844	0.932	0.055	0.860	0.825	0.929	0.056	0.776	0.729	0.865	0.099	0.716	0.694	0.829	0.139
PCF $_{18}$ [22]	0.877	0.872	0.924	0.059	0.875	0.860	0.925	0.064	0.842	0.804	0.893	0.049	0.874	0.841	0.925	0.044	0.841	0.807	0.894	0.062	0.842	0.838	0.901	0.071
AFNet $_{19}$ [20]	0.772	0.775	0.853	0.100	0.825	0.823	0.887	0.075	0.770	0.729	0.881	0.068	0.799	0.771	0.879	0.058	0.714	0.687	0.807	0.118	0.720	0.712	0.819	0.118
CPFP $_{19}$ [30]	0.878	0.877	0.923	0.053	0.879	0.874	0.925	0.051	0.872	0.846	0.923	0.038	0.888	0.867	0.932	0.036	0.807	0.766	0.852	0.082	0.850	0.851	0.903	0.064
MMCI $_{19}$ [29]	0.859	0.853	0.915	0.079	0.873	0.863	0.927	0.068	0.848	0.822	0.928	0.065	0.856	0.815	0.913	0.059	0.813	0.781	0.882	0.082	0.833	0.818	0.897	0.086
TANet $_{19}$ [24]	0.878	0.874	0.925	0.060	0.871	0.861	0.923	0.060	0.858	0.827	0.910	0.046	0.886	0.863	0.941	0.041	0.839	0.810	0.897	0.063	0.835	0.830	0.895	0.075
DMRA $_{19}$ [60]	0.886	0.886	0.927	0.051	0.886	0.886	0.938	0.047	0.900	0.888	0.943	0.030	0.899	0.879	0.947	0.031	0.857	0.844	0.906	0.058	0.806	0.821	0.875	0.085
cmSalGAN $_{20}$ [82]	0.903	0.896	0.940	0.046	0.900	0.894	0.936	0.050	0.913	0.899	0.943	0.028	0.922	0.907	0.957	0.027	0.791	0.735	0.867	0.086	0.865	0.864	0.906	0.064
ASIFNet $_{20}$ [83]	0.889	0.888	0.927	0.047	0.878	0.878	0.927	0.049	0.934	0.935	0.974	0.019	0.906	0.888	0.944	0.030	0.857	0.834	0.884	0.056	0.857	0.859	0.896	0.061
ICNet $_{20}$ [84]	0.894	0.891	0.926	0.052	0.903	0.898	0.942	0.045	0.920	0.913	0.960	0.027	0.923	0.908	0.952	0.028	0.848	0.841	0.902	0.064	0.854	0.857	0.903	0.069
A2dele $_{20}$ [85]	0.871	0.874	0.916	0.051	0.878	0.879	0.928	0.044	0.886	0.872	0.920	0.029	0.898	0.882	0.944	0.029	0.802	0.776	0.861	0.070	0.828	0.833	0.889	0.070
JLDCF $_{20}$ [11]	0.903	0.903	0.944	0.043	0.905	0.901	0.946	0.042	0.929	0.919	0.968	0.022	0.925	0.916	0.962	0.022	0.830	0.795	0.885	0.068	0.879	0.885	0.923	0.051
S $2$ MA $_{20}$ [86]	0.894	0.889	0.930	0.053	0.890	0.882	0.932	0.051	0.941	0.935	0.973	0.021	0.915	0.902	0.953	0.030	0.868	0.848	0.909	0.052	0.872	0.877	0.919	0.057
UCNet $_{20}$ [12]	0.897	0.895	0.936	0.043	0.903	0.899	0.944	0.039	0.933	0.930	0.976	0.018	0.920	0.903	0.956	0.025	0.865	0.854	0.907	0.049	0.875	0.879	0.919	0.051
SSF $_{20}$ [87]	0.899	0.896	0.935	0.043	0.893	0.890	0.936	0.044	0.904	0.884	0.941	0.026	0.914	0.896	0.953	0.026	0.845	0.824	0.897	0.058	0.876	0.882	0.922	0.052
HDFNet $_{20}$ [88]	0.908	0.911	0.944	0.038	0.900	0.900	0.943	0.041	0.926	0.921	0.970	0.021	0.923	0.917	0.963	0.023	0.879	0.870	0.925	0.045	0.886	0.894	0.930	0.047
Cas-GNN $_{20}$ [89]	0.911	0.903	0.933	0.035	0.899	0.901	0.930	0.039	0.905	0.906	0.947	0.028	0.919	0.904	0.947	0.028	0.872	0.862	0.915	0.047	0.875	0.879	0.919	0.051
CMMS $_{20}$ [61]	0.900	0.897	0.936	0.044	0.895	0.893	0.939	0.043	0.937	0.930	0.976	0.018	0.915	0.896	0.949	0.027	0.874	0.864	0.922	0.046	0.872	0.877	0.911	0.058
CoNet $_{20}$ [90]	0.895	0.893	0.937	0.046	0.908	0.905	0.949	0.040	0.909	0.896	0.945	0.028	0.908	0.887	0.945	0.031	0.853	0.840	0.915	0.059	0.858	0.867	0.913	0.063
DANet $_{20}$ [91]	0.899	0.910	0.935	0.045	0.901	0.892	0.937	0.043	0.924	0.928	0.968	0.023	0.915	0.916	0.953	0.028	0.864	0.866	0.914	0.050	0.875	0.892	0.918	0.054
PGAR $_{20}$ [92]	0.909	0.907	0.940	0.042	0.907	0.898	0.939	0.041	0.913	0.902	0.945	0.026	0.930	0.916	0.961	0.024	0.865	0.838	0.898	0.057	0.876	0.876	0.915	0.055
D $3$ Net $_{21}$ [7]	0.900	0.900	0.950	0.041	0.899	0.891	0.938	0.046	0.898	0.885	0.946	0.031	0.912	0.897	0.953	0.030	0.857	0.834	0.910	0.058	0.860	0.861	0.909	0.063
SPNet (ours)	0.925	0.935	0.954	0.028	0.907	0.915	0.944	0.037	0.945	0.950	0.980	0.014	0.927	0.925	0.959	0.021	0.871	0.883	0.915	0.044	0.894	0.916	0.930	0.043

10.1007/s41095-022-0268-6.T002Table 2Results from our model and 13 state-of-the art methods: CTMFF [25], PCF [22], AFNet [20], MMCI [29], CPFP [30], DMRA [60], TANet [24], A2dele [85], UCNet [12], JLDCF [11], S $2$ MA [86], SSF [87], and D $3$ Net [7]) on the ReDWeb-S dataset

Model	CTMF	PCF	AFNet	MMCI	CPFP	DMRA	TANet	A2dele	UCNet	JLDCF	S $2$ MA	SSF	D $3$ Net	Ours
$S_{α} ↑$	0.641	0.655	0.546	0.660	0.685	0.592	0.656	0.641	0.713	0.734	0.711	0.595	0.689	0.710
$F_{β} ↑$	0.607	0.627	0.549	0.641	0.645	0.579	0.623	0.603	0.710	0.727	0.696	0.558	0.673	0.715
$E_{ϕ} ↑$	0.739	0.743	0.693	0.754	0.744	0.721	0.741	0.672	0.794	0.805	0.781	0.710	0.768	0.800
$ℳ ↓$	0.204	0.166	0.213	0.176	0.142	0.188	0.165	0.160	0.130	0.128	0.139	0.189	0.149	0.129

10.1007/s41095-022-0268-6.T003Table 3Results from our model using different backbone networks

Backbone	NJU2K [54]				STERE [76]				DES [74]				NLPR [1]				SSD [75]				SIP [7]
Backbone	$S_{α} ↑$	$F_{β} ↑$	$E_{ϕ} ↑$	$ℳ ↓$	$S_{α} ↑$	$F_{β} ↑$	$E_{ϕ} ↑$	$ℳ ↓$	$S_{α} ↑$	$F_{β} ↑$	$E_{ϕ} ↑$	$ℳ ↓$	$S_{α} ↑$	$F_{β} ↑$	$E_{ϕ} ↑$	$ℳ ↓$	$S_{α} ↑$	$F_{β} ↑$	$E_{ϕ} ↑$	$ℳ ↓$	$S_{α} ↑$	$F_{β} ↑$	$E_{ϕ} ↑$	$ℳ ↓$
ResNet-50	0.922	0.934	0.952	0.030	0.904	0.914	0.942	0.037	0.936	0.944	0.974	0.016	0.930	0.931	0.965	0.020	0.869	0.876	0.906	0.044	0.896	0.916	0.934	0.041
Res2Net-50	0.925	0.935	0.954	0.028	0.907	0.915	0.944	0.037	0.945	0.950	0.980	0.014	0.927	0.925	0.959	0.021	0.871	0.883	0.915	0.044	0.894	0.916	0.930	0.043

10.1007/s41095-022-0268-6.T004Table 4Comparisons of inference time and model size for different methods

Method	Ours	JLDCF	S2MA
Model size (MB)	175.3	124.5	82.7
Inference time (ms)	91.7	21.8	22.1
Method	UCNet	SSF	HDFNet
Model size (MB)	31.3	32.9	153.2
Inference time (ms)	31.8	45.7	57.1

10.1007/s41095-022-0268-6.T005Table 5Quantitative evaluation for ablation studies

	NJU2K [54]		STERE [76]		DES [74]		NLPR [1]		SSD [75]		SIP [7]
	$S_{α} ↑$	$ℳ ↓$	$S_{α} ↑$	$ℳ ↓$	$S_{α} ↑$	$ℳ ↓$	$S_{α} ↑$	$ℳ ↓$	$S_{α} ↑$	$ℳ ↓$	$S_{α} ↑$	$ℳ ↓$
Ours	0.925	0.028	0.907	0.037	0.945	0.014	0.927	0.021	0.871	0.044	0.894	0.043
A1	0.916	0.034	0.898	0.042	0.939	0.016	0.926	0.022	0.869	0.047	0.892	0.044
A2	0.921	0.031	0.895	0.042	0.938	0.016	0.925	0.022	0.865	0.051	0.896	0.042
A3	0.919	0.032	0.895	0.043	0.938	0.016	0.929	0.020	0.864	0.049	0.887	0.048
A4	0.924	0.029	0.903	0.038	0.930	0.019	0.927	0.023	0.867	0.049	0.888	0.046
B1	0.918	0.034	0.901	0.041	0.939	0.017	0.922	0.024	0.858	0.050	0.885	0.048
B2	0.924	0.029	0.900	0.041	0.941	0.015	0.926	0.022	0.864	0.049	0.893	0.044
B3	0.921	0.031	0.903	0.039	0.938	0.016	0.925	0.022	0.863	0.050	0.891	0.045
C1	0.913	0.037	0.900	0.047	0.935	0.019	0.922	0.025	0.861	0.055	0.880	0.051
C2	0.916	0.034	0.906	0.040	0.923	0.021	0.924	0.022	0.866	0.049	0.882	0.051

10.1007/s41095-022-0268-6.T006Table 6Results for different numbers of CIMs

	NJU2K		STERE		DES		NLPR		SSD		SIP
	$S_{α} ↑$	$ℳ ↓$	$S_{α} ↑$	$ℳ ↓$	$S_{α} ↑$	$ℳ ↓$	$S_{α} ↑$	$ℳ ↓$	$S_{α} ↑$	$ℳ ↓$	$S_{α} ↑$	$M ↓$
${CIM}_{1}$	0.918	0.034	0.908	0.039	0.929	0.019	0.928	0.022	0.865	0.047	0.889	0.046
${CIM}_{3}$	0.920	0.032	0.900	0.041	0.935	0.017	0.928	0.021	0.857	0.049	0.891	0.045
Ours	0.925	0.028	0.907	0.037	0.945	0.014	0.927	0.021	0.871	0.044	0.894	0.043

10.1007/s41095-022-0268-6.T007Table 7Results for camouflaged object detection models on benchmark datasets using evaluation metrics $S_{α}$ [77] and $ℳ$ [79]. $↑, ↓$ indicate that larger or smaller is better

Model	CHAMELEON		CAMO		COD10K
Model	$S_{α} ↑$	$ℳ ↓$	$S_{α} ↑$	$ℳ ↓$	$S_{α} ↑$	$ℳ ↓$
FPN [93]	0.794	0.075	0.684	0.131	0.697	0.075
MaskRCNN [94]	0.643	0.099	0.574	0.151	0.613	0.080
PSPNet [95]	0.773	0.085	0.663	0.139	0.678	0.080
PiCANet [46]	0.769	0.085	0.609	0.156	0.649	0.090
BASNet [96]	0.687	0.118	0.618	0.159	0.634	0.105
PFANet [97]	0.679	0.144	0.659	0.172	0.636	0.128
CPD [98]	0.853	0.052	0.726	0.115	0.747	0.059
EGNet [99]	0.848	0.050	0.732	0.104	0.737	0.056
SINet [100]	0.869	0.044	0.751	0.100	0.771	0.051
DANet [91]	0.874	0.043	0.752	0.100	0.765	0.051
HDFNet [88]	0.875	0.032	0.778	0.085	0.779	0.045
SPNet (ours)	0.895	0.027	0.795	0.082	0.797	0.042