65 | 2020 | BiANet [116] | TIP | NLPR(0.7k), NJUD(1.485k) | VGG-16 | Uses a bilateral attention module (BAM) to explore rich foreground and background information from depth maps |
66 | 2020 | ASIF-Net [117] | TCYB | NLPR(0.65k), NJUD(1.4k) | VGG-16 | Integrates the attention steered complementarity from RGB-D images and introduces a global semantic constraint using adversarial learning |
67 | 2020 | Triple-Net [118] | SPL | Triple-Net | ResNe-18 | Uses a triple-complementary network for RGB-D based salient object detection |
68 | 2020 | ICNet [42] | TIP | Triple-Net | VGG-16 | Uses a novel information conversion module to fuse high-level RGB and depth features in an interactive and adaptive way |
69 | 2020 | SDF [119] | TIP | NLPR, NJUD,DEC, LFSD(1.5k) | VGG-16 | Proposes a exemplar-driven method to estimate relatively trustworthy depth maps, and uses a selective deep saliency fusion network to effectively integrate RGB images, original depths, and newly estimated depths |
70 | 2020 | GFNet [120] | SPL | NLPR(0.8k), NJUD(1.588k) | Res2Net | Designs a gate fusion block to regularize feature fusion |
71 | 2020 | RGBS [121] | MTAP | NLPR(0.65k), NJUD(1.4k) | VGG-16 | Utilizes a GAN to generate the saliency map |
72 | 2020 | D Net [38] | TNNLS | NLPR(0.7k), NJUD(1.485k) | VGG-16 | Uses a depth purifier unit and a three-stream feature learning module to employ low-quality depth cue filtering and cross-modal feature learning, respectively |
73 | 2020 | JL-DCF [43] | CVPR | NLPR(0.7k), NJUD(1.5k) | VGG-16, ResNet-101 | Uses a joint learning strategy and a densely-cooperative fusion module to achieve better salient object detection performance |
74 | 2020 | A2dele [40] | CVPR | NLPR(0.7k), NJUD(1.485k) | VGG-16 | Employs a depth distiller to explore ways of using network prediction and attention as two bridges to transfer depth knowledge to RGB images |
75 | 2020 | SSF [39] | CVPR | NLPR(0.7k), NJUD(1.485k), DUT(0.8k) | AGG-16 | Designs a complimentary interaction module to select useful representations from the RGB and depth images and then integrate cross-modal features |
76 | 2020 | S MA [41] | CVPR | NLPR(0.65k), NJUD(1.4k) | VGG-16 | Fuses multi-modal information via self-attention and each otherâs attention strategies, and reweights the mutual attention term to filter out unreliable information |
77 | 2020 | UC-Net [44] | CVPR | NLPR(0.7k), NJUD(1.5k) | VGG-16 | Uses a probabilistic RGB-D saliency detection network via a conditional VAE to generate multiple saliency maps |
78 | 2020 | CMWNet [122] | ECCV | NLPR(0.65k), NJUD(1.4k) | VGG-16 | Exploits feature interactions using three cross-modal cross-scale weighting modules to improve salient object detection performance |
79 | 2020 | HDFNet [123] | ECCV | NLPR(0.7k), NJUD(1.485k), DUT(0.8k) | VGG-16 | Designs a hierarchical dynamic filtering network to effectively make use of cross-modal fusion information |
80 | 2020 | CAS-GNN [124] | ECCV | NLPR(0.65k), NJUD(1.4k) | VGG-16 | Designs cascaded graph neural networks to exploit useful knowledge from RGB and depth images for building powerful feature embeddings |
81 | 2020 | CMMS [125] | ECCV | NLPR(0.7k), NJUD(1.485k) | VGG-16 | Proposes a cross-modality feature modulation module to enhance feature representations and an adaptive feature selection module to gradually select saliency-related features |
82 | 2020 | DANet [126] | ECCV | NLPR(0.65k), NJUD(1.4k) | VGG-16, VGG-19 | Develops a single-stream network combined with a depth-enhanced dual attention to achieve real-time salient object detection |
83 | 2020 | CoNet [127] | ECCV | NLPR(0.7k), NJUD(1.485k), DUT(0.8k) | ResNet | Develops a collaborative learning framework for RGB-D based salient object detection. Three collaborators (edge detection, coarse salient object detection and depth estimation) are utilized to jointly boost the performance |
84 | 2020 | BBS-Net [128] | ECCV | NLPR(0.65k), NJUD(1.4k) | VGG-16, VGG-19, ResNet-50 | Uses a bifurcated backbone strategy to learn teacher and student features, and utilizes a depth-enhanced module to excavate informative parts of depth cues |
85 | 2020 | ATSA [129] | ECCV | NLPR(0.7k), NJUD(1.485k), DUT(0.8k) | VGG-19 | Proposes an asymmetric two-stream architecture taking account of the inherent differences between RGB and depth data for salient object detection |
86 | 2020 | PGAR [130] | ECCV | NLPR(0.7k), NJUD(1.485k) | VGG-16 | Proposes a progressively guided alternate refinement network to produce a coarse initial prediction using a multi-scale residual block |
87 | 2020 | MCINet [131] | arXiv | NLPR(0.65k), NJUD(1.4k) | ResNet-50 | Develops a novel multi-level cross-modal interaction network for RGB-D salient object detection |
88 | 2020 | DRLF [132] | TIP | NLPR(0.65k), NJUD(1.4k) | VGG-16 | Develops a channel-wise fusion network to conduct multi-net and multi-level selective fusion for RGB-D salient object detection |
89 | 2020 | DQAM [133] | arXiv | NLPR(0.65k), NJUD(1.4k) | Without | Proposes a depth quality assessment solution to conduct “quality-aware” salient object detection for RGB-D images |
90 | 2020 | DQSD [134] | TIP | NLPR(0.65k), NJUD(1.4k) | VGG-19 | Integrates a depth quality aware subnet into a bi-stream structure to assess the depth quality before conducting RGB-D fusion |
91 | 2020 | DASNet [135] | ACM MM | NLPR(0.7k), NJUD(1.5k) | ResNet-50 | Proposes a new perspective of containing the depth constraints in the learning process rather than using depths as inputs |
92 | 2020 | DCMF [136] | TIP | NLPR(0.65k), NJUD(1.4k) | VGG-16, ResNet-50 | Designs a disentangled cross-modal fusion network to expose structural and content representations from RGB and depth images |