[5]
Z. Huang, H. Dai, T. Z. Xiang, S. Wang, H. X. Chen, J. Qin, and H. Xiong, Feature shrinkage pyramid for camouflaged object detection with transformers, in Proc. 2023 IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 2023, pp. 5557–5566.
[6]
Z. Wu, D. P. Paudel, D. -P. Fan, J. Wang, S. Wang, C. Demonceaux, R. Timofte, and L. Van Gool, Source-free depth for object pop-out, arXiv preprint arXiv: 2212.05370, 2022.
[7]
Y. Zhong, B. Li, L. Tang, S. Kuang, S. Wu, and S. Ding, Detecting camouflaged object in frequency domain, in Proc. 2022 IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 2022, pp. 4494–4503.
[8]
H. Ali Qadir, Y. Shin, J. Solhusvik, J. Bergsland, L. Aabakken, and I. Balasingham, Polyp detection and segmentation using mask R-CNN: Does a deeper feature extractor CNN always perform better? in Proc. 2019 13th Int. Symp. on Medical Information and Communication Technology (ISMICT), Oslo, Norway, 2019, pp. 1–6.
[10]
B. Dong, W. Wang, D. P. Fan, J. Li, H. Fu, and L. Shao, Polyp-PVT: Polyp segmentation with pyramid vision transformers, arXiv preprint arXiv: 2108.06932, 2021.
[13]
D. P. Fan, G. P. Ji, G. Sun, M. M. Cheng, J. Shen, and L. Shao, Camouflaged object detection, in Proc. 2020 IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 2020, pp. 2774–2784.
[14]
Y. Pang, X. Zhao, T. Z. Xiang, L. Zhang, and H. Lu, Zoom in and out: A mixed-scale triplet network for camouflaged object detection, in Proc. 2022 IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 2022, pp. 2150–2160.
[15]
A. Li, J. Zhang, Y. Lv, B. Liu, T. Zhang, and Y. Dai, Uncertainty-aware joint salient object and camouflaged object detection, in Proc. 2021 IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 2021, pp. 10066–10076.
[16]
H. Mei, G. P. Ji, Z. Wei, X. Yang, X. Wei, and D. P. Fan, Camouflaged object segmentation with distraction mining, in Proc. 2021 IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 2021, pp. 8768–8777.
[17]
Y. Sun, G. Chen, T. Zhou, Y. Zhang, and N. Liu, Context-aware cross-level fusion network for camouflaged object detection, in Proc. 30th Int. Joint Conf. Artificial Intelligence (IJCAI-21), virtual, 2021, pp. 1025–1031.
[18]
Q. Jia, S. Yao, Y. Liu, X. Fan, R. Liu, and Z. Luo, Segment, magnify and reiterate: Detecting camouflaged objects the hard way, in Proc. 2022 IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 2022, pp. 4703–4712.
[19]
B. Dai and D. Lin, Contrastive learning for image captioning, in Proc. 31st Conf. Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, pp. 898–907.
[20]
Y. Zhang, H. Jiang, Y. Miura, C. D. Manning, and C. Langlotz, Contrastive learning of medical visual representations from paired images and text, in Proc. 9th Int. Conf. Learning Representations, virtual, 2021.
[21]
M. Kang and J. Park, ContraGAN: contrastive learning for conditional image generation, in Proc. 34th Conf. Neural Information Processing Systems (NeurIPS 2020), Vancouver, Canada, 2020.
[22]
T. Chen, S. Kornblith, M. Norouzi, and G. Hinton, A simple framework for contrastive learning of visual representations, arXiv preprint arXiv: 2002.05709, 2020.
[23]
Yonglong Tian, Chen Sun, B. Poole, D. Krishnan, and P. Isola, What makes for good views for contrastive learning? in Proc. 34th Conf. Neural Information Processing Systems (NeurIPS 2020), Vancouver, Canada, 2020.
[24]
P. Khosla, P. Teterwak, C. Wang, A. Sarna, Y. Tian, P. Isola, A. Maschinot, C. Liu, and D. Krishnan, Supervised contrastive learning, in Proc. 34th Conf. Neural Information Processing Systems (NeurIPS 2020), Vancouver, Canada, 2020.
[25]
R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, High-resolution image synthesis with latent diffusion models, in Proc. 2022 IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 2022, pp. 10674–10685.
[26]
J. Ho, A. Jain, and P. Abbeel, Denoising diffusion probabilistic models, in Proc. 34th Conf. Neural Information Processing Systems (NeurIPS 2020), Vancouver, Canada, 2020.
[27]
Y. Cao, S. Li, Y. Liu, Z. Yan, Y. Dai, P. S. Yu, and L. Sun, A comprehensive survey of AI-generated content (AIGC): A history of generative AI from GAN to ChatGPT, arXiv preprint arXiv: 2303.04226, 2023.
[28]
A. Ramesh, P. Dhariwal, A. Nichol, C. Chu, and M. Chen, Hierarchical text-conditional image generation with CLIP latents, arXiv preprint arXiv: 2204.06125, 2022.
[29]
S. Huang, Z. Wang, P. Li, B. Jia, T. Liu, Y. Zhu, W. Liang, and S. C. Zhu, Diffusion-based generation, optimization, and planning in 3D scenes, arXiv preprint arXiv: 2301.06015, 2023.
[30]
C. H. Lin, H. Y. Lee, W. Menapace, M. Chai, A. Siarohin, M. H. Yang, and S. Tulyakov, InfiniCity: infinite-scale city synthesis, arXiv preprint arXiv: 2301.09637, 2023.
[31]
D. Huynh, J. Kuen, Z. Lin, J. Gu, and E. Elhamifar, Open-vocabulary instance segmentation via robust cross-modal pseudo-labeling, in Proc. 2022 IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 2022, pp. 7010–7021.
[32]
Y. Benigmim, S. Roy, S. Essid, V. Kalogeiton, and S. Lathuilière, One-shot unsupervised domain adaptation with personalized diffusion models, arXiv preprint arXiv: 2303.18080, 2023.
[33]
M. Zhang, X. Guo, L. Pan, Z. Cai, F. Hong, H. Li, L. Yang, and Z. Liu, ReMoDiffuse: retrieval-augmented motion diffusion model, arXiv preprint arXiv: 2304.01116, 2023.
[34]
A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, and I. Sutskever, Learning transferable visual models from natural language supervision, in Proc. 38 th Int. Conf. Machine Learning, virtual, 2021, pp. 8748–8763.
[35]
P. Dhariwal and A. Nichol, Diffusion models beat GANs on image synthesis, arXiv preprint arXiv: 2105.05233, 2021.
[37]
C. Nash, J. Menick, S. Dieleman, and P. W. Battaglia, Generating images with sparse representations, in Proc. 38 th Int. Conf. Machine Learning, virtual, 2021, pp. 7958–7968.
[38]
T. Miyato, T. Kataoka, M. Koyama, and Y. Yoshida, Spectral normalization for generative adversarial networks, in Proc. 6th Int. Conf. Learning Representations (ICLR), Vancouver, Canada, 2018.
[39]
C. Meng, Y. Song, J. Song, J. Wu, and S. Ermon, SDEdit: Image synthesis and editing with stochastic differential equations, in Proc. 10th Int. Conf. Learning Representations (ICLR), virtual, 2022.
[40]
B. Poole, A. Jain, J. Barron, and B. Mildenhall, DreamFusion: Text-to-3D using 2D diffusion, in Proc. 11th Int. Conf. Learning Representations (ICLR), Kigali, Rwanda, 2023.
[41]
L. Zhou, Y. Du, and J. Wu, 3D shape generation and completion through point-voxel diffusion, in Proc. 2021 IEEE/CVF Int. Conf. Computer Vision (ICCV), Montreal, Canada, 2022, pp. 5806–5815.
[42]
S. Luo and W. Hu, Diffusion probabilistic models for 3D point cloud generation, in Proc. 2021 IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 2021, pp. 2836–2844.
[43]
B. L. Trippe, J. Yim, D. Tischer, T. Broderick, D. Baker, R. Barzilay, and T. Jaakkola, Diffusion probabilistic modeling of protein backbones in 3D for the motif-scaffolding problem, in Proc. 11th Int. Conf. Learning Representations (ICLR), Kigali, Rwanda, 2023.
[44]
R. Yang, P. Srivastava, and S. Mandt, Diffusion probabilistic modeling for video generation, arXiv preprint arXiv: 2203.09481, 2022.
[45]
C. Saharia, W. Chan, H. Chang, C. A. Lee, J. Ho, T. Salimans, D. J. Fleet, and M. Norouzi, Palette: image-to-image diffusion models, arXiv preprint arXiv: 2111.05826, 2021.
[46]
F. Yang, Q. Zhai, X. Li, R. Huang, A. Luo, H. Cheng, and D. P. Fan, Uncertainty-guided transformer reasoning for camouflaged object detection, in Proc. 2021 IEEE/CVF Int. Conf. Computer Vision (ICCV), Montreal, Canada, 2022, pp. 4126–4135.
[48]
A. Kirillov, E. Mintun, N. Ravi, H. Mao, C. Rolland, L. Gustafson, T. Xiao, S. Whitehead, A. C. Berg, W. Y. Lo et al., Segment anything, arXiv preprint arXiv: 2304.02643, 2023.
[49]
G. P. Ji, D. P. Fan, P. Xu, M. -M. Cheng, B. Zhou, and L. Van Gool, SAM struggles in concealed scenes—Empirical study on “segment anything”, arXiv preprint arXiv: 2304.06022, 2023.
[53]
P. Skurowski, H. Abdulameer, J. Błaszczyk, T. Depta, A. Kornacki, and P. Kozieł, Animal camouflage analysis: CHAMELEON database, https://www.polsl.pl/rau6/chameleon-database-animal-camouflage-analysis/, 2018.
[54]
Y. Lv, J. Zhang, Y. Dai, A. Li, B. Liu, N. Barnes, and D. P. Fan, Simultaneously localize, segment and rank the camouflaged objects, in Proc. 2021 IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 2021, pp. 11586–11596.
[55]
W. Liu, X. Shen, C. M. Pun, and X. Cun, Explicit visual prompting for low-level structure segmentations, in Proc. 2023 IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), Vancouver, Canada, 2023, pp. 19434–19445.
[56]
T. Salimans, I. Goodfellow, W. Zaremba, V. Cheung, A. Radford, and X. Chen, Improved techniques for training GANs, in Proc. 30th Conf. Neural Information Processing Systems (NIPS 2016), Barcelona, Spain.
[57]
L. Wang, H. Lu, Y. Wang, M. Feng, D. Wang, B. Yin, and X. Ruan, Learning to detect salient objects with image-level supervision, in Proc. 2017 IEEE Conf. Computer Vision and Pattern Recognition (CVPR). Honolulu, HI, USA, 2017, pp. 3796–3805.
[59]
C. Xia, J. Li, X. Chen, A. Zheng, and Y. Zhang, What is and what is not a salient object? Learning salient object detector by ensembling linear exemplar regressors, in Proc. 2017 IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 2017, pp. 4399–4407.