AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
Article Link
Collect
Submit Manuscript
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Regular Paper

WavEnhancer: Unifying Wavelet and Transformer for Image Enhancement

Department of Computer and Information Science, University of Macau, Macao 999078, China
Research Center for Biomedical Information Technology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China

Co-First Author (Zi-Nuo Li was responsible for the conceptualization, data curation, and writing of the original draft. Xu-Hang Chen contributed to the methodology, visualization, and editing of the manuscript.)

Show Author Information

Abstract

Image enhancement is a widely used technique in digital image processing that aims to improve image aesthetics and visual quality. However, traditional methods of enhancement based on pixel-level or global-level modifications have limited effectiveness. Recently, as learning-based techniques gain popularity, various studies are now focusing on utilizing networks for image enhancement. However, these techniques often fail to optimize image frequency domains. This study addresses this gap by introducing a transformer-based model for improving images in the wavelet domain. The proposed model refines various frequency bands of an image and prioritizes local details and high-level features. Consequently, the proposed technique produces superior enhancement results. The proposed model’s performance was assessed through comprehensive benchmark evaluations, and the results suggest it outperforms the state-of-the-art techniques.

Electronic Supplementary Material

Download File(s)
JCST-2305-13414-Highlights.pdf (1.2 MB)

References

[1]
Moran S, Marza P, McDonagh S, Parisot S, Slabaugh G. DeepLPF: Deep local parametric filters for image enhancement. In Proc. the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2020, pp.12826–12835. DOI: 10.1109/cvpr42600.2020.01284.
[2]
Li Z N, Chen X H, Wang S Q, Pun C M. A large-scale film style dataset for learning multi-frequency driven film enhancement. In Proc. the 32nd International Joint Conference on Artificial Intelligence, Aug. 2023, pp.1160–1168. DOI: 10.24963/ijcai.2023/129.
[3]
Li Z N, Chen X H, Pun C M, Cun X D. High-resolution document shadow removal via a large-scale real-world dataset and a frequency-aware shadow erasing net. In Proc. the 2023 IEEE/CVF International Conference on Computer Vision, Oct. 2023, pp.12415–12424. DOI: 10.1109/iccv51070.2023.01144.
[4]
Luo S H, Chen X H, Chen W W, Li Z N, Wang S Q, Pun C M. Devignet: High-resolution vignetting removal via a dual aggregated fusion transformer with adaptive channel expansion. arXiv: 2308.13739, 2023. https://arxiv.org/abs/2308.13739, Mar. 2024.
[5]
He J W, Liu Y H, Qiao Y, Dong C. Conditional sequential modulation for efficient global image retouching. In Proc. the 16th European Conference on Computer Vision, Aug. 2020, pp.679–695. DOI: 10.1007/978-3-030-58601-0_40.
[6]
Yang C Q, Jin M G, Xu Y, Zhang R, Chen Y, Liu H D. SepLUT: Separable image-adaptive lookup tables for real-time image enhancement. In Proc. the 17th European Conference on Computer Vision, Oct. 2022, pp.201–217. DOI: 10.1007/978-3-031-19797-0_12.
[7]

Zeng H, Cai J R, Li L D, Cao Z S, Zhang L. Learning image-adaptive 3D lookup tables for high performance photo enhancement in real-time. IEEE Trans. Pattern Analysis and Machine Intelligence, 2020, 44(4): 2058–2073. DOI: 10.1109/tpami.2020.3026740.

[8]
Zhang Z Y, Jiang Y T, Jiang J, Wang X G, Luo P, Gu J W. STAR: A structure-aware lightweight transformer for real-time image enhancement. In Proc. the 2021 IEEE/CVF International Conference on Computer Vision, Oct. 2021, pp.4086–4095. DOI: 10.1109/iccv48922.2021.00407.
[9]
Hu S Y, Yu W, Chen Z, Wang S Q. Medical image reconstruction using generative adversarial network for Alzheimer disease assessment with class-imbalance problem. In Proc. the 6th International Conference on Computer and Communications, Dec. 2020, pp.1323–1327. DOI: 10.1109/iccc51575.2020.9344912.
[10]
Chen Y S, Wang Y C, Kao M H, Chuang Y Y. Deep photo enhancer: Unpaired learning for image enhancement from photographs with GANs. In Proc. the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2018, pp.6306–6314. DOI: 10.1109/cvpr.2018.00660.
[11]
Wang R X, Zhang Q, Fu C W, Shen X Y, Zheng W S, Jia J Y. Underexposed photo enhancement using deep illumination estimation. In Proc. the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2019, pp.6842–6850. DOI: 10.1109/cvpr.2019.00701.
[12]
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X H, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J, Houlsby N. An image is worth 16x16 words: Transformers for image recognition at scale. In Proc. the 9th International Conference on Learning Representations, May 2021.
[13]
Bychkovsky V, Paris S, Chan E, Durand F. Learning photographic global tonal adjustment with a database of input/output image pairs. In Proc. the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2011, pp.97–104. DOI: 10.1109/cvpr.2011.5995332.
[14]

Hasinoff S W, Sharlet D, Geiss R, Adams A, Barron J T, Kainz F, Chen J W, Levoy M. Burst photography for high dynamic range and low-light imaging on mobile cameras. ACM Trans. Graphics, 2016, 35(6): Article No. 192. DOI: 10.1145/2980179.2980254.

[15]

Gharbi M, Chen J W, Barron J T, Hasinoff S W, Durand F. Deep bilateral learning for real-time image enhancement. ACM Trans. Graphics, 2017, 36(4): Article No. 118. DOI: 10.1145/3072959.3073592.

[16]
Yoo J, Uh Y, Chun S, Kang B, Ha J W. Photorealistic style transfer via wavelet transforms. In Proc. the 2019 IEEE/CVF International Conference on Computer Vision, Oct. 27–Nov. 2, 2019, pp.9035–9044. DOI: 10.1109/iccv.2019.00913.
[17]

Liu P J, Zhang H Z, Lian W, Zuo W M. Multi-level wavelet convolutional neural networks. IEEE Access, 2019, 7: 74973–74985. DOI: 10.1109/access.2019.2921451.

[18]

You S R, Lei B Y, Wang S Q, Chui C K, Cheung A C, Liu Y, Gan M, Wu G C, Shen Y Y. Fine perceptive GANs for brain MR image super-resolution in wavelet domain. IEEE Trans. Neural Networks and Learning Systems, 2023, 34(11): 8802–8814. DOI: 10.1109/TNNLS.2022.3153088.

[19]
Yao T, Pan Y W, Li Y H, Ngo C W, Mei T. Wave-ViT: Unifying wavelet and transformers for visual representation learning. In Proc. the 17th European Conference on Computer Vision, Oct. 2022, pp.328–345. DOI: 10.1007/978-3-031-19806-9_19.
[20]
Liu L, Liu J Z, Yuan S X, Slabaugh G, Leonardis A, Zhou W G, Tian Q. Wavelet-based dual-branch network for image demoiréing. In Proc. the 16th European Conference on Computer Vision, Aug. 2020, pp.86–102. DOI: 10.1007/978-3-030-58601-0_6.
[21]
Ronneberger O, Fischer P, Brox T. U-Net: Convolutional networks for biomedical image segmentation. In Proc. the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention, Oct. 2015, pp.234–241. DOI: 10.1007/978-3-319-24574-4_28.
[22]
Hu S Y, Yuan J P, Wang S Q. Cross-modality synthesis from MRI to PET using adversarial U-Net with different normalization. In Proc. the 2019 International Conference on Medical Imaging Physics and Engineering, Nov. 2019, pp.1–5. DOI: 10.1109/icmipe47306.2019.9098219.
[23]
Zamir S W, Arora A, Khan S, Hayat M, Khan F S, Yang M H. Restormer: Efficient transformer for high-resolution image restoration. In Proc. the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2022, pp.5718–5729. DOI: 10.1109/cvpr52688.2022.00564.
[24]
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A N, Kaiser Ł, Polosukhin I. Attention is all you need. In Proc. the 31st Conference on Neural Information Processing Systems, Dec. 2017, pp.6000–6010. DOI: 10.5555/3295222.3295349.
[25]
Ba J L, Kiros J R, Hinton G E. Layer normalization. arXiv: 1607.06450, 2016. https://arxiv.org/abs/1607.06450, Mar. 2024.
[26]
He J W, Dong C, Qiao Y. Modulating image restoration with continual levels via adaptive feature modification layers. In Proc. the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2019, pp.11048–11056. DOI: 10.1109/cvpr.2019.01131.
[27]
Zhang X E, Ng R, Chen Q F. Single image reflection separation with perceptual losses. In Proc. the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2018, pp.4786–4794. DOI: 10.1109/cvpr.2018.00503.
[28]
Johnson J, Alahi A, Fei-Fei L. Perceptual losses for real-time style transfer and super-resolution. In Proc. the 14th European Conference on Computer Vision, Oct. 2016, pp.694–711. DOI: 10.1007/978-3-319-46475-6_43.
[29]
Backhaus W G K, Kliegl R, Werner J S. Color Vision: Perspectives from Different Disciplines. De Gruyter, 2011.
[30]
Cun X D, Pun C M, Shi C. Towards ghost-free shadow removal via dual hierarchical aggregation network and shadow matting GAN. In Proc. the 34th AAAI Conference on Artificial Intelligence, Apr. 2020, p.10680–10687. DOI: 10.1609/aaai.v34i07.6695.
Journal of Computer Science and Technology
Pages 336-345
Cite this article:
Li Z-N, Chen X-H, Guo S-N, et al. WavEnhancer: Unifying Wavelet and Transformer for Image Enhancement. Journal of Computer Science and Technology, 2024, 39(2): 336-345. https://doi.org/10.1007/s11390-024-3414-z

225

Views

9

Crossref

0

Web of Science

7

Scopus

0

CSCD

Altmetrics

Received: 19 May 2023
Accepted: 06 January 2024
Published: 30 March 2024
© Institute of Computing Technology, Chinese Academy of Sciences 2024
Return