| Sign up

Article Link

Cite

EndNote(RIS) BibTeX

Collect

Collect

Submit Manuscript

Show Outline

Outline

Abstract

Keywords

Electronic Supplementary Material

References

Show full outline

Hide outline

Regular Paper

WavEnhancer: Unifying Wavelet and Transformer for Image Enhancement

Zi-Nuo Li^{¹^,^†}, Xu-Hang Chen^{¹^,²^,^†}, Shu-Na Guo^¹, Shu-Qiang Wang^²(), Chi-Man Pun^¹()

1Department of Computer and Information Science, University of Macau, Macao 999078, China

2Research Center for Biomedical Information Technology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China

^†Co-First Author (Zi-Nuo Li was responsible for the conceptualization, data curation, and writing of the original draft. Xu-Hang Chen contributed to the methodology, visualization, and editing of the manuscript.)

Show Author Information

Abstract

Image enhancement is a widely used technique in digital image processing that aims to improve image aesthetics and visual quality. However, traditional methods of enhancement based on pixel-level or global-level modifications have limited effectiveness. Recently, as learning-based techniques gain popularity, various studies are now focusing on utilizing networks for image enhancement. However, these techniques often fail to optimize image frequency domains. This study addresses this gap by introducing a transformer-based model for improving images in the wavelet domain. The proposed model refines various frequency bands of an image and prioritizes local details and high-level features. Consequently, the proposed technique produces superior enhancement results. The proposed model’s performance was assessed through comprehensive benchmark evaluations, and the results suggest it outperforms the state-of-the-art techniques.

Keywords

transformer wavelet transform image enhancement

Electronic Supplementary Material

Download File(s)

JCST-2305-13414-Highlights.pdf (1.2 MB)

References

[1]

Moran S, Marza P, McDonagh S, Parisot S, Slabaugh G. DeepLPF: Deep local parametric filters for image enhancement. In Proc. the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2020, pp.12826–12835. DOI: 10.1109/cvpr42600.2020.01284.

[2]

Li Z N, Chen X H, Wang S Q, Pun C M. A large-scale film style dataset for learning multi-frequency driven film enhancement. In Proc. the 32nd International Joint Conference on Artificial Intelligence, Aug. 2023, pp.1160–1168. DOI: 10.24963/ijcai.2023/129.

[3]

Li Z N, Chen X H, Pun C M, Cun X D. High-resolution document shadow removal via a large-scale real-world dataset and a frequency-aware shadow erasing net. In Proc. the 2023 IEEE/CVF International Conference on Computer Vision, Oct. 2023, pp.12415–12424. DOI: 10.1109/iccv51070.2023.01144.

[4]

Luo S H, Chen X H, Chen W W, Li Z N, Wang S Q, Pun C M. Devignet: High-resolution vignetting removal via a dual aggregated fusion transformer with adaptive channel expansion. arXiv: 2308.13739, 2023. https://arxiv.org/abs/2308.13739, Mar. 2024.

[5]

He J W, Liu Y H, Qiao Y, Dong C. Conditional sequential modulation for efficient global image retouching. In Proc. the 16th European Conference on Computer Vision, Aug. 2020, pp.679–695. DOI: 10.1007/978-3-030-58601-0_40.

[6]

Yang C Q, Jin M G, Xu Y, Zhang R, Chen Y, Liu H D. SepLUT: Separable image-adaptive lookup tables for real-time image enhancement. In Proc. the 17th European Conference on Computer Vision, Oct. 2022, pp.201–217. DOI: 10.1007/978-3-031-19797-0_12.

[7]

Zeng H, Cai J R, Li L D, Cao Z S, Zhang L. Learning image-adaptive 3D lookup tables for high performance photo enhancement in real-time. IEEE Trans. Pattern Analysis and Machine Intelligence, 2020, 44(4): 2058–2073. DOI: 10.1109/tpami.2020.3026740.

Crossref Google Scholar

[8]

Zhang Z Y, Jiang Y T, Jiang J, Wang X G, Luo P, Gu J W. STAR: A structure-aware lightweight transformer for real-time image enhancement. In Proc. the 2021 IEEE/CVF International Conference on Computer Vision, Oct. 2021, pp.4086–4095. DOI: 10.1109/iccv48922.2021.00407.

[9]

Hu S Y, Yu W, Chen Z, Wang S Q. Medical image reconstruction using generative adversarial network for Alzheimer disease assessment with class-imbalance problem. In Proc. the 6th International Conference on Computer and Communications, Dec. 2020, pp.1323–1327. DOI: 10.1109/iccc51575.2020.9344912.

[10]

Chen Y S, Wang Y C, Kao M H, Chuang Y Y. Deep photo enhancer: Unpaired learning for image enhancement from photographs with GANs. In Proc. the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2018, pp.6306–6314. DOI: 10.1109/cvpr.2018.00660.

[11]

Wang R X, Zhang Q, Fu C W, Shen X Y, Zheng W S, Jia J Y. Underexposed photo enhancement using deep illumination estimation. In Proc. the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2019, pp.6842–6850. DOI: 10.1109/cvpr.2019.00701.

[12]

Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X H, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J, Houlsby N. An image is worth 16x16 words: Transformers for image recognition at scale. In Proc. the 9th International Conference on Learning Representations, May 2021.

[13]

Bychkovsky V, Paris S, Chan E, Durand F. Learning photographic global tonal adjustment with a database of input/output image pairs. In Proc. the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2011, pp.97–104. DOI: 10.1109/cvpr.2011.5995332.

[14]

Hasinoff S W, Sharlet D, Geiss R, Adams A, Barron J T, Kainz F, Chen J W, Levoy M. Burst photography for high dynamic range and low-light imaging on mobile cameras. ACM Trans. Graphics, 2016, 35(6): Article No. 192. DOI: 10.1145/2980179.2980254.

Crossref Google Scholar

[15]

Gharbi M, Chen J W, Barron J T, Hasinoff S W, Durand F. Deep bilateral learning for real-time image enhancement. ACM Trans. Graphics, 2017, 36(4): Article No. 118. DOI: 10.1145/3072959.3073592.

Crossref Google Scholar

[16]

Yoo J, Uh Y, Chun S, Kang B, Ha J W. Photorealistic style transfer via wavelet transforms. In Proc. the 2019 IEEE/CVF International Conference on Computer Vision, Oct. 27–Nov. 2, 2019, pp.9035–9044. DOI: 10.1109/iccv.2019.00913.

[17]

Liu P J, Zhang H Z, Lian W, Zuo W M. Multi-level wavelet convolutional neural networks. IEEE Access, 2019, 7: 74973–74985. DOI: 10.1109/access.2019.2921451.

Crossref Google Scholar

[18]

You S R, Lei B Y, Wang S Q, Chui C K, Cheung A C, Liu Y, Gan M, Wu G C, Shen Y Y. Fine perceptive GANs for brain MR image super-resolution in wavelet domain. IEEE Trans. Neural Networks and Learning Systems, 2023, 34(11): 8802–8814. DOI: 10.1109/TNNLS.2022.3153088.

Crossref Google Scholar

[19]

Yao T, Pan Y W, Li Y H, Ngo C W, Mei T. Wave-ViT: Unifying wavelet and transformers for visual representation learning. In Proc. the 17th European Conference on Computer Vision, Oct. 2022, pp.328–345. DOI: 10.1007/978-3-031-19806-9_19.

[20]

Liu L, Liu J Z, Yuan S X, Slabaugh G, Leonardis A, Zhou W G, Tian Q. Wavelet-based dual-branch network for image demoiréing. In Proc. the 16th European Conference on Computer Vision, Aug. 2020, pp.86–102. DOI: 10.1007/978-3-030-58601-0_6.

[21]

Ronneberger O, Fischer P, Brox T. U-Net: Convolutional networks for biomedical image segmentation. In Proc. the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention, Oct. 2015, pp.234–241. DOI: 10.1007/978-3-319-24574-4_28.

[22]

Hu S Y, Yuan J P, Wang S Q. Cross-modality synthesis from MRI to PET using adversarial U-Net with different normalization. In Proc. the 2019 International Conference on Medical Imaging Physics and Engineering, Nov. 2019, pp.1–5. DOI: 10.1109/icmipe47306.2019.9098219.

[23]

Zamir S W, Arora A, Khan S, Hayat M, Khan F S, Yang M H. Restormer: Efficient transformer for high-resolution image restoration. In Proc. the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2022, pp.5718–5729. DOI: 10.1109/cvpr52688.2022.00564.

[24]

Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A N, Kaiser Ł, Polosukhin I. Attention is all you need. In Proc. the 31st Conference on Neural Information Processing Systems, Dec. 2017, pp.6000–6010. DOI: 10.5555/3295222.3295349.

[25]

Ba J L, Kiros J R, Hinton G E. Layer normalization. arXiv: 1607.06450, 2016. https://arxiv.org/abs/1607.06450, Mar. 2024.

[26]

He J W, Dong C, Qiao Y. Modulating image restoration with continual levels via adaptive feature modification layers. In Proc. the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2019, pp.11048–11056. DOI: 10.1109/cvpr.2019.01131.

[27]

Zhang X E, Ng R, Chen Q F. Single image reflection separation with perceptual losses. In Proc. the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2018, pp.4786–4794. DOI: 10.1109/cvpr.2018.00503.

[28]

Johnson J, Alahi A, Fei-Fei L. Perceptual losses for real-time style transfer and super-resolution. In Proc. the 14th European Conference on Computer Vision, Oct. 2016, pp.694–711. DOI: 10.1007/978-3-319-46475-6_43.

[29]

Backhaus W G K, Kliegl R, Werner J S. Color Vision: Perspectives from Different Disciplines. De Gruyter, 2011.

[30]

Cun X D, Pun C M, Shi C. Towards ghost-free shadow removal via dual hierarchical aggregation network and shadow matting GAN. In Proc. the 34th AAAI Conference on Artificial Intelligence, Apr. 2020, p.10680–10687. DOI: 10.1609/aaai.v34i07.6695.

Journal of Computer Science and Technology

Volume 39 Issue 2,
March 2024

Pages 336-345

DOI: 10.1007/s11390-024-3414-z

Cite this article:

Li Z-N, Chen X-H, Guo S-N, et al. WavEnhancer: Unifying Wavelet and Transformer for Image Enhancement. Journal of Computer Science and Technology, 2024, 39(2): 336-345. https://doi.org/10.1007/s11390-024-3414-z

About Us

Learn about Open Access

Tsinghua University Press

Publish with Us

Peer Review Policy

Copyright and Licensing

Article Processing Charge

Contact Us

Journal Collaboration: Yao Meng (Ms.)✉️ +86-10-83470574

Technical Support: Kuo Zhao (Mr.)✉️ +86-10-83470507

Media Contact: Hao Jin (Mr.)✉️ +86-10-83470559

Address: Floor 6, Tower B, Xueyan Building, Shuangqing Road, Haidian District, Beijing 100084, China.

SciOpen——中国科技期刊卓越行动计划支持项目

Copyright © 2025 Tsinghua University Press Ltd.

京ICP备 10035462号-42 京公网安备11010802044758号