3D-aware image synthesis has attained high quality and robust 3D consistency. Existing 3D controllable generative models are designed to synthesize 3D-aware images through a single modality, such as 2D segmentation or sketches, but lack the ability to finely control generated content, such as texture and age. In pursuit of enhancing user-guided controllability, we propose Multi3D, a 3D-aware controllable image synthesis model that supports multi-modal input. Our model can govern the geometry of the generated image using a 2D label map, such as a segmentation or sketch map, while concurrently regulating the appearance of the generated image through a textual description. To demonstrate the effectiveness of our method, we have conducted experiments on multiple datasets, including CelebAMask-HQ, AFHQ-cat, and shapenet-car. Qualitative and quantitative evaluations show that our method outperforms existing state-of-the-art methods.
Sun, J.; Wang, X.; Shi, Y.; Wang, L.; Wang, J.; Liu, Y. IDE-3D: Interactive disentangled editing for high-resolution 3D-aware portrait synthesis. ACM Transactions on Graphics Vol. 41, No. 6, Article No. 270, 2022.
Zhou, W. Y.; Yuan, L.; Chen, S. Y.; Gao, L.; Hu, S. M. LC-NeRF: Local controllable face generation in neural radiance field. IEEE Transactions on Visualization and Computer Graphics doi: 10.1109/TVCG.2023.3293653, 2023.
Gao, L.; Liu, F. L.; Chen, S. Y.; Jiang, K. W.; Li, C. P.; Lai, Y. K.; Fu, H. B. SketchFaceNeRF: Sketch-based facial generation and editing in neural radiance fields. ACM Transactions on Graphics Vol. 42, No. 4, Article No. 159, 2023.
Müller, T.; Evans, A.; Schied, C.; Keller A. Instant neural graphics primitives with a multiresolution hash encoding. ACM Transactions on Graphics Vol. 41, No. 4, Article No. 102, 2022.
Huang, Z. Y.; Peng, Y. C.; Hibino, T.; Zhao, C. Q.; Xie, H. R.; Fukusato, T.; Miyata, K. DualFace: Two-stage drawing guidance for freehand portrait sketching. Computational Visual Media Vol. 8, No. 1, 63–77, 2022.
Liu, X. T.; Wu, W. L.; Li, C. Z.; Li, Y. F.; Wu, H. S. Reference-guided structure-aware deep sketch colorization for cartoons. Computational Visual Media Vol. 8, No. 1, 135–148, 2022.
Xue, Y.; Guo, Y. C.; Zhang, H.; Xu, T.; Zhang, S. H.; Huang, X. L. Deep image synthesis from intuitive user input: A review and perspectives. Computational Visual Media Vol. 8, No. 1, 3–31, 2022.
Zhou, W. Y.; Yang, G. W.; Hu, S. M. Jittor-GAN: A fast-training generative adversarial network model zoo based on Jittor. Computational Visual Media Vol. 7, No. 1, 153–157, 2021.
Sushko, V.; Schönfeld, E.; Zhang, D.; Gall, J.; Schiele, B.; Khoreva, A. OASIS: Only adversarial supervision for semantic image synthesis. International Journal of Computer Vision Vol. 130, No. 12, 2903–2923, 2022.
Chen, A.; Liu, R.; Xie, L.; Chen, Z.; Su, H.; Yu, J. SofGAN: A portrait image generator with dynamic styling. ACM Transactions on Graphics Vol. 41, No. 1, Article No. 1, 2022.
Sun, R. Q.; Huang, C.; Zhu, H. L.; Ma, L. Z. Mask-aware photorealistic facial attribute manipulation. Computational Visual Media Vol. 7, No. 3, 363–374, 2021.
Wang, C.; Tang, F.; Zhang, Y.; Wu, T. R.; Dong, W. M. Towards harmonized regional style transfer and manipulation for facial images. Computational Visual Media Vol. 9, No. 2, 351–366, 2023.
Chen, S. Y.; Su, W. C.; Gao, L.; Xia, S. H.; Fu, H. B. DeepFaceDrawing: Deep generation of face images from sketches. ACM Transactions on Graphics Vol. 39, No. 4, Article No. 72, 2020.
Chen, S. Y.; Liu, F. L.; Lai, Y. K.; Rosin, P. L.; Li, C. P.; Gao, L. DeepFaceEditing: Deep face generation and editing with disentangled geometry and appearance control. ACM Transactions on Graphics Vol. 40, No. 4, Article No. 90, 2021.
Jiang, K. W.; Chen, S. Y.; Fu, H. B.; Gao, L. NeRFFaceLighting: Implicit and disentangled face lighting representation leveraging generative prior in neural radiance fields. ACM Transactions on Graphics Vol. 42, No. 3, Article No. 35, 2023.
Tang, J. S.; Zhang, B.; Yang, B. X.; Zhang, T.; Chen, D.; Ma, L. Z.; Wen, F. 3DFaceShop: Explicitly controllable 3D-aware portrait generation. IEEE Transactions on Visualization and Computer Graphics doi: 10.1109/TVCG.2023.3323578, 2023.