AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
Article Link
Collect
Submit Manuscript
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Survey

A Survey on 360° Images and Videos in Mixed Reality: Algorithms and Applications

School of Engineering and Computer Science, Victoria University of Wellington, Wellington 6012, New Zealand
College of Media Engineering, Communication University of Zhejiang, Hangzhou 310018, China
Department of Computer Science, University of Otago, Dunedin 9054, New Zealand
Show Author Information

Abstract

Mixed reality technologies provide real-time and immersive experiences, which bring tremendous opportunities in entertainment, education, and enriched experiences that are not directly accessible owing to safety or cost. The research in this field has been in the spotlight in the last few years as the metaverse went viral. The recently emerging omnidirectional video streams, i.e., 360° videos, provide an affordable way to capture and present dynamic real-world scenes. In the last decade, fueled by the rapid development of artificial intelligence and computational photography technologies, the research interests in mixed reality systems using 360° videos with richer and more realistic experiences are dramatically increased to unlock the true potential of the metaverse. In this survey, we cover recent research aimed at addressing the above issues in the 360° image and video processing technologies and applications for mixed reality. The survey summarizes the contributions of the recent research and describes potential future research directions about 360° media in the field of mixed reality.

Electronic Supplementary Material

Download File(s)
JCST-2303-13210-Highlights.pdf (168.5 KB)

References

[1]

Friston S, Ritschel T, Steed A. Perceptual rasterization for head-mounted display image synthesis. ACM Trans. Graphics, 2019, 38(4): Article No. 97. DOI: 10.1145/3306346.3323033.

[2]

Tursun O T, Arabadzhiyska-Koleva E, Wernikowski M, Mantiuk R, Seidel H P, Myszkowski K, Didyk P. Luminance-contrast-aware foveated rendering. ACM Trans. Graphics, 2019, 38(4): Article No. 98. DOI: 10.1145/3306346.3322985.

[3]

Schroers C, Bazin J C, Sorkine-Hornung A. An omnistereoscopic video pipeline for capture and display of real-world VR. ACM Trans. Graphics, 2018, 37(3): Article No. 37. DOI: 10.1145/3225150.

[4]

Matzen K, Cohen M F, Evans B, Kopf J, Szeliski R. Low-cost 360 stereo photography and video capture. ACM Trans. Graphics, 2017, 36(4): Article No. 148. DOI: 10.1145/3072959.3073645.

[5]

Habermann M, Xu W P, Zollhöfer M, Pons-Moll G, Theobalt C. LiveCap: Real-time human performance capture from monocular video. ACM Trans. Graphics, 2019, 38(2): Article No. 14. DOI: 10.1145/3311970.

[6]

Xu W P, Chatterjee A, Zollhöfer M, Rhodin H, Mehta D, Seidel H P, Theobalt C. MonoPerfCap: Human performance capture from monocular video. ACM Trans. Graphics, 2018, 37(2): Article No. 27. DOI: 10.1145/3181973.

[7]

Kopf J. 360° video stabilization. ACM Trans. Graphics, 2016, 35(6): Article No. 195. DOI: 10.1145/2980179.2982405.

[8]

Tang C Z, Wang O, Liu F, Tan P. Joint stabilization and direction of 360° videos. ACM Trans. Graphics, 2019, 38(2): Article No. 18. DOI: 10.1145/3211889.

[9]
Silva R M A, Feijó B, Gomes P B, Frensh T, Monteiro D. Real time 360° video stitching and streaming. In Proc. the ACM SIGGRAPH 2016 Posters, Jul. 2016, Article No. 70. DOI: 10.1145/2945078.2945148.
[10]
Li Y H, Barnes C, Huang K, Zhang F L. Deep 360° optical flow estimation based on multi-projection fusion. In Proc. the 17th European Conference on Computer Vision, Oct. 2022, pp.336–352. DOI: 10.1007/978-3-031-19833-5_20.
[11]
Jung R, Lee A S J, Ashtari A, Bazin J C. Deep360Up: A deep learning-based approach for automatic VR image upright adjustment. In Proc. the 2019 IEEE Conference on Virtual Reality and 3D User Interfaces (VR), Mar. 2019. DOI: 10.1109/VR.2019.8798326.
[12]

Li D Z Y, Langlois T R, Zheng C X. Scene-aware audio for 360° videos. ACM Trans. Graphics, 2018, 37(4): Article No. 111. DOI: 10.1145/3197517.3201391.

[13]

Rhee T, Petikam L, Allen B, Chalmers A. MR360: Mixed reality rendering for 360° panoramic videos. IEEE Trans. Visualization and Computer Graphics, 2017, 23(4): 1379–1388. DOI: 10.1109/TVCG.2017.2657178.

[14]

Kang K, Cho S. Interactive and automatic navigation for 360° video playback. ACM Trans. Graphics, 2019, 38(4): Article No. 108. DOI: 10.1145/3306346.3323046.

[15]
Rees D W. Panoramic television viewing system. United States Patent, No. 3505465, 1970.
[16]

Yagi Y, Kawato S, Tsuji S. Real-time omnidirectional image sensor (COPIS) for vision-guided navigation. IEEE Trans. Robotics and Automation, 1994, 10(1): 11–22. DOI: 10.1109/70.285581.

[17]

Gledhill D, Tian G Y, Taylor D, Clarke D. Panoramic imaging—A review. Computers & Graphics, 2003, 27(3): 435–445. DOI: 10.1016/S0097-8493(03)00038-4.

[18]

Yagi Y, Yachida M. Real-time omnidirectional image sensors. International Journal of Computer Vision, 2004, 58(3): 173–207. DOI: 10.1023/B:VISI.0000019684.35147.fc.

[19]
Debevec P. Image-based lighting. In Proc. the ACM SIGGRAPH 2005 Courses, Jul. 2005, Article No. 3-es. DOI: 10.1145/1198555.1198709.
[20]

Tarini M, Hormann K, Cignoni P, Montani C. PolyCube-maps. ACM Trans. Graphics, 2004, 23(3): 853–860. DOI: 10.1145/1015706.1015810.

[21]
McMillan L, Bishop G. Plenoptic modeling: An image-based rendering system. In Proc. the 22nd Annual Conference on Computer Graphics and Interactive Techniques, Aug. 1995, pp.39–46. DOI: 10.1145/218380.218398.
[22]
Hilbert D, Cohn-Vossen S. Geometry and the Imagination, Volume 87. Providence: American Mathematical Soc. , 2021.
[23]

Nadeem S, Su Z Y, Zeng W, Kaufman A, Gu X F. Spherical parameterization balancing angle and area distortions. IEEE Trans. Visualization and Computer Graphics, 2017, 23(6): 1663–1676. DOI: 10.1109/TVCG.2016.2542073.

[24]

Poranne R, Tarini M, Huber S, Panozzo D, Sorkine-Hornung O. Autocuts: Simultaneous distortion and cut optimization for UV mapping. ACM Trans. Graphics, 2017, 36(6): Article No. 215. DOI: 10.1145/3130800.3130845.

[25]
Jin Y L, Liu J H, Wang F X, Cui S G. Where are you looking? A large-scale dataset of head and gaze behavior for 360-degree videos and a pilot study. In Proc. the 30th ACM International Conference on Multimedia, Oct. 2022, pp.1025–1034. DOI: 10.1145/3503161.3548200.
[26]
David E J, Gutiérrez J, Coutrot A, Da Silva M P, Le Callet P. A dataset of head and eye movements for 360° videos. In Proc. the 9th ACM Multimedia Systems Conference, Jun. 2018, pp.432–437. DOI: 10.1145/3204949.3208139.
[27]
Armeni I, Sax S, Zamir A R, Savarese S. Joint 2D-3D-semantic data for indoor scene understanding. arXiv: 1702.01105, 2017. https://arxiv.org/abs/1702.01105, Jul. 2023.
[28]
Coors B, Condurache A P, Geiger A. SphereNet: Learning spherical representations for detection and classification in omnidirectional images. In Proc. the 15th European Conference on Computer Vision, Sept. 2018, pp.525–541. DOI: 10.1007/978-3-030-01240-3_32.
[29]
Zhao Q, Zhu C, Dai F, Ma Y K, Jin G Q, Zhang Y D. Distortion-aware CNNs for spherical images. In Proc. the 27th International Joint Conference on Artificial Intelligence, Jul. 2018, pp.1198–1204.
[30]
Eder M, Shvets M, Lim J, Frahm J M. Tangent images for mitigating spherical distortion. In Proc. the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2020, pp.12423–12431. DOI: 10.1109/CVPR42600.2020.01244.
[31]
Yuan M Z, Christian R. 360° optical flow using tangent images. In Proc. the 32nd British Machine Vision Conference, Nov. 2021.
[32]
Lee Y, Jeong J, Yun J, Cho W, Yoon K J. SpherePHD: Applying CNNs on a spherical PolyHeDron representation of 360° images. In Proc. the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2019, pp.9173–9181. DOI: 10.1109/CVPR.2019.00940.
[33]
Zhang C, Liwicki S, Smith W, Cipolla R. Orientation-aware semantic segmentation on icosahedron spheres. In Proc. the 2019 IEEE/CVF International Conference on Computer Vision, Oct. 27–Nov. 2, 2019, pp.3532–3540. DOI: 10.1109/ICCV.2019.00363.
[34]
Yoon Y, Chung I, Wang L, Yoon K J. SphereSR: 360° image super-resolution with arbitrary projection via continuous spherical image representation. In Proc. the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2022, pp.5667–5676. DOI: 10.1109/CVPR52688.2022.00559.
[35]

Wu G, Shi Y H, Sun X Y, Wang J, Yin B C. SMSIR: Spherical measure based spherical image representation. IEEE Trans. Image Processing, 2021, 30: 6377–6391. DOI: 10.1109/TIP.2021.3079797.

[36]
Li J S, Wen Z Y, Li S H, Zhao Y K, Guo B C, Wen J T. Novel tile segmentation scheme for omnidirectional video. In Proc. the 2016 IEEE International Conference on Image Processing (ICIP), Sept. 2016, pp.370–374. DOI: 10.1109/ICIP.2016.7532381.
[37]
Cheng H T, Chao C H, Dong J D, Wen H K, Liu T L, Sun M. Cube padding for weakly-supervised saliency prediction in 360° videos. In Proc. the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2018, pp.1420–1429. DOI: 10.1109/CVPR.2018.00154.
[38]

Monroy R, Lutz S, Chalasani T, Smolic A. SalNet360: Saliency maps for omni-directional images with CNN. Signal Processing: Image Communication, 2018, 69: 26–34. DOI: 10.1016/j.image.2018.05.005.

[39]

Wang F E, Yeh Y H, Tsai Y H, Chiu W C, Sun M. BiFuse++: Self-supervised and efficient bi-projection fusion for 360° depth estimation. IEEE Trans. Pattern Analysis and Machine Intelligence, 2023, 45(5): 5448–5460. DOI: 10.1109/TPAMI.2022.3203516.

[40]
Wang F E, Yeh Y H, Sun M, Chiu W C, Tsai Y H. BiFuse: Monocular 360 depth estimation via bi-projection fusion. In Proc. the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2020, pp.459–468. DOI: 10.1109/CVPR42600.2020.00054.
[41]
Li Y Y, Guo Y L, Yan Z X, Huang X Y, Duan Y, Ren L. OmniFusion: 360 monocular depth estimation via geometry-aware fusion. In Proc. the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2022, pp.2791–2800. DOI: 10.1109/CVPR52688.2022.00282.
[42]
Sun C, Sun M, Chen H T. HoHoNet: 360 indoor holistic understanding with latent horizontal features. In Proc. the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2021, pp.2573–2582. DOI: 10.1109/CVPR46437.2021.00260.
[43]

Yang K L, Hu X X, Fang Y C, Wang K W, Stiefelhagen R. Omnisupervised omnidirectional semantic segmentation. IEEE Trans. Intelligent Transportation Systems, 2022, 23(2): 1184–1199. DOI: 10.1109/TITS.2020.3023331.

[44]
Zhang J M, Yang K L, Ma C X, Reiß S, Peng K Y, Stiefelhagen R. Bending reality: Distortion-aware transformers for adapting to panoramic semantic segmentation. In Proc. the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2022, pp.16896–16906. DOI: 10.1109/CVPR52688.2022.01641.
[45]
Defferrard M, Milani M, Gusset F, Perraudin N. DeepSphere: A graph-based spherical CNN. In Proc. the 8th International Conference on Learning Representations, Apr. 2019.
[46]
Ros G, Sellart L, Materzynska J, Vazquez D, Lopez A M. The Synthia dataset: A large collection of synthetic images for semantic segmentation of urban scenes. In Proc. the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Jun. 2016, pp.3234–3243. DOI: 10.1109/CVPR.2016.352.
[47]

Yang K L, Hu X X, Bergasa L M, Romera E, Wang K W. PASS: Panoramic annular semantic segmentation. IEEE Trans. Intelligent Transportation Systems, 2020, 21(10): 4171–4185. DOI: 10.1109/TITS.2019.2938965.

[48]
Su Y C, Grauman K. Learning spherical convolution for fast features from 360° imagery. In Proc. the 31st International Conference on Neural Information Processing Systems, Dec. 2017, pp.529–539.
[49]
Zhao P Y, You A S, Zhang Y X, Liu J Y, Bian K G, Tong Y H. Spherical criteria for fast and accurate 360° object detection. In Proc. the 34th AAAI Conference on Artificial Intelligence, Feb. 2020, pp.12959–12966. DOI: 10.1609/aaai.v34i07.6995.
[50]
Cao M, Ikehata S, Aizawa K. Field-of-view IoU for object detection in 360° images. arXiv: 2202.03176, 2022. https://arxiv.org/abs/2202.03176, Jul. 2023.
[51]
Chou S H, Sun C, Chang W Y, Hsu W T, Sun M, Fu J D. 360-indoor: Towards learning real-world objects in 360° indoor equirectangular images. In Proc. the 2020 IEEE Winter Conference on Applications of Computer Vision, Mar. 2020, pp.834–842. DOI: 10.1109/WACV45572.2020.9093262.
[52]
Guerrero-Viu J, Fernandez-Labrador C, Demonceaux C, Guerrero J J. What’s in my room? Object recognition on indoor panoramic images. In Proc. the 2020 IEEE International Conference on Robotics and Automation (ICRA), May 2020, pp.567–573. DOI: 10.1109/ICRA40945.2020.9197335.
[53]
Zhang Z H, Xu Y Y, Yu J Y, Gao S H. Saliency detection in 360° videos. In Proc. the 15th European Conference on Computer Vision, Sept. 2018, pp.504–520. DOI: 10.1007/978-3-030-01234-2_30.
[54]

Ma G X, Li S, Chen C L Z, Hao A M, Qin H. Stage-wise salient object detection in 360° omnidirectional image via object-level semantical saliency ranking. IEEE Trans. Visualization and Computer Graphics, 2020, 26(12): 3535–3545. DOI: 10.1109/TVCG.2020.3023636.

[55]
Hu H N, Lin Y C, Liu M Y, Cheng H T, Chang Y J, Sun M. Deep 360 pilot: Learning a deep agent for piloting through 360° sports videos. In Proc. the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jul. 2017, pp.1396–1405. DOI: 10.1109/CVPR.2017.153.
[56]
Dahou Y, Tliba M, McGuinness K, O’Connor N. ATSal: An attention based architecture for saliency prediction in 360° videos. In Proc. the 26th International Conference on Pattern Recognition, Jan. 2021, pp.305–320. DOI: 10.1007/978-3-030-68796-0_22.
[57]

Qiao M L, Xu M, Wang Z L, Borji A. Viewport-dependent saliency prediction in 360° video. IEEE Trans. Multimedia, 2021, 23: 748–760. DOI: 10.1109/TMM.2020.2987682.

[58]

Chao F Y, Zhang L, Hamidouche W, Déforges O. A multi-FoV viewport-based visual saliency model using adaptive weighting losses for 360° images. IEEE Trans. Multimedia, 2021, 23: 1811–1826. DOI: 10.1109/TMM.2020.3003642.

[59]

Zhang Y, Chao F Y, Hamidouche W, Deforges O. PAV-SOD: A new task towards panoramic audiovisual saliency detection. ACM Trans. Multimedia Computing, Communications, and Applications, 2023, 19(3): Article No. 101. DOI: 10.1145/3565267.

[60]

Jiang H L, Sheng Z, Zhu S Y, Dong Z L, Huang R. UniFuse: Unidirectional fusion for 360° panorama depth estimation. IEEE Robotics and Automation Letters, 2021, 6(2): 1519–1526. DOI: 10.1109/LRA.2021.3058957.

[61]
Zhuang C Q, Lu Z D, Wang Y Q, Xiao J, Wang Y. ACDNet: Adaptively combined dilated convolution for monocular panorama depth estimation. In Proc. the 36th AAAI Conference on Artificial Intelligence, Feb. 22–Mar. 1, 2022, pp.3653–3661. DOI: 10.1609/aaai.v36i3.20278.
[62]
Feng Q, Shum H P H, Morishima S. 360 depth estimation in the wild—The depth360 dataset and the SegFuse network. In Proc. the 2022 IEEE Conference on Virtual Reality and 3D User Interfaces (VR), Mar. 2022, pp.664–673. DOI: 10.1109/VR51125.2022.00087.
[63]
Rey-Area M, Yuan M Z, Richardt C. 360MonoDepth: High-resolution 360° monocular depth estimation. In Proc. the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2022, pp.3752–3762. DOI: 10.1109/CVPR52688.2022.00374.
[64]

Serrano A, Kim I, Chen Z L, DiVerdi S, Gutierrez D, Hertzmann A, Masia B. Motion parallax for 360° RGBD video. IEEE Trans. Visualization and Computer Graphics, 2019, 25(5): 1817–1827. DOI: 10.1109/TVCG.2019.2898757.

[65]
Won C, Ryu J, Lim J. SweepNet: Wide-baseline omnidirectional depth estimation. In Proc. the 2019 International Conference on Robotics and Automation, May 2019, pp.6073–6079. DOI: 10.1109/ICRA.2019.8793823.
[66]
Wang N H, Solarte B, Tsai Y H, Chiu W C, Sun M. 360SD-net: 360° stereo depth estimation with learnable cost volume. In Proc. the 2020 IEEE International Conference on Robotics and Automation (ICRA), May 2020, pp.582–588. DOI: 10.1109/ICRA40945.2020.9196975.
[67]
Teed Z, Deng J. RAFT: Recurrent all-pairs field transforms for optical flow. arXiv: 2003.12039, 2020. https://arxiv.org/abs/2003.12039, Jul. 2023.
[68]
Sun D Q, Yang X D, Liu M Y, Kautz J. PWC-net: CNNs for optical flow using pyramid, warping, and cost volume. In Proc. the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2018, pp.8934–8943. DOI: 10.1109/CVPR.2018.00931.
[69]
Bhandari K, Zong Z L, Yan Y. Revisiting optical flow estimation in 360 videos. In Proc. the 25th International Conference on Pattern Recognition (ICPR), Jan. 2021, pp.8196–8203. DOI: 10.1109/ICPR48806.2021.9412035.
[70]

Martin D, Serrano A, Bergman A W, Wetzstein G, Masia B. ScanGAN360: A generative model of realistic scanpaths for 360° images. IEEE Trans. Visualization and Computer Graphics, 2022, 28(5): 2003–2013. DOI: 10.1109/TVCG.2022.3150502.

[71]
Yu M, Lakshman H, Girod B. A framework to evaluate omnidirectional video coding schemes. In Proc. the 2015 IEEE International Symposium on Mixed and Augmented Reality, Sept. 29–Oct. 3, 2015, pp.31–36. DOI: 10.1109/ISMAR.2015.12.
[72]
Wang S B, Yang S S, Li H L, Zhang X D, Zhou C, Xu C R, Qian F, Wang N B, Xu Z B. SalientVR: Saliency-driven mobile 360-degree video streaming with gaze information. In Proc. the 28th Annual International Conference on Mobile Computing and Networking, Oct. 2022, pp.542–555. DOI: 10.1145/3495243.3517018.
[73]
Xu Y Y, Dong Y B, Wu J R, Sun Z Z, Shi Z R, Yu J Y, Gao S H. Gaze prediction in dynamic 360° immersive videos. In Proc. the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2018, pp.5333–5342. DOI: 10.1109/CVPR.2018.00559.
[74]

Yang L, Xu M, Guo Y C, Deng X, Gao F Y, Guan Z Y. Hierarchical Bayesian LSTM for head trajectory prediction on omnidirectional images. IEEE Trans. Pattern Analysis and Machine Intelligence, 2022, 44(11): 7563–7580. DOI: 10.1109/TPAMI.2021.3117019.

[75]

Rondón M, Sassatelli L, Aparicio-Pardo R, Precioso F. TRACK: A new method from a re-examination of deep architectures for head motion prediction in 360° videos. IEEE Trans. Pattern Analysis and Machine Intelligence, 2022, 44(9): 5681–5699. DOI: 10.1109/TPAMI.2021.3070520.

[76]
Assens M, Giro-i-Nieto X, McGuinness K, O’Connor N E. SaltiNet: Scan-path prediction on 360 degree images using saliency volumes. In Proc. the 2017 IEEE International Conference on Computer Vision Workshops, Oct. 2017, pp.2331–2338. DOI: 10.1109/ICCVW.2017.275.
[77]
de Belen R A J, Bednarz T, Sowmya A. ScanpathNet: A recurrent mixture density network for scanpath prediction. In Proc. the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2022, pp.5006–5016. DOI: 10.1109/CVPRW56347.2022.00549.
[78]

Griffin R, Langlotz T, Zollmann S. 6DIVE: 6 degrees-of-freedom immersive video editor. Frontiers in Virtual Reality, 2021, 2: 676895. DOI: 10.3389/frvir.2021.676895.

[79]

Xu M, Li C, Zhang S Y, Le Callet P. State-of-the-art in 360° video/image processing: Perception, assessment and compression. IEEE Journal of Selected Topics in Signal Processing, 2020, 14(1): 5–26. DOI: 10.1109/JSTSP.2020.2966864.

[80]
Barron J T, Mildenhall B, Verbin D, Srinivasan P P, Hedman P. Mip-NeRF 360: Unbounded anti-aliased neural radiance fields. In Proc. the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2022, pp.5460–5469. DOI: 10.1109/CVPR52688.2022.00539.
[81]
Huang J W, Chen Z L, Ceylan D, Jin H L. 6-DOF VR videos with a single 360-camera. In Proc. the 2017 IEEE Virtual Reality (VR), Mar. 2017, pp.37–44. DOI: 10.1109/VR.2017.7892229.
[82]
Baker L, Mills S, Zollmann S, Ventura J. CasualStereo: Casual capture of stereo panoramas with spherical structure-from-motion. In Proc. the 2020 IEEE Conference on Virtual Reality and 3D User Interfaces (VR), Mar. 2020, pp.782–790. DOI: 10.1109/VR46266.2020.00102.
[83]
Chen R S, Zhang F L, Finnie S, Chalmers A, Rhee T. Casual 6-DoF: Free-viewpoint panorama using a handheld 360° camera. IEEE Trans. Visualization and Computer Graphics, 2022: 1. DOI: 10.1109/TVCG.2022.3176832.
[84]
Waidhofer J, Gadgil R, Dickson A, Zollmann S, Ventura J. PanoSynthVR: Toward light-weight 360-degree view synthesis from a single panoramic input. In Proc. the 2022 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), Oct. 2022, pp.584–592. DOI: 10.1109/ISMAR55827.2022.00075.
[85]
DiVerdi S, Wither J, Hollerer T. Envisor: Online environment map construction for mixed reality. In Proc. the 2008 IEEE Virtual Reality Conference, Mar. 2008, pp.19–26. DOI: 10.1109/VR.2008.4480745.
[86]

Park J, Park H, Yoon S E, Woo W. Physically-inspired deep light estimation from a homogeneous-material object for mixed reality lighting. IEEE Trans. Visualization and Computer Graphics, 2020, 26(5): 2002–2011. DOI: 10.1109/TVCG.2020.2973050.

[87]

Georgoulis S, Rematas K, Ritschel T, Gavves E, Fritz M, Van Gool L, Tuytelaars T. Reflectance and natural illumination from single-material specular objects using deep learning. IEEE Trans. Pattern Analysis and Machine Intelligence, 2018, 40(8): 1932–1947. DOI: 10.1109/TPAMI.2017.2742999.

[88]
Wei X, Chen G J, Dong Y, Lin S, Tong X. Object-based illumination estimation with rendering-aware neural networks. In Proc. the 16th European Conference on Computer Vision, Aug. 2020, pp.380–396. DOI: 10.1007/978-3-030-58555-6_23.
[89]
Hold-Geoffroy Y, Sunkavalli K, Hadap S, Gambaretto E, Lalonde J F. Deep outdoor illumination estimation. In Proc. the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Jul. 2017, pp.2373–2382. DOI: 10.1109/CVPR.2017.255.
[90]
Zhang J S, Sunkavalli K, Hold-Geoffroy Y, Hadap S, Eisenman J, Lalonde J F. All-weather deep outdoor lighting estimation. In Proc. the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2019, pp.10150–10158. DOI: 10.1109/CVPR.2019.01040.
[91]
Hold-Geoffroy Y, Athawale A, Lalonde J F. Deep sky modeling for single image outdoor lighting estimation. In Proc. the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2019, pp.6920–6928. DOI: 10.1109/CVPR.2019.00709.
[92]
Yu P P, Guo J, Huang F, Zhou C, Che H W, Ling X, Guo Y W. Hierarchical disentangled representation learning for outdoor illumination estimation and editing. In Proc. the 2021 IEEE/CVF International Conference on Computer Vision, Oct. 2021, pp.15293–15302. DOI: 10.1109/ICCV48922.2021.01503.
[93]
Zhu Y J, Zhang Y D, Li S, Shi B X. Spatially-varying outdoor lighting estimation from intrinsics. In Proc. the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2021, pp.12829–12837. DOI: 10.1109/CVPR46437.2021.01264.
[94]
Tang J J, Zhu Y J, Wang H Y, Chan J H, Li S, Shi B X. Estimating spatially-varying lighting in urban scenes with disentangled representation. In Proc. the 17th European Conference on Computer Vision, Oct. 2022, pp.454–469. DOI: 10.1007/978-3-031-20068-7_26.
[95]

Gardner M A, Sunkavalli K, Yumer E, Shen X H, Gambaretto E, Gagné C, Lalonde J F. Learning to predict indoor illumination from a single image. ACM Trans. Graphics, 2017, 36(6): Article No. 176. DOI: 10.1145/3130800.3130891.

[96]
Song S R, Funkhouser T. Neural Illumination: Lighting prediction for indoor environments. In Proc. the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2019, pp.6911–6919. DOI: 10.1109/CVPR.2019.00708.
[97]
Srinivasan P P, Mildenhall B, Tancik M, Barron J T, Tucker R, Snavely N. Lighthouse: Predicting lighting volumes for spatially-coherent illumination. In Proc. the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2020, pp.8077–8086. DOI: 10.1109/CVPR42600.2020.00810.
[98]

Chalmers A, Zhao J H, Medeiros D, Rhee T. Reconstructing reflection maps using a stacked-CNN for mixed reality rendering. IEEE Trans. Visualization and Computer Graphics, 2021, 27(10): 4073–4084. DOI: 10.1109/TVCG.2020.3001917.

[99]

Zhao J H, Chalmers A, Rhee T. Adaptive light estimation using dynamic filtering for diverse lighting conditions. IEEE Trans. Visualization and Computer Graphics, 2021, 27(11): 4097–4106. DOI: 10.1109/TVCG.2021.3106497.

[100]
Zhan F N, Zhang C G, Yu Y C, Chang Y, Lu S J, Ma F Y, Xie X S. EMLight: Lighting estimation via spherical distribution approximation. In Proc. the 35th AAAI Conference on Artificial Intelligence, Feb. 2021, pp.3287–3295. DOI: 10.1609/aaai.v35i4.16440.
[101]
Zhan F N, Yu Y C, Wu R L, Zhang C G, Lu S J, Shao L, Ma F Y, Xie X S. GMLight: Lighting estimation via geometric distribution approximation. arXiv: 2102.10244, 2021. https://arxiv.org/abs/2102.10244v1, Jul. 2023.
[102]
Xu J P, Zuo C Y, Zhang F L, Wang M. Rendering-aware HDR environment map prediction from a single image. In Proc. the 36th AAAI Conference on Artificial Intelligence, Feb. 22–Mar. 1, 2022, pp.2857–2865. DOI: 10.1609/aaai.v36i3.20190.
[103]
Akimoto N, Matsuo Y, Aoki Y. Diverse plausible 360-degree image outpainting for efficient 3DCG background creation. In Proc. the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2022, pp.11431–11440. DOI: 10.1109/CVPR52688.2022.01115.
[104]
Somanath G, Kurz D. HDR environment map estimation for real-time augmented reality. In Proc. the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2021, pp.11293–11301. DOI: 10.1109/CVPR46437.2021.01114.
[105]
Wagner D, Mulloni A, Langlotz T, Schmalstieg D. Real-time panoramic mapping and tracking on mobile phones. In Proc. the 2010 IEEE Virtual Reality Conference (VR), Mar. 2010, pp.211–218. DOI: 10.1109/VR.2010.5444786.
[106]
Gauglitz S, Sweeney C, Ventura J, Turk M, Hlerer T. Live tracking and mapping from both general and rotation-only camera motion. In Proc. the 2012 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), Nov. 2012, pp.13–22. DOI: 10.1109/ISMAR.2012.6402532.
[107]

Gauglitz S, Sweeney C, Ventura J, Turk M, Hollerer T. Model estimation and selection towards unconstrained real-time tracking and mapping. IEEE Trans. Visualization and Computer Graphics, 2014, 20(6): 825–838. DOI: 10.1109/TVCG.2013.243.

[108]
Pirchheim C, Schmalstieg D, Reitmayr G. Handling pure camera rotation in keyframe-based SLAM. In Proc. the 2013 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), Oct. 2013, pp.229–238. DOI: 10.1109/ISMAR.2013.6671783.
[109]
Baker L, Ventura J, Zollmann S, Mills S, Langlotz T. SPLAT: Spherical localization and tracking in large spaces. In Proc. the 2020 IEEE Conference on Virtual Reality and 3D User Interfaces (VR), Mar. 2020, pp.809–817. DOI: 10.1109/VR46266.2020.00105.
[110]
Pan Q, Arth C, Reitmayr G, Rosten E, Drummond T. Rapid scene reconstruction on mobile phones from panoramic images. In Proc. the 10th IEEE International Symposium on Mixed and Augmented Reality, Oct. 2011, pp.55–64. DOI: 10.1109/ISMAR.2011.6092370.
[111]
Arth C, Klopschitz M, Reitmayr G, Schmalstieg D. Real-time self-localization from panoramic images on mobile devices. In Proc. the 10th IEEE International Symposium on Mixed and Augmented Reality, Oct. 2011, pp.37–46. DOI: 10.1109/ISMAR.2011.6092368.
[112]
Reinisch G, Arth C, Schmalstieg D. Panoramic mapping on a mobile phone GPU. In Proc. the 2013 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), Oct. 2013, pp.291–292. DOI: 10.1109/ISMAR.2013.6671810.
[113]
Pece F, Steptoe W, Wanner F, Julier S, Weyrich T, Kautz J, Steed A. PanoInserts: Mobile spatial teleconferencing. In Proc. the 2013 SIGCHI Conference on Human Factors in Computing Systems, Apr. 2013, pp.1319–1328. DOI: 10.1145/2470654.2466173.
[114]

Speicher M, Cao J C, Yu A, Zhang H H, Nebeling M. 360Anywhere: Mobile ad-hoc collaboration in any environment using 360 video and augmented reality. Proceedings of the ACM on Human-Computer Interaction, 2018, 2(EICS): 9. DOI: 10.1145/3229091.

[115]
Piumsomboon T, Lee G A, Irlitti A, Ens B, Thomas B H, Billinghurst M. On the shoulder of the giant: A multi-scale mixed reality collaboration with 360 video sharing and tangible interaction. In Proc. the 2019 CHI Conference on Human Factors in Computing Systems, May 2019, Article No. 228. DOI: 10.1145/3290605.3300458.
[116]
Teo T, Lawrence L, Lee G A, Billinghurst M, Adcock M. Mixed reality remote collaboration combining 360 video and 3D reconstruction. In Proc. the 2019 CHI Conference on Human Factors in Computing Systems, May 2019, Article No. 201. DOI: 10.1145/3290605.3300431.
[117]

Wang P, Bai X L, Billinghurst M, Zhang S S, Zhang X Y, Wang S X, He W P, Yan Y X, Ji H Y. AR/MR remote collaboration on physical tasks: A review. Robotics and Computer-Integrated Manufacturing, 2021, 72: 102071. DOI: 10.1016/j.rcim.2020.102071.

[118]
Nebeling M, Madier K. 360proto: Making interactive virtual reality & augmented reality prototypes from paper. In Proc. the 2019 CHI Conference on Human Factors in Computing Systems, May 2019, Article No. 596. DOI: 10.1145/3290605.3300826.
[119]

Zhu Z, Martin R R, Hu S M. Panorama completion for street views. Computational Visual Media, 2015, 1(1): 49–57. DOI: 10.1007/s41095-015-0008-2.

[120]

He K M, Sun J. Image completion approaches using the statistics of similar patches. IEEE Trans. Pattern Analysis and Machine Intelligence, 2014, 36(12): 2423–2435. DOI: 10.1109/TPAMI.2014.2330611.

[121]

Xu B B, Pathak S, Fujii H, Yamashita A, Asama H. Spatio-temporal video completion in spherical image sequences. IEEE Robotics and Automation Letters, 2017, 2(4): 2032–2039. DOI: 10.1109/LRA.2017.2718106.

[122]
Zhao Q, Wan L, Feng W, Zhang J W, Wong T T. 360 panorama cloning on sphere. arXiv: 1709.01638, 2017.https://arxiv.org/abs/1709.01638, Jul. 2023.
[123]
Huang K, Zhang F L, Zhao J H, Li Y H, Dodgson N. 360° stereo image composition with depth adaption. arXiv: 2212.10062, 2022. https://arxiv.org/abs/2212.10062, Jul. 2023.
[124]

Jung J, Kim B, Lee J Y, Kim B, Lee S. Robust upright adjustment of 360 spherical panoramas. The Visual Computer, 2017, 33(6): 737–747. DOI: 10.1007/s00371-017-1368-7.

[125]

Zhang Y, Zhang F L, Lai Y K, Zhu Z. Efficient propagation of sparse edits on 360° panoramas. Computers & Graphics, 2021, 96: 61–70. DOI: 10.1016/j.cag.2021.03.005.

[126]

Zhang Y, Zhang F L, Zhu Z, Wang L D, Jin Y. Fast edit propagation for 360 degree panoramas using function interpolation. IEEE Access, 2022, 10: 43882–43894. DOI: 10.1109/ACCESS.2022.3168665.

[127]
Wong K M. View-adaptive asymmetric image detail enhancement for 360-degree stereoscopic VR content. In Proc. the 2022 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW), Mar. 2022, pp.23–26. DOI: 10.1109/VRW55335.2022.00012.
[128]
Wang M, Li Y J, Zhang W X, Richardt C, Hu S M. Transitioning360: Content-aware NFOV virtual camera paths for 360° video playback. In Proc. the 2020 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), Nov. 2020, pp.185–194. DOI: 10.1109/ISMAR50242.2020.00040.
[129]
Li Y J, Shi J C, Zhang F L, Wang M. Bullet comments for 360°video. In Proc. the 2022 IEEE Conference on Virtual Reality and 3D User Interfaces (VR), Mar. 2022. DOI: 10.1109/VR51125.2022.00017.
Journal of Computer Science and Technology
Pages 473-491
Cite this article:
Zhang F, Zhao J, Zhang Y, et al. A Survey on 360° Images and Videos in Mixed Reality: Algorithms and Applications. Journal of Computer Science and Technology, 2023, 38(3): 473-491. https://doi.org/10.1007/s11390-023-3210-1

443

Views

2

Crossref

2

Web of Science

3

Scopus

0

CSCD

Altmetrics

Received: 06 March 2023
Accepted: 24 May 2023
Published: 30 May 2023
© Institute of Computing Technology, Chinese Academy of Sciences 2023
Return