A Survey on 360° Images and Videos in Mixed Reality: Algorithms and Applications

Zhang, Fanglue; Zhao, Junhong; Zhang, Yun; Zollmann, Stefanie

doi:10.1007/s11390-023-3210-1

A Survey on 360° Images and Videos in Mixed Reality: Algorithms and Applications

Survey
Published: 30 May 2023

Volume 38, pages 473–491, (2023)
Cite this article

Journal of Computer Science and Technology Aims and scope Submit manuscript

Fanglue Zhang¹,
Junhong Zhao¹,
Yun Zhang² &
…
Stefanie Zollmann³

389 Accesses
1 Altmetric
Explore all metrics

Abstract

Mixed reality technologies provide real-time and immersive experiences, which bring tremendous opportunities in entertainment, education, and enriched experiences that are not directly accessible owing to safety or cost. The research in this field has been in the spotlight in the last few years as the metaverse went viral. The recently emerging omnidirectional video streams, i.e., 360° videos, provide an affordable way to capture and present dynamic real-world scenes. In the last decade, fueled by the rapid development of artificial intelligence and computational photography technologies, the research interests in mixed reality systems using 360° videos with richer and more realistic experiences are dramatically increased to unlock the true potential of the metaverse. In this survey, we cover recent research aimed at addressing the above issues in the 360° image and video processing technologies and applications for mixed reality. The survey summarizes the contributions of the recent research and describes potential future research directions about 360° media in the field of mixed reality.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Friston S, Ritschel T, Steed A. Perceptual rasterization for head-mounted display image synthesis. ACM Trans. Graphics, 2019, 38(4): Article No. 97. https://doi.org/10.1145/3306346.3323033.
Tursun O T, Arabadzhiyska-Koleva E, Wernikowski M, Mantiuk R, Seidel H P, Myszkowski K, Didyk P. Luminance-contrast-aware foveated rendering. ACM Trans. Graphics, 2019, 38(4): Article No. 98. https://doi.org/10.1145/3306346.3322985.
Schroers C, Bazin J C, Sorkine-Hornung A. An omnistereoscopic video pipeline for capture and display of real-world VR. ACM Trans. Graphics, 2018, 37(3): Article No. 37. https://doi.org/10.1145/3225150.
Matzen K, Cohen M F, Evans B, Kopf J, Szeliski R. Low-cost 360 stereo photography and video capture. ACM Trans. Graphics, 2017, 36(4): Article No. 148. https://doi.org/10.1145/3072959.3073645.
Habermann M, Xu W P, Zollhöfer M, Pons-Moll G, Theobalt C. LiveCap: Real-time human performance capture from monocular video. ACM Trans. Graphics, 2019, 38(2): Article No. 14. https://doi.org/10.1145/3311970.
Xu W P, Chatterjee A, Zollhöfer M, Rhodin H, Mehta D, Seidel H P, Theobalt C. MonoPerfCap: Human performance capture from monocular video. ACM Trans. Graphics, 2018, 37(2): Article No. 27. https://doi.org/10.1145/3181973.
Kopf J. 360° video stabilization. ACM Trans. Graphics, 2016, 35(6): Article No. 195. https://doi.org/10.1145/2980179.2982405.
Tang C Z, Wang O, Liu F, Tan P. Joint stabilization and direction of 360° videos. ACM Trans. Graphics, 2019, 38(2): Article No. 18. https://doi.org/10.1145/3211889.
Silva R M A, Feijó B, Gomes P B, Frensh T, Monteiro D. Real time 360° video stitching and streaming. In Proc. the ACM SIGGRAPH 2016 Posters, Jul. 2016, Article No. 70. https://doi.org/10.1145/2945078.2945148.
Li Y H, Barnes C, Huang K, Zhang F L. Deep 360° optical flow estimation based on multi-projection fusion. In Proc. the 17th European Conference on Computer Vision, Oct. 2022, pp.336–352. https://doi.org/10.1007/978-3-031-19833-5_20.
Jung R, Lee A S J, Ashtari A, Bazin J C. Deep360Up: A deep learning-based approach for automatic VR image upright adjustment. In Proc. the 2019 IEEE Conference on Virtual Reality and 3D User Interfaces (VR), Mar. 2019. https://doi.org/10.1109/VR.2019.8798326.
Li D Z Y, Langlois T R, Zheng C X. Scene-aware audio for 360° videos. ACM Trans. Graphics, 2018, 37(4): Article No. 111. https://doi.org/10.1145/3197517.3201391.
Rhee T, Petikam L, Allen B, Chalmers A. MR360: Mixed reality rendering for 360° panoramic videos. IEEE Trans. Visualization and Computer Graphics, 2017, 23(4): 1379–1388. https://doi.org/10.1109/TVCG.2017.2657178.
Article Google Scholar
Kang K, Cho S. Interactive and automatic navigation for 360° video playback. ACM Trans. Graphics, 2019, 38(4): Article No. 108. https://doi.org/10.1145/3306346.3323046.
Rees D W. Panoramic television viewing system. United States Patent, No. 3505465, 1970.
Yagi Y, Kawato S, Tsuji S. Real-time omnidirectional image sensor (COPIS) for vision-guided navigation. IEEE Trans. Robotics and Automation, 1994, 10(1): 11–22. https://doi.org/10.1109/70.285581.
Article Google Scholar
Gledhill D, Tian G Y, Taylor D, Clarke D. Panoramic imaging—A review. Computers & Graphics, 2003, 27(3): 435–445. https://doi.org/10.1016/S0097-8493(03)00038-4.
Article Google Scholar
Yagi Y, Yachida M. Real-time omnidirectional image sensors. International Journal of Computer Vision, 2004, 58(3): 173–207. https://doi.org/10.1023/B:VISI.0000019684.35147.fc.
Article Google Scholar
Debevec P. Image-based lighting. In Proc. the ACM SIGGRAPH 2005 Courses, Jul. 2005, Article No. 3-es. https://doi.org/10.1145/1198555.1198709.
Tarini M, Hormann K, Cignoni P, Montani C. Poly-Cube-maps. ACM Trans. Graphics, 2004, 23(3): 853–860. https://doi.org/10.1145/1015706.1015810.
Article Google Scholar
McMillan L, Bishop G. Plenoptic modeling: An image-based rendering system. In Proc. the 22nd Annual Conference on Computer Graphics and Interactive Techniques, Aug. 1995, pp.39–46. https://doi.org/10.1145/218380.218398.
Hilbert D, Cohn-Vossen S. Geometry and the Imagination, Volume 87. Providence: American Mathematical Soc. , 2021.
Nadeem S, Su Z Y, Zeng W, Kaufman A, Gu X F. Spherical parameterization balancing angle and area distortions. IEEE Trans. Visualization and Computer Graphics, 2017, 23(6): 1663–1676. https://doi.org/10.1109/TVCG.2016.2542073.
Article Google Scholar
Poranne R, Tarini M, Huber S, Panozzo D, Sorkine-Hornung O. Autocuts: Simultaneous distortion and cut optimization for UV mapping. ACM Trans. Graphics, 2017, 36(6): Article No. 215. https://doi.org/10.1145/3130800.3130845.
Jin Y L, Liu J H, Wang F X, Cui S G. Where are you looking? A large-scale dataset of head and gaze behavior for 360-degree videos and a pilot study. In Proc. the 30th ACM International Conference on Multimedia, Oct. 2022, pp.1025–1034. https://doi.org/10.1145/3503161.3548200.
David E J, Gutiérrez J, Coutrot A, Da Silva M P, Le Callet P. A dataset of head and eye movements for 360° videos. In Proc. the 9th ACM Multimedia Systems Conference, Jun. 2018, pp.432–437. https://doi.org/10.1145/3204949.3208139.
Armeni I, Sax S, Zamir A R, Savarese S. Joint 2D-3D-semantic data for indoor scene understanding. arXiv: 1702.01105, 2017. https://arxiv.org/abs/1702.01105, Jul. 2023.
Coors B, Condurache A P, Geiger A. SphereNet: Learning spherical representations for detection and classification in omnidirectional images. In Proc. the 15th European Conference on Computer Vision, Sept. 2018, pp.525–541. https://doi.org/10.1007/978-3-030-01240-3_32.
Zhao Q, Zhu C, Dai F, Ma Y K, Jin G Q, Zhang Y D. Distortion-aware CNNs for spherical images. In Proc. the 27th International Joint Conference on Artificial Intelligence, Jul. 2018, pp.1198–1204.
Eder M, Shvets M, Lim J, Frahm J M. Tangent images for mitigating spherical distortion. In Proc. the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2020, pp.12423–12431. https://doi.org/10.1109/CVPR42600.2020.01244.
Yuan M Z, Christian R. 360° optical flow using tangent images. In Proc. the 32nd British Machine Vision Conference, Nov. 2021.
Lee Y, Jeong J, Yun J, Cho W, Yoon K J. SpherePHD: Applying CNNs on a spherical PolyHeDron representation of 360° images. In Proc. the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2019, pp.9173–9181. https://doi.org/10.1109/CVPR.2019.00940.
Zhang C, Liwicki S, Smith W, Cipolla R. Orientation-aware semantic segmentation on icosahedron spheres. In Proc. the 2019 IEEE/CVF International Conference on Computer Vision, Oct. 27–Nov. 2, 2019, pp.3532–3540. https://doi.org/10.1109/ICCV.2019.00363.
Yoon Y, Chung I, Wang L, Yoon K J. SphereSR: 360° image super-resolution with arbitrary projection via continuous spherical image representation. In Proc. the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2022, pp.5667–5676. https://doi.org/10.1109/CVPR52688.2022.00559.
Wu G, Shi Y H, Sun X Y, Wang J, Yin B C. SMSIR: Spherical measure based spherical image representation. IEEE Trans. Image Processing, 2021, 30: 6377–6391. https://doi.org/10.1109/TIP.2021.3079797.
Article MathSciNet Google Scholar
Li J S, Wen Z Y, Li S H, Zhao Y K, Guo B C, Wen J T. Novel tile segmentation scheme for omnidirectional video. In Proc. the 2016 IEEE International Conference on Image Processing (ICIP), Sept. 2016, pp.370–374. https://doi.org/10.1109/ICIP.2016.7532381.
Cheng H T, Chao C H, Dong J D, Wen H K, Liu T L, Sun M. Cube padding for weakly-supervised saliency prediction in 360° videos. In Proc. the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2018, pp.1420–1429. https://doi.org/10.1109/CVPR.2018.00154.
Monroy R, Lutz S, Chalasani T, Smolic A. SalNet360: Saliency maps for omni-directional images with CNN. Signal Processing: Image Communication, 2018, 69: 26–34. https://doi.org/10.1016/j.image.2018.05.005.
Article Google Scholar
Wang F E, Yeh Y H, Tsai Y H, Chiu W C, Sun M. Bi-Fuse++: Self-supervised and efficient bi-projection fusion for 360° depth estimation. IEEE Trans. Pattern Analysis and Machine Intelligence, 2023, 45(5): 5448–5460. https://doi.org/10.1109/TPAMI.2022.3203516.
Wang F E, Yeh Y H, Sun M, Chiu W C, Tsai Y H. Bi-Fuse: Monocular 360 depth estimation via bi-projection fusion. In Proc. the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2020, pp.459–468. https://doi.org/10.1109/CVPR42600.2020.00054.
Li Y Y, Guo Y L, Yan Z X, Huang X Y, Duan Y, Ren L. OmniFusion: 360 monocular depth estimation via geometry-aware fusion. In Proc. the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2022, pp.2791–2800. https://doi.org/10.1109/CVPR52688.2022.00282.
Sun C, Sun M, Chen H T. HoHoNet: 360 indoor holistic understanding with latent horizontal features. In Proc. the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2021, pp.2573–2582. https://doi.org/10.1109/CVPR46437.2021.00260.
Yang K L, Hu X X, Fang Y C, Wang K W, Stiefelhagen R. Omnisupervised omnidirectional semantic segmentation. IEEE Trans. Intelligent Transportation Systems, 2022, 23(2): 1184–1199. https://doi.org/10.1109/TITS.2020.3023331.
Article Google Scholar
Zhang J M, Yang K L, Ma C X, Reiß S, Peng K Y, Stiefelhagen R. Bending reality: Distortion-aware transformers for adapting to panoramic semantic segmentation. In Proc. the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2022, pp.16896–16906. https://doi.org/10.1109/CVPR52688.2022.01641.
Defferrard M, Milani M, Gusset F, Perraudin N. Deep-Sphere: A graph-based spherical CNN. In Proc. the 8th International Conference on Learning Representations, Apr. 2019.
Ros G, Sellart L, Materzynska J, Vazquez D, Lopez A M. The Synthia dataset: A large collection of synthetic images for semantic segmentation of urban scenes. In Proc. the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Jun. 2016, pp.3234–3243. https://doi.org/10.1109/CVPR.2016.352.
Yang K L, Hu X X, Bergasa L M, Romera E, Wang K W. PASS: Panoramic annular semantic segmentation. IEEE Trans. Intelligent Transportation Systems, 2020, 21(10): 4171–4185. https://doi.org/10.1109/TITS.2019.2938965.
Article Google Scholar
Su Y C, Grauman K. Learning spherical convolution for fast features from 360° imagery. In Proc. the 31st International Conference on Neural Information Processing Systems, Dec. 2017, pp.529–539.
Zhao P Y, You A S, Zhang Y X, Liu J Y, Bian K G, Tong Y H. Spherical criteria for fast and accurate 360° object detection. In Proc. the 34th AAAI Conference on Artificial Intelligence, Feb. 2020, pp.12959–12966. https://doi.org/10.1609/aaai.v34i07.6995.
Cao M, Ikehata S, Aizawa K. Field-of-view IoU for object detection in 360° images. arXiv: 2202.03176, 2022. https://arxiv.org/abs/2202.03176, Jul. 2023.
Chou S H, Sun C, Chang W Y, Hsu W T, Sun M, Fu J D. 360-indoor: Towards learning real-world objects in 360° indoor equirectangular images. In Proc. the 2020 IEEE Winter Conference on Applications of Computer Vision, Mar. 2020, pp.834–842. https://doi.org/10.1109/WACV45572.2020.9093262.
Guerrero-Viu J, Fernandez-Labrador C, Demonceaux C, Guerrero J J. What’s in my room? Object recognition on indoor panoramic images. In Proc. the 2020 IEEE International Conference on Robotics and Automation (ICRA), May 2020, pp.567–573. https://doi.org/10.1109/ICRA40945.2020.9197335.
Zhang Z H, Xu Y Y, Yu J Y, Gao S H. Saliency detection in 360° videos. In Proc. the 15th European Conference on Computer Vision, Sept. 2018, pp.504–520. https://doi.org/10.1007/978-3-030-01234-2_30.
Ma G X, Li S, Chen C L Z, Hao A M, Qin H. Stage-wise salient object detection in 360° omnidirectional image via object-level semantical saliency ranking. IEEE Trans. Visualization and Computer Graphics, 2020, 26(12): 3535–3545. https://doi.org/10.1109/TVCG.2020.3023636.
Article Google Scholar
Hu H N, Lin Y C, Liu M Y, Cheng H T, Chang Y J, Sun M. Deep 360 pilot: Learning a deep agent for piloting through 360° sports videos. In Proc. the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jul. 2017, pp.1396–1405. https://doi.org/10.1109/CVPR.2017.153.
Dahou Y, Tliba M, McGuinness K, O’Connor N. ATSal: An attention based architecture for saliency prediction in 360° videos. In Proc. the 26th International Conference on Pattern Recognition, Jan. 2021, pp.305–320. https://doi.org/10.1007/978-3-030-68796-0_22.
Qiao M L, Xu M, Wang Z L, Borji A. Viewport-dependent saliency prediction in 360° video. IEEE Trans. Multimedia, 2021, 23: 748–760. https://doi.org/10.1109/TMM.2020.2987682.
Article Google Scholar
Chao F Y, Zhang L, Hamidouche W, Déforges O. A multi-FoV viewport-based visual saliency model using adaptive weighting losses for 360° images. IEEE Trans. Multimedia, 2021, 23: 1811–1826. https://doi.org/10.1109/TMM.2020.3003642.
Article Google Scholar
Zhang Y, Chao F Y, Hamidouche W, Deforges O. PAVSOD: A new task towards panoramic audiovisual saliency detection. ACM Trans. Multimedia Computing, Communications, and Applications, 2023, 19(3): Article No. 101. https://doi.org/10.1145/3565267.
Jiang H L, Sheng Z, Zhu S Y, Dong Z L, Huang R. Uni-Fuse: Unidirectional fusion for 360° panorama depth estimation. IEEE Robotics and Automation Letters, 2021, 6(2): 1519–1526. https://doi.org/10.1109/LRA.2021.3058957.
Article Google Scholar
Zhuang C Q, Lu Z D, Wang Y Q, Xiao J, Wang Y. ACDNet: Adaptively combined dilated convolution for monocular panorama depth estimation. In Proc. the 36th AAAI Conference on Artificial Intelligence, Feb. 22–Mar. 1, 2022, pp.3653–3661. https://doi.org/10.1609/aaai.v36i3.20278.
Feng Q, Shum H P H, Morishima S. 360 depth estimation in the wild—The depth360 dataset and the SegFuse network. In Proc. the 2022 IEEE Conference on Virtual Reality and 3D User Interfaces (VR), Mar. 2022, pp.664–673. https://doi.org/10.1109/VR51125.2022.00087.
Rey-Area M, Yuan M Z, Richardt C. 360MonoDepth: High-resolution 360° monocular depth estimation. In Proc. the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2022, pp.3752–3762. https://doi.org/10.1109/CVPR52688.2022.00374.
Serrano A, Kim I, Chen Z L, DiVerdi S, Gutierrez D, Hertzmann A, Masia B. Motion parallax for 360° RGBD video. IEEE Trans. Visualization and Computer Graphics, 2019, 25(5): 1817–1827. https://doi.org/10.1109/TVCG.2019.2898757.
Article Google Scholar
Won C, Ryu J, Lim J. SweepNet: Wide-baseline omnidirectional depth estimation. In Proc. the 2019 International Conference on Robotics and Automation, May 2019, pp.6073–6079. https://doi.org/10.1109/ICRA.2019.8793823.
Wang N H, Solarte B, Tsai Y H, Chiu W C, Sun M. 360SD-net: 360° stereo depth estimation with learnable cost volume. In Proc. the 2020 IEEE International Conference on Robotics and Automation (ICRA), May 2020, pp.582–588. https://doi.org/10.1109/ICRA40945.2020.9196975.
Teed Z, Deng J. RAFT: Recurrent all-pairs field transforms for optical flow. arXiv: 2003.12039, 2020. https://arxiv.org/abs/2003.12039, Jul. 2023.
Sun D Q, Yang X D, Liu M Y, Kautz J. PWC-net: CNNs for optical flow using pyramid, warping, and cost volume. In Proc. the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2018, pp.8934–8943. https://doi.org/10.1109/CVPR.2018.00931.
Bhandari K, Zong Z L, Yan Y. Revisiting optical flow estimation in 360 videos. In Proc. the 25th International Conference on Pattern Recognition (ICPR), Jan. 2021, pp.8196–8203. https://doi.org/10.1109/ICPR48806.2021.9412035.
Martin D, Serrano A, Bergman A W, Wetzstein G, Masia B. ScanGAN360: A generative model of realistic scanpaths for 360° images. IEEE Trans. Visualization and Computer Graphics, 2022, 28(5): 2003–2013. https://doi.org/10.1109/TVCG.2022.3150502.
Article Google Scholar
Yu M, Lakshman H, Girod B. A framework to evaluate omnidirectional video coding schemes. In Proc. the 2015 IEEE International Symposium on Mixed and Augmented Reality, Sept. 29–Oct. 3, 2015, pp.31–36. https://doi.org/10.1109/ISMAR.2015.12.
Wang S B, Yang S S, Li H L, Zhang X D, Zhou C, Xu C R, Qian F, Wang N B, Xu Z B. SalientVR: Saliency-driven mobile 360-degree video streaming with gaze information. In Proc. the 28th Annual International Conference on Mobile Computing and Networking, Oct. 2022, pp.542–555. https://doi.org/10.1145/3495243.3517018.
Xu Y Y, Dong Y B, Wu J R, Sun Z Z, Shi Z R, Yu J Y, Gao S H. Gaze prediction in dynamic 360° immersive videos. In Proc. the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2018, pp.5333–5342. https://doi.org/10.1109/CVPR.2018.00559.
Yang L, Xu M, Guo Y C, Deng X, Gao F Y, Guan Z Y. Hierarchical Bayesian LSTM for head trajectory prediction on omnidirectional images. IEEE Trans. Pattern Analysis and Machine Intelligence, 2022, 44(11): 7563–7580. https://doi.org/10.1109/TPAMI.2021.3117019.
Rondón M, Sassatelli L, Aparicio-Pardo R, Precioso F. TRACK: A new method from a re-examination of deep architectures for head motion prediction in 360° videos. IEEE Trans. Pattern Analysis and Machine Intelligence, 2022, 44(9): 5681–5699. https://doi.org/10.1109/TPAMI.2021.3070520.
Article Google Scholar
Assens M, Giro-i-Nieto X, McGuinness K, O’Connor N E. SaltiNet: Scan-path prediction on 360 degree images using saliency volumes. In Proc. the 2017 IEEE International Conference on Computer Vision Workshops, Oct. 2017, pp.2331–2338. https://doi.org/10.1109/ICCVW.2017.275.
de Belen R A J, Bednarz T, Sowmya A. ScanpathNet: A recurrent mixture density network for scanpath prediction. In Proc. the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2022, pp.5006–5016. https://doi.org/10.1109/CVPRW56347.2022.00549.
Griffin R, Langlotz T, Zollmann S. 6DIVE: 6 degrees-of-freedom immersive video editor. Frontiers in Virtual Reality, 2021, 2: 676895. https://doi.org/10.3389/frvir.2021.676895.
Xu M, Li C, Zhang S Y, Le Callet P. State-of-the-art in 360° video/image processing: Perception, assessment and compression. IEEE Journal of Selected Topics in Signal Processing, 2020, 14(1): 5–26. https://doi.org/10.1109/JSTSP.2020.2966864.
Article Google Scholar
Barron J T, Mildenhall B, Verbin D, Srinivasan P P, Hedman P. Mip-NeRF 360: Unbounded anti-aliased neural radiance fields. In Proc. the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2022, pp.5460–5469. https://doi.org/10.1109/CVPR52688.2022.00539.
Huang J W, Chen Z L, Ceylan D, Jin H L. 6-DOF VR videos with a single 360-camera. In Proc. the 2017 IEEE Virtual Reality (VR), Mar. 2017, pp.37–44. https://doi.org/10.1109/VR.2017.7892229.
Baker L, Mills S, Zollmann S, Ventura J. CasualStereo: Casual capture of stereo panoramas with spherical structure-from-motion. In Proc. the 2020 IEEE Conference on Virtual Reality and 3D User Interfaces (VR), Mar. 2020, pp.782–790. https://doi.org/10.1109/VR46266.2020.00102.
Chen R S, Zhang F L, Finnie S, Chalmers A, Rhee T. Casual 6-DoF: Free-viewpoint panorama using a handheld 360° camera. IEEE Trans. Visualization and Computer Graphics, 2022: 1. https://doi.org/10.1109/TVCG.2022.3176832.
Waidhofer J, Gadgil R, Dickson A, Zollmann S, Ventura J. PanoSynthVR: Toward light-weight 360-degree view synthesis from a single panoramic input. In Proc. the 2022 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), Oct. 2022, pp.584–592. https://doi.org/10.1109/ISMAR55827.2022.00075.
DiVerdi S, Wither J, Hollerer T. Envisor: Online environment map construction for mixed reality. In Proc. the 2008 IEEE Virtual Reality Conference, Mar. 2008, pp.19–26. https://doi.org/10.1109/VR.2008.4480745.
Park J, Park H, Yoon S E, Woo W. Physically-inspired deep light estimation from a homogeneous-material object for mixed reality lighting. IEEE Trans. Visualization and Computer Graphics, 2020, 26(5): 2002–2011. https://doi.org/10.1109/TVCG.2020.2973050.
Article Google Scholar
Georgoulis S, Rematas K, Ritschel T, Gavves E, Fritz M, Van Gool L, Tuytelaars T. Reflectance and natural illumination from single-material specular objects using deep learning. IEEE Trans. Pattern Analysis and Machine Intelligence, 2018, 40(8): 1932–1947. https://doi.org/10.1109/TPAMI.2017.2742999.
Wei X, Chen G J, Dong Y, Lin S, Tong X. Object-based illumination estimation with rendering-aware neural networks. In Proc. the 16th European Conference on Computer Vision, Aug. 2020, pp.380–396. https://doi.org/10.1007/978-3-030-58555-6_23.
Hold-Geoffroy Y, Sunkavalli K, Hadap S, Gambaretto E, Lalonde J F. Deep outdoor illumination estimation. In Proc. the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Jul. 2017, pp.2373–2382. https://doi.org/10.1109/CVPR.2017.255.
Zhang J S, Sunkavalli K, Hold-Geoffroy Y, Hadap S, Eisenman J, Lalonde J F. All-weather deep outdoor lighting estimation. In Proc. the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2019, pp.10150–10158. https://doi.org/10.1109/CVPR.2019.01040.
Hold-Geoffroy Y, Athawale A, Lalonde J F. Deep sky modeling for single image outdoor lighting estimation. In Proc. the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2019, pp.6920–6928. https://doi.org/10.1109/CVPR.2019.00709.
Yu P P, Guo J, Huang F, Zhou C, Che H W, Ling X, Guo Y W. Hierarchical disentangled representation learning for outdoor illumination estimation and editing. In Proc. the 2021 IEEE/CVF International Conference on Computer Vision, Oct. 2021, pp.15293–15302. https://doi.org/10.1109/ICCV48922.2021.01503.
Zhu Y J, Zhang Y D, Li S, Shi B X. Spatially-varying outdoor lighting estimation from intrinsics. In Proc. the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2021, pp.12829–12837. https://doi.org/10.1109/CVPR46437.2021.01264.
Tang J J, Zhu Y J, Wang H Y, Chan J H, Li S, Shi B X. Estimating spatially-varying lighting in urban scenes with disentangled representation. In Proc. the 17th European Conference on Computer Vision, Oct. 2022, pp.454–469. https://doi.org/10.1007/978-3-031-20068-7_26.
Gardner M A, Sunkavalli K, Yumer E, Shen X H, Gambaretto E, Gagné C, Lalonde J F. Learning to predict indoor illumination from a single image. ACM Trans. Graphics, 2017, 36(6): Article No. 176. https://doi.org/10.1145/3130800.3130891.
Song S R, Funkhouser T. Neural Illumination: Lighting prediction for indoor environments. In Proc. the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2019, pp.6911–6919. https://doi.org/10.1109/CVPR.2019.00708.
Srinivasan P P, Mildenhall B, Tancik M, Barron J T, Tucker R, Snavely N. Lighthouse: Predicting lighting volumes for spatially-coherent illumination. In Proc. the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2020, pp.8077–8086. https://doi.org/10.1109/CVPR42600.2020.00810.
Chalmers A, Zhao J H, Medeiros D, Rhee T. Reconstructing reflection maps using a stacked-CNN for mixed reality rendering. IEEE Trans. Visualization and Computer Graphics, 2021, 27(10): 4073–4084. https://doi.org/10.1109/TVCG.2020.3001917.
Article Google Scholar
Zhao J H, Chalmers A, Rhee T. Adaptive light estimation using dynamic filtering for diverse lighting conditions. IEEE Trans. Visualization and Computer Graphics, 2021, 27(11): 4097–4106. https://doi.org/10.1109/TVCG.2021.3106497.
Article Google Scholar
Zhan F N, Zhang C G, Yu Y C, Chang Y, Lu S J, Ma F Y, Xie X S. EMLight: Lighting estimation via spherical distribution approximation. In Proc. the 35th AAAI Conference on Artificial Intelligence, Feb. 2021, pp.3287–3295. https://doi.org/10.1609/aaai.v35i4.16440.
Zhan F N, Yu Y C, Wu R L, Zhang C G, Lu S J, Shao L, Ma F Y, Xie X S. GMLight: Lighting estimation via geometric distribution approximation. arXiv: 2102.10244, 2021. https://arxiv.org/abs/2102.10244v1, Jul. 2023.
Xu J P, Zuo C Y, Zhang F L, Wang M. Rendering-aware HDR environment map prediction from a single image. In Proc. the 36th AAAI Conference on Artificial Intelligence, Feb. 22–Mar. 1, 2022, pp.2857–2865. https://doi.org/10.1609/aaai.v36i3.20190.
Akimoto N, Matsuo Y, Aoki Y. Diverse plausible 360-degree image outpainting for efficient 3DCG background creation. In Proc. the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2022, pp.11431–11440. https://doi.org/10.1109/CVPR52688.2022.01115.
Somanath G, Kurz D. HDR environment map estimation for real-time augmented reality. In Proc. the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2021, pp.11293–11301. https://doi.org/10.1109/CVPR46437.2021.01114.
Wagner D, Mulloni A, Langlotz T, Schmalstieg D. Real-time panoramic mapping and tracking on mobile phones. In Proc. the 2010 IEEE Virtual Reality Conference (VR), Mar. 2010, pp.211–218. https://doi.org/10.1109/VR.2010.5444786.
Gauglitz S, Sweeney C, Ventura J, Turk M, Hlerer T. Live tracking and mapping from both general and rotation-only camera motion. In Proc. the 2012 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), Nov. 2012, pp.13–22. https://doi.org/10.1109/ISMAR.2012.6402532.
Gauglitz S, Sweeney C, Ventura J, Turk M, Hollerer T. Model estimation and selection towards unconstrained real-time tracking and mapping. IEEE Trans. Visualization and Computer Graphics, 2014, 20(6): 825–838. https://doi.org/10.1109/TVCG.2013.243.
Article Google Scholar
Pirchheim C, Schmalstieg D, Reitmayr G. Handling pure camera rotation in keyframe-based SLAM. In Proc. the 2013 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), Oct. 2013, pp.229–238. https://doi.org/10.1109/ISMAR.2013.6671783.
Baker L, Ventura J, Zollmann S, Mills S, Langlotz T. SPLAT: Spherical localization and tracking in large spaces. In Proc. the 2020 IEEE Conference on Virtual Reality and 3D User Interfaces (VR), Mar. 2020, pp.809–817. https://doi.org/10.1109/VR46266.2020.00105.
Pan Q, Arth C, Reitmayr G, Rosten E, Drummond T. Rapid scene reconstruction on mobile phones from panoramic images. In Proc. the 10th IEEE International Symposium on Mixed and Augmented Reality, Oct. 2011, pp.55–64. https://doi.org/10.1109/ISMAR.2011.6092370.
Arth C, Klopschitz M, Reitmayr G, Schmalstieg D. Real-time self-localization from panoramic images on mobile devices. In Proc. the 10th IEEE International Symposium on Mixed and Augmented Reality, Oct. 2011, pp.37–46. https://doi.org/10.1109/ISMAR.2011.6092368.
Reinisch G, Arth C, Schmalstieg D. Panoramic mapping on a mobile phone GPU. In Proc. the 2013 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), Oct. 2013, pp.291–292. https://doi.org/10.1109/ISMAR.2013.6671810.
Pece F, Steptoe W, Wanner F, Julier S, Weyrich T, Kautz J, Steed A. PanoInserts: Mobile spatial teleconferencing. In Proc. the 2013 SIGCHI Conference on Human Factors in Computing Systems, Apr. 2013, pp.1319–1328. https://doi.org/10.1145/2470654.2466173.
Speicher M, Cao J C, Yu A, Zhang H H, Nebeling M. 360Anywhere: Mobile ad-hoc collaboration in any environment using 360 video and augmented reality. Proceedings of the ACM on Human-Computer Interaction, 2018, 2(EICS): 9. https://doi.org/10.1145/3229091.
Piumsomboon T, Lee G A, Irlitti A, Ens B, Thomas B H, Billinghurst M. On the shoulder of the giant: A multi-scale mixed reality collaboration with 360 video sharing and tangible interaction. In Proc. the 2019 CHI Conference on Human Factors in Computing Systems, May 2019, Article No. 228. https://doi.org/10.1145/3290605.3300458.
Teo T, Lawrence L, Lee G A, Billinghurst M, Adcock M. Mixed reality remote collaboration combining 360 video and 3D reconstruction. In Proc. the 2019 CHI Conference on Human Factors in Computing Systems, May 2019, Article No. 201. https://doi.org/10.1145/3290605.3300431.
Wang P, Bai X L, Billinghurst M, Zhang S S, Zhang X Y, Wang S X, He W P, Yan Y X, Ji H Y. AR/MR remote collaboration on physical tasks: A review. Robotics and Computer-Integrated Manufacturing, 2021, 72: 102071. https://doi.org/10.1016/j.rcim.2020.102071.
Nebeling M, Madier K. 360proto: Making interactive virtual reality & augmented reality prototypes from paper. In Proc. the 2019 CHI Conference on Human Factors in Computing Systems, May 2019, Article No. 596. https://doi.org/10.1145/3290605.3300826.
Zhu Z, Martin R R, Hu S M. Panorama completion for street views. Computational Visual Media, 2015, 1(1): 49–57. https://doi.org/10.1007/s41095-015-0008-2.
Article Google Scholar
He K M, Sun J. Image completion approaches using the statistics of similar patches. IEEE Trans. Pattern Analysis and Machine Intelligence, 2014, 36(12): 2423–2435. https://doi.org/10.1109/TPAMI.2014.2330611.
Xu B B, Pathak S, Fujii H, Yamashita A, Asama H. Spatio-temporal video completion in spherical image sequences. IEEE Robotics and Automation Letters, 2017, 2(4): 2032–2039. https://doi.org/10.1109/LRA.2017.2718106.
Article Google Scholar
Zhao Q, Wan L, Feng W, Zhang J W, Wong T T. 360 panorama cloning on sphere. arXiv: 1709.01638, 2017. https://arxiv.org/abs/1709.01638, Jul. 2023.
Huang K, Zhang F L, Zhao J H, Li Y H, Dodgson N. 360° stereo image composition with depth adaption. arXiv: 2212.10062, 2022. https://arxiv.org/abs/2212.10062, Jul. 2023.
Jung J, Kim B, Lee J Y, Kim B, Lee S. Robust upright adjustment of 360 spherical panoramas. The Visual Computer, 2017, 33(6): 737–747. https://doi.org/10.1007/s00371-017-1368-7.
Article Google Scholar
Zhang Y, Zhang F L, Lai Y K, Zhu Z. Efficient propagation of sparse edits on 360° panoramas. Computers & Graphics, 2021, 96: 61–70. https://doi.org/10.1016/j.cag.2021.03.005.
Article Google Scholar
Zhang Y, Zhang F L, Zhu Z, Wang L D, Jin Y. Fast edit propagation for 360 degree panoramas using function interpolation. IEEE Access, 2022, 10: 43882–43894. https://doi.org/10.1109/ACCESS.2022.3168665.
Article Google Scholar
Wong K M. View-adaptive asymmetric image detail enhancement for 360-degree stereoscopic VR content. In Proc. the 2022 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW), Mar. 2022, pp.23–26. https://doi.org/10.1109/VRW55335.2022.00012.
Wang M, Li Y J, Zhang W X, Richardt C, Hu S M. Transitioning360: Content-aware NFOV virtual camera paths for 360° video playback. In Proc. the 2020 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), Nov. 2020, pp.185–194. https://doi.org/10.1109/ISMAR50242.2020.00040.
Li Y J, Shi J C, Zhang F L, Wang M. Bullet comments for 360°video. In Proc. the 2022 IEEE Conference on Virtual Reality and 3D User Interfaces (VR), Mar. 2022. https://doi.org/10.1109/VR51125.2022.00017.

Download references

Author information

Authors and Affiliations

School of Engineering and Computer Science, Victoria University of Wellington, Wellington, 6012, New Zealand
Fanglue Zhang & Junhong Zhao
College of Media Engineering, Communication University of Zhejiang, Hangzhou, 310018, China
Yun Zhang
Department of Computer Science, University of Otago, Dunedin, 9054, New Zealand
Stefanie Zollmann

Authors

Fanglue Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Junhong Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Yun Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Stefanie Zollmann
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Junhong Zhao.

Supplementary Information

ESM 1

(PDF 168 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, F., Zhao, J., Zhang, Y. et al. A Survey on 360° Images and Videos in Mixed Reality: Algorithms and Applications. J. Comput. Sci. Technol. 38, 473–491 (2023). https://doi.org/10.1007/s11390-023-3210-1

Download citation

Received: 06 March 2023
Accepted: 24 May 2023
Published: 30 May 2023
Issue Date: June 2023
DOI: https://doi.org/10.1007/s11390-023-3210-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Survey on 360° Images and Videos in Mixed Reality: Algorithms and Applications

Abstract

Access this article

References

Author information

Authors and Affiliations

Corresponding author

Supplementary Information

ESM 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation