3D modeling and motion parallax for improved videoconferencing

Zhe Zhu; Ralph R. Martin; Robert Pepperell; Alistair Burleigh

doi:10.1007/s41095-016-0038-4

| Sign up

PDF (10.7 MB)

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

Show Outline

Figures (6)

Fig. 1

Fig. 2

Fig. 3

Fig. 4

Fig. 5

Fig. 6

Research Article | Open Access

3D modeling and motion parallax for improved videoconferencing

Zhe Zhu^¹(), Ralph R. Martin^², Robert Pepperell^³, Alistair Burleigh^³

1 TNList, Tsinghua University, Beijing 100084, China.

2 School of Computer Science & Informatics, Cardiff University, UK.

3 Cardiff School of Art & Design, Cardiff Metropolitan University, UK.

Show Author Information

Abstract

We consider a face-to-face videoconferencing system that uses a Kinect camera at each end of the link for 3D modeling and an ordinary 2D display for output. The Kinect camera allows a 3D model of each participant to be transmitted; the (assumed static) background is sent separately. Furthermore, the Kinect tracks the receiver’s head, allowing our system to render a view of the sender depending on the receiver’s viewpoint. The resulting motion parallax gives the receivers a strong impression of 3D viewing as they move, yet the system only needs an ordinary 2D display. This is cheaper than a full 3D system, and avoids disadvantages such as the need to wear shutter glasses, VR headsets, or to sit in a particular position required by an autostereo display. Perceptual studies show that users experience a greater sensation of depth with our system compared to a typical 2D videoconferencing system.

Keywords

naked-eye 3D motion parallax videoconferencing real-time 3D modeling

References

[1]

Rosenthal,

A. H.

Two-way television communication unit. US Patent 2420198, 1947.

[2]

Okada,

K.-I.

; Maeda,

; Ichikawaa,

; Matsushita,

Multiparty videoconferencing at virtual social distance: MAJIC design. In: Proceedings of ACM Conference on Computer Supported Cooperative Work, 385-393, 1994.

Crossref

[3]

Sellen,

; Buxton,

; Arnott,

Using spatial cues to improve videoconferencing. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 651-652, 1992.

Crossref

[4]

Tang,

J. C.

; Minneman,

VideoWhiteboard: Video shadows to support remote collaboration. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 315-322, 1991.

Crossref

[5]

Vertegaal,

The GAZE groupware system: Mediating joint attention in multiparty communication and collaboration. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 294-301, 1999.

Crossref

[6]

Vertegaal,

; Ding,

Explaining effects of eye gaze on mediated group conversations: Amount or synchronization? In: Proceedings of ACM Conference on Computer Supported Cooperative Work, 41-48, 2002.

Crossref

[7]

Pirenne,

M. H.

Optics, Painting and Photography. Cambridge, UK: Cambridge University Press, 1970.

[8]

Solso,

R. L.

Cognition and the Visual Arts. Cambridge, MA, USA: MIT Press, 1996.

[9]

Pepperell,

; Haertel,

Do artists use linear perspective to depict visual space? Perception Vol. 43, No. 5, 395-416, 2014.

Crossref Google Scholar

[10]

Baldwin,

; Burleigh,

; Pepperell,

Comparing artistic and geometrical perspective depictions of space in the visual field. i-Perception Vol. 5, No. 6, 536-547, 2014.

Crossref Google Scholar

[11]

Kemp,

The Science of Art: Optical Themes in Western Art from Brunelleschi to Seurat. New Haven, CT, USA: Yale University Press, 1990.

[12]

Kingslake,

Optics in Photography. Bellingham, WA, USA: SPIE Publications, 1992.

Crossref

[13]

Ogle,

K. N.

Research in Binocular Vision, 2nd edn. New York: Hafner Publishing Company, 1964.

[14]

Harrison,

; Hudson,

S. E.

Pseudo-3D video conferencing with a generic webcam. In: Proceedings of the 10th IEEE International Symposium on Multimedia, 236-241, 2008.

Crossref

[15]

Zhang,

; Yin,

; Florencio,

Improving depth perception with motion parallax and its application in teleconferencing. In: Proceedings of IEEE International Workshop on Multimedia Signal Processing, 1-6, 2009.

Crossref

[16]

Izadi,

; Kim,

; Hilliges,

; Molyneaux,

; Newcombe,

; Kohli,

; Shotton,

; Hodges,

; Freeman,

; Davison,

; Fitzgibbon,

KinectFusion: Real-time 3D reconstruction and interaction using a moving depth camera. In: Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology, 559-568, 2011.

Crossref

[17]

Newcombe,

R. A.

; Izadi,

; Hilliges,

; Molyneaux,

; Kim,

; Davison,

A. J.

; Kohli,

; Shotton,

; Hodges,

; Fitzgibbon,

KinectFusion: Real-time dense surface mapping and tracking. In: Proceedings of the 10th IEEE International Symposium on Mixed and Augmented Reality, 127-136, 2011.

Crossref

[18]

Kim,

; Bolton,

; Girouard,

; Cooperstock,

; Vertegaal,

TeleHuman: Effects of 3D perspective on gaze and pose estimation with a life-size cylindrical telepresence pod. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 2531-2540, 2012.

Crossref

[19]

Lee,

J. C.

Head tracking for desktop VR displays using the Wii remote. Available at http://johnnylee.net/ projects/wii/.

[20]

iPhone User Guide For iOS 8.1 Software. Apple Inc., 2014.

[21]

Levin,

; Lischinski,

; Weiss,

A closed form solution to natural image matting. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 30, No. 2, 228-242, 2008.

Crossref Google Scholar

[22]

Rydfalk,

CANDIDE, a parameterized face. Technical Report LiTH-ISY-I-866. Linköping University, 1987.

[23]

Welsh,

Model-based coding of images. Ph.D. Thesis. British Telecom Research Lab, 1991.

Crossref

[24]

Ahlberg,

CANDIDE-3—An updated parameterised face. Technical Report LiTH-ISY-R-2326. Linköping University, 2001.

[25]

Rusinkiewicz,

; Hall-Holt,

; Levoy,

Real-time 3D model acquisition. ACM Transactions on Graphics Vol. 21, No. 3, 438-446, 2002.

Crossref Google Scholar

[26]

3dMD Static Systems. Available at http://www.3dmd. com/3dMD-systems/.

Crossref

[27]

Chen,

; Bautembach,

; Izadi,

Scalable real-time volumetric surface reconstruction. ACM Transactions on Graphics Vol. 32, No. 4, Article No. 113, 2013.

Crossref Google Scholar

[28]

Wexler,

; Shechtman,

; Irani,

Space-time completion of video. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 29, No. 3, 463-476, 2007.

Crossref Google Scholar

[29]

Chen,

; Zhu,

; Shamir,

; Hu,

S.-M.

; Cohen-Or,

3-Sweep: Extracting editable objects from a single photo. ACM Transactions on Graphics Vol. 32, No. 6, Article No. 195, 2013.

Crossref Google Scholar

[30]

Gal,

; Sorkine,

; Mitra,

N. J.

; Cohen-Or,

iWIRES: An analyze-and-edit approach to shape manipulation. ACM Transactions on Graphics Vol. 28, No. 3, Article No. 33, 2009.

Crossref Google Scholar

[31]

Schulz,

; Shamir,

; Levin,

D. I. W.

; Sitthi-amorn,

; Matusik,

Design and fabrication by example. ACM Transactions on Graphics Vol. 33, No. 4, Article No. 62, 2014.

Crossref Google Scholar

[32]

Zheng,

; Fu,

; Cohen-Or,

; Au,

O. K.-C.

; Tai,

C.-L.

Component-wise controllers for structure-preserving shape manipulation. Computer Graphics Forum Vol. 30, No. 2, 563-572, 2011.

Crossref Google Scholar

Computational Visual Media

Volume 2 Issue 2,
June 2016

Pages 131-142

DOI: 10.1007/s41095-016-0038-4

Cite this article:

Zhu Z, Martin RR, Pepperell R, et al. 3D modeling and motion parallax for improved videoconferencing. Computational Visual Media, 2016, 2(2): 131-142. https://doi.org/10.1007/s41095-016-0038-4