| Sign up

PDF (11 MB)

Cite

EndNote(RIS) BibTeX

Collect

Collect

Submit Manuscript

Research Article | Open Access

Joint head pose and facial landmark regression from depth images

Jie Wang^¹, Juyong Zhang^¹(), Changwei Luo^¹, Falai Chen^¹

1 University of Science and Technology of China, Hefei, Anhui, 230026, China.

Show Author Information

Abstract

This paper presents a joint head pose and facial landmark regression method with input from depth images for realtime application. Our main contributions are: firstly, a joint optimization method to estimate head pose and facial landmarks, i.e., the pose regression result provides supervised initialization for cascaded facial landmark regression, while the regression result for the facial landmarks can also help to further refine the head pose at each stage. Secondly, we classify the head pose space into 9 sub-spaces, and then use a cascaded random forest with a global shape constraint for training facial landmarks in each specific space. This classification-guided method can effectively handle the problem of large pose changes and occlusion. Lastly, we have built a 3D face database containing 73 subjects, each with 14 expressions in various head poses. Experiments on challenging databases show our method achieves state-of-the-art performance on both head pose estimation and facial landmark regression.

Keywords

head pose facial landmarks depth images

References

[1]

C.

Cao,

; Y.

Weng,

; S.

Lin,

; K.

Zhou,

3D shape regression for real-time facial animation. ACM Transactions on Graphics Vol. 32, No. 4, Article No. 41, 2013.

Crossref Google Scholar

[2]

C.

Cao,

; Q.

Hou,

; K.

Zhou,

Displaced dynamic expression regression for real-time facial tracking and animation. ACM Transactions on Graphics Vol. 33, No. 4, Article No. 43, 2014.

Crossref Google Scholar

[3]

M. D.

Breitenstein,

; D.

Kuettel,

; T.

Weise,

; L.

van Gool,

; H.

Pfister,

Real-time face pose estimation from single range images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1-8, 2008.

[4]

G. P.

Meyer,

; S.

Gupta,

; I.

Frosio,

; D.

Reddy,

; J.

Kautz,

Robust model-based 3D head pose estimation. In: Proceedings of the IEEE International Conference on Computer Vision, 3649-3657, 2015.

[5]

P.

Padeleris,

; X.

Zabulis,

; A. A.

Argyros,

Head pose estimation on depth based on particle swarm optimation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 42-49, 2012.

[6]

E.

Seeman,

; K.

Nickel,

; R.

Stiefelhagen,

Head pose estimation using stereo vision for human-robot interaction. In: Proceedings of the 6th IEEE International Conference on Automatic Face and Gesture Recognition, 626-631, 2004.

[7]

S.

Tulyakov,

; R. L.

Vieriu,

; S.

Semeniuta,

; N.

Sebe,

Robust real-time extreme head pose estimation. In: Proceedings of the 22nd International Conference on Pattern Recognition, 2263-2268, 2014.

[8]

X. P.

Burgos-Artizzu,

; P.

Perona,

; P.

Dollar,

Robust face landmark estimation under occlusion. In: Proceedings of the IEEE International Conference on Computer Vision, 151-1520, 2013.

[9]

X.

Cao,

; Y.

Wei,

; F.

Wei,

; J.

Sun,

Face alignment by explicit shape regression. International Journal of Computer Vision Vol. 107, No. 2, 177-190, 2014.

Crossref Google Scholar

[10]

M.

Dantone,

; J.

Gall,

; G.

Fanelli,

; L.

van Gool,

Real-time facial feature detection using conditional regression forests. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2578-2585, 2012.

[11]

Z.

Zhang,

; W.

Zhang,

; J.

Liu,

; X.

Tang,

Multiview facial landmark localization in RGB-D images via hierarchical regression with binary patterns. IEEE Transactions on Circuits and Systems for Video Technology Vol. 24, No. 9, 1475-1485, 2014.

Crossref Google Scholar

[12]

Z.

Zhu,

; R. R.

Martin,

; R.

Pepperell,

; A.

Burleigh,

3D modeling and motion parallax for improved videoconferencing. Computational Visual Media Vol. 2, No. 2, 131-142, 2016.

Crossref Google Scholar

[13]

P.

Dollár,

; P.

Welinder,

; P.

Perona,

Cascaded pose regression. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1078-1085, 2010.

[14]

X.

Sun,

; Y.

Wei,

; S.

Liang,

; X.

Tang,

; J.

Sun,

Cascaded hand pose regression. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 824-832, 2015.

[15]

D.

Chen,

; S.

Ren,

; Y.

Wei,

; X.

Cao,

; J.

Sun,

Joint cascade face detection and alignment. In: Computer Vision-ECCV 2014. D.

Fleet,

; T.

Pajdla,

; B.

Schiele,

; T.

Tuytelaars,

Eds. Springer International Publishing Switzerland, 109-122, 2014.

[16]

D.

Lee,

; H.

Park,

; C. D.

Yoo,

Face alignment using cascade Gaussian process regression trees. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 4204-4212, 2015.

[17]

G.

Tzimiropoulos,

Project-out cascaded regression with an application to face alignment. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3659-3667, 2015.

[18]

S.

Ren,

; X.

Cao,

; Y.

Wei,

; J.

Sun,

Face alignment at 3000 fps via regression local binary features. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1685-1692, 2014.

[19]

T.

Baltrušaitis,

; P.

Robinson,

; L. P.

Morency,

3D constrained local model for rigid and non-rigid facial tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2610-2617, 2012.

[20]

D. J.

Tan,

; F.

Tombari,

; N.

Navab,

A combined generalized and subject-specific 3D head pose estimation. In: Proceedings of the International Conference on 3D Vision, 500-508, 2015.

[21]

E.

Borovikov,

Human head pose estimation by facial features location. arXiv preprint arXiv:1510.02774, 2015.

[22]

G.

Fanelli,

; M.

Dantone,

; J.

Gall,

; A.

Fossati,

; L.

van Gool,

Random forests for real time 3D face analysis. International Journal of Computer Vision Vol. 101, No. 3, 437-458, 2013.

Crossref Google Scholar

[23]

G.

Fanelli,

; J.

Gall,

; L.

van Gool,

Real time head pose estimation with random regression forests. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 617-624, 2011.

[24]

G.

Fanelli,

; T.

Weise,

; J.

Gall,

; L.

van Gool,

Real time head pose estimation from consumer depth cameras. In: Pattern Recognition. R.

Mester,

; M.

Felsberg,

Eds. Springer-Verlag Berlin Heidelberg, 101-110, 2011.

[25]

C.

Papazov,

; T. K.

Marks,

; M.

Jones,

Real-time 3D head pose and facial landmark estimation from depth images using triangular surface patch features. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 4722-4730, 2015.

[26]

T. F.

Cootes,

; G. J.

Edwards,

; C. J.

Taylor,

Active appearance models. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 23, No. 6, 681-685, 2001.

Crossref Google Scholar

[27]

D.

Cristinacce,

; T.

Cootes,

Boost regression active shape models. In: Proceedings of the British Machine Conference, 79.1-79.10, 2007.

[28]

P.

Sauer,

; T.

Cootes,

; C.

Taylor,

Accurate regression procedures for active appearance models. In: Proceedings of the British Machine Vision Conference, 30.1-30.11, 2011.

[29]

G.

Tzimiropoulos,

; M.

Pantic,

Optimization problems for fast AAM fitting in-the-wild. In: Proceedings of the IEEE International Conference on Computer Vision, 593-600, 2013.

[30]

J.

Xiao,

; S.

Baker,

; I.

Matthews,

; T.

Kanade,

Real-time combined 2D+3D active appearance models. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 535-542, 2004.

[31]

M. C.

Ruiz,

; J.

Illingworth,

Automatic landmarking of faces in 3D-ALF

^{3 D}

. In: Proceedings of the 5th International Conference on Visual Information Engineering, 41-46, 2008.

[32]

S. Z.

Gilani,

; F.

Shafait,

; A.

Mian,

Shape-based automatic detection of a large number of 3D facial landmarks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 4639-4648, 2015.

[33]

A.

Jourabloo,

; X.

Liu,

Pose-invariant 3D face alignment. In: Proceedings of the IEEE International Conference on Computer Vision, 3694-3702, 2015.

[34]

A.

Jourabloo,

; X.

Liu,

Large-pose face alignment via CNN-based dense 3D model fitting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 4188-4196, 2016.

[35]

L.

Breiman,

Random forests. Machine Learning Vol. 45, No. 1, 5-32, 2001.

Crossref Google Scholar

[36]

S.

Schulter,

; C.

Leistner,

; P.

Wohlhart,

; P. M.

Roth,

; H.

Bischof,

Alternating regression forests for object detection and pose estimation. In: Proceedings of the IEEE International Conference on Computer Vision, 417-424, 2013.

[37]

S.

Schulter,

; P.

Wohlhart,

; C.

Leistner,

; A.

Saffari,

; P. M.

Roth,

; H.

Bischof,

Alternating decision forests. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 508-515, 2013.

[38]

C.

Wan,

; A.

Yao,

; L.

van Gool,

Direction matters: Hand pose estimation from local surface normals. arXiv preprint arXiv:1604.02657, 2016.

Crossref Google Scholar

[39]

S.

Ren,

; X.

Cao,

; Y.

Wei,

; J.

Sun,

Global refinement of random forest. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 723-730, 2015.

[40]

G.

Fanelli,

; M.

Dantone,

; L.

van Gool,

Real time 3D face alignment with random forests-based active appearance models. In: Proceedings of the 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition, 1-8, 2013.

Computational Visual Media

Volume 3 Issue 3,
September 2017

Pages 229-241

DOI: 10.1007/s41095-017-0082-8

Cite this article:

Wang J, Zhang J, Luo C, et al. Joint head pose and facial landmark regression from depth images. Computational Visual Media, 2017, 3(3): 229-241. https://doi.org/10.1007/s41095-017-0082-8

About Us

Learn about Open Access

Tsinghua University Press

Publish with Us

Peer Review Policy

Copyright and Licensing

Article Processing Charge

Contact Us

Journal Collaboration: Yao Meng (Ms.)✉️ +86-10-83470574

Technical Support: Kuo Zhao (Mr.)✉️ +86-10-83470507

Media Contact: Hao Jin (Mr.)✉️ +86-10-83470559

Address: Floor 6, Tower B, Xueyan Building, Shuangqing Road, Haidian District, Beijing 100084, China.

SciOpen——中国科技期刊卓越行动计划支持项目

Copyright © 2025 Tsinghua University Press Ltd.

京ICP备 10035462号-42 京公网安备11010802044758号