| Sign up

Article Link

Cite

EndNote(RIS) BibTeX

Collect

Collect

Submit Manuscript

Show Outline

Outline

Abstract

Keywords

Electronic Supplementary Material

References

Show full outline

Hide outline

Regular Paper

PESTA: An Elastic Motion Capture Data Retrieval Method

Zi-Fei Jiang^¹, Wei Li^¹, Yan Huang^¹, Yi-Long Yin^¹, C.-C. Jay Kuo^², Jing-Liang Peng^{³^,⁴}()

1School of Software, Shandong University, Jinan 250101, China

2Ming Hsieh Department of Electrical and Computer Engineering, University of Southern California Los Angeles 90089, U.S.A.

3Shandong Provincial Key Laboratory of Network Based Intelligent Computing, Jinan 250022, China

4School of Information Science and Engineering, University of Jinan, Jinan 250022, China

Show Author Information

Abstract

Prevalent use of motion capture (MoCap) produces large volumes of data and MoCap data retrieval becomes crucial for efficient data reuse. MoCap clips may not be neatly segmented and labeled, increasing the difficulty of retrieval. In order to effectively retrieve such data, we propose an elastic content-based retrieval scheme via unsupervised posture encoding and strided temporal alignment (PESTA) in this work. It retrieves similarities at the sub-sequence level, achieves robustness against singular frames and enables control of tradeoff between precision and efficiency. It firstly learns a dictionary of encoded postures utilizing unsupervised adversarial autoencoder techniques and, based on which, compactly symbolizes any MoCap sequence. Secondly, it conducts strided temporal alignment to align a query sequence to repository sequences to retrieve the best-matching sub-sequences from the repository. Further, it extends to find matches for multiple sub-queries in a long query at sharply promoted efficiency and minutely sacrificed precision. Outstanding performance of the proposed scheme is well demonstrated by experiments on two public MoCap datasets and one MoCap dataset captured by ourselves.

Keywords

motion capture (MoCap)content-based retrieval adversarial autoencoder temporal alignment

Electronic Supplementary Material

Download File(s)

JCST-2302-13140-Highlights.pdf (140.7 KB)

References

[1]

Liu F, Zhuang Y T, Wu F, Pan Y H. 3D motion retrieval with motion index tree. Computer Vision and Image Understanding, 2003, 92(2/3): 265–284. DOI: 10.1016/j.cviu.2003.06.001.

Crossref Google Scholar

[2]

Deng Z G, Gu Q, Li Q. Perceptually consistent example-based human motion retrieval. In Proc. the 2009 Symposium on Interactive 3D Graphics and Games, Feb. 2009, pp.191–198. DOI: 10.1145/1507149.1507181.

[3]

Lv N, Jiang Z F, Huang Y, Meng X X, Meenakshisundaram G, Peng J L. Generic content-based retrieval of marker-based motion capture data. IEEE Transactions on Visualization and Computer Graphics, 2018, 24(6): 1969–1982. DOI: 10.1109/TVCG.2017.2702620.

Crossref Google Scholar

[4]

Jin Y, Prabhakaran B. Knowledge discovery from 3D human motion streams through semantic dimensional reduction. ACM Trans. Multimedia Computing, Communications, and Applications, 2011, 7(2): Article No. 9. DOI: 10.1145/1925101.1925104.

Crossref Google Scholar

[5]

Sun C, Junejo I, Foroosh H. Motion retrieval using low-rank subspace decomposition of motion volume. Computer Graphics Forum, 2011, 30(7): 1953–1962. DOI: 10.1111/j.1467-8659.2011.02048.x.

Crossref Google Scholar

[6]

Zhu M Y, Sun H J, Deng Z G. Quaternion space sparse decomposition for motion compression and retrieval. In Proc. the 2012 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, Jul. 2012, pp.183–192. DOI: 10.2312/SCA/SCA12/183-192.

[7]

Wang P J, Lau R W H, Pan Z G, Wang J, Song H Y. An Eigen-based motion retrieval method for real-time animation. Computers & Graphics, 2014, 38: 255–267. DOI: 10.1016/j.cag.2013.11.008.

Crossref Google Scholar

[8]

Xiao Q K, Wang Y, Wang H Y. Motion retrieval using weighted graph matching. Soft Computing, 2015, 19(1): 133–144. DOI: 10.1007/s00500-014-1237-5.

Crossref Google Scholar

[9]

Qi T, Feng Y F, Xiao J, Zhuang Y T, Yang X S, Zhang J J. A semantic feature for human motion retrieval. Computer Animation and Virtual Worlds, 2013, 24(3/4): 399–407. DOI: 10.1002/cav.1505.

Crossref Google Scholar

[10]

Sedmidubsky J, Budikova P, Dohnal V, Zezula P. Motion words: A text-like representation of 3D skeleton sequences. In Proc. the 42nd European Conference on Information Retrieval, Apr. 2020, pp.527–541. DOI: 10.1007/978-3-030-45439-5_35.

[11]

Li W, Huang Y, Peng J L. Video-interfaced human motion capture data retrieval based on the normalized motion energy image representation. In Proc. the 27th International Conference on Neural Information Processing, Nov. 2020, pp.616–627. DOI: 10.1007/978-3-030-63830-6_52.

[12]

Feng L, Shen X, Sun T, Xu X H, Pan X W. Retrieval of human motion data based on energy model. Journal of Computer-Aided Design & Computer Graphics, 2007, 19(8): 1015–1021. (in Chinese)

[13]

Xiao Q K, Li J F, Xiao Q H. Human motion capture data retrieval based on quaternion and EMD. In Proc. the 5th International Conference on Intelligent Human-machine Systems and Cybernetics, Aug. 2013, pp.517–520. DOI: 10.1109/IHMSC.2013.129.

[14]

Huang Z W, Wan C D, Probst T, Van Gool L. Deep learning on lie groups for skeleton-based action recognition. In Proc. the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Jul. 2017, pp.6099–6108. DOI: 10.1109/CVPR.2017.137.

[15]

Li C L, Cui Z, Zheng W M, Xu C Y, Ji R R, Yang J. Action-attending graphic neural network. IEEE Trans. Image Processing, 2018, 27(7): 3657–3670. DOI: 10.1109/TIP.2018.2815744.

Crossref Google Scholar

[16]

Zhu W T, Lan C L, Xing J L, Zeng W J, Li Y H, Shen L, Xie X H. Co-occurrence feature learning for skeleton based action recognition using regularized deep LSTM networks. In Proc. the 30th AAAI Conference on Artificial Intelligence, Feb. 2016, pp.3697-3703. DOI: 10.1609/aaai.v30i1.10451.

[17]

Du Y, Wang W, Wang L. Hierarchical recurrent neural network for skeleton based action recognition. In Proc. the 2015 IEEE Conference on Computer Vision and Pattern Recognition, Jun. 2015, pp.1110–118. DOI: 10.1109/CVPR.2015.7298714.

[18]

Holden D, Saito J, Komura T, Joyce T. Learning motion manifolds with convolutional autoencoders. In Proc. the SIGGRAPH Asia 2015 Technical Briefs, Nov. 2015, Article No. 18. DOI: 10.1145/2820903.2820918.

[19]

Wang Y Y, Neff M. Deep signatures for indexing and retrieval in large motion databases. In Proc. the 8th ACM SIGGRAPH Conference on Motion in Games, Nov. 2015, pp.37–45. DOI: 10.1145/2822013.2822024.

[20]

Sedmidubsky J, Elias P, Zezula P. Effective and efficient similarity searching in motion capture data. Multimedia Tools and Applications, 2018, 77(10): 12073–12094. DOI: 10.1007/s11042-017-4859-7.

Crossref Google Scholar

[21]

Lv N, Wang Y, Feng Z Q, Peng J L. Deep hashing for motion capture data retrieval. In Proc. the 2021 IEEE International Conference on Acoustics, Speech and Signal Processing, Jun. 2021, pp.2215–2219. DOI: 10.1109/ICASSP39728.2021.9413505.

[22]

Sakamoto Y, Kuriyama S, Kaneko T. Motion map: Image-based retrieval and segmentation of motion data. In Proc. the 2004 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, Aug. 2004, pp.259–266. DOI: 10.1145/1028523.1028557.

[23]

Krüuger B, Tautges J, Weber A, Zinke A. Fast local and global similarity searches in large motion capture databases. In Proc. the 2010 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, Jul. 2010.

[24]

Choi M G, Yang K, Igarashi T, Mitani J, Lee J. Retrieval and visualization of human motion data via stick figures. Computer Graphics Forum, 2012, 31(7): 2057–2065. DOI: 10.1111/j.1467-8659.2012.03198.x.

Crossref Google Scholar

[25]

Kapadia M, Chiang I K, Thomas T, Badler N I, Kider J T. Efficient motion retrieval in large motion databases. In Proc. the 2013 ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games, Mar. 2013, pp.19–28. DOI: 10.1145/2448196.2448199.

[26]

Xiao J, Tang Z P, Feng Y F, Xiao Z D. Sketch-based human motion retrieval via selected 2D geometric posture descriptor. Signal Processing, 2015, 113: 1–8. DOI: 10.1016/ j.sigpro.2015.01.004.

Crossref Google Scholar

[27]

Sedmidubsky J, Elias P, Zezula P. Searching for variable-speed motions in long sequences of motion capture data. Information Systems, 2019, 80: 148–158. DOI: 10.1016/j.is.2018.04.002.

Crossref Google Scholar

[28]

Kovar L, Gleicher M. Automated extraction and parameterization of motions in large data sets. ACM Trans. Graphics, 2004, 23(3): 559–568. DOI: 10.1145/1015706.1015 760.

Crossref Google Scholar

[29]

Müller M, Röder T, Clausen M. Efficient content-based retrieval of motion capture data. ACM Trans. Graphics, 2005, 24(3): 677–685. DOI: 10.1145/1073204.1073247.

Crossref Google Scholar

[30]

Müller M, Röder T. Motion templates for automatic classification and retrieval of motion capture data. In Proc. the 2006 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, Sept. 2006, pp.137–146. DOI: 10.2312/SCA/SCA06/137-146.

[31]

Forbes K, Fiume E. An efficient search algorithm for motion data using weighted PCA. In Proc. the 2005 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, Jul. 2005, pp.67–76. DOI: 10.1145/1073368.1073377.

[32]

Wu S Y, Xia S H, Wang Z Q, Li C P. Efficient motion data indexing and retrieval with local similarity measure of motion strings. The Visual Computer, 2009, 25(5/6/7): 499–508. DOI: 10.1007/s00371-009-0345-1.

Crossref Google Scholar

[33]

Gupta A, He J, Martinez J, Little J J, Woodham R J. Efficient video-based retrieval of human motion with flexible alignment. In Proc. the 2016 IEEE Winter Conference on Applications of Computer Vision, Mar. 2016. DOI: 10.1109/WACV.2016.7477588.

[34]

Makhzani A, Shlens J, Jaitly N, Goodfellow I, Frey B. Adversarial autoencoders. arXiv: 1511.05644, 2015. https://arxiv.org/abs/1511.05644, Sept. 2023.

[35]

Jiang Z, Huang Y, Peng J. Recent advances in content-based motion capture data retrieval. International Journal of Electrical Engineering, 2018, 25(2): 47–56. DOI: 10.6329/CIEE.201804-25(2).0002.

Crossref Google Scholar

[36]

Li Y, Fermuller C, Aloimonos Y, Ji H. Learning shift-invariant sparse representation of actions. In Proc. the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Jun. 2010, pp.2630–2637. DOI: 10.1109/CVPR.2010.5539977.

[37]

Goodfellow I J, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A C, Bengio Y. Generative adversarial nets. In Proc. the 27th International Conference on Neural Information Processing Systems, Dec. 2014, pp.2672–2680. DOI: 10.1145/3422622.

[38]

Manber U, Myers G. Suffix arrays: A new method for on-line string searches. SIAM Journal on Computing, 1993, 22(5): 935–948. DOI: 10.1137/0222058.

Crossref Google Scholar

[39]

Müller M, Röder T, Clausen M, Eberhardt B, Krüger B, Weber A. Documentation MoCap database HDM05. Computer Graphics Technical Reports CG-2007-2, Universität Bonn, 2007. https://cg.cs.uni-bonn.de/backend/v1/files/publications/cg-2007-2.pdf, July 2023.

[40]

Yang W, Ouyang W L, Wang X L, Ren J, Li H S, Wang X G. 3D human pose estimation in the wild by adversarial learning. In Proc. the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2018, pp.5255–5264. DOI: 10.1109/CVPR.2018.00551.

[41]

Sun X, Xiao B, Wei F Y, Liang S, Wei Y C. Integral human pose regression. In Proc. the 15th European Conference on Computer Vision, Sept. 2018, pp.536–553. DOI: 10.1007/978-3-030-01231-1_33.

[42]

Zhao L, Peng X, Tian Y, Kapadia M, Metaxas D N. Semantic graph convolutional networks for 3D human pose regression. In Proc. the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2019, pp.3425–3435. DOI: 10.1109/CVPR.2019.00354.

[43]

Li S C, Ke L, Pratama K, Tai Y W, Tang C K, Cheng K T. Cascaded deep monocular 3D human pose estimation with evolutionary training data. In Proc. the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2020, pp.6173–6183. DOI: 10.1109/CVPR42600.2020.00621.

Journal of Computer Science and Technology

Volume 38 Issue 4,
July 2023

Pages 867-884

DOI: 10.1007/s11390-023-3140-y

Cite this article:

Jiang Z-F, Li W, Huang Y, et al. PESTA: An Elastic Motion Capture Data Retrieval Method. Journal of Computer Science and Technology, 2023, 38(4): 867-884. https://doi.org/10.1007/s11390-023-3140-y

About Us

Learn about Open Access

Tsinghua University Press

Publish with Us

Peer Review Policy

Copyright and Licensing

Article Processing Charge

Contact Us

Journal Collaboration: Yao Meng (Ms.)✉️ +86-10-83470574

Technical Support: Kuo Zhao (Mr.)✉️ +86-10-83470507

Media Contact: Hao Jin (Mr.)✉️ +86-10-83470559

Address: Floor 6, Tower B, Xueyan Building, Shuangqing Road, Haidian District, Beijing 100084, China.

SciOpen——中国科技期刊卓越行动计划支持项目

Copyright © 2025 Tsinghua University Press Ltd.

京ICP备 10035462号-42 京公网安备11010802044758号