AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
Article Link
Collect
Submit Manuscript
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Regular Paper

PESTA: An Elastic Motion Capture Data Retrieval Method

School of Software, Shandong University, Jinan 250101, China
Ming Hsieh Department of Electrical and Computer Engineering, University of Southern California Los Angeles 90089, U.S.A.
Shandong Provincial Key Laboratory of Network Based Intelligent Computing, Jinan 250022, China
School of Information Science and Engineering, University of Jinan, Jinan 250022, China
Show Author Information

Abstract

Prevalent use of motion capture (MoCap) produces large volumes of data and MoCap data retrieval becomes crucial for efficient data reuse. MoCap clips may not be neatly segmented and labeled, increasing the difficulty of retrieval. In order to effectively retrieve such data, we propose an elastic content-based retrieval scheme via unsupervised posture encoding and strided temporal alignment (PESTA) in this work. It retrieves similarities at the sub-sequence level, achieves robustness against singular frames and enables control of tradeoff between precision and efficiency. It firstly learns a dictionary of encoded postures utilizing unsupervised adversarial autoencoder techniques and, based on which, compactly symbolizes any MoCap sequence. Secondly, it conducts strided temporal alignment to align a query sequence to repository sequences to retrieve the best-matching sub-sequences from the repository. Further, it extends to find matches for multiple sub-queries in a long query at sharply promoted efficiency and minutely sacrificed precision. Outstanding performance of the proposed scheme is well demonstrated by experiments on two public MoCap datasets and one MoCap dataset captured by ourselves.

Electronic Supplementary Material

Download File(s)
JCST-2302-13140-Highlights.pdf (140.7 KB)

References

[1]

Liu F, Zhuang Y T, Wu F, Pan Y H. 3D motion retrieval with motion index tree. Computer Vision and Image Understanding, 2003, 92(2/3): 265–284. DOI: 10.1016/j.cviu.2003.06.001.

[2]
Deng Z G, Gu Q, Li Q. Perceptually consistent example-based human motion retrieval. In Proc. the 2009 Symposium on Interactive 3D Graphics and Games, Feb. 2009, pp.191–198. DOI: 10.1145/1507149.1507181.
[3]

Lv N, Jiang Z F, Huang Y, Meng X X, Meenakshisundaram G, Peng J L. Generic content-based retrieval of marker-based motion capture data. IEEE Transactions on Visualization and Computer Graphics, 2018, 24(6): 1969–1982. DOI: 10.1109/TVCG.2017.2702620.

[4]

Jin Y, Prabhakaran B. Knowledge discovery from 3D human motion streams through semantic dimensional reduction. ACM Trans. Multimedia Computing, Communications, and Applications, 2011, 7(2): Article No. 9. DOI: 10.1145/1925101.1925104.

[5]

Sun C, Junejo I, Foroosh H. Motion retrieval using low-rank subspace decomposition of motion volume. Computer Graphics Forum, 2011, 30(7): 1953–1962. DOI: 10.1111/j.1467-8659.2011.02048.x.

[6]
Zhu M Y, Sun H J, Deng Z G. Quaternion space sparse decomposition for motion compression and retrieval. In Proc. the 2012 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, Jul. 2012, pp.183–192. DOI: 10.2312/SCA/SCA12/183-192.
[7]

Wang P J, Lau R W H, Pan Z G, Wang J, Song H Y. An Eigen-based motion retrieval method for real-time animation. Computers & Graphics, 2014, 38: 255–267. DOI: 10.1016/j.cag.2013.11.008.

[8]

Xiao Q K, Wang Y, Wang H Y. Motion retrieval using weighted graph matching. Soft Computing, 2015, 19(1): 133–144. DOI: 10.1007/s00500-014-1237-5.

[9]

Qi T, Feng Y F, Xiao J, Zhuang Y T, Yang X S, Zhang J J. A semantic feature for human motion retrieval. Computer Animation and Virtual Worlds, 2013, 24(3/4): 399–407. DOI: 10.1002/cav.1505.

[10]
Sedmidubsky J, Budikova P, Dohnal V, Zezula P. Motion words: A text-like representation of 3D skeleton sequences. In Proc. the 42nd European Conference on Information Retrieval, Apr. 2020, pp.527–541. DOI: 10.1007/978-3-030-45439-5_35.
[11]
Li W, Huang Y, Peng J L. Video-interfaced human motion capture data retrieval based on the normalized motion energy image representation. In Proc. the 27th International Conference on Neural Information Processing, Nov. 2020, pp.616–627. DOI: 10.1007/978-3-030-63830-6_52.
[12]

Feng L, Shen X, Sun T, Xu X H, Pan X W. Retrieval of human motion data based on energy model. Journal of Computer-Aided Design & Computer Graphics, 2007, 19(8): 1015–1021. (in Chinese)

[13]
Xiao Q K, Li J F, Xiao Q H. Human motion capture data retrieval based on quaternion and EMD. In Proc. the 5th International Conference on Intelligent Human-machine Systems and Cybernetics, Aug. 2013, pp.517–520. DOI: 10.1109/IHMSC.2013.129.
[14]
Huang Z W, Wan C D, Probst T, Van Gool L. Deep learning on lie groups for skeleton-based action recognition. In Proc. the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Jul. 2017, pp.6099–6108. DOI: 10.1109/CVPR.2017.137.
[15]

Li C L, Cui Z, Zheng W M, Xu C Y, Ji R R, Yang J. Action-attending graphic neural network. IEEE Trans. Image Processing, 2018, 27(7): 3657–3670. DOI: 10.1109/TIP.2018.2815744.

[16]
Zhu W T, Lan C L, Xing J L, Zeng W J, Li Y H, Shen L, Xie X H. Co-occurrence feature learning for skeleton based action recognition using regularized deep LSTM networks. In Proc. the 30th AAAI Conference on Artificial Intelligence, Feb. 2016, pp.3697-3703. DOI: 10.1609/aaai.v30i1.10451.
[17]
Du Y, Wang W, Wang L. Hierarchical recurrent neural network for skeleton based action recognition. In Proc. the 2015 IEEE Conference on Computer Vision and Pattern Recognition, Jun. 2015, pp.1110–118. DOI: 10.1109/CVPR.2015.7298714.
[18]
Holden D, Saito J, Komura T, Joyce T. Learning motion manifolds with convolutional autoencoders. In Proc. the SIGGRAPH Asia 2015 Technical Briefs, Nov. 2015, Article No. 18. DOI: 10.1145/2820903.2820918.
[19]
Wang Y Y, Neff M. Deep signatures for indexing and retrieval in large motion databases. In Proc. the 8th ACM SIGGRAPH Conference on Motion in Games, Nov. 2015, pp.37–45. DOI: 10.1145/2822013.2822024.
[20]

Sedmidubsky J, Elias P, Zezula P. Effective and efficient similarity searching in motion capture data. Multimedia Tools and Applications, 2018, 77(10): 12073–12094. DOI: 10.1007/s11042-017-4859-7.

[21]
Lv N, Wang Y, Feng Z Q, Peng J L. Deep hashing for motion capture data retrieval. In Proc. the 2021 IEEE International Conference on Acoustics, Speech and Signal Processing, Jun. 2021, pp.2215–2219. DOI: 10.1109/ICASSP39728.2021.9413505.
[22]
Sakamoto Y, Kuriyama S, Kaneko T. Motion map: Image-based retrieval and segmentation of motion data. In Proc. the 2004 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, Aug. 2004, pp.259–266. DOI: 10.1145/1028523.1028557.
[23]
Krüuger B, Tautges J, Weber A, Zinke A. Fast local and global similarity searches in large motion capture databases. In Proc. the 2010 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, Jul. 2010.
[24]

Choi M G, Yang K, Igarashi T, Mitani J, Lee J. Retrieval and visualization of human motion data via stick figures. Computer Graphics Forum, 2012, 31(7): 2057–2065. DOI: 10.1111/j.1467-8659.2012.03198.x.

[25]
Kapadia M, Chiang I K, Thomas T, Badler N I, Kider J T. Efficient motion retrieval in large motion databases. In Proc. the 2013 ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games, Mar. 2013, pp.19–28. DOI: 10.1145/2448196.2448199.
[26]

Xiao J, Tang Z P, Feng Y F, Xiao Z D. Sketch-based human motion retrieval via selected 2D geometric posture descriptor. Signal Processing, 2015, 113: 1–8. DOI: 10.1016/ j.sigpro.2015.01.004.

[27]

Sedmidubsky J, Elias P, Zezula P. Searching for variable-speed motions in long sequences of motion capture data. Information Systems, 2019, 80: 148–158. DOI: 10.1016/j.is.2018.04.002.

[28]

Kovar L, Gleicher M. Automated extraction and parameterization of motions in large data sets. ACM Trans. Graphics, 2004, 23(3): 559–568. DOI: 10.1145/1015706.1015 760.

[29]

Müller M, Röder T, Clausen M. Efficient content-based retrieval of motion capture data. ACM Trans. Graphics, 2005, 24(3): 677–685. DOI: 10.1145/1073204.1073247.

[30]
Müller M, Röder T. Motion templates for automatic classification and retrieval of motion capture data. In Proc. the 2006 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, Sept. 2006, pp.137–146. DOI: 10.2312/SCA/SCA06/137-146.
[31]
Forbes K, Fiume E. An efficient search algorithm for motion data using weighted PCA. In Proc. the 2005 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, Jul. 2005, pp.67–76. DOI: 10.1145/1073368.1073377.
[32]

Wu S Y, Xia S H, Wang Z Q, Li C P. Efficient motion data indexing and retrieval with local similarity measure of motion strings. The Visual Computer, 2009, 25(5/6/7): 499–508. DOI: 10.1007/s00371-009-0345-1.

[33]
Gupta A, He J, Martinez J, Little J J, Woodham R J. Efficient video-based retrieval of human motion with flexible alignment. In Proc. the 2016 IEEE Winter Conference on Applications of Computer Vision, Mar. 2016. DOI: 10.1109/WACV.2016.7477588.
[34]
Makhzani A, Shlens J, Jaitly N, Goodfellow I, Frey B. Adversarial autoencoders. arXiv: 1511.05644, 2015. https://arxiv.org/abs/1511.05644, Sept. 2023.
[35]

Jiang Z, Huang Y, Peng J. Recent advances in content-based motion capture data retrieval. International Journal of Electrical Engineering, 2018, 25(2): 47–56. DOI: 10.6329/CIEE.201804-25(2).0002.

[36]
Li Y, Fermuller C, Aloimonos Y, Ji H. Learning shift-invariant sparse representation of actions. In Proc. the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Jun. 2010, pp.2630–2637. DOI: 10.1109/CVPR.2010.5539977.
[37]
Goodfellow I J, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A C, Bengio Y. Generative adversarial nets. In Proc. the 27th International Conference on Neural Information Processing Systems, Dec. 2014, pp.2672–2680. DOI: 10.1145/3422622.
[38]

Manber U, Myers G. Suffix arrays: A new method for on-line string searches. SIAM Journal on Computing, 1993, 22(5): 935–948. DOI: 10.1137/0222058.

[39]
Müller M, Röder T, Clausen M, Eberhardt B, Krüger B, Weber A. Documentation MoCap database HDM05. Computer Graphics Technical Reports CG-2007-2, Universität Bonn, 2007. https://cg.cs.uni-bonn.de/backend/v1/files/publications/cg-2007-2.pdf, July 2023.
[40]
Yang W, Ouyang W L, Wang X L, Ren J, Li H S, Wang X G. 3D human pose estimation in the wild by adversarial learning. In Proc. the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2018, pp.5255–5264. DOI: 10.1109/CVPR.2018.00551.
[41]
Sun X, Xiao B, Wei F Y, Liang S, Wei Y C. Integral human pose regression. In Proc. the 15th European Conference on Computer Vision, Sept. 2018, pp.536–553. DOI: 10.1007/978-3-030-01231-1_33.
[42]
Zhao L, Peng X, Tian Y, Kapadia M, Metaxas D N. Semantic graph convolutional networks for 3D human pose regression. In Proc. the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2019, pp.3425–3435. DOI: 10.1109/CVPR.2019.00354.
[43]
Li S C, Ke L, Pratama K, Tai Y W, Tang C K, Cheng K T. Cascaded deep monocular 3D human pose estimation with evolutionary training data. In Proc. the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2020, pp.6173–6183. DOI: 10.1109/CVPR42600.2020.00621.
Journal of Computer Science and Technology
Pages 867-884
Cite this article:
Jiang Z-F, Li W, Huang Y, et al. PESTA: An Elastic Motion Capture Data Retrieval Method. Journal of Computer Science and Technology, 2023, 38(4): 867-884. https://doi.org/10.1007/s11390-023-3140-y

419

Views

2

Crossref

0

Web of Science

2

Scopus

0

CSCD

Altmetrics

Received: 01 February 2023
Accepted: 26 April 2023
Published: 06 December 2023
© Institute of Computing Technology, Chinese Academy of Sciences 2023
Return