Data-driven garment animation is a current topic of interest in the computer graphics industry. Existing approaches generally establish the mapping between a single human pose or a temporal pose sequence, and garment deformation, but it is difficult to quickly generate diverse clothed human animations. We address this problem with a method to automatically synthesize dressed human animations with temporal consistency from a specified human motion label. At the heart of our method is a two-stage strategy. Specifically, we first learn a latent space encoding the sequence-level distribution of human motions utilizing a transformer-based conditional variational autoencoder (Transformer-CVAE). Then a garment simulator synthesizes dynamic garment shapes using a transformer encoder–decoder architecture. Since the learned latent space comes from varied human motions, our method can generate a variety of styles of motions given a specific motion label. By means of a novel beginning of sequence (BOS) learning strategy and a self-supervised refinement procedure, our garment simulator is capable of efficiently synthesizing garment deformation sequences corresponding to the generated human motions while maintaining temporal and spatial consistency. We verify our ideasexperimentally. This is the first generative model that directly dresses human animation.
- Article type
- Year
- Co-author
Synthesizing garment dynamics according to body motions is a vital technique in computer graphics. Physics-based simulation depends on an accurate model of the law of kinetics of cloth, which is time-consuming, hard to implement, and complex to control. Existing data-driven approaches either lack temporal consistency, or fail to handle garments that are different from body topology. In this paper, we present a motion-inspired real-time garment synthesis workflow that enables high-level control of garment shape. Given a sequence of body motions, our workflow is able to generate corresponding garment dynamics with both spatial and temporal coherence. To that end, we develop a transformer-based garment synthesis network to learn the mapping from body motions to garment dynamics. Frame-level attention is employed to capture the dependency of garments and body motions. Moreover, a post-processing procedure is further taken to perform penetration removal and auto-texturing. Then, textured clothing animation that is collision-free and temporally-consistent is generated. We quantitatively and qualitatively evaluated our proposed workflow from different aspects. Extensive experiments demonstrate that our network is able to deliver clothing dynamics which retain the wrinkles from the physics-based simulation, while running 1000 times faster. Besides, our workflow achieved superior synthesis performance compared with alternative approaches. To stimulate further research in this direction, our code will be publicly available soon.