Data-driven garment animation is a current topic of interest in the computer graphics industry. Existing approaches generally establish the mapping between a single human pose or a temporal pose sequence, and garment deformation, but it is difficult to quickly generate diverse clothed human animations. We address this problem with a method to automatically synthesize dressed human animations with temporal consistency from a specified human motion label. At the heart of our method is a two-stage strategy. Specifically, we first learn a latent space encoding the sequence-level distribution of human motions utilizing a transformer-based conditional variational autoencoder (Transformer-CVAE). Then a garment simulator synthesizes dynamic garment shapes using a transformer encoder–decoder architecture. Since the learned latent space comes from varied human motions, our method can generate a variety of styles of motions given a specific motion label. By means of a novel beginning of sequence (BOS) learning strategy and a self-supervised refinement procedure, our garment simulator is capable of efficiently synthesizing garment deformation sequences corresponding to the generated human motions while maintaining temporal and spatial consistency. We verify our ideasexperimentally. This is the first generative model that directly dresses human animation.
- Article type
- Year
- Co-author
Synthesizing garment dynamics according to body motions is a vital technique in computer graphics. Physics-based simulation depends on an accurate model of the law of kinetics of cloth, which is time-consuming, hard to implement, and complex to control. Existing data-driven approaches either lack temporal consistency, or fail to handle garments that are different from body topology. In this paper, we present a motion-inspired real-time garment synthesis workflow that enables high-level control of garment shape. Given a sequence of body motions, our workflow is able to generate corresponding garment dynamics with both spatial and temporal coherence. To that end, we develop a transformer-based garment synthesis network to learn the mapping from body motions to garment dynamics. Frame-level attention is employed to capture the dependency of garments and body motions. Moreover, a post-processing procedure is further taken to perform penetration removal and auto-texturing. Then, textured clothing animation that is collision-free and temporally-consistent is generated. We quantitatively and qualitatively evaluated our proposed workflow from different aspects. Extensive experiments demonstrate that our network is able to deliver clothing dynamics which retain the wrinkles from the physics-based simulation, while running 1000 times faster. Besides, our workflow achieved superior synthesis performance compared with alternative approaches. To stimulate further research in this direction, our code will be publicly available soon.
Grasp detection is a visual recognition task where the robot makes use of its sensors to detect graspable objects in its environment. Despite the steady progress in robotic grasping, it is still difficult to achieve both real-time and high accuracy grasping detection. In this paper, we propose a real-time robotic grasp detection method, which can accurately predict potential grasp for parallel-plate robotic grippers using RGB images. Our work employs an end-to-end convolutional neural network which consists of a feature descriptor and a grasp detector. And for the first time, we add an attention mechanism to the grasp detection task, which enables the network to focus on grasp regions rather than background. Specifically, we present an angular label smoothing strategy in our grasp detection method to enhance the fault tolerance of the network. We quantitatively and qualitatively evaluate our grasp detection method from different aspects on the public Cornell dataset and Jacquard dataset. Extensive experiments demonstrate that our grasp detection method achieves superior performance to the state-of-the-art methods. In particular, our grasp detection method ranked first on both the Cornell dataset and the Jacquard dataset, giving rise to the accuracy of 98.9% and 95.6%, respectively at real-time calculation speed.