A popular and challenging task in video research, frame interpolation aims to increase the frame rate of video. Most existing methods employ a fixed motion model, e.g., linear, quadratic, or cubic, to estimate the intermediate warping field. However,such fixed motion models cannot well represent the complicated non-linear motions in the real world or rendered animations. Instead, we present an adaptive flow prediction module to better approximate the complex motions in video. Furthermore, interpolating just one intermediate frame between consecutive input frames may be insufficient for complicated non-linear motions. To enable multi-frame interpolation, we introducethe time as a control variable when interpolating frames between original ones in our generic adaptive flow prediction module. Qualitative and quantitative experimental results show that our method can produce high-quality results and outperforms the existing state-of-the-art methods on popular public datasets.
- Article type
- Year
- Co-author
Learning-based techniques have recently been shown to be effective for denoising Monte Carlo rendering methods. However, there remains a quality gap to state-of-the-art handcrafted denoisers. In this paper, we propose a deep residual learning based method that outperforms both state-of-the-art handcrafted denoisers and learning-based denoisers. Unlike the indirect nature of existing learning-based methods (which e.g., estimate the parameters and kernel weights of an explicit feature based filter), we directly map the noisy input pixels to the smoothed output. Using this direct mapping formulation, we demonstrate that even a simple-and-standard ResNet and three common auxiliary features (depth, normal, and albedo) are sufficient to achieve high-quality denoising. This minimal requirement on auxiliary data simplifies both training and integration of our method into most production rendering pipelines. We have evaluated our method on unseen images created by a different renderer. Consistently superior quality denoising is obtained in all cases.
Shape matching plays an important role in various computer vision and graphics applications such as shape retrieval, object detection, image editing, image retrieval, etc. However, detecting shapes in cluttered images is still quite challenging due to the incomplete edges and changing perspective. In this paper, we propose a novel approach that can efficiently identify a queried shape in a cluttered image. The core idea is to acquire the transformation from the queried shape to the cluttered image by summarising all point-to-point transformations between the queried shape and the image. To do so, we adopt a point-based shape descriptor, the pyramid of arc-length descriptor (PAD), to identify point pairs between the queried shape and the image having similar local shapes. We further calculate the transformations between the identified point pairs based on PAD. Finally, we summarise all transformations in a 4D transformation histogram and search for the main cluster. Our method can handle both closed shapes and open curves, and is resistant to partial occlusions. Experiments show that our method can robustly detect shapes in images in the presence of partial occlusions, fragile edges, and cluttered backgrounds.
Due to the lack of color in manga (Japanese comics), black-and-white textures are often used to enrich visual experience. With the rising need to digitize manga, segmenting texture regions from manga has become an indispensable basis for almost all manga processing, from vectorization to colorization. Unfortunately, such texture segmentation is not easy since textures in manga are composed of lines and exhibit similar features to structural lines (contour lines). So currently, texture segmentation is still manually performed, which is labor-intensive and time-consuming. To extract a texture region, various texture features have been proposed for measuring texture similarity, but precise boundaries cannot be achieved since boundary pixels exhibit different features from inner pixels. In this paper, we propose a novel method which also adopts texture features to estimate texture regions. Unlike existing methods, the estimated texture region is only regarded an initial, imprecise texture region. We expand the initial texture region to the precise boundary based on local smoothness via a graph-cut formulation. This allows our method to extract texture regions with precise boundaries. We have applied our method to various manga images and satisfactory results were achieved in all cases.
Cartoons are a worldwide popular visual entertainment medium with a long history. Nowadays, with the boom of electronic devices, there is an increasing need to digitize old classic cartoons as a basis for further editing, including deformation, colorization, etc. To perform such editing, it is essential to extract the structure lines within cartoon images. Traditional edge detection methods are mainly based on gradients. These methods perform poorly in the face of compression artifacts and spatially-varying line colors, which cause gradient values to become unreliable. This paper presents the first approach to extract structure lines in cartoons based on regions. Our method starts by segmenting an image into regions, and then classifies them as edge regions and non-edge regions. Our second main contribution comprises three measures to estimate the likelihood of a region being a non-edge region. These measure darkness, local contrast, and shape. Since the likelihoods become unreliable as regions become smaller, we further classify regions using both likelihoods and the relationships to neighboring regions via a graph-cut formulation. Our method has been evaluated on a wide variety of cartoon images, and convincing results are obtained in all cases.