Sort:
Open Access Research Article Issue
Removing fences from sweep motion videos using global 3D reconstruction and fence-aware light field rendering
Computational Visual Media 2019, 5 (1): 21-32
Published: 08 April 2019
Abstract PDF (29.3 MB) Collect
Downloads:90

Diminishing the appearance of a fence in an image is a challenging research area due to the characteristics of fences (thinness, lack of texture, etc.) and the need for occluded background restoration. In this paper, we describe a fence removal method for an image sequence captured by a user making a sweep motion, in which occluded background is potentially observed. To make use of geometric and appearance information such as consecutive images, we use two well-known approaches: structure from motion and light field rendering. Results using real image sequences show that our method can stably segment fences and preserve background details for various fence and background combinations. A new video without the fence, with frame coherence, can be successfully provided.

Open Access Research Article Issue
FusionMLS: Highly dynamic 3D reconstruction with consumer-grade RGB-D cameras
Computational Visual Media 2018, 4 (4): 287-303
Published: 22 August 2018
Abstract PDF (19.6 MB) Collect
Downloads:30

Multi-view dynamic three-dimensional reconstruction has typically required the use of custom shutter-synchronized camera rigs in order to capture scenes containing rapid movements or complex topology changes. In this paper, we demonstrate that multiple unsynchronized low-cost RGB-D cameras can be used for the same purpose. To alleviate issues caused by unsynchronized shutters, we propose a novel depth frame interpolation technique that allows synchronized data capture from highly dynamic 3D scenes. To manage the resulting huge number of input depth images, we also introduce an efficient moving least squares-based volumetric reconstruction method that generates triangle meshes of the scene. Our approach does not store the reconstruction volume in memory, making it memory-efficient and scalable to large scenes. Our implementation is completely GPU based and works in real time. The results shown herein, obtained with real data, demonstrate the effectiveness of our proposed method and its advantages compared to state-of-the-art approaches.

Open Access Research Article Issue
Robust camera pose estimation by viewpoint classification using deep learning
Computational Visual Media 2017, 3 (2): 189-198
Published: 06 December 2016
Abstract PDF (3.7 MB) Collect
Downloads:20

Camera pose estimation with respect to target scenes is an important technology for superimposing virtual information in augmented reality (AR). However, it is difficult to estimate the camera pose for all possible view angles because feature descriptors such as SIFT are not completely invariant from every perspective. We propose a novel method of robust camera pose estimation using multiple feature descriptor databases generated for each partitioned viewpoint, in which the feature descriptor of each keypoint is almost invariant. Our method estimates the viewpoint class for each input image using deep learning based on a set of training images prepared for each viewpoint class. We give two ways to prepare these images for deep learning and generating databases. In the first method, images are generated using a projection matrix to ensure robust learning in a range of environments with changing backgrounds. The second method uses real images to learn a given environment around a planar pattern. Our evaluation results confirm that our approach increases the number of correct matches and the accuracy of camera pose estimation compared to the conventional method.

Open Access Research Article Issue
Synthesis of a stroboscopic image from a hand-held camera sequence for a sports analysis
Computational Visual Media 2016, 2 (3): 277-289
Published: 01 June 2016
Abstract PDF (4 MB) Collect
Downloads:54

This paper presents a method for synthesizing a stroboscopic image of a moving sports player from a hand-held camera sequence. This method has three steps: synthesis of background image, synthesis of stroboscopic image, and removal of player’s shadow. In synthesis of background image step, all input frames masked a bounding box of the player are stitched together to generate a background image. The player is extracted by an HOG-based people detector. In synthesis of stroboscopic image step, the background image, the input frame, and a mask of the player synthesize a stroboscopic image. In removal of shadow step, we remove the player’s shadow which negatively affects an analysis by using mean-shift. In our previous work, synthesis of background image has been time-consuming. In this paper, by using the bounding box of the player detected by HOG and by subtracting the images for synthesizing a mask, computational speed and accuracy can be improved. These have contributed greatly to the improvement from the previous method. These are main improvements and novelty points from our previous method. In experiments, we confirmed the effectiveness of the proposed method, measured the player’s speed and stride length, and made a footprint image. The image sequence was captured under a simple condition that no other people were in the background and the person controlling the video camera was standing still, such like a motion parallax was not occurred. In addition, we applied the synthesis method to various scenes to confirm its versatility.

Total 4