Scholar - SciOpen

Follow this author

Zhidong Deng

Downloads: 95 Citations: 18 Articles: 3

Publication Fields

Information Sciences

Publications

Year

Co-author

Sort：

Published

Cited

Download

Open Access Issue

CasNet: A Cascade Coarse-to-Fine Network for Semantic Segmentation

Zhenyang Wang, Zhidong Deng, Shiyao Wang

Tsinghua Science and Technology 2019, 24(2): 207-215

Published: 31 December 2018

Abstract

PDF (4.6 MB) Collect Collected

Downloads：32

Semantic segmentation is a fundamental topic in computer vision. Since it is required to make dense predictions for an entire image, a network can hardly achieve good performance on various kinds of scenes. In this paper, we propose a cascade coarse-to-fine network called CasNet, which focuses on regions that are difficult to make pixel-level labels. The CasNet comprises three branches. The first branch is designed to produce coarse predictions for easy-to-label pixel regions. The second one learns to distinguish the relatively difficult-to-label pixels from the entire image. Finally, the last branch generates final predictions by combining both the coarse and the fine prediction results through a weighting coefficient that is estimated by the second branch. Three branches focus on their own objectives and collaboratively learn to predict from coarse-to-fine predictions. To evaluate the performance of the proposed network, we conduct experiments on two public datasets: SIFT Flow and Stanford Background. We show that these three branches can be trained in an end-to-end manner, and the experimental results demonstrate that the proposed CasNet outperforms existing state-of-the-art models, and it achieves prediction accuracy of 91.6% and 89.7% on SIFT Flow and Standford Background, respectively.

Open Access Issue

A New Algorithm for the Establishing Data Association Between a Camera and a 2-D LIDAR

Lipu Zhou, Zhidong Deng

Tsinghua Science and Technology 2014, 19(3): 314-322

Published: 18 June 2014

Abstract

PDF (1.6 MB) Collect Collected

Downloads：23

In this paper, we propose a new algorithm to establish the data association between a camera and a 2-D LIght Detection And Ranging sensor (LIDAR). In contrast to the previous works, where data association is established by calibrating the intrinsic parameters of the camera and the extrinsic parameters of the camera and the LIDAR, we formulate the map between laser points and pixels as a 2-D homography. The line-point correspondence is employed to construct geometric constraint on the homography matrix. This enables checkerboard to be not essential and any object with straight boundary can be an effective target. The calculation of the 2-D homography matrix consists of a linear least-squares solution of a homogeneous system followed by a nonlinear minimization of the geometric error in the image plane. Since the measurement quality impacts on the accuracy of the result, we investigate the equivalent constraint and show that placing the calibration target nearby the 2-D LIDAR will provide sufficient constraints to calculate the 2-D homography matrix. Simulation and experimental results validate that the proposed algorithm is robust and accurate. Compared with the previous works, which require two calibration processes and special calibration targets such as checkerboard, our method is more flexible and easier to perform.

Open Access Issue

C-HMAX: Artificial Cognitive Model Inspired by the Color Vision Mechanism of the Human Brain

Bo Yang, Lipu Zhou, Zhidong Deng

Tsinghua Science and Technology 2013, 18(1): 51-56

Published: 07 February 2013

Abstract

PDF (469.4 KB) Collect Collected

Downloads：38

Artificial cognitive models and computational neuroscience methods have garnered great interest from both neurologist and leading analysts in recent years. Among the cognitive models, HMAX has been widely used in computer vision systems for its robustness shape and texture features inspired by the ventral stream of the human brain. This work presents a Color-HMAX (C-HMAX) model based on the HMAX model which imitates the color vision mechanism of the human brain that the HMAX model does not include. C-HMAX is then applied to the German Traffic Sign Recognition Benchmark (GTSRB) which has 43 categories and 51 840 sample traffic signs with an accuracy of 98.41%, higher than most other models including linear discriminant analysis and multi-scale convolutional neural network.

Total 3

<1/11>GOpage