BING: Binarized normed gradients for objectness estimation at 300fps

Ming-Ming Cheng; Yun Liu; Wen-Yan Lin; Ziming Zhang; Paul L. Rosin; Philip H. S. Torr

doi:10.1007/s41095-018-0120-1

| Sign up

PDF (21.8 MB)

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

Research Article | Open Access

BING: Binarized normed gradients for objectness estimation at 300fps

Ming-Ming Cheng^¹(), Yun Liu^¹, Wen-Yan Lin^², Ziming Zhang^³, Paul L. Rosin^⁴, Philip H. S. Torr^⁵

1 CCS, Nankai University, Tianjin 300350, China.

2 Institute for Infocomm Research, Singapore, 138632.

3 MERL, Cambridge, MA 02139-1955, US.

4 Cardiff University, Wales, CF24 3AA, UK.

5 University of Oxford, Oxford, OX1 3PJ, UK.

* These authors contributed equally to this work.

Show Author Information

Abstract

Training a generic objectness measure to produce object proposals has recently become of significant interest. We observe that generic objects with well-defined closed boundaries can be detected by looking at the norm of gradients, with a suitable resizing of their corresponding image windows to a small fixed size. Based on this observation and computational reasons, we propose to resize the window to $8 \times 8$ and use the norm of the gradients as a simple 64D feature to describe it, for explicitly training a generic objectness measure. We further show how the binarized version of this feature, namely binarized normed gradients (BING), can be used for efficient objectness estimation, which requires only a few atomic operations (e.g., add, bitwise shift, etc.). To improve localization quality of the proposals while maintaining efficiency, we propose a novel fast segmentation method and demonstrate its effectiveness for improving BING’s localization performance, when used in multi-thresholding straddling expansion (MTSE) post-processing. On the challenging PASCAL VOC2007 dataset, using 1000 proposals per image and intersection-over-union threshold of 0.5, our proposal method achieves a 95.6% object detection rate and 78.6% mean average best overlap in less than 0.005 second per image.

Keywords

object proposals objectness visual atten-tion category agnostic proposals

References

[1]

Alexe,

; T.

Deselaers,

; V.

Ferrari,

Measuring the objectness of image windows. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 34, No. 11, 2189-2202, 2012.

Crossref Google Scholar

[2]

Alexe,

; T.

Deselaers,

; V.

Ferrari,

What is an object? In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 73-80, 2010.

Crossref

[3]

Girshick,

; J.

Donahue,

; T.

Darrell,

; J.

Malik,

Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 580-587, 2014.

Crossref

[4]

Girshick,

Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, 1440-1448, 2015.

Crossref

[5]

He,

; X.

Zhang,

; S.

Ren,

; J.

Sun,

Spatial pyramid pooling in deep convolutional networks for visual recognition. In: Computer Vision-ECCV 2014. Lecture Notes in Computer Science, Vol. 8691. D.

Fleet,

; T.

Pajdla,

; B.

Schiele,

; T.

Tuytelaars,

Eds. Springer Cham, 346-361, 2014.

[6]

Wang,

; S.

Li,

; A.

Gupta,

; D.-Y.

Yeung,

Transferring rich feature hierarchies for robust visual tracking. arXiv preprint arXiv:1501.04587, 2015.

[7]

Kwak,

; M.

Cho,

; I.

Laptev,

; J.

Ponce,

; C.

Schmid,

Unsupervised object discovery and tracking in video collections. In: Proceedings of the IEEE International Conference on Computer Vision, 3173-3181, 2015.

Crossref

[8]

Kading,

; A.

Freytag,

; E.

Rodner,

; P.

Bodesheim,

; J.

Denzler,

Active learning and discovery of object categories in the presence of unnameable instances. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 4343-4352, 2015.

Crossref

[9]

Cho,

; S.

Kwak,

; C.

Schmid,

; J.

Ponce,

Unsupervised object discovery and localization in the wild: Part-based matching with bottom-up region proposals. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1201-1210, 2015.

Crossref

[10]

Arbeláez,

; B.

Hariharan,

; C.

Gu,

; S.

Gupta,

; L.

Bourdev,

; J.

Malik,

Semantic segmentation using regions and parts. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3378-3385, 2012.

Crossref

[11]

Carreira,

; R.

Caseiro,

; J.

Batista,

; C.

Sminchisescu,

Semantic segmentation with second-order pooling. In: Computer Vision-ECCV 2012. Lecture Notes in Computer Science, Vol. 7578. A.

Fitzgibbon,

; S.

Lazebnik,

; P.

Perona,

; Y.

Sato,

; C.

Schmid,

Eds. Springer Berlin Heidelberg, 430-443, 2012.

Crossref

[12]

Sun,

; H.

Ling,

Scale and object aware image retargeting for thumbnail browsing. In: Proceedings of the International Conference on Computer Vision, 1511-1518, 2011.

[13]

Sener,

; C.

Bas,

; N.

Ikizler-Cinbis,

On recognizing actions in still images via multiple features. In: Computer Vision-ECCV 2012. Workshops and Demonstrations. Lecture Notes in Computer Science, Vol. 7585. A.

Fusiello,

; V.

Murino,

; R.

Cucchiara,

Eds. Springer Berlin Heidelberg, 263-272, 2012.

Crossref

[14]

H.-L.

Teuber,

Physiological psychology. Annual Review of Psychology Vol. 6, 267-296, 1955.

Crossref Google Scholar

[15]

J. M.

Wolfe,

; T. S.

Horowitz,

What attributes guide the deployment of visual attention and how do they do it? Nature Reviews Neuroscience Vol. 5, 495-501, 2004.

Crossref Google Scholar

[16]

Koch,

; S.

Ullman,

Shifts in selective visual attention: Towards the underlying neural circuitry. Human Neurbiology Vol. 4, No. 4, 219-227, 1985.

Google Scholar

[17]

Desimone,

; J.

Duncan,

Neural mechanisms of selective visual attention. Annual Review of Neuroscience Vol. 18, 193-222, 1995.

Crossref Google Scholar

[18]

D. A.

Forsyth,

; J.

Malik,

; M. M.

Fleck,

; H.

Greenspan,

; T.

Leung,

; S.

Belongie,

; C.

Carson,

; C.

Bregler,

Finding pictures of objects in large collections of images. In: Object Representation in Computer Vision II. Lecture Notes in Computer Science, Vol. 1144. J.

Ponce,

; A.

Zisserman,

; M.

Hebert,

Eds. Springer Berlin Heidelberg, 335-360, 1996.

Crossref

[19]

Heitz,

; D.

Koller,

Learning spatial context: Using stuff to find things. In: Computer Vision-ECCV 2008. Lecture Notes in Computer Science, Vol. 5302. D.

Forsyth,

; P.

Torr,

; A.

Zisserman,

Eds. Springer Berlin Heidelberg, 30-43, 2008.

Crossref

[20]

J. R. R.

Uijlings,

; K. E. A.

van de Sande,

; T.

Gevers,

; A. W. M.

Smeulders,

Selective search for object recognition. International Journal on Computer Vision Vol. 104, No. 2, 154-171, 2013.

Crossref Google Scholar

[21]

Endres,

; D.

Hoiem,

Category-independent object proposals with diverse ranking. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 36, No. 2, 222-234, 2014.

Crossref Google Scholar

[22]

M.-M.

Cheng,

; Z.

Zhang,

; W.-Y.

Lin,

; P. H. S.

Torr,

BING: Binarized normed gradients for objectness estimation at 300fps. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3286-3293, 2014.

Crossref

[23]

Wei,

; W.

Xia,

; M.

Lin,

; J.

Huang,

; B.

Ni,

; J.

Dong,

; Y.

Zhao,

; S.

Yan,

HCP: A flexible CNN framework for multi-label image classification. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 38, No. 9, 1901-1907, 2016.

Crossref Google Scholar

[24]

Zha,

; F.

Luisier,

; W.

Andrews,

; N.

Srivastava,

; R.

Salakhutdinov,

Exploiting image-trained CNN architectures for unconstrained video classification. In: Proceedings of the British Machine Vision Conference, 2015.

Crossref

[25]

P. O.

Pinheiro,

; R.

Collobert,

From image-level to pixel-level labeling with convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1713-1721, 2015.

Crossref

[26]

Wu,

; Y.

Yu,

; C.

Huang,

; K.

Yu,

Deep multiple instance learning for image classification and auto-annotation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3460-3469, 2015.

Crossref

[27]

Y. J.

Lee,

; K.

Grauman,

Predicting important objects for egocentric video summarization. International Journal on Computer Vision Vol. 114, No. 1, 38-55, 2015.

Crossref Google Scholar

[28]

Paisitkriangkrai,

; C.

Shen,

; A. v. d.

Hengel,

Pedestrian detection with spatially pooled features and structured ensemble learning. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 38, No. 6, 1243-1257, 2016.

Crossref Google Scholar

[29]

Zhang,

; J.

Han,

; C.

Li,

; J.

Wang,

; X.

Li,

Detection of co-salient objects by looking deep and wide. International Journal on Computer Vision Vol. 120, No. 2, 215-232, 2016.

Crossref Google Scholar

[30]

Ren,

; K.

He,

; R.

Girshick,

; J.

Sun,

Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 39, No. 6, 1137-1149, 2015.

Crossref Google Scholar

[31]

Redmon,

; A.

Farhadi,

YOLO9000: Better, faster, stronger. arXiv preprint arXiv:1612.08242, 2016.

Crossref

[32]

Liu,

; D.

Anguelov,

; D.

Erhan,

; C.

Szegedy,

; S.

Reed,

; C.-Y.

Fu,

; A. C.

Berg,

SSD: Single shot multibox detector. In: Computer Vision-ECCV 2016. Lecture Notes in Computer Science, Vol. 9905. B.

Leibe,

; J.

Matas,

; N.

Sebe,

; M.

Welling,

Eds. Springer Cham, 21-37, 2016.

Crossref

[33]

Everingham,

; L.

Van Gool,

; C. K. I.

Williams,

; J.

Winn,

; A.

Zisserman,

The PASCAL visual object classes (VOC) challenge. International Journal on Computer Vision Vol. 88, No. 2, 303-338, 2010.

Crossref Google Scholar

[34]

C. L.

Zitnick,

; P.

Dollár,

Edge boxes: Locating object proposals from edges. In: Computer Vision-ECCV 2014. Lecture Notes in Computer Science, Vol. 8693. D.

Fleet,

; T.

Pajdla,

; B.

Schiele,

; T.

Tuytelaars,

Eds. Springer Cham, 391-405, 2014.

[35]

Hosang,

; R.

Benenson,

; P.

Dollár,

; B.

Schiele,

What makes for effective detection proposals? IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 38, No. 4, 814-830, 2016.

Crossref Google Scholar

[36]

Pont-Tuset,

; P.

Arbeláez,

; J. T.

Barron,

; F.

Marques,

; J.

Malik,

Multiscale combinatorial grouping for image segmentation and object proposal generation. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 39, No. 1, 128-140, 2017.

Crossref Google Scholar

[37]

Zhao,

; Z.

Liu,

; B.

Yin,

Cracking BING and beyond. In: Proceedings of the British Machine Vision Conference, 2014.

Crossref

[38]

Chen,

; H.

Ma,

; X.

Wang,

; Z.

Zhao,

Improving object proposals with multi-thresholding straddling expansion. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2587-2595, 2015.

[39]

C. Y.

Ren,

; V. A.

Prisacariu,

; I. D.

Reid,

gSLICr: SLIC superpixels at over 250Hz. arXiv preprint arXiv:1509.04232, 2015.

[40]

Achanta,

; A.

Shaji,

; K.

Smith,

; A.

Lucchi,

; P.

Fua,

; S.

SÃijsstrunk,

SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 34, No. 11, 2274-2282, 2012.

Crossref Google Scholar

[41]

P. F.

Felzenszwalb,

; D. P.

Huttenlocher,

Efficient graph-based image segmentation. International Journal on Computer Vision Vol. 59, No. 2, 167-181, 2004.

Crossref Google Scholar

[42]

M.-M.

Cheng,

; Y.

Liu,

; Q.

Hou,

; J.

Bian,

; P.

Torr,

; S.-M.

Hu,

; Z.

Tu,

HFS: Hierarchical feature selection for efficient image segmentation. In: Computer Vision-ECCV 2016. Lecture Notes in Computer Science, Vol. 9907. B.

Leibe,

; J.

Matas,

; N.

Sebe,

; M.

Welling,

Eds. Springer Cham, 867-882, 2016.

Crossref

[43]

T.-Y.

Lin,

; M.

Maire,

; S.

Belongie,

; J.

Hays,

; P.

Perona,

; D.

Ramanan,

; P.

Dollár,

; C. L.

Zitnick,

Microsoft COCO: Common objects in context. In: Computer Vision-ECCV 2014. Lecture Notes in Computer Science, Vol. 8693. D.

Fleet,

; T.

Pajdla,

; B.

Schiele,

; T.

Tuytelaars,

Eds. Springer Cham, 740-755, 2014.

Crossref

[44]

Zhang,

; J.

Warrell,

; P. H. S.

Torr,

Proposal generation for object detection using cascaded ranking SVMs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1497-1504, 2011.

Crossref

[45]

Rahtu,

; J.

Kannala,

; M. B.

Blaschko,

Learning a category independent object detection cascade. In: Proceedings of the International Conference on Computer Vision, 1052-1059, 2011.

Crossref

[46]

Manen,

; M.

Guillaumin,

; L.

Van Gool,

Prime object proposals with randomized Prim’s algorithm. In: Proceedings of the IEEE International Conference on Computer Vision, 2536-2543, 2013.

Crossref

[47]

Rantalankila,

; J.

Kannala,

; E.

Rahtu,

Generating object segmentation proposals using global and local search. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2417-2424, 2014.

Crossref

[48]

Krähenbühl,

; V.

Koltun,

Geodesic object proposals. In: Computer Vision-ECCV 2014. Lecture Notes in Computer Science, Vol. 8693. D.

Fleet,

; T.

Pajdla,

; B.

Schiele,

; T.

Tuytelaars,

Eds. Springer Cham, 725-739, 2014.

Crossref

[49]

Krähenbühl,

; V.

Koltun,

Learning to propose objects. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1574-1582, 2015.

Crossref

[50]

Humayun,

; F.

Li,

; J. M.

Rehg,

RIGOR: Reusing inference in graph cuts for generating object regions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 336-343, 2014.

Crossref

[51]

Borji,

; M. M.

Cheng,

; H.

Jiang,

et al. Salient object detection: A survey. arXiv preprint arXiv:1411.5878, 2014.

[52]

Judd,

; F.

Durand,

; A.

Torralba,

A benchmark of computational models of saliency to predict human fixations. Technical Report. MIT Tech Report, 2012.

[53]

Itti,

; C.

Koch,

; E.

Niebur,

A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 20, No. 11, 1254-1259, 1998.

Crossref Google Scholar

[54]

Y.-F.

Ma,

; H.-J.

Zhang,

Contrast-based image attention analysis by using fuzzy growing. In: Proceedings of the 11th ACM International Conference on Multimedia, 374-381, 2003.

Crossref

[55]

Harel,

; C.

Koch,

; P.

Perona,

Graph-based visual saliency. In: Proceedings of the 19th International Conference on Neural Information Processing Systems, 545-552, 2006.

[56]

Borji,

; D. N.

Sihite,

; L.

Itti,

Quantitative analysis of human-model agreement in visual saliency modeling: A comparative study. IEEE Transactions on Image Processing Vol. 22, No. 1, 55-69, 2013.

Crossref Google Scholar

[57]

Li,

; X.

Hou,

; C.

Koch,

; J. M.

Rehg,

; A. L.

Yuille,

The secrets of salient object segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 280-287, 2014.

Crossref

[58]

Borji,

; M.-M.

Cheng,

; H.

Jiang,

; J.

Li,

Salient object detection: A benchmark. IEEE Transactions on Image Processing Vol. 24, No. 12, 5706-5722, 2015.

Crossref Google Scholar

[59]

Liu,

; J.

Sun,

; N.

Zheng,

; X.

Tang,

; H.

Shum,

Learning to detect a salient object. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1-8, 2007.

Crossref

[60]

Achanta,

; S.

Hemami,

; F.

Estrada,

; S.

Susstrunk,

Frequency-tuned salient region detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1597-1604, 2009.

Crossref

[61]

M.-M.

Cheng,

; N. J.

Mitra,

; X.

Huang,

; P. H. S.

Torr,

; S.-M.

Hu,

Global contrast based salient region detection. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 37, No. 3, 569-582, 2015.

Crossref Google Scholar

[62]

Perazzi,

; P.

Krähenbühl,

; Y.

Pritch,

; A.

Hornung,

Saliency filters: Contrast based filtering for salient region detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 733-740, 2012.

Crossref

[63]

M.-M.

Cheng,

; S.

Zheng,

; W.-Y.

Lin,

; V.

Vineet,

; P.

Sturgess,

; N.

Crook,

; N. J.

Mitra,

; P.

Torr,

ImageSpirit: Verbal guided image parsing. ACM Transactions on Graphics Vol. 34, No. 1, Article No. 3, 2014.

Crossref Google Scholar

[64]

Zheng,

; M.-M.

Cheng,

; J.

Warrell,

; P.

Sturgess,

; V.

Vineet,

; C.

Rother,

; P. H. S.

Torr,

Dense semantic image segmentation with objects and attributes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3214-3221, 2014.

Crossref

[65]

Li,

; Y.

Zhu,

; J.

Yang,

; J.

Jiang,

Video super-resolution using an adaptive superpixel-guided auto-regressive model. Pattern Recognition Vol. 51, 59-71, 2016.

Crossref Google Scholar

[66]

G.-X.

Zhang,

; M.-M.

Cheng,

; S.-M.

Hu,

; R. R.

Martin,

A shape-preserving approach to image resizing. Computer Graphics Forum Vol. 28, No. 7, 1897-1906, 2009.

Crossref Google Scholar

[67]

Zheng,

; X.

Chen,

; M.-M.

Cheng,

; K.

Zhou,

; S.-M.

Hu,

; N. J.

Mitra,

Interactive images: Cuboid proxies for smart image manipulation. ACM Transactions on Graphics Vol. 31, No. 4, Article No. 99, 2012.

Crossref Google Scholar

[68]

Chen,

; M.-M.

Cheng,

; P.

Tan,

; A.

Shamir,

; S.-M.

Hu,

Sketch2Photo: Internet image montage. ACM Transactions on Graphics Vol. 28, No. 5, Article No. 124, 2009.

Crossref Google Scholar

[69]

Huang,

; L.

Zhang,

; H.-C.

Zhang,

Arcimboldo-like collage using internet images. ACM Transactions on Graphics Vol. 30, No. 6, Article No. 155, 2011.

Crossref Google Scholar

[70]

A. Y.-S.

Chia,

; S.

Zhuo,

; R. K.

Gupta,

; Y.-W.

Tai,

; S.-Y.

Cho,

; P.

Tan,

; S.

Lin,

Semantic colorization with internet images. ACM Transactions on Graphics Vol. 30, No. 6, Article No. 156, 2011.

Crossref Google Scholar

[71]

He,

; J.

Feng,

; X.

Liu,

; T.

Cheng,

; T.-H.

Lin,

; H.

Chung,

; S.-F.

Chang,

Mobile product search with bag of hash bits and boundary reranking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3005-3012, 2012.

[72]

Chen,

; P.

Tan,

; L.-Q.

Ma,

; M.-M.

Cheng,

; A.

Shamir,

; S.-M.

Hu,

PoseShop: Human image database construction and personalized content synthesis. IEEE Transactions on Visualization and Computer Graphics Vol. 19, No. 5, 824-837, 2013.

Crossref Google Scholar

[73]

S.-M.

Hu,

; T.

Chen,

; K.

Xu,

; M.-M.

Cheng,

; R. R.

Martin,

Internet visual media processing: A survey with graphics and vision applications. The Visual Computer Vol. 29, No. 5, 393-405, 2013.

Crossref Google Scholar

[74]

M.-M.

Cheng,

; N. J.

Mitra,

; X.

Huang,

; S.-M.

Hu,

SalientShape: Group saliency in image collections. The Visual Computer Vol. 30, No. 4, 443-453, 2014.

Crossref Google Scholar

[75]

Carreira,

; C.

Sminchisescu,

CPMC: Automatic object segmentation using constrained parametric min-cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 34, No. 7, 1312-1328, 2012.

Crossref Google Scholar

[76]

Lu,

; S.

Liu,

; J.

Jia,

; C.-K.

Tang,

Contour box: Rejecting object proposals without explicit closed contours. In: Proceedings of the IEEE International Conference on Computer Vision, 2021-2029, 2015.

Crossref

[77]

R.-E.

Fan,

; K.-W.

Chang,

; C.-J.

Hsieh,

; X.-R.

Wang,

; C.-J.

Lin,

LIBLINEAR: A library for large linear classification. The Journal of Machine Learning Research Vol. 9, 1871-1874, 2008.

Google Scholar

[78]

J. P.

Gottlieb,

; M.

Kusunoki,

; M. E.

Goldberg,

The representation of visual salience in monkey parietal cortex. Nature Vol. 391, No. 6666, 481-484, 1998.

Crossref Google Scholar

[79]

Hare,

; A.

Saffari,

; P. H. S.

Torr,

Efficient online structured output learning for keypoint-based object tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1894-1901, 2012.

Crossref

[80]

Zheng,

; P.

Sturgess,

; P. H. S.

Torr,

Approximate structured output learning for constrained local models with application to real-time facial feature detection and tracking on low-power devices. In: Proceedings of the 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition, 1-8,2013.

Crossref

[81]

Viola,

; M.

Jones,

Rapid object detection using a boosted cascade of simple features. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, I-I, 2001.

[82]

Chavali,

; H.

Agrawal,

; A.

Mahendru,

; D.

Batra,

Object-proposal evaluation protocol is ‘gameable’. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 835-844, 2016.

Crossref

[83]

Simonyan,

; A.

Zisserman,

Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.

[84]

Dalal,

; B.

Triggs,

Histograms of oriented gradients for human detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 1, 886-893, 2005.

[85]

P. F.

Felzenszwalb,

; R. B.

Girshick,

; D.

McAllester,

; D.

Ramanan,

Object detection with discriminatively trained part-based models. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 32, No. 9, 1627-1645, 2010.

Crossref Google Scholar

[86]

Deng,

; W.

Dong,

; R.

Socher,

; L.-J.

Li,

; K.

Li,

; L.

Fei-Fei,

ImageNet: A large-scale hierarchical image database. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 248-255, 2009.

Crossref

[87]

He,

; X.

Zhang,

; S.

Ren,

; J.

Sun,

Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770-778, 2016.

Crossref

[88]

Kuo,

; B.

Hariharan,

; J.

Malik,

DeepBox: Learning objectness with convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, 2479-2487, 2015.

Crossref

[89]

Zhang,

; Y.

Liu,

; X.

Chen,

; Y.

Zhu,

; M.-M.

Cheng,

; V.

Saligrama,

; P. H.

Torr,

Sequential optimization for efficient high-quality object proposal generation. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 40, No. 5, 1209-1223, 2018.

Crossref Google Scholar

[90]

Chen,

; C.

Xiong,

; R.

Xu,

; J. J.

Corso,

Actionness ranking with lattice conditional ordinal random fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,748-755, 2014.

Crossref

Computational Visual Media

Volume 5 Issue 1,
March 2019

Pages 3-20

DOI: 10.1007/s41095-018-0120-1

Cite this article:

Cheng M-M, Liu Y, Lin W-Y, et al. BING: Binarized normed gradients for objectness estimation at 300fps. Computational Visual Media, 2019, 5(1): 3-20. https://doi.org/10.1007/s41095-018-0120-1