SECOND: Sparsely Embedded Convolutional Detection

Yan Yan; Yuyin Mao; Bo Li

doi:10.3390/s18103337

journal article Open Access Oct 06, 2018

SECOND: Sparsely Embedded Convolutional Detection

Yan Yan

Yuyin Mao Bo Li

Sensors Vol. 18 No. 10 pp. 3337 · MDPI AG

View at Publisher Save 10.3390/s18103337

Abstract

LiDAR-based or RGB-D-based object detection is used in numerous applications, ranging from autonomous driving to robot vision. Voxel-based 3D convolutional networks have been used for some time to enhance the retention of information when processing point cloud LiDAR data. However, problems remain, including a slow inference speed and low orientation estimation performance. We therefore investigate an improved sparse convolution method for such networks, which significantly increases the speed of both training and inference. We also introduce a new form of angle loss regression to improve the orientation estimation performance and a new data augmentation approach that can enhance the convergence speed and performance. The proposed network produces state-of-the-art results on the KITTI 3D object detection benchmarks while maintaining a fast inference speed.

Topics

No keywords indexed for this article. Browse by subject →

References

34

[1]

Fast R-CNN

Ross Girshick

2015 IEEE International Conference on Computer Vis... 10.1109/iccv.2015.169

[2]

Dai, J., Li, Y., He, K., and Sun, J. (2016, January 5–10). R-FCN: Object detection via region-based fully convolutional networks. Proceedings of the Advances in Neural Information Processing Systems, Barcelona, Spain.

[3]

Mask R-CNN

Kaiming He, Georgia Gkioxari, Piotr Dollar et al.

2017 IEEE International Conference on Computer Vis... 10.1109/iccv.2017.322

[4]

Chen, Y., Wang, Z., Peng, Y., Zhang, Z., Yu, G., and Sun, J. (arXiv, 2017). Cascaded pyramid network for multi-person pose estimation, arXiv. 10.1109/cvpr.2018.00742

[5]

Chen, X., Kundu, K., Zhang, Z., Ma, H., Fidler, S., and Urtasun, R. (2016, January 27–30). Monocular 3D object detection for autonomous driving. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA. 10.1109/cvpr.2016.236

[6]

Chen "3D object proposals using stereo imagery for accurate object class detection" IEEE Trans. Pattern Anal. Mach. Intell. (2018) 10.1109/tpami.2017.2706685

[7]

(2018, April 28). Kitti 3D Object Detection Benchmark Leader Board. Available online: http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=3d.

[8]

Chen, X., Ma, H., Wan, J., Li, B., and Xia, T. (2017, January 21–26). Multi-view 3D object detection network for autonomous driving. Proceedings of the IEEE Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA. 10.1109/cvpr.2017.691

[9]

Ku, J., Mozifian, M., Lee, J., Harakeh, A., and Waslander, S. (arXiv, 2017). Joint 3D Proposal Generation and Object Detection from View Aggregation, arXiv. 10.1109/iros.2018.8594049

[10]

Du, X., Ang Jr, M.H., Karaman, S., and Rus, D. (arXiv, 2018). A general pipeline for 3D detection of vehicles, arXiv. 10.1109/icra.2018.8461232

[11]

Qi, C.R., Liu, W., Wu, C., Su, H., and Guibas, L.J. (arXiv, 2017). Frustum PointNets for 3D Object Detection from RGB-D Data, arXiv. 10.1109/cvpr.2018.00102

[12]

Wang, D.Z., and Posner, I. (2015, January 13–17). Voting for Voting in Online Point Cloud Object Detection. Proceedings of the Robotics: Science and Systems, Rome, Italy.

[13]

Engelcke, M., Rao, D., Wang, D.Z., Tong, C.H., and Posner, I. (June, January 29). Vote3deep: Fast object detection in 3D point clouds using efficient convolutional neural networks. Proceedings of the 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore. 10.1109/icra.2017.7989161

[14]

Zhou, Y., and Tuzel, O. (arXiv, 2017). VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection, arXiv. 10.1109/cvpr.2018.00472

[15]

Li, B. (2017, January 24–28). 3D fully convolutional network for vehicle detection in point cloud. Proceedings of the IEEE 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, BC, Canada. 10.1109/iros.2017.8205955

[16]

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

Ross Girshick, Jeff Donahue, Trevor Darrell et al.

2014 IEEE Conference on Computer Vision and Patter... 10.1109/cvpr.2014.81

[17]

Mousavian, A., Anguelov, D., Flynn, J., and Košecká, J. (2017, January 21–26). 3D bounding box estimation using deep learning and geometry. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA. 10.1109/cvpr.2017.597

[18]

Li, B., Zhang, T., and Xia, T. (arXiv, 2016). Vehicle detection from 3D lidar using fully convolutional network, arXiv.

[19]

Simon, M., Milz, S., Amende, K., and Gross, H.M. (arXiv, 2018). Complex-YOLO: Real-time 3D Object Detection on Point Clouds, arXiv. 10.1109/cvprw.2019.00158

[20]

You Only Look Once: Unified, Real-Time Object Detection

Joseph Redmon, Santosh Divvala, Ross Girshick et al.

2016 IEEE Conference on Computer Vision and Patter... 10.1109/cvpr.2016.91

[21]

Yang, B., Luo, W., and Urtasun, R. (2018, January 18–22). PIXOR: Real-Time 3D Object Detection From Point Clouds. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA. 10.1109/cvpr.2018.00798

[22]

Qi, C.R., Su, H., Mo, K., and Guibas, L.J. (2017, January 21–26). Pointnet: Deep learning on point sets for 3D classification and segmentation. Proceedings of the IEEE Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.

[23]

Qi, C.R., Yi, L., Su, H., and Guibas, L.J. (2017, January 4–9). Pointnet++: Deep hierarchical feature learning on point sets in a metric space. Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA.

[24]

Li, Y., Bu, R., Sun, M., and Chen, B. (arXiv, 2018). PointCNN, arXiv.

[25]

Graham, B. (arXiv, 2014). Spatially-sparse convolutional neural networks, arXiv. 10.5244/c.29.150

[26]

Graham, B. (arXiv, 2015). Sparse 3D convolutional neural networks, arXiv. 10.5244/c.29.150

[27]

Graham, B., and van der Maaten, L. (arXiv, 2017). Submanifold Sparse Convolutional Networks, arXiv. 10.1109/cvpr.2018.00961

[28]

Graham, B., Engelcke, M., and van der Maaten, L. (2018, January 18–22). 3D Semantic Segmentation with Submanifold Sparse Convolutional Networks. Proceedings of the IEEE Computer Vision and Pattern Recognition CVPR, Salt Lake City, UT, USA. 10.1109/cvpr.2018.00961

[29]

Song, S., and Xiao, J. (2016, January 27–30). Deep sliding shapes for amodal 3D object detection in rgb-d images. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA. 10.1109/cvpr.2016.94

[30]

Vasudevan, A., Anderson, A., and Gregg, D. (2017, January 10–12). Parallel multi channel convolution using general matrix multiplication. Proceedings of the 2017 IEEE 28th International Conference on Application-specific Systems, Architectures and Processors (ASAP), Seattle, WA, USA. 10.1109/asap.2017.7995254

[31]

(2018, April 28). SparseConvNet Project. Available online: https://github.com/facebookresearch/SparseConvNet.

[32]

SSD: Single Shot MultiBox Detector

Wei Liu, Dragomir Anguelov, Dumitru Erhan et al.

Lecture Notes in Computer Science 10.1007/978-3-319-46448-0_2

[33]

Focal Loss for Dense Object Detection

Tsung-Yi Lin, Priya Goyal, Ross Girshick et al.

2017 IEEE International Conference on Computer Vis... 10.1109/iccv.2017.324

[34]

Are we ready for autonomous driving? The KITTI vision benchmark suite

A. Geiger, P. Lenz, R. Urtasun

2012 IEEE Conference on Computer Vision and Patter... 10.1109/cvpr.2012.6248074

Cited By

2,824

Target-aware proposal-level fusion for multi-modal three-dimensional detection

Zilong Zhao, Baofu Wu · 2026

Engineering Applications of Artific...

A dual roi feature fusion for 3D object detection

Qingao Meng, Jigang Tong · 2026

Multimedia Systems

DFPR-BEV: Cross-Modal Knowledge Distillation of Decouple Features and Projection Regions for Multicamera 3-D Object Detection

Jiaxin Liu, Wei Liu · 2026

IEEE Sensors Journal

Label-efficient outdoor 3D object detection via single click annotation from LiDAR point cloud

Qiming Xia, Hongwei Lin · 2026

ISPRS Journal of Photogrammetry and...

Policy-cooperative semantic-bit quantization and hierarchical reinforcement learning for scalable vehicular semantic communications

Xiaojun Li, Jingyi Lang · 2026

Physical Communication

FastPillars: A Deployment-Friendly Pillar-Based 3D Detector

Sifan Zhou, Xinyu Zhang · 2026

IEEE Transactions on Circuits and S...

TiGDistill-BEV: Multi-View BEV 3D Object Detection via Target Inner-Geometry Learning Distillation

Shaoqing Xu, Fang Li · 2026

IEEE Transactions on Circuits and S...

Voxel-PIM: An Efficient Process-in-Memory Based ASIC Accelerator for Voxel-Based Point Cloud Neural Networks

Xipeng Lin, Shaoxuan Li · 2026

IEEE Transactions on Computers

MonoDFM: Density Field Modeling-Based End-to-End Monocular 3D Object Detection

Gang Liu, Xinrui Huang · 2025

IEEE Access

An effective feature enhancement detection network for building change detection in high-resolution remote sensing images

Zhihuan Liu, Tingting Ren · 2025

Engineering Applications of Artific...

SPWS‐Transformer: A Study of 3D Target Detection Method Based on Lightweight Depth Prediction With Multi‐Scale Fusion

Chang'an Zhang, Yian Wang · 2025

IET Image Processing

Target-aware attentional network for rare class segmentation in large-scale LiDAR point clouds

Xinlong Zhang, Dong Lin · 2025

ISPRS Journal of Photogrammetry and...

Surface defect detection of wire rope using laser point cloud data and deep learning

Qing Liu, Chang Zhao · 2025

Optics & Laser Technology

PFENet: Towards precise feature extraction from sparse point cloud for 3D object detection

Yaochen Li, Qiao Li · 2025

Neural Networks

SMM-POD: Panoramic 3D Object Detection via Spherical Multi-Stage Multi-Modal Fusion

Jinghan Zhang, Yusheng Yang · 2025

Remote Sensing

BEVHeight++: Toward Robust Visual Centric 3D Object Detection

Lei Yang, Tao Tang · 2025

IEEE Transactions on Pattern Analys...

Deep learning for three-dimensional (3D) plant phenomics

Shichao Jin, Dawei Li · 2025

Plant Phenomics

TransFusion: Multi-Modal Robust Fusion for 3D Object Detection in Foggy Weather Based on Spatial Vision Transformer

Cheng Zhang, Hai Wang · 2024

IEEE Transactions on Intelligent Tr...

Vehicle Detection in Adverse Weather: A Multi-Head Attention Approach with Multimodal Fusion

Nujhat Tabassum, Mohamed El-Sharkawy · 2024

Journal of Low Power Electronics an...

Region-Based Hybrid Collaborative Perception for Connected Autonomous Vehicles

Pengfei Liu, Zeyi Wang · 2024

IEEE Transactions on Vehicular Tech...

Metrics

2,824

Citations

34

References

Details

Published: Oct 06, 2018
Vol/Issue: 18(10)
Pages: 3337
License: View

Authors

Y

Yan Yan

State Key Laboratory of Power Transmission Equipment and System Security and New Technology, Chongqing University, Chongqing 400044, China; TrunkTech Co., Ltd., No. 3, Danling street, ZhongGuan Town, HaiDian District, Beijing 100089, China

Y

Yuyin Mao

State Key Laboratory of Power Transmission Equipment and System Security and New Technology, Chongqing University, Chongqing 400044, China

B

Bo Li

TrunkTech Co., Ltd., No. 3, Danling street, ZhongGuan Town, HaiDian District, Beijing 100089, China

Cite This Article

Yan Yan, Yuyin Mao, Bo Li (2018). SECOND: Sparsely Embedded Convolutional Detection. Sensors, 18(10), 3337. https://doi.org/10.3390/s18103337

SECOND: Sparsely Embedded Convolutional Detection

You May Also Like