Fast Scene Understanding

We propose a network architecture to perform efficient scene understanding. This work presents three main novelties: the first is an Improved Guided Upsampling Module that can replace in toto the decoder part in common semantic segmentation networks.
Our second contribution is the introduction of a new module based on spatial sampling to perform Instance Segmentation. It provides a very fast instance segmentation, needing only thresholding as post-processing step at inference time. Finally, we propose a novel efficient network design that includes the new modules and we test it against different datasets for outdoor scene understanding. To our knowledge, our network is one of the most efficient architectures for scene understanding published to date, furthermore being 8.6% more accurate than the fastest competitor on semantic segmentation and almost five times faster than the most efficient network for instance segmentation.

Presented at CVPR 2019 Workshop on Autonomous Driving
Arxiv: https://arxiv.org/abs/1905.09033

Demo Video

Publications

1.

Spatial Sampling Network for Fast Scene Understanding
(Davide Mazzini, Raimondo Schettini) In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2019.

@inproceedings{Mazzini_2019_CVPR_Workshops,
 author = {Mazzini, Davide and Schettini, Raimondo},
 year = {2019},
 month = {6},
 year = {2019},
 title = {Spatial Sampling Network for Fast Scene Understanding},
 booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}}
2.

Guided Upsampling Network for Real-Time Semantic Segmentation
(Davide Mazzini) accepted on July 2018.

@unpublished{gun,
 author = {Mazzini, Davide},
 year = {accepted on July 2018},
 title = {Guided Upsampling Network for Real-Time Semantic Segmentation},
 projectref = {http://www.ivl.disco.unimib.it/activities/semantic-segmentation/}}