Benchmarking Algorithms for Food Localization and Semantic Segmentation
We create a new dataset composed of 120,000 images of 50 diverse food categories. The images are accurately annotated with pixel-wise annotations. The dataset is augmented with the same 5,000 images but rendered under different acquisition distortions that comprise illuminant change, JPEG compression, Gaussian noise, and Gaussian blur.
Deep Learning for Logo Recognition
In this project we present a method for logo recognition based on deep learning. Our recognition …
Fast Scene Understanding
We propose a network architecture to perform efficient scene understanding. This work presents three main …
Food 524 Database
Food524DB is the largest publicly available food dataset with 524 food classes and 247,636 images by merging food classes from existing datasets in the state of the art. This database can be used for food recognition. The database is composed of 247,636 images belonging to 524 food categories.
Automatic food recognition is an important task to support the users in their daily dietary monitoring and to keep tracks of their food consumption. We have designed datasets and algorithms for automatic dietary monitoring of canteen customers based on robust computer vision techniques.
Food-475 database is one of the largest publicly available food database with 475 food classes and 247,636 images obtained by merging four publicly available food databases.
Hierarchical car classification
We address the task of classifying car images at multiple levels of detail, ranging from …
Local Detectors and Compact Descriptors for Visual Search
A review of existing methods for local visual detectors and descriptors.
Real Time Semantic Segmentation
Semantic segmentation architectures are mainly built upon an encoder-decoder structure. These models perform subsequent downsampling …
Semi-automatic and Automatic Video Annotation
We have developed the iVAT: an interactive Video Annotation Tool. It supports manual, semi-automatic and automatic annotations through the interaction of the user with various detection algorithms.