Tutorial: Vision for robotics
Tim Patten (TU Vienna, Austria), Cesar Cadena (ETH Zurich, Switzerland)

Robot perception is the act of interpreting sensor data to generate an awareness of the surrounding environment and robot vision specifically interprets the data from onboard cameras. In recent years, robot vision has experienced significant leaps forward due to the availability of inexpensive RGB-D cameras, which has enabled direct perception of the 3D world, and the exploitation of deep learning, which has established state of the art for many semantic tasks. Despite these progressions, there is a frustrating performance gap between computer vision algorithms tested in the lab and those deployed in the wild.

The aim of this tutorial is to cover the concepts, methods, applications and challenges of vision for robotics in order to both expose and bridge the gap between computer and robot vision. The tutorial will be divided into two sessions. The aim of the first session is to give an overview of simple robotic platforms that are suitable for different tasks to encourage computer vision experts to test their work in real-world domains. The second session will cover semantic understanding of the environment with a focus on modern deep learning approaches that are frequently applied in robotics.

Part I: Which robotic platform should I use?

This talk will look in detail at available robot platforms that are suitable for robotic vision experimentation. The focus will be on out-of-the-box platforms that can be simply set up and used directly to deploy vision algorithms. This will include an overview of system prerequisites, software frameworks and a description of common platforms that are usable with minimal hardware knowledge.

Part II: Semantics and deep learning for robotic perception

This talk will look at the typical vision tasks faced by robots and clarify the differences between methods and results when algorithms are applied in the lab as opposed to the real world. This will first cover the challenges of employing standard techniques and then give an overview of frequently used methods in the robotics context by describing what does and does not work. The talk will also present state of the art for vision tasks, such as object detection, as well as robotic related tasks, such as grasp point estimation.