Colloquium Cognitive Systems

Exploiting Representations and Models for Data-Efficient Reinforcement Learning
Dr. Joschka Boedecker, Uni Freiburg


Abstract:
Reinforcement Learning provides an approach to learn approximately optimal behavior of an artificial agent even in cases where the dynamics of a given task are unknown. There have been impressive applications of this technology in recent years, including beating the best human players in the game of Go. However, current approaches often require millions of samples to achieve high levels of performance, making it difficult to apply them on real-world devices such as robots. We discuss possibilities to learn representations from data that facilitate control tasks, and to use learned dynamics models to generate "imagined" experiences that help speed up the learning process significantly.

Bio: Joschka Boedecker studied computer science at the University of Koblenz-Landau, Germany, and artificial intelligence at the University of Georgia, USA. He received his PhD degree in engineering from Osaka University, Japan, in 2011, and continued to work there as a postdoc until 2012. In 2013, he joined the Machine Learning Lab of University of Freiburg, Germany, which he lead as  interim professor from 2015-2017. Since fall of 2017, he holds a position as assistant professor of neurorobotics at University of Freiburg. His research interests are at the intersection of machine learning and robotics, with a focus on deep reinforcement learning.

Literature:
* Manuel Watter, Jost Springenberg, Joschka Boedecker, Martin Riedmiller (2015) Embed to Control: A Locally Linear Latent Dynamics Model for Control from Raw Images. In Advances in Neural Information Processing Systems 28. pp. 2728–2736.
Link: papers.nips.cc/paper/5964-embed-to-control-a-locally-linear-latent-dynamics-model-for-control-from-raw-images.pdf
* Gabriel Kalweit, Joschka Boedecker (2017) Uncertainty-driven Imagination for Continuous Deep Reinforcement Learning. Proceedings of the 1st Annual Conference on Robot Learning, PMLR 78:195-206.
Link: proceedings.mlr.press/v78/kalweit17a/kalweit17a.pdf