Blog

Quick Summary of New DeepMind Paper I Found Interesting

Allan Grosvenor, CEO of MSBAI

February 07, 2020

Quick Summary of New DeepMind Paper I Found Interesting

I'm very interested in AI methods that improve learning efficiency. I found this new paper from DeepMind interesting and thought you might too.

Motivation:

Humans explicitly learn notions of objects, relations, geometry and cardinality in a task-agnostic manner and re-purpose this knowledge to future tasks

  • Transporter learns across commonly used RL-proposed architecture robust to varying #, size & motion of objects
  • Using learned keypoints as state input leads to policies that perform better than model-free & model-based RL
  • Demonstrate drastic reductions in the search complexity

First hypothesis: task-agnostic learning of object keypoints can enable fast learning of goal-directed policies.

Second hypothesis: learned keypoints can enable significantly better task-independent exploration.

Search efficiency improvement:

A random action agent would need to search in the space of 18^100 raw actions. However, observing 5 keypoints and T = 20 only has (5×4)^100/20, giving a search space reduction of 10^100

Results:

  • Transporter consistently tracks the salient object keypoints over long time horizons and outperforms
  • Using the learned keypoints and corresponding features within a reinforcement learning context can lead to data-efficient learning in Atari games
  • Surprisingly, learned options model able to play several Atari games via random sampling of options - possible by learning skills to move discovered game avatar as far as possible without dying
  • Learned keypoint options consistently outperform the random actions baseline by a large margin
  • Most notably, this is achieved without rewards or (extrinsic) task-directed learning - Therefore learned keypoints are stable enough to learn complex object-oriented skills in the Atari domain