Unsupervised Reinforcement Learning With Expert Demonstration!

Reinforcement Learning 15/02/2021

3 main points
✔️ Proposed GCSL, a supervised reinforcement learning method for goal-reaching tasks
✔️ Generate supervised data for policies by relabeling the collected data (Hindsight Relabelling)
✔️ Perform as well as or better than regular reinforcement learning on a variety of tasks compared to other comparative methods

Learning to Reach Goals via Iterated Supervised Learning
written by Dibya Ghosh, Abhishek Gupta, Ashwin Reddy, Justin Fu, Coline Devin, Benjamin Eysenbach, Sergey Levine
(Submitted on 12 Dec 2019 (v1), last revised 2 Oct 2020 (this version, v4))
Comments: Accepted to arXiv.
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

First of all

Here we introduce a paper accepted for ICLR 2020: Reinforcement Learning (RL) has a problem that it is difficult to learn goal-reaching tasks, especially when the reward is sparse. On the other hand, Imitation Learning can solve the task by supervised learning using expert demonstrations, but it needs to collect expert demonstrations.

In this article, we introduce Goal-conditional supervised learning (GCSL), which learns a policy by re-labeling the data collected by the policy (measure) being learned and using the data to perform supervised learning without using expert demonstrations. learning (GCSL).

To read more,

Please register with AI-SCHOLAR.

Categories related to this article

山田

Unsupervised Reinforcement Learning With Expert Demonstration!

First of all

Interesting Discovery: Blind AI Learns To Map Its Environment

Interesting Discovery: Blind AI Learns To Map Its Environment

Machine Suggestion Of Optimal Strategies: A System That Recommends Strategies That Meet Advertisers' Objectives Is Now Available

Machine Suggestion Of Optimal Strategies: A System That Recommends Strategies That Meet Advertisers' ...

Autonomous Drone-controlled Reforestation Approach Using MA Reinforcement Learning

Autonomous Drone-controlled Reforestation Approach Using MA Reinforcement Learning

DeepFoids: Simulation Of Fish School Behavior Using Deep Reinforcement Learning

DeepFoids: Simulation Of Fish School Behavior Using Deep Reinforcement Learning

Multi-agent Reinforcement Learning Algorithm That Can Handle Increasing Or Decreasing Number Of Agents

Multi-agent Reinforcement Learning Algorithm That Can Handle Increasing Or Decreasing Number Of Agen ...

When Should Agents Explore?

When Should Agents Explore?