Unsupervised Reinforcement Learning With Expert Demonstration!
3 main points
✔️ Proposed GCSL, a supervised reinforcement learning method for goal-reaching tasks
✔️ Generate supervised data for policies by relabeling the collected data (Hindsight Relabelling)
✔️ Perform as well as or better than regular reinforcement learning on a variety of tasks compared to other comparative methods
Learning to Reach Goals via Iterated Supervised Learning
written by Dibya Ghosh, Abhishek Gupta, Ashwin Reddy, Justin Fu, Coline Devin, Benjamin Eysenbach, Sergey Levine
(Submitted on 12 Dec 2019 (v1), last revised 2 Oct 2020 (this version, v4))
Comments: Accepted to arXiv.
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
First of all
Here we introduce a paper accepted for ICLR 2020: Reinforcement Learning (RL) has a problem that it is difficult to learn goal-reaching tasks, especially when the reward is sparse. On the other hand, Imitation Learning can solve the task by supervised learning using expert demonstrations, but it needs to collect expert demonstrations.
In this article, we introduce Goal-conditional supervised learning (GCSL), which learns a policy by re-labeling the data collected by the policy (measure) being learned and using the data to perform supervised learning without using expert demonstrations. learning (GCSL).
To read more,
Please register with AI-SCHOLAR.
ORCategories related to this article