Path integral guided policy search

Type: Article

Publication Date: 2017-05-01

Citations: 135

DOI: https://doi.org/10.1109/icra.2017.7989384

Chat PDF

Abstract

We present a policy search method for learning complex feedback control policies that map from high-dimensional sensory inputs to motor torques, for manipulation tasks with discontinuous contact dynamics. We build on a prior technique called guided policy search (GPS), which iteratively optimizes a set of local policies for specific instances of a task, and uses these to train a complex, high-dimensional global policy that generalizes across task instances. We extend GPS in the following ways: (1) we propose the use of a model-free local optimizer based on path integral stochastic optimal control (PI <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> ), which enables us to learn local policies for tasks with highly discontinuous contact dynamics; and (2) we enable GPS to train on a new set of task instances in every iteration by using on-policy sampling: this increases the diversity of the instances that the policy is trained on, and is crucial for achieving good generalization. We show that these contributions enable us to learn deep neural network policies that can directly perform torque control from visual input. We validate the method on a challenging door opening task and a pick-and-place task, and we demonstrate that our approach substantially outperforms the prior LQR-based local policy optimizer on these tasks. Furthermore, we show that on-policy sampling significantly increases the generalization ability of these policies.

Locations

  • arXiv (Cornell University) - View - PDF

Similar Works

Action Title Year Authors
+ Path Integral Guided Policy Search 2016 Yevgen Chebotar
Mrinal Kalakrishnan
Ali Abdullah Yahya
Adrian Li
Stefan Schaal
Sergey Levine
+ Path Integral Guided Policy Search 2016 Yevgen Chebotar
Mrinal Kalakrishnan
Ali Abdullah Yahya
Adrian Li
Stefan Schaal
Sergey Levine
+ Guided Policy Search via Approximate Mirror Descent 2016 William Montgomery
Sergey Levine
+ Learning Robust Manipulation Skills with Guided Policy Search via Generative Motor Reflexes 2018 Philipp Ennen
Pia Bresenitz
René Vossen
Frank Hees
+ Learning Robust Manipulation Skills with Guided Policy Search via Generative Motor Reflexes 2018 Philipp Ennen
Pia Bresenitz
René Vossen
Frank Hees
+ PDF Chat Learning Robust Manipulation Skills with Guided Policy Search via Generative Motor Reflexes 2019 Philipp Ennen
Pia Bresenitz
René Vossen
Frank Hees
+ Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization 2016 Chelsea Finn
Sergey Levine
Pieter Abbeel
+ Enhancing Task Performance of Learned Simplified Models via Reinforcement Learning 2023 Hien X. Bui
Michael Posa
+ Guided Policy Search as Approximate Mirror Descent 2016 William Montgomery
Sergey Levine
+ Combining Model-Based and Model-Free Updates for Trajectory-Centric Reinforcement Learning 2017 Yevgen Chebotar
Karol Hausman
Marvin Zhang
Gaurav S. Sukhatme
Stefan Schaal
Sergey Levine
+ PDF Chat Learning contact-rich manipulation skills with guided policy search 2015 Sergey Levine
Nolan Wagener
Pieter Abbeel
+ Guided Policy Search as Approximate Mirror Descent 2016 William Montgomery
Sergey Levine
+ A Comparison of Action Spaces for Learning Manipulation Tasks 2019 Patrick Varin
Lev Grossman
Scott Kuindersma
+ PDF Chat A Comparison of Action Spaces for Learning Manipulation Tasks 2019 Patrick Varin
Lev Grossman
Scott Kuindersma
+ Learning Dexterous Manipulation from Suboptimal Experts 2020 Rae Jeong
Jost Tobias Springenberg
Jackie Kay
Daniel Zheng
Yuxiang Zhou
Alexandre Galashov
Nicolas Heess
Francesco Nori
+ Learning Contact-Rich Manipulation Skills with Guided Policy Search 2015 Sergey Levine
Nolan Wagener
Pieter Abbeel
+ Learning Contact-Rich Manipulation Skills with Guided Policy Search 2015 Sergey Levine
Nolan Wagener
Pieter Abbeel
+ PDF Chat Enhancing Exploration with Diffusion Policies in Hybrid Off-Policy RL: Application to Non-Prehensile Manipulation 2024 H. A. Le
Manal M. Gabriel
Tai Hoang
Gerhard Neumann
Ngo Anh Vien
+ LVIS: Learning from Value Function Intervals for Contact-Aware Robot Controllers 2018 Robin Deits
Twan Koolen
Russ Tedrake
+ LVIS: Learning from Value Function Intervals for Contact-Aware Robot Controllers 2018 Robin Deits
Twan Koolen
Russ Tedrake

Cited by (83)

Action Title Year Authors
+ PDF Chat Optimal adaptive inspection and maintenance planning for deteriorating structural systems 2021 Elizabeth Bismut
Dániel Straub
+ Supervised Learning and Reinforcement Learning of Feedback Models for Reactive Behaviors: Tactile Feedback Testbed 2020 Giovanni Sutanto
Katharina Rombach
Yevgen Chebotar
Zhe Su
Stefan Schaal
Gaurav S. Sukhatme
Franziska Meier
+ PDF Chat Making Sense of Vision and Touch: Self-Supervised Learning of Multimodal Representations for Contact-Rich Tasks 2019 Michelle A. Lee
Yuke Zhu
Krishnan Srinivasan
Parth Shah
Silvio Savarese
Li Fei-Fei
Animesh Garg
Jeannette Bohg
+ Reinforcement Learning on Variable Impedance Controller for High-Precision Robotic Assembly 2019 Jianlan Luo
Eugen Solowjow
Chengtao Wen
Juan Aparicio Ojea
Alice M. Agogino
Aviv Tamar
Pieter Abbeel
+ PDF Chat Critic PI2: Master Continuous Planning via Policy Improvement with Path Integrals and Deep Actor-Critic Reinforcement Learning 2021 He Ba
Jiajun Fan
Xian Guo
Jianye Hao
+ PDF Chat Reinforcement learning for non-prehensile manipulation: Transfer from simulation to physical system 2018 Kendall Lowrey
Svetoslav Kolev
Jeremy Dao
Aravind Rajeswaran
Emanuel Todorov
+ Model-based Lookahead Reinforcement Learning 2019 Zhang-Wei Hong
Joni Pajarinen
Jan Peters
+ Reinforcement learning for non-prehensile manipulation: Transfer from simulation to physical system 2018 Kendall Lowrey
Svetoslav Kolev
Jeremy Dao
Aravind Rajeswaran
Emanuel Todorov
+ PDF Chat Prescribed Performance Control Guided Policy Improvement for Satisfying Signal Temporal Logic Tasks 2019 Péter Várnai
Dimos V. Dimarogonas
+ Curiosity-Driven Experience Prioritization via Density Estimation 2019 Rui Zhao
Volker Tresp
+ Region Growing Curriculum Generation for Reinforcement Learning 2018 Artem Molchanov
Karol Hausman
Gaurav S. Sukhatme
+ Robot Playing Kendama with Model-Based and Model-Free Reinforcement Learning 2020 Shidi Li
+ End-To-End Robotic Reinforcement Learning without Reward Engineering 2019 Avi Singh
Larry Yang
Chelsea Finn
Sergey Levine
+ PDF Chat Collective robot reinforcement learning with distributed asynchronous guided policy search 2017 Ali Abdullah Yahya
Adrian Li
Mrinal Kalakrishnan
Yevgen Chebotar
Sergey Levine
+ ACDER: Augmented Curiosity-Driven Experience Replay 2020 Boyao Li
Tao LĂĽ
Jiayi Li
Ning Lu
Yinghao Cai
Shuo Wang
+ PDF Chat Learning Modular Robot Control Policies 2023 Julian Whitman
Matthew Travers
Howie Choset
+ PDF Chat ACDER: Augmented Curiosity-Driven Experience Replay 2020 Boyao Li
Tao LĂĽ
Jiayi Li
Ning Lu
Yinghao Cai
Shuo Wang
+ Experience Augmentation: Boosting and Accelerating Off-Policy Multi-Agent Reinforcement Learning 2020 Zhenhui Ye
Yining Chen
Guanghua Song
Bowei Yang
Shen Fan
+ Reinforcement and Imitation Learning for Diverse Visuomotor Skills 2018 Yuke Zhu
Ziyu Wang
Josh Merel
Andrei A. Rusu
Tom Erez
Serkan Cabi
Saran Tunyasuvunakool
János Kramár
Raia Hadsell
Nando de Freitas
+ A Survey of Behavior Learning Applications in Robotics -- State of the Art and Perspectives 2019 Alexander Fabisch
Christoph Petzoldt
Marc Otto
Frank Kirchner
+ PDF Chat Hybrid Control for Learning Motor Skills 2021 Ian Abraham
Alexander Broad
Allison Pinosky
Brenna Argall
Todd Murphey
+ PDF Chat Reinforcement learning using expectation maximization based guided policy search for stochastic dynamics 2021 Prakash Mallick
Zhiyiong Chen
Mohsen Zamani
+ Unsupervised Perceptual Rewards for Imitation Learning 2017 Pierre Sermanet
Kelvin Xu
Sergey Levine
+ Guided Policy Improvement for Satisfying STL Tasks using Funnel Adaptation 2020 Péter Várnai
Dimos V. Dimarogonas
+ PDF Chat A multilevel approach for stochastic nonlinear optimal control 2020 Ajay Jasra
Jeremy Heng
Yaxian Xu
Adrian N. Bishop
+ PDF Chat Reinforcement Learning on Variable Impedance Controller for High-Precision Robotic Assembly 2019 Jianlan Luo
Eugen Solowjow
Chengtao Wen
Juan Aparicio Ojea
Alice M. Agogino
Aviv Tamar
Pieter Abbeel
+ Reinforcement Learning Using Expectation Maximization Based Guided Policy Search for Stochastic Dynamics 2020 Prakash Mallick
Zhiyong Chen
Mohsen Zamani
+ PDF Chat Reset-Free Reinforcement Learning via Multi-Task Learning: Learning Dexterous Manipulation Behaviors without Human Intervention 2021 Abhishek Gupta
Justin Yu
Tony Z. Zhao
Vikash Kumar
Aaron Rovinsky
Kelvin Xu
T. Devlin
Sergey Levine
+ PDF Chat Local Policy Optimization for Trajectory-Centric Reinforcement Learning 2020 Patrik Kolaric
Devesh K. Jha
Arvind U. Raghunathan
Frank L. Lewis
Mouhacine Benosman
Diego Romeres
Daniel Nikovski
+ PDF Chat Deep predictive policy training using reinforcement learning 2017 Ali Ghadirzadeh
Atsuto Maki
Danica Kragić
Mårten Björkman
+ How to train your robot with deep reinforcement learning: lessons we have learned 2021 Julian Ibarz
Jie Tan
Chelsea Finn
Mrinal Kalakrishnan
Peter Pástor
Sergey Levine
+ Motion Planning Networks: Bridging the Gap Between Learning-Based and Classical Motion Planners 2020 Ahmed H. Qureshi
Yinglong Miao
Anthony Simeonov
Michael C. Yip
+ PDF Chat Making Sense of Vision and Touch: Learning Multimodal Representations for Contact-Rich Tasks 2020 Michelle A. Lee
Yuke Zhu
Peter Zachares
Matthew Tan
Krishnan Srinivasan
Silvio Savarese
Li Fei-Fei
Animesh Garg
Jeannette Bohg
+ PDF Chat Complex Robotic Manipulation via Graph-Based Hindsight Goal Generation 2021 Zhenshan Bing
Matthias Brucker
Fabrice O. Morin
Rui Li
Xiaojie Su
Kai Huang
Alois Knoll
+ Local Policy Optimization for Trajectory-Centric Reinforcement Learning 2020 Patrik Kolaric
Devesh K. Jha
Arvind U. Raghunathan
Frank L. Lewis
Mouhacine Benosman
Diego Romeres
Daniel Nikovski
+ Complex Robotic Manipulation via Graph-Based Hindsight Goal Generation 2020 Zhenshan Bing
Matthias Brucker
Fabrice O. Morin
Kai Huang
Alois Knoll
+ Benchmarking Model-Based Reinforcement Learning 2019 Tingwu Wang
Xuchan Bao
Ignasi Clavera
Jerrick Hoang
Yeming Wen
Eric Langlois
Shunshi Zhang
Guodong Zhang
Pieter Abbeel
Jimmy Ba
+ Planning under Uncertainty to Goal Distributions. 2020 Adam Conkey
Tucker Hermans
+ PDF Chat robo-gym – An Open Source Toolkit for Distributed Deep Reinforcement Learning on Real and Simulated Robots 2020 Matteo Lucchi
Friedemann Zindler
Stephan MĂĽhlbacher-Karrer
Horst Pichler
+ PDF Chat Active Learning of Discrete-Time Dynamics for Uncertainty-Aware Model Predictive Control 2023 Alessandro Saviolo
Jonathan Frey
Abhishek Rathod
Moritz Diehl
Giuseppe Loianno