【5分钟 Paper】Continuous Control With Deep Reinforcement Learning

  • 论文题目:Continuous Control With Deep Reinforcement Learning

标题及作者信息

所解决的问题?

  这篇文章将Deep Q-Learning运用到Deterministic Policy Gradient算法中。如果了解DPG的话,那这篇文章就是引入DQN改进了一下DPGstate value function。解决了DQN需要寻找maximizes action-value只能运用于离散动作空间 的局限。

背景

  其实就是这两篇文章的组合:

所采用的方法?

  这个DDPG我太熟悉,我实在不想再写啥了,附录一个伪代码吧:

DDPG algorithm

取得的效果?

  实验结果如下图所示:

在这里插入图片描述

所出版信息?作者信息?

  这篇文章是ICLR2016上面的一篇文章。第一作者TimothyP.LillicrapGoogle DeepMindresearch Scientist

  Research focuses on machine learning and statistics for optimal control and decision making, as well as using these mathematical frameworks to understand how the brain learns. In recent work, I've developed new algorithms and approaches for exploiting deep neural networks in the context of reinforcement learning, and new recurrent memory architectures for one-shot learning. Applications of this work include approaches for recognizing images from a single example, visual question answering, deep learning for robotics problems, and playing games such as Go and StarCraft. I'm also fascinated by the development of deep network models that might shed light on how robust feedback control laws are learned and employed by the central nervous system.

作者头像

我的微信公众号名称:深度学习与先进智能决策
微信公众号ID:MultiAgent1024
公众号介绍:主要研究分享深度学习、机器博弈、强化学习等相关内容!期待您的关注,欢迎一起学习交流进步!
推荐阅读
关注数
286
内容数
36
主要研究分享深度学习、机器博弈、强化学习等相关内容!公众号:深度学习与先进智能决策
目录
极术微信服务号
关注极术微信号
实时接收点赞提醒和评论通知
安谋科技学堂公众号
关注安谋科技学堂
实时获取安谋科技及 Arm 教学资源
安谋科技招聘公众号
关注安谋科技招聘
实时获取安谋科技中国职位信息