散文網 » 科技 »學習 » Reinforcement Learning_Code_Simplest Actor-Critic

Reinforcement Learning_Code_Simplest Actor-Critic

2023-04-12 21:59 作者:別叫我小紅 0人讀過 | 我要投稿

Following results and code are the implementation of simplest actor-critic in Gymnasium's Cart Pole environment. More actor-critic alorithms will be added in the learning of OpenAi Sunning Up tutorial.

RESULTS:

The simplest actor-critic algorithm takes too many steps to converge, it may be caused by large variance in sampling. If a baseline is reduced when updating policy, which refers to the trick used in?A2C, this phenomenon may be alleviated.

Visualizations of (i) changes in score?and?value approximation loss, and (ii) animation results.

Fig. 1. Changes in score and value approximation loss.

Fig. 2. Animation result?which got?a score of 357 points.

CODE:

NetWork.py

QACAgent.py

train_and_test.py

The above code are mainly based on?Lesson 7 of the David Silver's lecture [1],?Chapter 10 of Shiyu Zhao's Mathematical Foundation of Reinforcement Learning [2], and?Chapter 10 of Hands-on Reinforcement Learning?[3].

Reference

[1] https://www.davidsilver.uk/teaching/

[2] https://github.com/MathFoundationRL/Book-Mathmatical-Foundation-of-Reinforcement-Learning

[3]?https://hrl.boyuai.com/

標簽：強化學習

Reinforcement Learning_Code_Simplest Actor-Critic的評論 (共條)

愛情散文傷感散文哲理散文優(yōu)美生活隨筆親情唯美句子傷感的句子現(xiàn)代詩歌空間日志經典語句愛情句子作文大全

国产精品天干天干,亚洲毛片在线,日韩gay小鲜肉啪啪18禁,女同Gay自慰喷水

Reinforcement Learning_Code_Simplest Actor-Critic

Reinforcement Learning_Code_Simplest Actor-Critic的評論 (共條)

你可能也喜歡這些文章

最新發(fā)布的文章

国产精品天干天干,亚洲毛片在线,日韩gay小鲜肉啪啪18禁,女同Gay自慰喷水

Reinforcement Learning_Code_Simplest Actor-Critic

本文作者的其他文章

Reinforcement Learning_Code_Simplest Actor-Critic的評論 (共 條)

你可能也喜歡這些文章

最新發(fā)布的文章

Reinforcement Learning_Code_Simplest Actor-Critic的評論 (共條)