浏览代码
Modification of reward signals and rl_trainer for SAC (#2433)
Modification of reward signals and rl_trainer for SAC (#2433)
* Adds evaluate_batch to reward signals. Evaluates on minibatch rather than on BrainInfo. * Changes the way reward signal results are reported in rl_trainer so that we get the pure, unprocessed environment reward separate from the reward signals. * Moves end_episode to rl_trainer * Fixed bug with BCModule with RNN/develop-gpu-test
GitHub
5 年前
当前提交
689765d6
共有 11 个文件被更改,包括 215 次插入 和 118 次删除
-
2ml-agents/mlagents/trainers/components/bc/module.py
-
60ml-agents/mlagents/trainers/components/reward_signals/curiosity/signal.py
-
9ml-agents/mlagents/trainers/components/reward_signals/extrinsic/signal.py
-
2ml-agents/mlagents/trainers/components/reward_signals/gail/model.py
-
57ml-agents/mlagents/trainers/components/reward_signals/gail/signal.py
-
15ml-agents/mlagents/trainers/components/reward_signals/reward_signal.py
-
22ml-agents/mlagents/trainers/ppo/trainer.py
-
53ml-agents/mlagents/trainers/rl_trainer.py
-
57ml-agents/mlagents/trainers/tests/mock_brain.py
-
47ml-agents/mlagents/trainers/tests/test_reward_signals.py
-
9ml-agents/mlagents/trainers/tests/test_rl_trainer.py
撰写
预览
正在加载...
取消
保存
Reference in new issue