ml-agents

Unity 机器学习代理工具包 (ML-Agents) 是一个开源项目，它使游戏和模拟能够作为训练智能代理的环境。

unity3d unity unity-tech reinforcement-le deep-learning deep-reinforcement-learning neural-networks

目录树: 7450bd0f

文件历史

GitHub 3b866e9f Use Clipped Gaussian (#649 ) This PR makes the following changes: * Moves clipping of continuous control model into model itself. Output is now always [-1, 1]. * Internal model values are now clipped between [-3, 3] before being rescaled to [-1, 1] for output. * This improves training performance by providing a wider range of values within which the pdf of the gaussian can fall. Output of [-1, 1] is used to be more environment-creator friendly. * Fixes issue where epsilon was erroneously being used to reconstruct old probabilities during PPO update, leading to reduced learning performance. * Introduce ScaleAction() function within python to easily rescale values from [-1, 1] to arbitrary range. * Re-train all CC models using improved algorithm. All performance levels are equal or improved. In the case of Crawler, improvement is drastic. * Update documentation appropriately. * Made miscellaneous minor code style and optimization improvements within environments.	7 年前
..
Banana.bytes	Use Clipped Gaussian (#649)	7 年前
Banana.bytes.meta	Use Clipped Gaussian (#649)	7 年前

GitHub 3b866e9f Use Clipped Gaussian (#649 )

This PR makes the following changes:

* Moves clipping of continuous control model into model itself. Output is now always [-1, 1].
* Internal model values are now clipped between [-3, 3] before being rescaled to [-1, 1] for output.  * This improves training performance by providing a wider range of values within which the pdf of the gaussian can fall. Output of [-1, 1] is used to be more environment-creator friendly.
* Fixes issue where epsilon was erroneously being used to reconstruct old probabilities during PPO update, leading to reduced learning performance.
* Introduce ScaleAction() function within python to easily rescale values from [-1, 1] to arbitrary range.
* Re-train all CC models using improved algorithm. All performance levels are equal or improved. In the case of Crawler, improvement is drastic.
* Update documentation appropriately.
* Made miscellaneous minor code style and optimization improvements within environments.

7 年前

Banana.bytes

Use Clipped Gaussian (#649)

7 年前

Banana.bytes.meta

Use Clipped Gaussian (#649)

7 年前