浏览代码

update document (#4237)

small fix to documentation formatting
/MLA-1734-demo-provider
GitHub 4 年前
当前提交
d302ddb9
共有 3 个文件被更改,包括 5 次插入5 次删除
  1. 4
      docs/Learning-Environment-Create-New.md
  2. 2
      docs/ML-Agents-Overview.md
  3. 4
      gym-unity/README.md

4
docs/Learning-Environment-Create-New.md


1. Right click in Hierarchy window, select 3D Object > Cube.
1. Name the GameObject "Target"
1. Select the Target Cube to view its properties in the Inspector window.
1. Set Transform to Position = `3, 0.5, 3)`, Rotation = `(0, 0, 0)`, Scale =
1. Set Transform to Position = `(3, 0.5, 3)`, Rotation = `(0, 0, 0)`, Scale =
`(1, 1, 1)`.
<p align="left">

1. In the Unity Project window, double-click the `RollerAgent` script to open it
in your code editor.
1. In the editor, add the `using Unity.MLAgents;` and
`using Unity.MLAgents.Sensors` statements and then change the base class from
`using Unity.MLAgents.Sensors;` statements and then change the base class from
`MonoBehaviour` to `Agent`.
1. Delete the `Update()` method, but we will use the `Start()` function, so
leave it alone for now.

2
docs/ML-Agents-Overview.md


In reinforcement learning, the end goal for the Agent is to discover a behavior
(a Policy) that maximizes a reward. You will need to provide the agent one or
more reward signals to use during training.Typically, a reward is defined by
more reward signals to use during training. Typically, a reward is defined by
your environment, and corresponds to reaching some goal. These are what we refer
to as _extrinsic_ rewards, as they are defined external of the learning
algorithm.

4
gym-unity/README.md


```python
from gym_unity.envs import UnityToGymWrapper
env = UnityToGymWrapper(unity_environment, uint8_visual, allow_multiple_obs)
env = UnityToGymWrapper(unity_environment, uint8_visual, flatten_branched, allow_multiple_obs)
```
- `unity_environment` refers to the Unity environment to be wrapped.

- `allow_multiple_obs` will return a list of observations. The first elements contain the visual observations and the
last element contains the array of vector observations. If False the environment returns a single array (containing
a single visual observations, if present, otherwise the vector observation)
a single visual observations, if present, otherwise the vector observation). Defaults to `False`.
The returned environment `env` will function as a gym.

正在加载...
取消
保存