浏览代码

Documentation change on gridworld

/goal-conditioning/grid-world
vincentpierre 3 年前
当前提交
5a4137a8
共有 2 个文件被更改,包括 225 次插入414 次删除
  1. 16
      docs/Learning-Environment-Examples.md
  2. 623
      docs/images/gridworld.png

16
docs/Learning-Environment-Examples.md


![GridWorld](images/gridworld.png)
- Set-up: A version of the classic grid-world task. Scene contains agent, goal,
- Set-up: A version of the grid-world task. Scene contains agent, goal,
- Goal: The agent must navigate the grid to the goal while avoiding the
obstacles.
- Goal: The agent must navigate the grid to the appropriate goal while
avoiding the obstacles.
- +1.0 if the agent navigates to the goal position of the grid (episode ends).
- -1.0 if the agent navigates to an obstacle (episode ends).
- +1.0 if the agent navigates to the correct goal (episode ends).
- -1.0 if the agent navigates to an incorrect goal (episode ends).
- Behavior Parameters:
- Vector Observation space: None
- Actions: 1 discrete action branch with 5 actions, corresponding to movement in

checkbox within the `trueAgent` GameObject). The trained model file provided
was generated with action masking turned on.
- Visual Observations: One corresponding to top-down view of GridWorld.
- Float Properties: Three, corresponding to grid size, number of obstacles, and
number of goals.
- Goal Signal : A one hot vector corresponding to which color is the correct goal
for the Agent
- Float Properties: Three, corresponding to grid size, number of green goals, and
number of red goals.
- Benchmark Mean Reward: 0.8
## Push Block

623
docs/images/gridworld.png

之前 之后
宽度: 934  |  高度: 594  |  大小: 71 KiB
正在加载...
取消
保存