
* Add benchmark thresholds for example environments

Arthur Juliani 7 年前
共有 2 个文件被更改,包括 19 次插入8 次删除
  1. 16
  2. 11


* Vector Action space: (Discrete) Two possible actions (Move left, move right).
* Visual Observations: 0
* Reset Parameters: None
* Benchmark Mean Reward: 0.94
## [3DBall: 3D Balance Ball](https://youtu.be/dheeCO29-EI)

* Vector Action space: (Continuous) Size of 2, with one value corresponding to X-rotation, and the other to Z-rotation.
* Visual Observations: 0
* Reset Parameters: None
* Benchmark Mean Reward: 100
## [GridWorld](https://youtu.be/gu8HE9WKEVI)

* Vector Action space: (Discrete) Size of 4, corresponding to movement in cardinal directions.
* Visual Observations: One corresponding to top-down view of GridWorld.
* Reset Parameters: Three, corresponding to grid size, number of obstacles, and number of goals.
* Benchmark Mean Reward: 0.8
## [Tennis](https://youtu.be/RDaIh7JX6RI)

* Vector Action space: (Continuous) Size of 2, corresponding to movement toward net or away from net, and jumping.
* Visual Observations: None
* Reset Parameters: One, corresponding to size of ball.
* Benchmark Mean Reward: 2.5
## [Push Block](https://youtu.be/jKdw216ZgoE)

* Vector Action space: (Continuous) Size of 2, corresponding to movement in X and Z directions.
* Visual Observations: None.
* Reset Parameters: None.
* Benchmark Mean Reward: 4.5
## [Wall Jump](https://youtu.be/NITLug2DIWQ)

* Vector Action space: (Discrete) Size of 74, corresponding to 14 raycasts each detecting 4 possible objects. plus the global position of the agent and whether or not the agent is grounded.
* Visual Observations: None.
* Reset Parameters: 4, corresponding to the height of the possible walls.
* Benchmark Mean Reward (Big & Small Wall Brain): 0.8
## [Reacher](https://youtu.be/2N9EoF6pQyE)

* Goal: The agents must move it's hand to the goal location, and keep it there.
* Agents: The environment contains 32 agent linked to a single brain.
* Agents: The environment contains 10 agent linked to a single brain.
* Agent Reward Function (independent):
* +0.1 Each step agent's hand is in goal location.
* Brains: One brain with the following observation/action space.

* Reset Parameters: Two, corresponding to goal size, and goal movement speed.
* Benchmark Mean Reward: 30
## [Crawler](https://youtu.be/ftLliaeooYI)

* Vector Action space: (Continuous) Size of 12, corresponding to torque applicable to 12 joints.
* Visual Observations: None
* Reset Parameters: None
* Benchmark Mean Reward: 2000
## [Banana Collector](https://youtu.be/heVMs3t9qSk)

* Vector Action space: (Continuous) Size of 3, corresponding to forward movement, y-axis rotation, and whether to use laser to disable other agents.
* Visual Observations (Optional; None by default): First-person view for each agent.
* Reset Parameters: None
* Benchmark Mean Reward: 10
## [Hallway](https://youtu.be/53GyfpPQRUQ)

* Vector Action space: (Discrete) 4 corresponding to agent rotation and forward/backward movement.
* Visual Observations (Optional): First-person view for the agent.
* Reset Parameters: None
* Benchmark Mean Reward: 0.7
## [Bouncer](https://youtu.be/Tkv-c-b1b2I)

* Vector Action space: (Continuous) 3 corresponding to agent force applied for the jump.
* Visual Observations: None
* Reset Parameters: None
* Benchmark Mean Reward: 2.5
## [Soccer Twos](https://youtu.be/Hg3nmYD3DjQ)

* Goalie: 4 corresponding to forward, backward, sideways movement.
* Visual Observations: None
* Reset Parameters: None
* Benchmark Mean Reward (Striker & Goalie Brain): 0 (the means will be inverse of each other and criss crosses during training)
## Walker

* Vector Action space: (Continuous) Size of 39, corresponding to target rotations applicable to the joints.
* Visual Observations: None
* Reset Parameters: None


accomplish the task. From there, we can slowly add to the difficulty of the task by
increasing the size of the wall, until the agent can complete the initially
near-impossible task of scaling the wall. We are including just such an environment with
ML-Agents 0.2, called Wall Area.
ML-Agents 0.2, called Wall Jump.

"measure" : "reward",
"thresholds" : [0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5],
"measure" : "progress",
"thresholds" : [0.1, 0.3, 0.5],
"min_wall_height" : [0.0, 0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5],
"max_wall_height" : [1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0, 5.5, 6.0]
"big_wall_min_height" : [0.0, 4.0, 6.0, 8.0],
"big_wall_max_height" : [4.0, 7.0, 8.0, 8.0],
"small_wall_height" : [1.5, 2.0, 2.5, 4.0]
