浏览代码

update benchmarks based on new models

/hh-develop-loco-walker-variable-speed
GitHub 4 年前
当前提交
59717d0e
共有 1 个文件被更改,包括 3 次插入3 次删除
  1. 6
      docs/Learning-Environment-Examples.md

6
docs/Learning-Environment-Examples.md


- Body velocity matches goal velocity. (normalized between (0,1))
- Head direction alignment with goal direction. (normalized between (0,1))
- Behavior Parameters:
- Vector Observation space: 238 variables corresponding to position, rotation,
- Vector Observation space: 243 variables corresponding to position, rotation,
velocity, and angular velocities of each limb, along with goal direction.
- Vector Action space: (Continuous) Size of 39, corresponding to target
rotations and strength applicable to the joints.

- Recommended Minimum: 3
- Recommended Maximum: 20
- Benchmark Mean Reward for `WalkerDynamic`: 2500
- Benchmark Mean Reward for `WalkerDynamicVariableSpeed`: 1200
- Benchmark Mean Reward for `WalkerDynamicVariableSpeed`: 2500
- Benchmark Mean Reward for `WalkerStaticVariableSpeed`: 3000
- Benchmark Mean Reward for `WalkerStaticVariableSpeed`: 3500

正在加载...
取消
保存