浏览代码

Merge branch 'master' into fix-numti-env-delayed-spawn

/MLA-1734-demo-provider
GitHub 4 年前
当前提交
bd4bc66b
共有 3 个文件被更改,包括 7 次插入5 次删除
  1. 1
      com.unity.ml-agents/CHANGELOG.md
  2. 9
      ml-agents/mlagents/trainers/tests/torch/test_hybrid.py
  3. 2
      ml-agents/mlagents/trainers/torch/distributions.py

1
com.unity.ml-agents/CHANGELOG.md


#### ml-agents / ml-agents-envs / gym-unity (Python)
- Fixed a bug that would cause an exception when `RunOptions` was deserialized via `pickle`. (#4842)
- Fixed a bug that can cause a crash if a behavior can appear during training in multi-environment training. (#4872)
- Fixed the computation of entropy for continuous actions. (#4869)
## [1.7.2-preview] - 2020-12-22

9
ml-agents/mlagents/trainers/tests/torch/test_hybrid.py


env = SimpleEnvironment([BRAIN_NAME], action_sizes=action_size, step_size=0.8)
new_network_settings = attr.evolve(PPO_TORCH_CONFIG.network_settings)
new_hyperparams = attr.evolve(
PPO_TORCH_CONFIG.hyperparameters, batch_size=64, buffer_size=1024
PPO_TORCH_CONFIG.hyperparameters,
batch_size=64,
buffer_size=1024,
learning_rate=1e-3,
)
config = attr.evolve(
PPO_TORCH_CONFIG,

)
check_environment_trains(
env, {BRAIN_NAME: config}, success_threshold=0.9, training_seed=1212
)
check_environment_trains(env, {BRAIN_NAME: config}, success_threshold=0.9)
@pytest.mark.parametrize("num_visual", [1, 2])

2
ml-agents/mlagents/trainers/torch/distributions.py


def entropy(self):
return torch.mean(
0.5 * torch.log(2 * math.pi * math.e * self.std + EPSILON),
0.5 * torch.log(2 * math.pi * math.e * self.std ** 2 + EPSILON),
dim=1,
keepdim=True,
) # Use equivalent behavior to TF

正在加载...
取消
保存