浏览代码

Merge pull request #4872 from Unity-Technologies/fix-numti-env-delayed-spawn

[Bug Fix] Fix crash if spawn is delayed in multi-env
/MLA-1734-demo-provider
GitHub 4 年前
当前提交
db4436e9
共有 3 个文件被更改,包括 33 次插入2 次删除
  1. 1
      com.unity.ml-agents/CHANGELOG.md
  2. 7
      ml-agents/mlagents/trainers/subprocess_env_manager.py
  3. 27
      ml-agents/mlagents/trainers/tests/test_subprocess_env_manager.py

1
com.unity.ml-agents/CHANGELOG.md


- Fix a compile warning about using an obsolete enum in `GrpcExtensions.cs`. (#4812)
#### ml-agents / ml-agents-envs / gym-unity (Python)
- Fixed a bug that would cause an exception when `RunOptions` was deserialized via `pickle`. (#4842)
- Fixed a bug that can cause a crash if a behavior can appear during training in multi-environment training. (#4872)
- Fixed the computation of entropy for continuous actions. (#4869)

7
ml-agents/mlagents/trainers/subprocess_env_manager.py


@property
def training_behaviors(self) -> Dict[BehaviorName, BehaviorSpec]:
self.env_workers[0].send(EnvironmentCommand.BEHAVIOR_SPECS)
return self.env_workers[0].recv().payload
result: Dict[BehaviorName, BehaviorSpec] = {}
for worker in self.env_workers:
worker.send(EnvironmentCommand.BEHAVIOR_SPECS)
result.update(worker.recv().payload)
return result
def close(self) -> None:
logger.debug("SubprocessEnvManager closing.")

27
ml-agents/mlagents/trainers/tests/test_subprocess_env_manager.py


@mock.patch(
"mlagents.trainers.subprocess_env_manager.SubprocessEnvManager.create_worker"
)
def test_training_behaviors_collects_results_from_all_envs(
self, mock_create_worker
):
def create_worker_mock(worker_id, step_queue, env_factor, engine_c):
return MockEnvWorker(
worker_id,
EnvironmentResponse(
EnvironmentCommand.RESET, worker_id, {f"key{worker_id}": worker_id}
),
)
mock_create_worker.side_effect = create_worker_mock
manager = SubprocessEnvManager(
mock_env_factory, EngineConfig.default_config(), 4
)
res = manager.training_behaviors
for env in manager.env_workers:
env.send.assert_called_with(EnvironmentCommand.BEHAVIOR_SPECS)
env.recv.assert_called()
for worker_id in range(4):
assert f"key{worker_id}" in res
assert res[f"key{worker_id}"] == worker_id
@mock.patch(
"mlagents.trainers.subprocess_env_manager.SubprocessEnvManager.create_worker"
)
def test_step_takes_steps_for_all_non_waiting_envs(self, mock_create_worker):
mock_create_worker.side_effect = create_worker_mock
manager = SubprocessEnvManager(

正在加载...
取消
保存