浏览代码

[bug-fix] Empty ignored trajectory queues, make sure queues don't overflow (#3451)

/asymm-envs
GitHub 5 年前
当前提交
6876a1d6
共有 3 个文件被更改,包括 22 次插入5 次删除
  1. 2
      com.unity.ml-agents/CHANGELOG.md
  2. 23
      ml-agents/mlagents/trainers/ghost/trainer.py
  3. 2
      ml-agents/mlagents/trainers/tests/test_ghost.py

2
com.unity.ml-agents/CHANGELOG.md


- Update Barracuda to 0.6.0-preview
### Bugfixes
- Fixed an issue which caused self-play training sessions to consume a lot of memory. (#3451)
## [0.14.0-preview] - 2020-02-13

23
ml-agents/mlagents/trainers/ghost/trainer.py


self.internal_policy_queues: List[AgentManagerQueue[Policy]] = []
self.internal_trajectory_queues: List[AgentManagerQueue[Trajectory]] = []
self.ignored_trajectory_queues: List[AgentManagerQueue[Trajectory]] = []
self.learning_policy_queues: Dict[str, AgentManagerQueue[Policy]] = {}
# assign ghost's stats collection to wrapped trainer's

self.trajectory_queues, self.internal_trajectory_queues
):
try:
t = traj_queue.get_nowait()
# adds to wrapped trainers queue
internal_traj_queue.put(t)
self._process_trajectory(t)
# We grab at most the maximum length of the queue.
# This ensures that even if the queue is being filled faster than it is
# being emptied, the trajectories in the queue are on-policy.
for _ in range(traj_queue.maxlen):
t = traj_queue.get_nowait()
# adds to wrapped trainers queue
internal_traj_queue.put(t)
self._process_trajectory(t)
except AgentManagerQueue.Empty:
pass

if self.get_step - self.last_swap > self.steps_between_swap:
self._swap_snapshots()
self.last_swap = self.get_step
# Dump trajectories from non-learning policy
for traj_queue in self.ignored_trajectory_queues:
try:
for _ in range(traj_queue.maxlen):
traj_queue.get_nowait()
except AgentManagerQueue.Empty:
pass
def end_episode(self):
self.trainer.end_episode()

self.internal_trajectory_queues.append(internal_trajectory_queue)
self.trainer.subscribe_trajectory_queue(internal_trajectory_queue)
else:
self.ignored_trajectory_queues.append(trajectory_queue)
# Taken from https://github.com/Unity-Technologies/ml-agents/pull/1975 and

2
ml-agents/mlagents/trainers/tests/test_ghost.py


# Check that ghost trainer ignored off policy queue
assert trainer.trainer.update_buffer.num_experiences == 15
# Check that it emptied the queue
assert trajectory_queue1.empty()
def test_publish_queue(dummy_config):

正在加载...
取消
保存