浏览代码

Merge commit 'fbcdd83c087135f870e785cc72e5ff9a7e898e3a' into develop-splitpolicyoptimizer

/develop/nopreviousactions
Ervin Teng 5 年前
当前提交
1859f252
共有 6 个文件被更改,包括 25 次插入23 次删除
  1. 2
      Project/Assets/ML-Agents/Examples/Hallway/Scripts/HallwayAgent.cs
  2. 2
      com.unity.ml-agents/CHANGELOG.md
  3. 7
      com.unity.ml-agents/Runtime/Agent.cs
  4. 8
      com.unity.ml-agents/Tests/Editor/MLAgentsEditModeTest.cs
  5. 27
      config/gail_config.yaml
  6. 2
      docs/Migrating.md

2
Project/Assets/ML-Agents/Examples/Hallway/Scripts/HallwayAgent.cs


{
if (useVectorObs)
{
sensor.AddObservation(GetStepCount() / (float)maxStep);
sensor.AddObservation(StepCount / (float)maxStep);
}
}

2
com.unity.ml-agents/CHANGELOG.md


- The stepping logic for the Agent and the Academy has been simplified (#3448)
- Update Barracuda to 0.6.0-preview
- The checkpoint file suffix was changed from `.cptk` to `.ckpt` (#3470)
- The method `GetStepCount()` on the Agent class has been replaced with the property getter `StepCount`
- Updated the `gail_config.yaml` to work with per-Agent steps (#3475)
## [0.14.0-preview] - 2020-02-13

7
com.unity.ml-agents/Runtime/Agent.cs


m_Brain = m_PolicyFactory.GeneratePolicy(Heuristic);
}
/// <summary>
/// Current episode number.
/// Current step count.
public int GetStepCount()
public int StepCount
return m_StepCount;
get { return m_StepCount; }
}
/// <summary>

8
com.unity.ml-agents/Tests/Editor/MLAgentsEditModeTest.cs


Assert.AreEqual(i, aca.TotalStepCount);
Assert.AreEqual(agent2StepSinceReset, agent2.GetStepCount());
Assert.AreEqual(agent2StepSinceReset, agent2.StepCount);
Assert.AreEqual(numberAgent1Reset, agent1.agentResetCalls);
Assert.AreEqual(numberAgent2Reset, agent2.agentResetCalls);

expectedAgentStepCount += 1;
// If the next step will put the agent at maxSteps, we expect it to reset
if (agent1.GetStepCount() == maxStep - 1 || (i == 0))
if (agent1.StepCount == maxStep - 1 || (i == 0))
if (agent1.GetStepCount() == maxStep - 1)
if (agent1.StepCount == maxStep - 1)
{
expectedAgentActionSinceReset = 0;
expectedCollectObsCallsSinceReset = 0;

Assert.AreEqual(expectedAgentStepCount, agent1.GetStepCount());
Assert.AreEqual(expectedAgentStepCount, agent1.StepCount);
Assert.AreEqual(expectedResets, agent1.agentResetCalls);
Assert.AreEqual(expectedAgentAction, agent1.agentActionCalls);
Assert.AreEqual(expectedAgentActionSinceReset, agent1.agentActionCallsSinceLastReset);

27
config/gail_config.yaml


num_layers: 2
time_horizon: 64
sequence_length: 64
summary_freq: 1000
summary_freq: 10000
use_recurrent: false
reward_signals:
extrinsic:

Pyramids:
summary_freq: 2000
summary_freq: 30000
time_horizon: 128
batch_size: 128
buffer_size: 2048

max_steps: 5.0e5
max_steps: 1.0e7
steps: 10000
steps: 150000
reward_signals:
extrinsic:
strength: 1.0

time_horizon: 1000
batch_size: 2024
buffer_size: 20240
max_steps: 1e6
summary_freq: 3000
max_steps: 1e7
summary_freq: 30000
steps: 5000
steps: 50000
reward_signals:
gail:
strength: 1.0

PushBlock:
max_steps: 5.0e4
max_steps: 1.5e7
summary_freq: 2000
summary_freq: 60000
time_horizon: 64
num_layers: 2
reward_signals:

encoding_size: 128
demo_path: Project/Assets/ML-Agents/Examples/PushBlock/Demos/ExpertPush.demo
demo_path: Project/Assets/Demonstrations/PushblockDemo.demo
Hallway:
use_recurrent: true

num_epoch: 3
buffer_size: 1024
batch_size: 128
max_steps: 5.0e5
summary_freq: 1000
max_steps: 1.0e7
summary_freq: 10000
time_horizon: 64
reward_signals:
extrinsic:

FoodCollector:
batch_size: 64
summary_freq: 1000
max_steps: 5.0e4
max_steps: 2.0e6
use_recurrent: false
hidden_units: 128
learning_rate: 3.0e-4

2
docs/Migrating.md


* The `Monitor` class has been moved to the Examples Project. (It was prone to errors during testing)
* The `MLAgents.Sensor` namespace has been removed. All sensors now belong to the `MLAgents` namespace.
* The `SetActionMask` method must now be called on the optional `ActionMasker` argument of the `CollectObservations` method. (We now consider an action mask as a type of observation)
* The method `GetStepCount()` on the Agent class has been replaced with the property getter `StepCount`
* Replace all calls to `Agent.GetStepCount()` with `Agent.StepCount`
## Migrating from 0.13 to 0.14

正在加载...
取消
保存