浏览代码

Folded ODD Feature Into Agents

/develop-generalizationTraining-TrainerController
Marwan Mattar 7 年前
当前提交
3ceaa337
共有 5 个文件被更改,包括 90 次插入290 次删除
  1. 26
      docs/Learning-Environment-Design-Agents.md
  2. 3
      docs/ML-Agents-Overview.md
  3. 136
      docs/images/agent.png
  4. 176
      docs/images/ml-agents-ODD.png
  5. 39
      docs/Feature-On-Demand-Decisions.md

26
docs/Learning-Environment-Design-Agents.md


To control the frequency of step-based decision making, set the **Decision Frequency** value for the Agent object in the Unity Inspector window. Agents using the same Brain instance can use a different frequency. During simulation steps in which no decision is requested, the agent receives the same action chosen by the previous decision.
### On Demand Decision Making
On demand decision making allows agents to request decisions from their
brains only when needed instead of receiving decisions at a fixed
frequency. This is useful when the agents commit to an action for a
variable number of steps or when the agents cannot make decisions
at the same time. This typically the case for turn based games, games
where agents must react to events or games where agents can take
actions of variable duration.
When you turn on **On Demand Decisions** for an agent, your agent code must call the `Agent.RequestDecision()` function. This function call starts one iteration of the observation-decision-action-reward cycle. The Brain invokes the agent's `CollectObservations()` method, makes a decision and returns it by calling the `AgentAction()` method. The Brain waits for the agent to request the next decision before starting another iteration.
See [On Demand Decision Making](Feature-On-Demand-Decision.md).

* `Max Step` - The per-agent maximum number of steps. Once this number is reached, the agent will be reset if `Reset On Done` is checked.
* `Reset On Done` - Whether the agent's `AgentReset()` function should be called when the agent reaches its `Max Step` count or is marked as done in code.
* `On Demand Decision` - Whether the agent requests decisions at a fixed step interval or explicitly requests decisions by calling `RequestDecision()`.
* If not checked, the Agent will request a new
decision every `Decision Frequency` steps and
perform an action every step. In the example above,
`CollectObservations()` will be called every 5 steps and
`AgentAction()` will be called at every step. This means that the
Agent will reuse the decision the Brain has given it.
* If checked, the Agent controls when to receive
decisions, and take actions. To do so, the Agent may leverage one or two methods:
* `RequestDecision()` Signals that the Agent is requesting a decision.
This causes the Agent to collect its observations and ask the Brain for a
decision at the next step of the simulation. Note that when an Agent
requests a decision, it also request an action.
This is to ensure that all decisions lead to an action during training.
* `RequestAction()` Signals that the Agent is requesting an action. The
action provided to the Agent in this case is the same action that was
provided the last time it requested a decision.
* `Decision Frequency` - The number of steps between decision requests. Not used if `On Demand Decision`, is true.
## Instantiating an Agent at Runtime

3
docs/ML-Agents-Overview.md


must react to events or games where agents can take actions of variable
duration. Switching between decision taking at every step and
on-demand-decision is one button click away. You can learn more about the
on-demand-decision feature [here](Feature-On-Demand-Decisions.md).
on-demand-decision feature
[here](Learning-Environment-Design-Agents.md#on-demand-decision-making).
* **Memory-enhanced Agents** - In some scenarios, agents must learn to
remember the past in order to take the

136
docs/images/agent.png

之前 之后
宽度: 526  |  高度: 160  |  大小: 20 KiB

176
docs/images/ml-agents-ODD.png

之前 之后

39
docs/Feature-On-Demand-Decisions.md


# On Demand Decision Making
## Description
On demand decision making allows agents to request decisions from their
brains only when needed instead of receiving decisions at a fixed
frequency. This is useful when the agents commit to an action for a
variable number of steps or when the agents cannot make decisions
at the same time. This typically the case for turn based games, games
where agents must react to events or games where agents can take
actions of variable duration.
## How to use
To enable or disable on demand decision making, use the checkbox called
`On Demand Decisions` in the Agent Inspector.
<p align="center">
<img src="images/ml-agents-ODD.png"
alt="On Demand Decision"
width="500" border="10" />
</p>
* If `On Demand Decisions` is not checked, the Agent will request a new
decision every `Decision Frequency` steps and
perform an action every step. In the example above,
`CollectObservations()` will be called every 5 steps and
`AgentAction()` will be called at every step. This means that the
Agent will reuse the decision the Brain has given it.
* If `On Demand Decisions` is checked, the Agent controls when to receive
decisions, and take actions. To do so, the Agent may leverage one or two methods:
* `RequestDecision()` Signals that the Agent is requesting a decision.
This causes the Agent to collect its observations and ask the Brain for a
decision at the next step of the simulation. Note that when an Agent
requests a decision, it also request an action.
This is to ensure that all decisions lead to an action during training.
* `RequestAction()` Signals that the Agent is requesting an action. The
action provided to the Agent in this case is the same action that was
provided the last time it requested a decision.
正在加载...
取消
保存