An action is an instruction from the brain that the agent carries out. The action is passed to the agent as a parameter when the Academy invokes the agent's `AgentAction()` function. When you specify that the vector action space is **Continuous**, the action parameter passed to the agent is an array of control signals with length equal to the `Vector Action Space Size` property. When you specify a **Discrete** vector action space type, the action parameter is an array containing only a single value, which is an index into your list or table of commands. In the **Discrete** vector action space type, the `Vector Action Space Size` is the number of elements in your action table. Set the `Vector Action Space Size` and `Vector Action Space Type` properties on the Brain object assigned to the agent (using the Unity Editor Inspector window).
An action is an instruction from the brain that the agent carries out. The action is passed to the agent as a parameter when the Academy invokes the agent's `AgentAction()` function. When you specify that the vector action space is **Continuous**, the action parameter passed to the agent is an array of control signals with length equal to the `Vector Action Space Size` property. When you specify a **Discrete** vector action space type, the action parameter is an array containing integers. Each integer is an index into a list or table of commands. In the **Discrete** vector action space type, the action parameter is an array of indices. The number of indices in the array is determined by the number of branches defined in the `Branches Size` property. Each branch corresponds to an action table, you can specify the size of each table by modifying the `Branches` property. Set the `Vector Action Space Size` and `Vector Action Space Type` properties on the Brain object assigned to the agent (using the Unity Editor Inspector window).
For example, if you designed an agent to move in two dimensions, you could use either continuous or the discrete vector actions. In the continuous case, you would set the vector action size to two (one for each dimension), and the agent's brain would create an action with two floating point values. In the discrete case, you would set the vector action size to four (one for each direction), and the brain would create an action array containing a single element with a value ranging from zero to four.
For example, if you designed an agent to move in two dimensions, you could use either continuous or the discrete vector actions. In the continuous case, you would set the vector action size to two (one for each dimension), and the agent's brain would create an action with two floating point values. In the discrete case, you would use one Branch with a size of four (one for each direction), and the brain would create an action array containing a single element with a value ranging from zero to three. Alternatively, you could create two branches of size two (one for horizontal movement and one for vertical movement), and the brain would create an action array containing two elements with values ranging from zero to one.
Note that when you are programming actions for an agent, it is often helpful to test your action logic using a **Player** brain, which lets you map keyboard commands to actions. See [Brains](Learning-Environment-Design-Brains.md).
### Discrete Action Space
When an agent uses a brain set to the **Discrete** vector action space, the action parameter passed to the agent's `AgentAction()` function is an array containing a single element. The value is the index of the action to in your table or list of actions. With the discrete vector action space, `Vector Action Space Size` represents the number of actions in your action table.
When an agent uses a brain set to the **Discrete** vector action space, the action parameter passed to the agent's `AgentAction()` function is an array containing indices. With the discrete vector action space, `Branches` is an array of integers, each value corresponds to the number of possibilities for each branch.
The [Area example](Learning-Environment-Examples.md#push-block) defines five actions for the discrete vector action space: a jump action and one action for each cardinal direction:
For example, if we wanted an agent that can move in an plane and jump, we could define two branches (one for motion and one for jumping) because we want our agent be able to move __and__ jump concurently.
We define the first branch to have 5 possible actions (don't move, go left, go right, go backward, go forward) and the second one to have 2 possible actions (don't jump, jump). The AgentAction method would look something like :
* `Visual Observations` - Describes height, width, and whether to grayscale visual observations for the Brain.
* `Vector Action`
* `Space Type` - Corresponds to whether action vector contains a single integer (Discrete) or a series of real-valued floats (Continuous).
* `Space Size` - Length of action vector for brain (In _Continuous_ state space). Or number of possible values (in _Discrete_ action space).
* `Space Size` (Continuous) - Length of action vector for brain.
* `Branches` (Discrete) - An array of integers, defines multiple concurent discrete actions. The values in the `Branches` array correspond to the number of possible discrete values for each action branch.
* `Action Descriptions` - A list of strings used to name the available actions for the Brain.
* `Type of Brain` - Describes how the Brain will decide actions.
* `External` - Actions are decided by an external process, such as the PPO training process.
* `Recurrent Input Node Name` : If your graph uses a recurrent input / memory as input and outputs new recurrent input / memory, you must specify the name if the input placeholder here.
* `Recurrent Output Node Name` : If your graph uses a recurrent input / memory as input and outputs new recurrent input / memory, you must specify the name if the output placeholder here.
* `Observation Placeholder Name` : If your graph uses observations as input, you must specify it here. Note that the number of observations is equal to the length of `Camera Resolutions` in the brain parameters.
* `Action Node Name` : Specify the name of the placeholder corresponding to the actions of the brain in your graph. If the action space type is continuous, the output must be a one dimensional tensor of float of length `Action Space Size`, if the action space type is discrete, the output must be a one dimensional tensor of int of length 1.
* `Action Node Name` : Specify the name of the placeholder corresponding to the actions of the brain in your graph. If the action space type is continuous, the output must be a one dimensional tensor of float of length `Action Space Size`, if the action space type is discrete, the output must be a one dimensional tensor of int of the same length as the `Branches` array.
* `Graph Placeholder` : If your graph takes additional inputs that are fixed (example: noise level) you can specify them here. Note that in your graph, these must correspond to one dimensional tensors of int or float of size 1.
* `Name` : Corresponds to the name of the placeholder.
* `Value Type` : Either Integer or Floating Point.
The Decision interface defines two methods, `Decide()` and `MakeMemory()`.
The `Decide()` method receives an agents current state, consisting of the agent's observations, reward, memory and other aspects of the agent's state, and must return an array containing the action that the agent should take. The format of the returned action array depends on the **Vector Action Space Type**. When using a **Continuous** action space, the action array is just a float array with a length equal to the **Vector Action Space Size** setting. When using a **Discrete** action space, the array contains just a single value. In the discrete action space, the **Space Size** value defines the number of discrete values that your `Decide()` function can return, which don't need to be consecutive integers.
The `Decide()` method receives an agents current state, consisting of the agent's observations, reward, memory and other aspects of the agent's state, and must return an array containing the action that the agent should take. The format of the returned action array depends on the **Vector Action Space Type**. When using a **Continuous** action space, the action array is just a float array with a length equal to the **Vector Action Space Size** setting. When using a **Discrete** action space, the action array is an integer array with the same size as the `Branches` array. In the discrete action space, the values of the **Branches** array define the number of discrete values that your `Decide()` function can return for each branch, which don't need to be consecutive integers.
The `MakeMemory()` function allows you to pass data forward to the next iteration of an agent's decision making process. The array you return from `MakeMemory()` is passed to the `Decide()` function in the next iteration. You can use the memory to allow the agent's decision process to take past actions and observations into account when making the current decision. If your heuristic logic does not require memory, just return an empty array.
|| **Index** | The element of the agent's action vector to set when this key is pressed. The index value cannot exceed the size of the Action Space (minus 1, since it is an array index).|
|| **Value** | The value to send to the agent as its action for the specified index when the mapped key is pressed. All other members of the action vector are set to 0. |
|**Discrete Player Actions**|| The mapping for the discrete vector action space. Shown when the action space is **Discrete**.|
|| **Default Action** | The value to send when no keys are pressed.|
|| **Value** | The value to send to the agent as its action when the mapped key is pressed.|
|| **Branch Index** |The element of the agent's action vector to set when this key is pressed. The index value cannot exceed the size of the Action Space (minus 1, since it is an array index).|
|| **Value** | The value to send to the agent as its action when the mapped key is pressed. Cannot exceed the max value for the associated branch (minus 1, since it is an array index).|
For more information about the Unity input system, see [Input](https://docs.unity3d.com/ScriptReference/Input.html).
* Brains: Two brains, each with the following observation/action space.
* Vector Observation space: 16 variables corresponding to position and velocities of agent, block, and goal, plus the height of the wall.
* Vector Action space: (Discrete) Size of 74, corresponding to 14 raycasts each detecting 4 possible objects. plus the global position of the agent and whether or not the agent is grounded.
* Vector Observation space: Size of 74, corresponding to 14 raycasts each detecting 4 possible objects. plus the global position of the agent and whether or not the agent is grounded.
* Vector Action space: (Discrete) 4 Branches :
* Forward Motion (3 possible actions : Forward, Backwards, No Action)
* Rotation (3 possible acions : Rotate Left, Rotate Right, No Action)
* Side Motion (3 possible acions : Left, Right, No Action)
* Jump (2 possible actions: Jump, No Action)
* Visual Observations: None.
* Reset Parameters: 4, corresponding to the height of the possible walls.
* Benchmark Mean Reward (Big & Small Wall Brain): 0.8
* -1 for interaction with blue banana.
* Brains: One brain with the following observation/action space.
* Vector Observation space: 53 corresponding to velocity of agent (2), whether agent is frozen and/or shot its laser (2), plus ray-based perception of objects around agent's forward direction (49; 7 raycast angles with 7 measurements for each).
* Vector Action space: (Continuous) Size of 3, corresponding to forward movement, y-axis rotation, and whether to use laser to disable other agents.
* Vector Action space: (Discrete) 4 Branches :
* Forward Motion (3 possible actions : Forward, Backwards, No Action)
* Side Motion (3 possible acions : Left, Right, No Action)
* Rotation (3 possible acions : Rotate Left, Rotate Right, No Action)
* Laser (2 possible actions: Laser, No Action)
* Visual Observations (Optional): First-person camera per-agent. Use `VisualBanana` scene.
* Reset Parameters: None.
* Benchmark Mean Reward: 10
* -0.0003 Existential penalty.
* Brains: One brain with the following observation/action space:
* Vector Observation space: 30 corresponding to local ray-casts detecting objects, goals, and walls.
* Vector Action space: (Discrete) 4 corresponding to agent rotation and forward/backward movement.
* Vector Action space: (Discrete) 1 Branch, 4 actions corresponding to agent rotation and forward/backward movement.
* Visual Observations (Optional): First-person view for the agent. Use `VisualHallway` scene.
* Reset Parameters: None.
* Benchmark Mean Reward: 0.7
* +0.001 Existential bonus.
* Brains: Two brain with the following observation/action space:
* Vector Observation space: 112 corresponding to local 14 ray casts, each detecting 7 possible object types, along with the object's distance. Perception is in 180 degree view from front of agent.
* Vector Action space: (Discrete)
* Striker: 6 corresponding to forward, backward, sideways movement, as well as rotation.
* Goalie: 4 corresponding to forward, backward, sideways movement.
* Vector Action space: (Discrete) One Branch
* Striker: 6 actions corresponding to forward, backward, sideways movement, as well as rotation.
* Goalie: 4 actions corresponding to forward, backward, sideways movement.
* Visual Observations: None.
* Reset Parameters: None
* Benchmark Mean Reward (Striker & Goalie Brain): 0 (the means will be inverse of each other and criss crosses during training)
* **`local_done`** : A list as long as the number of agents using the brain containing `done` flags (whether or not the agent is done).
* **`max_reached`** : A list as long as the number of agents using the brain containing true if the agents reached their max steps.
* **`agents`** : A list of the unique ids of the agents using the brain.
* **`previous_actions`** : A two dimensional numpy array of dimension `(batch size, vector action size)` if the vector action space is continuous and `(batch size, 1)` if the vector action space is discrete.
* **`previous_actions`** : A two dimensional numpy array of dimension `(batch size, vector action size)` if the vector action space is continuous and `(batch size, number of branches)` if the vector action space is discrete.
Once loaded, you can use your UnityEnvironment object, which referenced by a variable named `env` in this example, can be used in the following way: