2.2 KiB
Creating an Academy
An Academy orchestrates all the Agent objects in a Unity scene. Every scene containing Agents must contain a single Academy. To use an Academy, you must create your own subclass. However, all the methods you can override are optional.
Use the Academy methods to:
- Initialize the environment after the scene loads
- Reset the environment
- Change things in the environment at each simulation step
See Reinforcement Learning in Unity for a description of the timing of these method calls during a simulation.
Initializing an Academy
Initialization is performed once in an Academy object's life cycle. Use the
InitializeAcademy()
method for any logic you would normally perform in the
standard Unity Start()
or Awake()
methods.
Note: Because the base Academy implements a Awake()
function, you must not
implement your own. Because of the way the Unity MonoBehaviour class is defined,
implementing your own Awake()
function hides the base class version and Unity
will call yours instead. Likewise, do not implement a FixedUpdate()
function
in your Academy subclass.
Resetting an Environment
Implement an AcademyReset()
function to alter the environment at the start of
each episode. For example, you might want to reset an Agent to its starting
position or move a goal to a random position. An environment resets when the
reset()
method is called on the Python UnityEnvironment
.
When you reset an environment, consider the factors that should change so that training is generalizable to different conditions. For example, if you were training a maze-solving agent, you would probably want to change the maze itself for each training episode. Otherwise, the agent would probably on learn to solve one, particular maze, not mazes in general.
Controlling an Environment
The AcademyStep()
function is called at every step in the simulation before
any Agents are updated. Use this function to update objects in the environment
at every step or during the episode between environment resets. For example, if
you want to add elements to the environment at random intervals, you can put the
logic for creating them in the AcademyStep()
function.