您最多选择25个主题
主题必须以中文或者字母或数字开头,可以包含连字符 (-),并且长度不得超过35个字符
2.2 KiB
2.2 KiB
Migrating to ML-Agents v0.3
There are a large number of new features and improvements in ML-Agents v0.3 which change both the training process and Unity API in ways which will cause incompatibilities with environments made using older versions. This page is designed to highlight those changes for users familiar with v0.1 or v0.2 in order to ensure a smooth transition.
Important
- ML-Agents is no longer compatible with Python 2.
Python Training
- The training script
ppo.py
andPPO.ipynb
Python notebook have been replaced with a singlelearn.py
script as the launching point for training with ML-Agents. For more information on usinglearn.py
, see here. - Hyperparameters for training brains are now stored in the
trainer_config.yaml
file. For more information on using this file, see here.
Unity API
- Modifications to an Agent's rewards must now be done using either
AddReward()
orSetReward()
. - Setting an Agent to done now requires the use of the
Done()
method. CollectStates()
has been replaced byCollectObservations()
, which now no longer returns a list of floats.- To collect observations, call
AddVectorObs()
withinCollectObservations()
. Note that you can callAddVectorObs()
with floats, integers, lists and arrays of floats, Vector3 and Quaternions. AgentStep()
has been replaced byAgentAction()
.WaitTime()
has been removed.- The
Frame Skip
field of the Academy is replaced by the Agent'sDecision Frequency
field, enabling agent to make decisions at different frequencies. - The names of the inputs in the Internal Brain have been changed. You must replace
state
withvector_observation
andobservation
withvisual_observation
. In addition, you must remove theepsilon
placeholder.
Semantics
In order to more closely align with the terminology used in the Reinforcement Learning field, and to be more descriptive, we have changed the names of some of the concepts used in ML-Agents. The changes are highlighted in the table below.
Old - v0.2 and earlier | New - v0.3 and later |
---|---|
State | Vector Observation |
Observation | Visual Observation |
Action | Vector Action |
N/A | Text Observation |
N/A | Text Action |