Arthur Juliani
5 年前
当前提交
c577ce26
共有 83 个文件被更改,包括 12547 次插入 和 5713 次删除
-
2Project/Assets/ML-Agents/Examples/3DBall/Demos/Expert3DBall.demo.meta
-
2Project/Assets/ML-Agents/Examples/3DBall/Demos/Expert3DBallHard.demo.meta
-
2Project/Assets/ML-Agents/Examples/Basic/Demos/ExpertBasic.demo.meta
-
2Project/Assets/ML-Agents/Examples/Bouncer/Demos/ExpertBouncer.demo.meta
-
2Project/Assets/ML-Agents/Examples/Crawler/Demos/ExpertCrawlerDyn.demo.meta
-
2Project/Assets/ML-Agents/Examples/Crawler/Demos/ExpertCrawlerSta.demo.meta
-
2Project/Assets/ML-Agents/Examples/FoodCollector/Demos/ExpertFood.demo.meta
-
2Project/Assets/ML-Agents/Examples/GridWorld/Demos/ExpertGrid.demo.meta
-
2Project/Assets/ML-Agents/Examples/Hallway/Demos/ExpertHallway.demo.meta
-
2Project/Assets/ML-Agents/Examples/PushBlock/Demos/ExpertPush.demo.meta
-
2Project/Assets/ML-Agents/Examples/Pyramids/Demos/ExpertPyramid.demo.meta
-
2Project/Assets/ML-Agents/Examples/Reacher/Demos/ExpertReacher.demo.meta
-
72Project/Assets/ML-Agents/Examples/Soccer/Prefabs/SoccerFieldTwos.prefab
-
71Project/Assets/ML-Agents/Examples/Soccer/Scripts/AgentSoccer.cs
-
12Project/Assets/ML-Agents/Examples/Soccer/Scripts/SoccerFieldArea.cs
-
1001Project/Assets/ML-Agents/Examples/Soccer/TFModels/SoccerTwos.nn
-
2Project/Assets/ML-Agents/Examples/Soccer/TFModels/Goalie.nn.meta
-
2Project/Assets/ML-Agents/Examples/Tennis/Demos/ExpertTennis.demo.meta
-
2Project/Assets/ML-Agents/Examples/Walker/Demos/ExpertWalker.demo.meta
-
2Project/ProjectSettings/ProjectVersion.txt
-
307com.unity.ml-agents/CHANGELOG.md
-
78com.unity.ml-agents/Editor/DemonstrationDrawer.cs
-
26com.unity.ml-agents/Editor/DemonstrationImporter.cs
-
21com.unity.ml-agents/Runtime/Agent.cs
-
68com.unity.ml-agents/Runtime/Communicator/GrpcExtensions.cs
-
2com.unity.ml-agents/Runtime/Demonstrations/DemonstrationWriter.cs
-
17com.unity.ml-agents/Runtime/Policies/HeuristicPolicy.cs
-
35com.unity.ml-agents/Runtime/Timer.cs
-
43com.unity.ml-agents/Tests/Editor/TimerTest.cs
-
36config/trainer_config.yaml
-
92docs/FAQ.md
-
346docs/Getting-Started.md
-
120docs/Installation-Anaconda-Windows.md
-
489docs/Learning-Environment-Create-New.md
-
508docs/Learning-Environment-Design-Agents.md
-
674docs/Learning-Environment-Examples.md
-
84docs/Learning-Environment-Executable.md
-
39docs/ML-Agents-Overview.md
-
100docs/Readme.md
-
6docs/Training-Imitation-Learning.md
-
368docs/Training-ML-Agents.md
-
24docs/Training-Self-Play.md
-
147docs/Training-on-Amazon-Web-Service.md
-
145docs/Training-on-Microsoft-Azure.md
-
36docs/Using-Docker.md
-
82docs/Using-Tensorboard.md
-
66docs/Using-Virtual-Environment.md
-
257docs/images/demo_inspector.png
-
999docs/images/docker_build_settings.png
-
198docs/images/learning_environment_basic.png
-
545docs/images/learning_environment_example.png
-
604docs/images/unity_package_json.png
-
999docs/images/unity_package_manager_window.png
-
95ml-agents/mlagents/trainers/learn.py
-
24utils/validate_versions.py
-
1001Project/Assets/ML-Agents/Examples/Soccer/Prefabs/StrikersVsGoalieField.prefab
-
8Project/Assets/ML-Agents/Examples/Soccer/Prefabs/StrikersVsGoalieField.prefab.meta
-
919Project/Assets/ML-Agents/Examples/Soccer/Scenes/StrikersVsGoalie.unity
-
8Project/Assets/ML-Agents/Examples/Soccer/Scenes/StrikersVsGoalie.unity.meta
-
1001Project/Assets/ML-Agents/Examples/Soccer/TFModels/Goalie.nn
-
11Project/Assets/ML-Agents/Examples/Soccer/TFModels/SoccerTwos.nn.meta
-
1001Project/Assets/ML-Agents/Examples/Soccer/TFModels/Striker.nn
-
11Project/Assets/ML-Agents/Examples/Soccer/TFModels/Striker.nn.meta
-
22com.unity.ml-agents/Runtime/Demonstrations/DemonstrationMetaData.cs
-
11com.unity.ml-agents/Runtime/Demonstrations/DemonstrationMetaData.cs.meta
-
37com.unity.ml-agents/Runtime/Demonstrations/DemonstrationSummary.cs
-
7config/curricula/soccer.yaml
-
122docs/images/learning_environment_full.png
-
1001docs/images/roller-ball-agent.png
-
932docs/images/roller-ball-floor.png
-
115docs/images/roller-ball-hierarchy.png
-
163docs/images/roller-ball-projects.png
-
803docs/images/roller-ball-target.png
-
938docs/images/strikersvsgoalie.png
-
38com.unity.ml-agents/Runtime/Demonstrations/Demonstration.cs
-
86docs/images/mlagents-NewProject.png
-
388docs/images/mlagents-NewTutBlock.png
-
345docs/images/mlagents-NewTutFloor.png
-
333docs/images/mlagents-NewTutSphere.png
-
91docs/Training-on-Microsoft-Azure-Custom-Instance.md
-
0/Project/Assets/ML-Agents/Examples/Soccer/TFModels/SoccerTwos.nn
-
0/Project/Assets/ML-Agents/Examples/Soccer/TFModels/Goalie.nn.meta
-
0/com.unity.ml-agents/Runtime/Demonstrations/DemonstrationSummary.cs.meta
1001
Project/Assets/ML-Agents/Examples/Soccer/TFModels/SoccerTwos.nn
文件差异内容过多而无法显示
查看文件
文件差异内容过多而无法显示
查看文件
|
|||
m_EditorVersion: 2018.4.18f1 |
|||
m_EditorVersion: 2018.4.17f1 |
|
|||
# Changelog |
|||
|
|||
and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.html). |
|||
|
|||
and this project adheres to |
|||
[Semantic Versioning](http://semver.org/spec/v2.0.0.html). |
|||
|
|||
- The `--load` and `--train` command-line flags have been deprecated. Training now happens by default, and |
|||
use `--resume` to resume training instead. (#3705) |
|||
- The Jupyter notebooks have been removed from the repository. |
|||
- Introduced the `SideChannelUtils` to register, unregister and access side channels. |
|||
- `Academy.FloatProperties` was removed, please use `SideChannelUtils.GetSideChannel<FloatPropertiesChannel>()` instead. |
|||
- Removed the multi-agent gym option from the gym wrapper. For multi-agent scenarios, use the [Low Level Python API](Python-API.md). |
|||
- The low level Python API has changed. You can look at the document [Low Level Python API documentation](Python-API.md) for more information. If you use `mlagents-learn` for training, this should be a transparent change. |
|||
- Added ability to start training (initialize model weights) from a previous run ID. (#3710) |
|||
- The internal event `Academy.AgentSetStatus` was renamed to `Academy.AgentPreStep` and made public. |
|||
- The offset logic was removed from DecisionRequester. |
|||
- The signature of `Agent.Heuristic()` was changed to take a `float[]` as a parameter, instead of returning the array. This was done to prevent a common source of error where users would return arrays of the wrong size. |
|||
- The communication API version has been bumped up to 1.0.0 and will use [Semantic Versioning](https://semver.org/) to do compatibility checks for communication between Unity and the Python process. |
|||
- The obsolete `Agent` methods `GiveModel`, `Done`, `InitializeAgent`, `AgentAction` and `AgentReset` have been removed. |
|||
|
|||
- The `--load` and `--train` command-line flags have been deprecated. Training |
|||
now happens by default, and use `--resume` to resume training instead. (#3705) |
|||
- The Jupyter notebooks have been removed from the repository. |
|||
- Introduced the `SideChannelUtils` to register, unregister and access side |
|||
channels. |
|||
- `Academy.FloatProperties` was removed, please use |
|||
`SideChannelUtils.GetSideChannel<FloatPropertiesChannel>()` instead. |
|||
- Removed the multi-agent gym option from the gym wrapper. For multi-agent |
|||
scenarios, use the [Low Level Python API](../docs/Python-API.md). |
|||
- The low level Python API has changed. You can look at the document |
|||
[Low Level Python API documentation](../docs/Python-API.md) for more |
|||
information. If you use `mlagents-learn` for training, this should be a |
|||
transparent change. |
|||
- Added ability to start training (initialize model weights) from a previous run |
|||
ID. (#3710) |
|||
- The internal event `Academy.AgentSetStatus` was renamed to |
|||
`Academy.AgentPreStep` and made public. |
|||
- The offset logic was removed from DecisionRequester. |
|||
- The signature of `Agent.Heuristic()` was changed to take a `float[]` as a |
|||
parameter, instead of returning the array. This was done to prevent a common |
|||
source of error where users would return arrays of the wrong size. |
|||
- The communication API version has been bumped up to 1.0.0 and will use |
|||
[Semantic Versioning](https://semver.org/) to do compatibility checks for |
|||
communication between Unity and the Python process. |
|||
- The obsolete `Agent` methods `GiveModel`, `Done`, `InitializeAgent`, |
|||
`AgentAction` and `AgentReset` have been removed. |
|||
- The GhostTrainer has been extended to support asymmetric games and the asymmetric example environment Strikers Vs. Goalie has been added. |
|||
- Format of console output has changed slightly and now matches the name of the model/summary directory. (#3630, #3616) |
|||
- Added a feature to allow sending stats from C# environments to TensorBoard (and other python StatsWriters). To do this from your code, use `SideChannelUtils.GetSideChannel<StatsSideChannel>().AddStat(key, value)` (#3660) |
|||
- Renamed 'Generalization' feature to 'Environment Parameter Randomization'. |
|||
- Timer files now contain a dictionary of metadata, including things like the package version numbers. |
|||
- SideChannel IncomingMessages methods now take an optional default argument, which is used when trying to read more data than the message contains. |
|||
- The way that UnityEnvironment decides the port was changed. If no port is specified, the behavior will depend on the `file_name` parameter. If it is `None`, 5004 (the editor port) will be used; otherwise 5005 (the base environment port) will be used. |
|||
- Fixed an issue where exceptions from environments provided a returncode of 0. (#3680) |
|||
- Running `mlagents-learn` with the same `--run-id` twice will no longer overwrite the existing files. (#3705) |
|||
- `StackingSensor` was changed from `internal` visibility to `public` |
|||
- Updated Barracuda to 0.6.3-preview. |
|||
|
|||
- Format of console output has changed slightly and now matches the name of the |
|||
model/summary directory. (#3630, #3616) |
|||
- Added a feature to allow sending stats from C# environments to TensorBoard |
|||
(and other python StatsWriters). To do this from your code, use |
|||
`SideChannelUtils.GetSideChannel<StatsSideChannel>().AddStat(key, value)` |
|||
(#3660) |
|||
- Renamed 'Generalization' feature to 'Environment Parameter Randomization'. |
|||
- Timer files now contain a dictionary of metadata, including things like the |
|||
package version numbers. |
|||
- SideChannel IncomingMessages methods now take an optional default argument, |
|||
which is used when trying to read more data than the message contains. |
|||
- The way that UnityEnvironment decides the port was changed. If no port is |
|||
specified, the behavior will depend on the `file_name` parameter. If it is |
|||
`None`, 5004 (the editor port) will be used; otherwise 5005 (the base |
|||
environment port) will be used. |
|||
- Fixed an issue where exceptions from environments provided a returncode of 0. |
|||
(#3680) |
|||
- Running `mlagents-learn` with the same `--run-id` twice will no longer |
|||
overwrite the existing files. (#3705) |
|||
- `StackingSensor` was changed from `internal` visibility to `public` |
|||
- Updated Barracuda to 0.6.3-preview. |
|||
|
|||
### Bug Fixes |
|||
|
|||
- Fixed a display bug when viewing Demonstration files in the inspector. The |
|||
shapes of the observations in the file now display correctly. (#3771) |
|||
|
|||
- Raise the wall in CrawlerStatic scene to prevent Agent from falling off. (#3650) |
|||
- Fixed an issue where specifying `vis_encode_type` was required only for SAC. (#3677) |
|||
- Fixed the reported entropy values for continuous actions (#3684) |
|||
- Fixed an issue where switching models using `SetModel()` during training would use an excessive amount of memory. (#3664) |
|||
- Environment subprocesses now close immediately on timeout or wrong API version. (#3679) |
|||
- Fixed an issue in the gym wrapper that would raise an exception if an Agent called EndEpisode multiple times in the same step. (#3700) |
|||
- Fixed an issue where logging output was not visible; logging levels are now set consistently. (#3703) |
|||
- Raise the wall in CrawlerStatic scene to prevent Agent from falling off. |
|||
(#3650) |
|||
- Fixed an issue where specifying `vis_encode_type` was required only for SAC. |
|||
(#3677) |
|||
- Fixed the reported entropy values for continuous actions (#3684) |
|||
- Fixed an issue where switching models using `SetModel()` during training would |
|||
use an excessive amount of memory. (#3664) |
|||
- Environment subprocesses now close immediately on timeout or wrong API |
|||
version. (#3679) |
|||
- Fixed an issue in the gym wrapper that would raise an exception if an Agent |
|||
called EndEpisode multiple times in the same step. (#3700) |
|||
- Fixed an issue where logging output was not visible; logging levels are now |
|||
set consistently. (#3703) |
|||
|
|||
- `Agent.CollectObservations` now takes a VectorSensor argument. (#3352, #3389) |
|||
- Added `Agent.CollectDiscreteActionMasks` virtual method with a `DiscreteActionMasker` argument to specify which discrete actions are unavailable to the Agent. (#3525) |
|||
- Beta support for ONNX export was added. If the `tf2onnx` python package is installed, models will be saved to `.onnx` as well as `.nn` format. |
|||
Note that Barracuda 0.6.0 or later is required to import the `.onnx` files properly |
|||
- Multi-GPU training and the `--multi-gpu` option has been removed temporarily. (#3345) |
|||
- All Sensor related code has been moved to the namespace `MLAgents.Sensors`. |
|||
- All SideChannel related code has been moved to the namespace `MLAgents.SideChannels`. |
|||
- `BrainParameters` and `SpaceType` have been removed from the public API |
|||
- `BehaviorParameters` have been removed from the public API. |
|||
- The following methods in the `Agent` class have been deprecated and will be removed in a later release: |
|||
- `InitializeAgent()` was renamed to `Initialize()` |
|||
- `AgentAction()` was renamed to `OnActionReceived()` |
|||
- `AgentReset()` was renamed to `OnEpisodeBegin()` |
|||
- `Done()` was renamed to `EndEpisode()` |
|||
- `GiveModel()` was renamed to `SetModel()` |
|||
|
|||
- `Agent.CollectObservations` now takes a VectorSensor argument. (#3352, #3389) |
|||
- Added `Agent.CollectDiscreteActionMasks` virtual method with a |
|||
`DiscreteActionMasker` argument to specify which discrete actions are |
|||
unavailable to the Agent. (#3525) |
|||
- Beta support for ONNX export was added. If the `tf2onnx` python package is |
|||
installed, models will be saved to `.onnx` as well as `.nn` format. Note that |
|||
Barracuda 0.6.0 or later is required to import the `.onnx` files properly |
|||
- Multi-GPU training and the `--multi-gpu` option has been removed temporarily. |
|||
(#3345) |
|||
- All Sensor related code has been moved to the namespace `MLAgents.Sensors`. |
|||
- All SideChannel related code has been moved to the namespace |
|||
`MLAgents.SideChannels`. |
|||
- `BrainParameters` and `SpaceType` have been removed from the public API |
|||
- `BehaviorParameters` have been removed from the public API. |
|||
- The following methods in the `Agent` class have been deprecated and will be |
|||
removed in a later release: |
|||
- `InitializeAgent()` was renamed to `Initialize()` |
|||
- `AgentAction()` was renamed to `OnActionReceived()` |
|||
- `AgentReset()` was renamed to `OnEpisodeBegin()` |
|||
- `Done()` was renamed to `EndEpisode()` |
|||
- `GiveModel()` was renamed to `SetModel()` |
|||
- Monitor.cs was moved to Examples. (#3372) |
|||
- Automatic stepping for Academy is now controlled from the AutomaticSteppingEnabled property. (#3376) |
|||
- The GetEpisodeCount, GetStepCount, GetTotalStepCount and methods of Academy were changed to EpisodeCount, StepCount, TotalStepCount properties respectively. (#3376) |
|||
- Several classes were changed from public to internal visibility. (#3390) |
|||
- Academy.RegisterSideChannel and UnregisterSideChannel methods were added. (#3391) |
|||
- A tutorial on adding custom SideChannels was added (#3391) |
|||
- The stepping logic for the Agent and the Academy has been simplified (#3448) |
|||
- Update Barracuda to 0.6.1-preview |
|||
* The interface for `RayPerceptionSensor.PerceiveStatic()` was changed to take an input class and write to an output class, and the method was renamed to `Perceive()`. |
|||
- The checkpoint file suffix was changed from `.cptk` to `.ckpt` (#3470) |
|||
- The command-line argument used to determine the port that an environment will listen on was changed from `--port` to `--mlagents-port`. |
|||
- `DemonstrationRecorder` can now record observations outside of the editor. |
|||
- `DemonstrationRecorder` now has an optional path for the demonstrations. This will default to `Application.dataPath` if not set. |
|||
- `DemonstrationStore` was changed to accept a `Stream` for its constructor, and was renamed to `DemonstrationWriter` |
|||
- The method `GetStepCount()` on the Agent class has been replaced with the property getter `StepCount` |
|||
- `RayPerceptionSensorComponent` and related classes now display the debug gizmos whenever the Agent is selected (not just Play mode). |
|||
- Most fields on `RayPerceptionSensorComponent` can now be changed while the editor is in Play mode. The exceptions to this are fields that affect the number of observations. |
|||
- Most fields on `CameraSensorComponent` and `RenderTextureSensorComponent` were changed to private and replaced by properties with the same name. |
|||
- Unused static methods from the `Utilities` class (ShiftLeft, ReplaceRange, AddRangeNoAlloc, and GetSensorFloatObservationSize) were removed. |
|||
- The `Agent` class is no longer abstract. |
|||
- SensorBase was moved out of the package and into the Examples directory. |
|||
- `AgentInfo.actionMasks` has been renamed to `AgentInfo.discreteActionMasks`. |
|||
- `DecisionRequester` has been made internal (you can still use the DecisionRequesterComponent from the inspector). `RepeatAction` was renamed `TakeActionsBetweenDecisions` for clarity. (#3555) |
|||
- The `IFloatProperties` interface has been removed. |
|||
- Fix #3579. |
|||
- Improved inference performance for models with multiple action branches. (#3598) |
|||
- Fixed an issue when using GAIL with less than `batch_size` number of demonstrations. (#3591) |
|||
- The interfaces to the `SideChannel` classes (on C# and python) have changed to use new `IncomingMessage` and `OutgoingMessage` classes. These should make reading and writing data to the channel easier. (#3596) |
|||
- Updated the ExpertPyramid.demo example demonstration file (#3613) |
|||
- Updated project version for example environments to 2018.4.18f1. (#3618) |
|||
- Changed the Product Name in the example environments to remove spaces, so that the default build executable file doesn't contain spaces. (#3612) |
|||
|
|||
- Monitor.cs was moved to Examples. (#3372) |
|||
- Automatic stepping for Academy is now controlled from the |
|||
AutomaticSteppingEnabled property. (#3376) |
|||
- The GetEpisodeCount, GetStepCount, GetTotalStepCount and methods of Academy |
|||
were changed to EpisodeCount, StepCount, TotalStepCount properties |
|||
respectively. (#3376) |
|||
- Several classes were changed from public to internal visibility. (#3390) |
|||
- Academy.RegisterSideChannel and UnregisterSideChannel methods were added. |
|||
(#3391) |
|||
- A tutorial on adding custom SideChannels was added (#3391) |
|||
- The stepping logic for the Agent and the Academy has been simplified (#3448) |
|||
- Update Barracuda to 0.6.1-preview |
|||
|
|||
* The interface for `RayPerceptionSensor.PerceiveStatic()` was changed to take |
|||
an input class and write to an output class, and the method was renamed to |
|||
`Perceive()`. |
|||
|
|||
- The checkpoint file suffix was changed from `.cptk` to `.ckpt` (#3470) |
|||
- The command-line argument used to determine the port that an environment will |
|||
listen on was changed from `--port` to `--mlagents-port`. |
|||
- `DemonstrationRecorder` can now record observations outside of the editor. |
|||
- `DemonstrationRecorder` now has an optional path for the demonstrations. This |
|||
will default to `Application.dataPath` if not set. |
|||
- `DemonstrationStore` was changed to accept a `Stream` for its constructor, and |
|||
was renamed to `DemonstrationWriter` |
|||
- The method `GetStepCount()` on the Agent class has been replaced with the |
|||
property getter `StepCount` |
|||
- `RayPerceptionSensorComponent` and related classes now display the debug |
|||
gizmos whenever the Agent is selected (not just Play mode). |
|||
- Most fields on `RayPerceptionSensorComponent` can now be changed while the |
|||
editor is in Play mode. The exceptions to this are fields that affect the |
|||
number of observations. |
|||
- Most fields on `CameraSensorComponent` and `RenderTextureSensorComponent` were |
|||
changed to private and replaced by properties with the same name. |
|||
- Unused static methods from the `Utilities` class (ShiftLeft, ReplaceRange, |
|||
AddRangeNoAlloc, and GetSensorFloatObservationSize) were removed. |
|||
- The `Agent` class is no longer abstract. |
|||
- SensorBase was moved out of the package and into the Examples directory. |
|||
- `AgentInfo.actionMasks` has been renamed to `AgentInfo.discreteActionMasks`. |
|||
- `DecisionRequester` has been made internal (you can still use the |
|||
DecisionRequesterComponent from the inspector). `RepeatAction` was renamed |
|||
`TakeActionsBetweenDecisions` for clarity. (#3555) |
|||
- The `IFloatProperties` interface has been removed. |
|||
- Fix #3579. |
|||
- Improved inference performance for models with multiple action branches. |
|||
(#3598) |
|||
- Fixed an issue when using GAIL with less than `batch_size` number of |
|||
demonstrations. (#3591) |
|||
- The interfaces to the `SideChannel` classes (on C# and python) have changed to |
|||
use new `IncomingMessage` and `OutgoingMessage` classes. These should make |
|||
reading and writing data to the channel easier. (#3596) |
|||
- Updated the ExpertPyramid.demo example demonstration file (#3613) |
|||
- Updated project version for example environments to 2018.4.18f1. (#3618) |
|||
- Changed the Product Name in the example environments to remove spaces, so that |
|||
the default build executable file doesn't contain spaces. (#3612) |
|||
- Fixed an issue which caused self-play training sessions to consume a lot of memory. (#3451) |
|||
- Fixed an IndexError when using GAIL or behavioral cloning with demonstrations recorded with 0.14.0 or later (#3464) |
|||
|
|||
- Fixed an issue which caused self-play training sessions to consume a lot of |
|||
memory. (#3451) |
|||
- Fixed an IndexError when using GAIL or behavioral cloning with demonstrations |
|||
recorded with 0.14.0 or later (#3464) |
|||
- Fixed a bug with the rewards of multiple Agents in the gym interface (#3471, #3496) |
|||
|
|||
- Fixed a bug with the rewards of multiple Agents in the gym interface (#3471, |
|||
#3496) |
|||
- A new self-play mechanism for training agents in adversarial scenarios was added (#3194) |
|||
- Tennis and Soccer environments were refactored to enable training with self-play (#3194, #3331) |
|||
- UnitySDK folder was split into a Unity Package (com.unity.ml-agents) and our examples were moved to the Project folder (#3267) |
|||
|
|||
- A new self-play mechanism for training agents in adversarial scenarios was |
|||
added (#3194) |
|||
- Tennis and Soccer environments were refactored to enable training with |
|||
self-play (#3194, #3331) |
|||
- UnitySDK folder was split into a Unity Package (com.unity.ml-agents) and our |
|||
examples were moved to the Project folder (#3267) |
|||
- In order to reduce the size of the API, several classes and methods were marked as internal or private. Some public fields on the Agent were trimmed (#3342, #3353, #3269) |
|||
- Decision Period and on-demand decision checkboxes were removed from the Agent. on-demand decision is now the default (#3243) |
|||
- Calling Done() on the Agent will reset it immediately and call the AgentReset virtual method (#3291, #3242) |
|||
- The "Reset on Done" setting in AgentParameters was removed; this is now always true. AgentOnDone virtual method on the Agent was removed (#3311, #3222) |
|||
- Trainer steps are now counted per-Agent, not per-environment as in previous versions. For instance, if you have 10 Agents in the scene, 20 environment steps now correspond to 200 steps as printed in the terminal and in Tensorboard (#3113) |
|||
- In order to reduce the size of the API, several classes and methods were |
|||
marked as internal or private. Some public fields on the Agent were trimmed |
|||
(#3342, #3353, #3269) |
|||
- Decision Period and on-demand decision checkboxes were removed from the Agent. |
|||
on-demand decision is now the default (#3243) |
|||
- Calling Done() on the Agent will reset it immediately and call the AgentReset |
|||
virtual method (#3291, #3242) |
|||
- The "Reset on Done" setting in AgentParameters was removed; this is now always |
|||
true. AgentOnDone virtual method on the Agent was removed (#3311, #3222) |
|||
- Trainer steps are now counted per-Agent, not per-environment as in previous |
|||
versions. For instance, if you have 10 Agents in the scene, 20 environment |
|||
steps now correspond to 200 steps as printed in the terminal and in |
|||
Tensorboard (#3113) |
|||
|
|||
- Curriculum config files are now YAML formatted and all curricula for a training run are combined into a single file (#3186) |
|||
- ML-Agents components, such as BehaviorParameters and various Sensor implementations, now appear in the Components menu (#3231) |
|||
- Exceptions are now raised in Unity (in debug mode only) if NaN observations or rewards are passed (#3221) |
|||
- RayPerception MonoBehavior, which was previously deprecated, was removed (#3304) |
|||
- Uncompressed visual (i.e. 3d float arrays) observations are now supported. CameraSensorComponent and RenderTextureSensor now have an option to write uncompressed observations (#3148) |
|||
- Agent’s handling of observations during training was improved so that an extra copy of the observations is no longer maintained (#3229) |
|||
- Error message for missing trainer config files was improved to include the absolute path (#3230) |
|||
- Curriculum config files are now YAML formatted and all curricula for a |
|||
training run are combined into a single file (#3186) |
|||
- ML-Agents components, such as BehaviorParameters and various Sensor |
|||
implementations, now appear in the Components menu (#3231) |
|||
- Exceptions are now raised in Unity (in debug mode only) if NaN observations or |
|||
rewards are passed (#3221) |
|||
- RayPerception MonoBehavior, which was previously deprecated, was removed |
|||
(#3304) |
|||
- Uncompressed visual (i.e. 3d float arrays) observations are now supported. |
|||
CameraSensorComponent and RenderTextureSensor now have an option to write |
|||
uncompressed observations (#3148) |
|||
- Agent’s handling of observations during training was improved so that an extra |
|||
copy of the observations is no longer maintained (#3229) |
|||
- Error message for missing trainer config files was improved to include the |
|||
absolute path (#3230) |
|||
|
|||
- A bug that caused RayPerceptionSensor to behave inconsistently with transforms that have non-1 scale was fixed (#3321) |
|||
- Some small bugfixes to tensorflow_to_barracuda.py were backported from the barracuda release (#3341) |
|||
- Base port in the jupyter notebook example was updated to use the same port that the editor uses (#3283) |
|||
|
|||
- A bug that caused RayPerceptionSensor to behave inconsistently with transforms |
|||
that have non-1 scale was fixed (#3321) |
|||
- Some small bugfixes to tensorflow_to_barracuda.py were backported from the |
|||
barracuda release (#3341) |
|||
- Base port in the jupyter notebook example was updated to use the same port |
|||
that the editor uses (#3283) |
|||
### This is the first release of *Unity Package ML-Agents*. |
|||
### This is the first release of _Unity Package ML-Agents_. |
|||
*Short description of this release* |
|||
_Short description of this release_ |
|
|||
# Getting Started Guide |
|||
|
|||
This guide walks through the end-to-end process of opening an ML-Agents |
|||
toolkit example environment in Unity, building the Unity executable, training an |
|||
Agent in it, and finally embedding the trained model into the Unity environment. |
|||
|
|||
The ML-Agents toolkit includes a number of [example |
|||
environments](Learning-Environment-Examples.md) which you can examine to help |
|||
understand the different ways in which the ML-Agents toolkit can be used. These |
|||
environments can also serve as templates for new environments or as ways to test |
|||
new ML algorithms. After reading this tutorial, you should be able to explore |
|||
train the example environments. |
|||
|
|||
If you are not familiar with the [Unity Engine](https://unity3d.com/unity), we |
|||
highly recommend the [Roll-a-ball |
|||
tutorial](https://unity3d.com/learn/tutorials/s/roll-ball-tutorial) to learn all |
|||
the basic concepts first. |
|||
This guide walks through the end-to-end process of opening one of our |
|||
[example environments](Learning-Environment-Examples.md) in Unity, training an |
|||
Agent in it, and embedding the trained model into the Unity environment. After |
|||
reading this tutorial, you should be able to train any of the example |
|||
environments. If you are not familiar with the |
|||
[Unity Engine](https://unity3d.com/unity), view our |
|||
[Background: Unity](Background-Unity.md) page for helpful pointers. |
|||
Additionally, if you're not familiar with machine learning, view our |
|||
[Background: Machine Learning](Background-Machine-Learning.md) page for a brief |
|||
overview and helpful pointers. |
|||
This guide uses the **3D Balance Ball** environment to teach the basic concepts and |
|||
usage patterns of ML-Agents. 3D Balance Ball |
|||
contains a number of agent cubes and balls (which are all copies of each other). |
|||
Each agent cube tries to keep its ball from falling by rotating either |
|||
horizontally or vertically. In this environment, an agent cube is an **Agent** that |
|||
receives a reward for every step that it balances the ball. An agent is also |
|||
penalized with a negative reward for dropping the ball. The goal of the training |
|||
process is to have the agents learn to balance the ball on their head. |
|||
For this guide, we'll use the **3D Balance Ball** environment which contains a |
|||
number of agent cubes and balls (which are all copies of each other). Each agent |
|||
cube tries to keep its ball from falling by rotating either horizontally or |
|||
vertically. In this environment, an agent cube is an **Agent** that receives a |
|||
reward for every step that it balances the ball. An agent is also penalized with |
|||
a negative reward for dropping the ball. The goal of the training process is to |
|||
have the agents learn to balance the ball on their head. |