* made BrainParameters a class to set default values
Modified the error message if the state is discrete
* Add discrete state support to PPO and provide discrete state example environment
* Add flexibility to continuous control as well
* Finish PPO flexible model generation implementation
* Fix formatting
* Support color observations
* Add best practices document
* bug fix for non square observations
* Update Readme.md
* Remove scipy dependency
* Add installation doc
* added broadcast to the player and heuristic brain.
Allows the python API to record actions taken along with the states and rewards
* removed the broadcast checkbox
Added a Handshake method for the communicator
The academy will try to handshake regardless of the brains present
Player and Heuristic brains will send their information through the communicator but will not receive commands
* bug fix : The environment only requests actions from external brains when unique
* added warning in case no brins are set to external
* fix on the instanciation of coreBrains,
fix on the conversion of actions to arrays in the BrainInfo received from step
* default discrete action is now 0
bug fix for discrete broadcast action (the action size should be one in Agents.cs)
modified Tennis so that the default action is no action
modified the TemplateDecsion.cs to ensure non null values are sent from Decide() and MakeMemory()
* minor fixes
* need to convert the s...
Summary of changes in setting up TF Sharp
1) Make sure to press "enter" after entering "ENABLE_TENSORFLOW" flag.
2) Save the project.
3) Checking to make sure the TF asset files are installed in the project.
Greatly simplified GridWorld code. It now also only uses a visual observation rather than state vector in order to demonstrate learning purely from a visual input.
* Add support for stacking past n states to allow network to learn temporal dependencies.
* Add Banana Collector environment for demonstrating partially observable multi-agent environments.
* Add 3DBall Hard which lacks velocity information in state representation. Used as test for LSTM and state-stacking features.
* Rework Tennis environment to be continuous control and trainable in 100k steps.
* [Semantics] Modified the semantics for the documentation
* [Semantics] Updated the images
* [Semantics] Made further changes to the docs based of the comments received
- Mostly ensures consistency with our other guides, in addition to including some more detail.
- Added an image to showcase the Linux Build Support for Unity.
- Updated the Installation guide to reference the Linux Build Support component.
* [Documentation] Added the On Demand Decision documentation.
* [Fixes] Corrected grammar mistakes
* [Documentation] Adding what kinds of games ODD is useful for
* [Documentation] Added the LSTM documentation
* [Documentation] Fix the line breaks
* [Documentations] Modified the doc given feedback
* [Documentation] Improvements based of PR comments
* [Documentation] Removed reference to PPO and BC
* [New Bouncer] Revamped the Bouncer to be in 3D
* [Bouncer Configuration file] Added the BouncerBrain configuration
* [Documentation] Added the Bouncer tot he documentation page
* [Fixes] Fixed lines too long and the documentation typo
* Slight adjustments to bouncer environment
* Don't default to internal brain on bouncer
- Incorporated feedback provided offline
- Fixed capitalizations of Agent/agent
- Re-organized trainers and features sections (renamed files accordingly)
- Change Agent Editor (code) to ODD feature
- Added a summary and next steps section
- Cleaned up text
- Renamed file
- Updated ML-Agents-Overview to point to new file
- Updated figure to showcase the new “On Demand Decisions” checkbox text
- For docker run commands, ensured each flag is on its own line (for readability)
- Standardized capitalization for “Docker” and “image”
- Removed lingering empty line
- Minor rewordings
* [Documentation] Added description on how to add visual observations
* [Documentation] Forgot a paragraph
* [Documentation] Addressed comments
* [Documentation] Addressed comments, again
* Minor changes to ensure a common visual language.
* Agents are blue (or additionally red in competitive scenarios).
* Interactable objects are orange.
* Goals are green when objects, and checkerboards when places.
* Not everything perfectly follows this, but things are mostly consistent now.
* Renamed "Banana" folder to "BananaCollectors"
* Ensured all brains were set to "Player"
* Moved non-shared assets out of the "SharedAssets" folder.
There were some important things that should have been mentioned in this tutorial, and it took me a while to figure them out. Most importantly, it was never mentioned how to properly end a training session in the Anaconda prompt to receive an exported .bytes file.
I added more comments to explain why using python 3.5, Tensorflow 1.4.0, and comments and links for Visual Studio 2015.
I erased the docopt installation and the part for the folders:
Had to paste this “lib” folder from C:\Program Files (x86)\Microsoft Visual Studio 14.0\lib
Into C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\PlatformSDK\lib
Because this might have been only a problem for me. Thanks,
There were some important things that should have been mentioned in this tutorial, and it took me a while to figure them out. Most importantly, it was never mentioned how to properly end a training session in the Anaconda prompt to receive an exported .bytes file.
Fixes the following issues:
* Missing component reference in BananaRL environment.
* Neural Network for multiple visual observations was not properly generated.
* Episode time-out value estimate bootstrapping used incorrect observation as input.
* [CoreBrain] Bug fix in the internal brain
Discrete vector observations did not have the right size
* [Docs] Removed all references to the unitypackages other than the TensorFlowSharp.unitypackage
.
* [Basic] Updated the bytes file of basic
* [Docs] Addressed comments
* [Docs] Re-addressed the comments
* [Bug Fix] Scalling the visual input between 0 and 1
* [Comments] Added comments to the
BatchVisualObservations method of the CoreInternalBrain.
* [Renaming] Renamed BlackAndWhite to blackAndWhite
This PR makes the following changes:
* Moves clipping of continuous control model into model itself. Output is now always [-1, 1].
* Internal model values are now clipped between [-3, 3] before being rescaled to [-1, 1] for output. * This improves training performance by providing a wider range of values within which the pdf of the gaussian can fall. Output of [-1, 1] is used to be more environment-creator friendly.
* Fixes issue where epsilon was erroneously being used to reconstruct old probabilities during PPO update, leading to reduced learning performance.
* Introduce ScaleAction() function within python to easily rescale values from [-1, 1] to arbitrary range.
* Re-train all CC models using improved algorithm. All performance levels are equal or improved. In the case of Crawler, improvement is drastic.
* Update documentation appropriately.
* Made miscellaneous minor code style and optimization improvements within environments.
* First draft of Azure support docs
* Correcting links to other docs
* Adding additional links and cleaning instructions
* Adding references to Azure docs in other appropriate places
- Indent the section about providing actions to multiple brains to be in line with the rest of the step() docs.
- Move the line about what step() returns closer to the top of the docs so it's harder to overlook.
- Add a small code snippet about how to get BrainInfo belonging to a specific brain and how to get data from that BrainInfo object.
* [Refactor] Fixed line indentation
* Removed the library Newtonsoft.Json from the monitor
* Replaced calls to JSON converstion with manual conversion
* [Modified] The Monitor now has multiple
* Log methods that take different object types
* some random change so that I can create this PR
* docs update for TensorFlowSharp new version
* changed the links to the new unitypackage file
* resolved conflicts, updated the pictures for CUDA 9.0
* fixed a typo
* resolved arthur's comment
* blurred the usernames
* modified the AWS doc
* resolved Vince's comment
* Adds implementation of Curiosity-driven Exploration by Self-supervised Prediction (https://arxiv.org/abs/1705.05363) to PPO trainer.
* To enable, set use_curiosity flag to true in hyperparameter file.
* Includes refactor of unitytrainers model code to accommodate new feature.
* Adds new Pyramids environment (w/ documentation). Environment contains sparse reward, and can only be solved using PPO+Curiosity.
* Revamps agent code for walker and crawler environments to use shared JointDriveController system.
* Crawler has been reworked to be very cute.
* Crawler & Walker environments have been reworked to be visually consistent.
* Added Dynamic Crawler scene.
* All scenes re-trained and new models added.
* Documentation changes.
* Added missing declaration to docs sample code.
* Added pretrained model as default graph in Internal brain of Tennis scene
* Disabled PlayerBrain in Tennis by default.
* Removed accidental config.
* Update Feature-Monitor.md
Adds details on 1) how to activate the monitor and 2) how to be able to use a target transform to display the information on.
* Update Feature-Monitor.md
Updated with suggestions from reviewer.
- Dockerfile pulls in the mlagents directory now.
- Installs mlagents package locally with `pip install .`.
- Clients should now place trainer configs in unity-volume.
* [Initial Commit]
Modified the model.py file and the ppo/trainer.py file to use masked actions
* Preliminary modifications to the python side of the code to enable action masking
* Preliminary modifications to the C# side of the code to enable action masking
* Preliminary modifications to the communication side of the code to enable action masking
* Implemented action masking for BC
Note : The actions of the teacher are not masked
* More error messages for the action masking
* fix pytests
* Added Documentation
* Address comment
* Addressed Comments on docs
* Addressed second comment on docs
* Addressed comments for the python side of the code
* Created the action masker and associated unit tests
* Addressed comments on the C# side
* Addressed the comment regarding action_masking_name
* Addressed the comments
- Highlighting Python snippet in FAQ.
- Fixing import in Python API doc.
- ml-agents-protbuf now automatically places generated files in
the correct directory.
* GridWorld now uses action masking
* Addressed the comments
* addressed comments
* Added checkbox to turn action masking on/off (#1146)
* Added checkbox to turn action masking on/off
* Fix to handle the no-action option
* Added comment to GridWorld mentioning the use of action masking. (#1153)
* Fixing learn.py, trainer_controller.py, and Docker
- learn.py has been moved under trainers.
- this was a two line change
- learn.py will no longer be run as a main method
- docopt arguments are strings by default. learn.py now uses
this assumption to correctly parse arguments.
- trainer_controller.py now considers the Docker volume when
accepting a trainer config file path.
- the Docker container now uses mlagents-learn.
* Removing extraneous unity-volume ref.
* Make project version 2017.4
* updated the documentation
* added the upgrade notes for 2017.1 to 2017.4
* removed the .10f1
* fix the typo and make the language nicer
* resolved the comments
* Wrapping lines.
* Wording.
* resolved part of jeff's comment
* resolved part of jeff's comment
* fixed the link
* Update FAQ.md
Missing "an".
* Missing "an".
The calculation of observation vectors is faulty. The old calculation does not reflect distances to the edges and it does not only yield results between -1 and 1. Since distance calculation would have been difficult in one line, I just replaced it by the relative position of the ball (only using two vectors instead of four). I've conducted 500K-step reinforcing trainings before and after the change and got enormously improved results. Contact me for screenshots of the tensorboard or just use the debugger and do the math.
* Initial Commit
Ported most functionalities, still need to :
- Documentation
- Add Comments
- Custom drawer for BrainParameters
- Fix the UnitTests
- Review Functionalities
* Added Custom Drawer for the Brain Parameters
* Improvements to the HubDrawer
* Modified the Brain Editors
* Minor bug fixes and UI changes
* Modified the Help Boxes of the Drawers
* Modified Brain class, renamed Initialize and made DecideAction virtual
* Fix the UnityTests
* Simpler Brain creation menu
* Renamed Internal Brain to Learning Brain
* modified the parameters to remove reference to External or Internal in the Protobuf objects
* Updated the protobuf generated files
* Fix the Pytests
* Removed the graph scope from the Learning Brain
* cleaner logic than try catch
* Removed the isExternal field of the brain and put the isTraining logic into LearningBrain and Training Hub
* Modified how the Brain finds the A...
* pull/1294 from has-taiar
* removed the left bracket
* moved the windows link position
* update the windows doc
* resolved the comments, changed the pip install . to pip install -e . , added the package explanation to the Windows installation doc
* Resolved the comments
* add the 'the'
* added faq to the aws doc
* added the link
* added some faq and updated the temp ami id
* resolved the comments, updated one of the faq along with the scriptable object update
* added one other cause raise in issues
* fixed line change
* Fix Typo #1323
* First update to the docs
* Addressed comments
* remove references to TF#
* Replaced the references to TF# with new document.
* Edditied the FAQ
The check for wether an agent has fallen off the platform was using a wrong value of 1 instead of 0.
This meant that the agent immediately started in a falling state and entered a thrashing cycle of resetting itself.
* Documentation Update
* addressed comments
* new images for the recorder
* Improvements to the docs
* Address the comments
* Core_ML typo
* Updated the links to inference repo
* Put back Inference-Engine.md
* fix typos : brain
* Readd deleted file
* fix typos
* Addressed comments
* Simplified rewards and observations; Determined better settings for training within a reasonable amount of time.
* Simplified Agent rewards; Added training section that discusses hyperparameters.
* Added note about DecisionFrequency.
* Updated screenshots and a small clarification in the text.
* Tested and updated using v0.6.
* Update a couple of images, minor text edit.
* Replace with more recent training stats.
* resolve a couple of minor review commnts.
* Increased the recommended batch and buffer size hyperparameter values.
* Fix 2 typos.
* Wording and filepath changes to tutorials
* Retake editor images to match v0.6
Retake editor images so that the filepaths and Brain names match what they actually are.
* Add blurb about using the --load flag in the intro guide, and typo fix.
* Add section in tutorial to create multiple area learning environment.
* Add mention of Done() method in agent design
* Add option to set gym visual observation to uint8
* Add option to flatten branched discrete actions
* Add game_over variable to gym wrapper
* Add guide on how to use Dopamine with the gym wrapper and comparisons with Baselines and PPO
* Switched default Mac GFX API to Metal
* Added Barracuda pre-0.1.5
* Added basic integration with Barracuda Inference Engine
* Use predefined outputs the same way as for TF engine
* Fixed discrete action + LSTM support
* Switch Unity Mac Editor to Metal GFX API
* Fixed null model handling
* All examples converted to support Barracuda
* Added model conversion from Tensorflow to Barracuda
copied the barracuda.py file to ml-agents/mlagents/trainers
copied the tensorflow_to_barracuda.py file to ml-agents/mlagents/trainers
modified the tensorflow_to_barracuda.py file so it could be called from mlagents
modified ml-agents/mlagents/trainers/policy.py to convert the tf models to barracuda compatible .bytes file
* Added missing iOS BLAS plugin
* Added forgotten prefab changes
* Removed GLCore GFX backend for Mac, because it doesn't support Compute shaders
* Exposed GPU support for LearningBrain inference
...
* added the pypiwin32 package
* fixed the break on mac, fixed part of pytest above version 4
* added something to the windows to help unstuck people
* resolved the comment
* Added RenderTexture support for visual observations
* Cleaned up new ObservationToTexture function
* Added check for to width/height of RenderTexture
* Added check to hide HelpBox unless both cameras and RenderTextures are used
* Added documentation for Visual Observations using RenderTextures
* Added GridWorldRenderTexture Example scene
* Adjusted image size of doc images
* Added GridWorld example reference
* Fixed missing reference in the GridWorldRenderTexture scene and resaved the agent prefab
* Fix prefab instantiation and render timing in GridWorldRenderTexture
* Added screenshot and reworded documentation
* Unchecked control box
* Rename renderTexture
* Make RenderTexture scene default for GridWorld
Co-authored-by: Mads Johansen <pyjamads@gmail.com>
We need to document the meaning of the two new flags added for
multi-environment training. We may also want to add more specific
instructions for people wanting to speed up training in the future.
* update title caps
* Rename Custom-Protos.md to Creating-Custom-Protobuf-Messages.md
* Updated with custom protobuf messages
* Cleanup against to our doc guidelines
* Minor text revision
* Create Training-Concurrent-Unity-Instances
* Rename Training-Concurrent-Unity-Instances to Training-Concurrent-Unity-Instances.md
* update to right format for --num-envs
* added link to concurrent unity instances
* Update and rename Training-Concurrent-Unity-Instances.md to Training-Using-Concurrent-Unity-Instances.md
* Added considerations section
* Update Training-Using-Concurrent-Unity-Instances.md
* cleaned up language to match doc
* minor updates
* retroactive migration from 0.6 to 0.7
* Updated from 0.7 to 0.8 migration
* Minor typo
* minor fix
* accidentally duplicated step
* updated with new features list
* Update Learning-Environment-Create-New.md
Section : Final Editor Setup - Step 3. It says:
Drag the Brain RollerBallPlayer from the Project window to the RollerAgent Brain field.
Should say:
Drag the Brain RollerBallBrain from the Project window to the RollerAgent Brain field.
* Develop black format fix (#1998)
* fixed the format
* changed the circleci config
* [Gym] Added no_graphics argument (#1997)
> Added the no_graphics argument to the gym interface. #1413
* [Documentation] SetReward method (#1996)
Added a paragraph in the docs/Learning-Environment-Design-Agents.md document regarding the use of SetReward and how it is different from AddReward
* [Documentation] Added information for the environments the trainer cannot train with the default configurations (#1995)
* Format gym_unity using black
* Add GetTotalStepCount to the Academy
This will allow the RecordVideos plugin to record based on the current academy step
* fixup! Add GetTotalStepCount to the Academy
* Add the video recorder to the documentation
* Update Learning-Environment-Create-New.md
- Clarify that training is done in the original ml-agents project folder
- Remove mistype
- In the future it could help to show the user that they can copy the config folder and run training in a new project folder so they don't have to mix project settings in the original config folder
* Update Learning-Environment-Create-New.md
Add file paths
- Fix re-install directions to include -e modifer
- Move re-install directions from creating-custom... to protobuf readme
- Add how to see confirmation that install worked
* Create new class (RewardSignal) that represents a reward signal.
* Add value heads for each reward signal in the PPO model.
* Make summaries agnostic to the type of reward signals, and log weighted rewards per reward signal.
* Move extrinsic and curiosity rewards into this new structure.
* Allow defining multiple reward signals in YAML file. Add documentation for this new structure.
* Using-Docker.md miss a backslash in 3DBall command
Hi,
Just a quick edit because a backslash seems to be missing from the 3DBall command example.
* Added interactive options and Tensorboard documentation for Docker training
Based on the new reward signals architecture, add BC pretrainer and GAIL for PPO. Main changes:
- A new GAILRewardSignal and GAILModel for GAIL/VAIL
- A BCModule component (not a reward signal) to do pretraining during RL
- Documentation for both of these
- Change to Demo Loader that lets you load multiple demo files in a folder
- Example Demo files for all of our tested sample environments (for future regression testing)
* Don't 0 value bootstrap for GAIL and Curiosity
* Add gradient penalties to GAN to help with stability
* Add gail_config.yaml with GAIL examples
* Cleaned up trainer_config.yaml and unnecessary gammas
* Documentation updates
* Code cleanup
* Add Sampler and SamplerManager
* Enable resampling of reset parameters during training
* Documentation for Sampler and example YAML configuration file
* Fix naming conventions for consistency
* Add generalization link to ML-Agents Overview
* Add generalization to main Readme
* Include types of samplers available for use
* add kor ver of README.md and empty docs, images
* add Installation.md translated to korean
* Fixed main readme docs and move all the English documents in the docs folder
* modify contents of 'Installation.md' and add kr version 'Installation-Windows.md'(not completed) with related image
* completed 1st translation of 'Installation-Windows.md' and added related images for korean docs
* add kr version 'Using-Docker.md'(not completed)
* translate Training-PPO.md to Korean
* Change word about epsilon in Training-PPO.md
* Fix Training PPO about epsilon
* completed korean translation of 'Using-Docker.md'
* Training Imitation Learning translation to Korean is finished! Also information about the translators are added
* modified all 'blogs.unity3d.com/' to 'blogs.unity3d.com/kr'
* removed all non-translated doc
* add translator information
* Included explicit version # for ZN
* added explicit version for KR docs
* minor fix in installation doc
* Consistency with numbers for reset parameters
* Removed extra verbiage. minor consistency
* minor consistency
* Cleaned up IL language
* moved parameter sampling above in list
* Cleaned up language in Env Parameter sampling
* Cleaned up migrating content
* updated consistency of Reset Parameter Sampling
* Rename Training-Generalization-Learning.md to Training-Generalization-Reinforcement-Learning-Agents.md
* Updated doc link for generalization
* Rename Training-Generalization-Reinforcement-Learning-Agents.md to Training-Generalized-Reinforcement-Learning-Agents.md
* Re-wrote the intro paragraph for generalization
* add titles, cleaned up language for reset params
* Update Training-Generalized-Reinforcement-Learning-Agents.md
* cleanup of generalization doc
* More cleanu...
* Add Soft Actor-Critic model, trainer, and policy and sac_trainer_config.yaml
* Add documentation for SAC and tweak PPO documentation to reference the new pages.
* Add tests for SAC, change simple_rl test to run both PPO and SAC.
* check using xargs
* fix broken BC link
* install npm, run precommit before unit tests
* try to install npm
* try a node image build
* add workflow
* don't use precommit on node run
* sudo make me a sandwich
* pass config arg
* revert CI order change
* retry precommit
* sudo apt-get
* sudo npm
* make sure fails on bad link
* cleanup and refix link
* relax versions, add python 3.7 to CI
* add workflows
* try paramaterized circleci build, disable slow test
* fix workflow
* fix (?) pyversion
* set job name, fix pip freeze output
* test_requirements.txt
* fix install
* fix paths (again) - should use pushd popd instead
* use pushd and popd
* sort deps, restore unit test, cleanup CI
* relax versions more
* clean up versions in docs
* test older libs for 3.6, newer for 3.7
* pip: progress bar off
* fix gym-unity pip install
* try cat'ing setups for checksum
* dont use fallback (temporarily)
* dont turn off progress bar before upgrading pip
* PR feedback
* add parameter descriptions in CI config
* Initial Commit
* Remove the Academy Done flag from the protobuf definitions
* remove global_done in the environment
* Removed irrelevant unitTests
* Remove the max_step from the Academy inspector
* Removed global_done from the python scripts
* Modified and removed some tests
* This actually does not break either curriculum nor generalization training
* Replace global_done with reserved.
Addressing Chris Elion's comment regarding the deprecation of the global_done field. We will use a reserved field to make sure the global done does not get replaced in the future causing errors.
* Removed unused fake brain
* Tested that the first call to step was the same as a reset call
* black formating
* Added documentation changes
* Editing the migrating doc
* Addressing comments on the Migrating doc
* Addressing comments :
- Removing dead code
- Resolving forgotten merged conflicts
- Editing documentations...
* new env styles rebased on develop
* added new trained models
* renamed food collector platforms
* reduce training timescale on WallJump from 100 to 10
* uncheck academy control on walljump
* new banner image
* rename banner file
* new example env images
* add foodCollector image
* change Banana to FoodCollector and update image
* change bouncer description to include green cube
* update image
* update gridworld image
* cleanup prefab names and tags
* updated soccer env to reference purple agent instead of red
* remove unused mats
* rename files
* remove more unused tags
* update image
* change platform to agent cube
* update text. change platform to agents head
* cleanup
* cleaned up weird unused meta files
* add new wall jump nn files and rename a prefab
* walker change stacked states from 5 to 1
walker collects physics observations so stacked states are not need...
* Feature Deprecation : Online Behavioral Cloning
In this PR :
- Delete the online_bc_trainer
- Delete the tests for online bc
- delete the configuration file for online bc training
* Deleting the BCTeacherHelper.cs Script
TODO :
- Remove usages in the scene
- Documentation Edits
*DO NOT MERGE*
* IMPORTANT : REMOVED ALL IL SCENES
- Removed all the IL scenes from the Examples folder
* Removed all mentions of online BC training in the Documentation
* Made a note in the Migrating.md doc about the removal of the Online BC feature.
* Feature Deprecation : Online Behavioral Cloning
In this PR :
- Delete the online_bc_trainer
- Delete the tests for online bc
- delete the configuration file for online bc training
* Deleting the BCTeacherHelper.cs Script
TODO :
- Remove usages in the scene
- Documentation Edits
*DO NOT MERGE*
* IMPORTANT : REMOVED ALL IL SCENES
- Removed all the IL scenes from the Examples folder
* Removed all mentions of online BC training in the Documentation
* Made a note in the Migrating.md doc about the removal of the Online BC feature.
* Modified the Academy UI to remove the control checkbox and replaced it with a train in the editor checkbox
* Removed the Broadcast functionality from the non-Learning brains
* Bug fix
* Note that the scenes are broken since the BroadcastHub has changed
* Modified the LL-API for Python to remove the broadcasting functiuonality.
* All unit tests are running
* Modifie...
* Feature Deprecation : Online Behavioral Cloning
In this PR :
- Delete the online_bc_trainer
- Delete the tests for online bc
- delete the configuration file for online bc training
* Deleting the BCTeacherHelper.cs Script
TODO :
- Remove usages in the scene
- Documentation Edits
*DO NOT MERGE*
* IMPORTANT : REMOVED ALL IL SCENES
- Removed all the IL scenes from the Examples folder
* Removed all mentions of online BC training in the Documentation
* Made a note in the Migrating.md doc about the removal of the Online BC feature.
* Modified the Academy UI to remove the control checkbox and replaced it with a train in the editor checkbox
* Removed the Broadcast functionality from the non-Learning brains
* Bug fix
* Note that the scenes are broken since the BroadcastHub has changed
* Modified the LL-API for Python to remove the broadcasting functiuonality.
* All unit tests are running
* Modified the scen...
* 1 to 1 Brain to Agent
This is a work in progess
In this PR :
- Deleted all Brain Objects
- Moved the BrainParameters into the Agent
- Gave the Agent a Heuristic method (see Balance Ball for example)
- Modified the Communicator and ModelRunner : Put can only take one agent at a time
- Made the IBrain Interface with RequestDecision and DecideAction method
No changes made to Python
[Design Doc](https://docs.google.com/document/d/1hBhBxZ9lepGF4H6fc6Hu6AW7UwOmnyX3trmgI3HpOmo/edit#)
* Removing editorconfig
* Updating BallanceBall scene
* grammar mistake
* Clearing the Agents of the Model runner
* Added Documentation on IBrain
* Modified comments on GiveModel
* Introduced a factory
* Split Learning Brain in two
* Changes to walljump
* Fixing the Unit tests
* Renaming the Brain to Policy
* Heuristic now has priority over training
* Edited code comments
* Fixing bugs
* Develop one to one scene edits...
* Modifying the .proto files
* attempt 1 at refactoring Python
* works for ppo hallway
* changing the documentation
* now works with both sac and ppo both training and inference
* Ned to fix the tests
* TODOs :
- Fix the demonstration recorder
- Fix the demonstration loader
- verify the intrinsic reward signals work
- Fix the tests on Python
- Fix the C# tests
* Regenerating the protos
* fix proto typo
* protos and modifying the C# demo recorder
* modified the demo loader
* Demos are loading
* IMPORTANT : THESE ARE THE FILES USED FOR CONVERSION FROM OLD TO NEW FORMAT
* Modified all the demo files
* Fixing all the tests
* fixing ci
* addressing comments
* removing reference to memories in the ll-api
* [WIP] Side Channel initial layout
* Working prototype for raw bytes
* fixing format mistake
* Added some errors and some unit tests in C#
* Added the side channel for the Engine Configuration. (#2958)
* Added the side channel for the Engine Configuration.
Note that this change does not require modifying a lot of files :
- Adding a sender in Python
- Adding a receiver in C#
- subscribe the receiver to the communicator (here is a one liner in the Academy)
- Add the side channel to the Python UnityEnvironment (not represented here)
Adding the side channel to the environment would look like such :
```python
from mlagents.envs.environment import UnityEnvironment
from mlagents.envs.side_channel.raw_bytes_channel import RawBytesChannel
from mlagents.envs.side_channel.engine_configuration_channel import EngineConfigurationChannel
channel0 = RawBytesChannel()
channel1 = EngineConfigurationChanne...
* Updated Brain Reference in Training ML Agents
* Removed reference to Brain object
* Update docs/Training-ML-Agents.md
Co-Authored-By: Chris Elion <chris.elion@unity3d.com>
* Update docs/Training-ML-Agents.md
Co-Authored-By: Chris Elion <chris.elion@unity3d.com>
* initial commit for LL-API
* fixing ml-agents-envs tests
* Implementing action masks
* training is fixed for 3DBall
* Tests all fixed, gym is broken and missing documentation changes
* adding case where no vector obs
* Fixed Gym
* fixing tests of float64
* fixing float64
* reverting some of brain.py
* removing old proto apis
* comment type fixes
* added properties to AgentGroupSpec and edited the notebooks.
* clearing the notebook outputs
* Update gym-unity/gym_unity/tests/test_gym.py
Co-Authored-By: Chris Elion <chris.elion@unity3d.com>
* Update gym-unity/gym_unity/tests/test_gym.py
Co-Authored-By: Chris Elion <chris.elion@unity3d.com>
* Update ml-agents-envs/mlagents/envs/base_env.py
Co-Authored-By: Chris Elion <chris.elion@unity3d.com>
* Update ml-agents-envs/mlagents/envs/base_env.py
Co-Authored-By: Chris Elion <chris.elion@unity3d.com>
* addressing first comments
* NaN checks for r...
The "num-runs" command-line option provides the ability to run multiple
identically-configured training runs in separate processes by running
mlagents-learn only once. This is a rarely used ML-Agents feature,
but it adds complexity to other parts of the system by adding the need
to support multiprocessing and managing of ports for the parallel training
runs. It also doesn't provide truly reproducible experiments, since there
is no guarantee of resource isolation between the trials.
This commit removes the --num-runs option, with the idea that users will
manage parallel or sequential runs of the same experiment themselves in the
future.
This change adds a new 'mlagents-run-experiment' endpoint which
accepts a single YAML/JSON file providing all of the information that
mlagents-learn accepts via command-line arguments and file inputs.
As part of this change the curriculum configuration is simplified to
accept only a single file for all the curricula in an environment
rather than a file for each behavior.
* Simplifying the Agent reset logic
- Agents will reset in ResetIfDone immediately after being marked Done
- Agents will always request a decision right after reset
- This change implies that additional messages might be sent to Python
* Fixing the Unit Tests
* Added a note in the Migrating.md document
* Triming some of the methods of the agent but left SetReward
* Fixing bugs
* modifying the environments
* Reintroducing IsDone and IsMaxStepReached
* Updating the Migrating doc
* more details on the Migration
* Made the Agent reset immediately
* fixing the C# tests
* Fixing the tests still
* Trying with incremental episode ids
* deleting buffer rather than using an empty list
* Addressing the comments
* Forgot to edit the comment on AgentInfo
* Updating the migrating doc
* Fixed an obvious bug
* cleaning after an agent is done in agent processor
* Fixing the pytest errors
Convert the UnitySDK to a Packman Package.
- Separate Examples into a sample project.
- Move core UnitySDK Code into com.unity.ml-agents.
- Create asmdefs for the ml-agents package.
- Add package validation tests for win/linux/max.
- Update protobuf generation scripts.
- Add Barracuda as a package dependency for ML-Agents. (users no longer have to install it themselves).
* Updating version number (#3366)
* updating version number
* fixing version numbers
* migration guide (#3375)
* Reduce num steps for walljump (#3377)
* Fixing the Docs on On Demand Decision
Co-authored-by: Anupam Bhatnagar <anupambhatnagar@gmail.com>
Co-authored-by: Chris Elion <celion@gmail.com>
Co-authored-by: Ervin T. <ervin@unity3d.com>
* Add the VectorSensor to the CollectObservation call
* Example of API change for BalanceBall
* Modified the Examples
* Changes to the migrating doc
* Editing the docs
* Update docs/Learning-Environment-Design-Agents.md
Co-Authored-By: Chris Elion <chris.elion@unity3d.com>
* Update docs/Migrating.md
Co-Authored-By: Chris Elion <chris.elion@unity3d.com>
* Update docs/Migrating.md
Co-Authored-By: Chris Elion <chris.elion@unity3d.com>
* Update docs/Getting-Started-with-Balance-Ball.md
Co-Authored-By: Chris Elion <chris.elion@unity3d.com>
* addressing comments
* Removed the MLAgents.Sensor namespace
* Removing the MLAgents.Sensor namespace from the tests
* Editing the migrating docs
Co-authored-by: Chris Elion <celion@gmail.com>
* Update Learning-Environment-Create-New.md (#3356)
* Update Learning-Environment-Create-New.md
In the "Final Editor Setup" , I think their should be a Step to add Decision Parameters Script and it says Decision Period from 1 to 20.
Without this their was no action taken by the RolerAgent. After adding this step it worked for me.
* Update docs/Learning-Environment-Create-New.md
Co-Authored-By: Chris Elion <celion@gmail.com>
* Update docs/Learning-Environment-Create-New.md
Co-Authored-By: Chris Elion <celion@gmail.com>
Co-authored-by: Chris Elion <celion@gmail.com>
* migration fixes
Co-authored-by: Medhavi Monish <39962268+MedhaviMonish@users.noreply.github.com>
* Update Learning-Environment-Create-New.md
In the "Final Editor Setup" , I think their should be a Step to add Decision Parameters Script and it says Decision Period from 1 to 20.
Without this their was no action taken by the RolerAgent. After adding this step it worked for me.
* Update docs/Learning-Environment-Create-New.md
Co-Authored-By: Chris Elion <celion@gmail.com>
* Update docs/Learning-Environment-Create-New.md
Co-Authored-By: Chris Elion <celion@gmail.com>
Co-authored-by: Chris Elion <celion@gmail.com>
* Sentencing Action masking the same as observations
I am rather unsure about the doubling of the CollectObservation methods (and the copy pasta that comes along)
Need to edit the documentation and the migrating doc once we agree we want to do this
* Addressing the comments
* Improvements to the documentation
* Editing the documentation
* Making Register side channel a public method on the Academy
* Adding a new tutorial on how to use side channels
* Update docs/Python-API.md
Co-Authored-By: Chris Goy <christopherg@unity3d.com>
* Add a Deregister method
* Add the deregister to the example
* Update com.unity.ml-agents/Runtime/Academy.cs
Co-Authored-By: Chris Elion <chris.elion@unity3d.com>
* Update docs/Python-API.md
Co-Authored-By: Chris Goy <christopherg@unity3d.com>
* Re-test
* Renaming Deregister to Unregister
* Update docs/Python-API.md
Co-Authored-By: Chris Elion <chris.elion@unity3d.com>
Co-authored-by: Chris Goy <christopherg@unity3d.com>
Co-authored-by: Chris Elion <celion@gmail.com>
* [Documentation] Adding a few lines to mention that ML-Agents only works with NN generated with our trainers
* Editing the Limitations doc to reflect the comments;
* Make ChannelId a property and renamed ReservedChannelId
* Changes on the Python side for consistency
* Modified the tutorial appropriately
* fixing bugs
* Update ml-agents-envs/mlagents_envs/environment.py
Co-Authored-By: Chris Elion <chris.elion@unity3d.com>
* Update com.unity.ml-agents/Runtime/Grpc/RpcCommunicator.cs
Co-Authored-By: Chris Elion <chris.elion@unity3d.com>
* Addressing comments
* Update docs/Python-API.md
Co-Authored-By: Chris Elion <chris.elion@unity3d.com>
* Added a Utils class on the side channel (#3447)
- No change in user facing API
- Simplifies the code in the side channel implementations as it makes it easier to check if a side channel id is within ranges
- No changes to tests
- No changes to Documentation
* Simplifying
* Fixing a bug
* Replace the int ChannelId with a GUID/UUID ChannelId (#3454)
* renaming channel_type to channel_id
* Making the constant GUID const...
* Added the MLAgents.Demonstrations namespace
* Added the MLAgents.Editor namespace
* Overrided the .demo.meta files due to the change in namespace
* More namespace changes
* Added the sidechannels namespace
* Modified changelog and migrating docs
* Made the BrainParameters internal
* Editing the docs
* [skip-ci] A lot more controversial
* [skip ci] Added formerly serialized as
* Use cached BehaviorParameters
* [skip ci] made the decision requester internal and renamed RepeatAction
* [skip ci] Updated the migration
* Update com.unity.ml-agents/Runtime/DecisionRequester.cs
Co-Authored-By: Chris Elion <chris.elion@unity3d.com>
* Run tests
Co-authored-by: Chris Elion <celion@gmail.com>
* Improvements to the main repo Readme: put an emphasis on the Releases section.
* Improving the installation guide.
* Added the first draft of package readme.
* 2 file renames
Installation-Windows —> Installation-Anacondo-Windows to be clearer that it’s about Anaconda on Windows and not just Windows.
index.md —> com.unity.ml-agents.md to be inline with package requirements.
* Split sidechannel docs off from python-api
* Edits of side channel docs for clarity
* Minor adjustments to naming convention for scs
* Update docs readme and add python code tag
* Fix trailing whitespace
* Update docs/Python-API.md
Co-Authored-By: Vincent-Pierre BERGES <vincentpierre@unity3d.com>
* Address comments
Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>
* [skip ci] Renamed methods in the Agent class
WARNING, the user when implementing obsolete methods will see the message :Member `old method` overrides obsolete member `old method`. Add the Obsolete attribute to `old method`. It will not suggest the new method to override.
* [skip ci] Updated the example environment
* [skip ci] Updated migrating and changelog
* [skip ci] Editing the docs
* [skip ci] Missing docs
* :+1
* Update docs/Getting-Started-with-Balance-Ball.md
Co-Authored-By: Chris Elion <chris.elion@unity3d.com>
* Update docs/Learning-Environment-Create-New.md
Co-Authored-By: Chris Elion <chris.elion@unity3d.com>
* Update docs/Learning-Environment-Create-New.md
Co-Authored-By: Chris Elion <chris.elion@unity3d.com>
* [skip ci] documentation changes
* [skip ci] Update docs/Getting-Started-with-Balance-Ball.md
* [skip ci] Update docs/Getting-Started-with-Balance-Ball.md
* [skip ci] Update docs/Gett...
* [bug-fix] Increase height of wall in CrawlerStatic (#3650)
* [bug-fix] Improve performance for PPO with continuous actions (#3662)
* Corrected a typo in a name of a function (#3670)
OnEpsiodeBegin was corrected to OnEpisodeBegin in Migrating.md document
* Add Academy.AutomaticSteppingEnabled to migration (#3666)
* Fix editor port in Dockerfile (#3674)
* Hotfix memory leak on Python (#3664)
* Hotfix memory leak on Python
* Fixing
* Fixing a bug in the heuristic policy. A decision should not be requested when the agent is done
* [bug-fix] Make Python able to deal with 0-step episodes (#3671)
* adding some comments
Co-authored-by: Ervin T <ervin@unity3d.com>
* Remove vis_encode_type from list of required (#3677)
* Update changelog (#3678)
* Shorten timeout duration for environment close (#3679)
The timeout duration for closing an environment was set to the
same duration as the timeout when waiting ...
* Merge agent & best practices doc. Plus other fixes
* Fix overly long lines
* Merge Getting Started and Basic Guides
* Rename guide and update links appropriately
* Fix broken link
The "docker target" feature and associated command-line flag
--docker-target-name were created for use with the now-deprecated
Docker setup. This feature redirects the paths used by learn.py
for the environment and config files to be based from a directory
other than the current working directory. Additionally it wrapped
the environment execution with xvfb-run.
This commit removes the "docker target" feature because:
* Renaming the paths doesn't fix any problem. Absolute paths can
already be passed for configs and environment executables.
* Use of xserver, Xvfb, or xvfb-run are independent of mlagents-learn
and can be used outside of the mlagents-learn call. Further, xvfb-run
is not the only solution for software rendering.
* Deprecating Academy.Instance.FloatProperties
* Made the registered side channels a static property and created the sideChannelUtils class to handle side channel stuff
* Clearing the sending message queue in the Academy when the communicaor is not on
* addressing comments
* [skip ci] WIP : Modify the base_env.py file
* [skip ci] typo
* [skip ci] renamed some methods
* [skip ci] Incorporated changes from our meeting
* [skip ci] everything is broken
* [skip ci] everything is broken
* [skip ci] formatting
* Fixing the gym tests
* Fixing bug, C# has an error that needs fixing
* Fixing the test
* relaxing the threshold of 0.99 to 0.9
* fixing the C# side
* formating
* Fixed the llapi integratio test
* [Increasing steps for testing]
* Fixing the python tests
* Need __contains__ after all
* changing the max_steps in the tests
* addressing comments
* Making env_manager logic clearer as proposed in the comments
* Remove duplicated logic and added back in episode length (#3728)
* removing mentions of multi-agent in gym and changed the docstring in base_env.py
* Edited the Documentation for the changes to the LLAPI (#3733)
* Edite...
* Fix the main releases table by generating from the script.
* Minor formatting fix
* Removing references to Cloud Training as those docs are now deprecated.
* Removing the link to video tutorials as they are now not current
* Bumping version on the release (#3615)
* Update examples project to 2018.4.18f1 (#3618)
From 2018.4.14f1. An internal package dependency was updated as
a side effect.
* Remove dead components from the examples scenes (#3619) (#3624)
* Improve warnings and exception if using unsupported combo
* add meta file
* fix unit test
* enforce onnx conversion (expect tf2 CI to fail) (#3600)
* Update error message
* Updated the release branch docs (#3621)
* Updated the release branch docs
* Edited the README
* make sure top-level timer is closed before writing
* Remove space from Product Name for examples
In #2588 it was suggested that the space in the Product Name for
our example environments causes confusion when using a default build
because of the need to escape the space in the build filename.
This change removes the space from the Product Name in the project's
player settings.
* [bug-fix] Increase 3dbal...
* Removed the unused images from the images folder. Used the command
```
for f in *; do echo " file : $f" && grep -s -r "$f" /Users/vincentpierre/Documents/ml-agents -i --include *.md --exclude-dir=/Users/vincentpierre/Documents/ml-agents/docs/localized;done
```
to hunt the files down
* Modified the Unity Editor screenshots
* Addressing comments
* Removed the obsolete methods from the Agent class
* Documentation changes
* [skip ci] Update com.unity.ml-agents/CHANGELOG.md
Co-Authored-By: Chris Elion <chris.elion@unity3d.com>
* [skip ci] Update docs/Migrating.md
Co-Authored-By: Chris Elion <chris.elion@unity3d.com>
Co-authored-by: Chris Elion <chris.elion@unity3d.com>
Merged the "Overview" sections of a few pages into their respective sections in ML-Agents-Overview:
- Training-Using-Concurrent-Unity-Instances.md
- Training-Self-Play.md
- Training-SAC.md
- Training-PPO.md
- Training-Imitation-Learning.md
- Training-Environment-Parameter-Randomization.md
- Training-Curriculum-Learning.md
- Reward-Signals.md
- Feature-Monitor.md
- Feature-Memory.md
Organized ML-Agents-Overview into Training Methods and Training Options sections.
Follow-up action items (part of a separate PR):
- Smooth over the documentation in ML-Agents-Overview (right now, we somewhat just pasted text from other pages). If we align on the new structure for this page, we can iterate on it.
- Update “Key Components” section with new graph and discuss side channels and revise use of Academy.
- Consolidate “Training-*” docs into Training-ML-Agents to offer a single guide for all hyperparameter selection
* Various doc improvements
For Using-Virtual-Environment.md:
- Made a note regarding updating setuptools and pip.
- Changed lists from "-" to "*"
For Using-Tensorboard.md:
- Changed the ordered list to use "1."
For Training-on-Microsoft-Azure-Custom-Instance.md:
- Deleted as it was not linked anywhere
For FAQ.md
- Removed stale issues given upgrade to 2018.3
For Readme.md
- Added links for Reward Signals, Self-Play and Profiling Trainers
For Learning-Environment-Executable.md
- Changed the ordered list to use "1."
For Learning-Environment-Examples.md
- Minor rewording of intro paragraphs
* consolidating custom instances page in main page
So we have a single page for Azure.
Adding warning note for deprecated docs
* Fixing doc links that are failing CI
* Improvements to Training-ML-Agents
- Removed duplicate documentation
- Moved CLI descriptions to learn.py
- Reorganized "Training with mlagents-learn" into 5 sub-sections
* fixed formatting errors and incorporated minor feedback
* minor improvement
* Minor formatting.
* fixed run-id references
* Keeping link to use Inference consistent with master
Will update the UIE page in a separate PR.
* Squashed commit of the following:
commit 9600d0fbe6684eca69fb5bab84ab0f6754fc8b0f
Author: Marwan Mattar <marwan@unity3d.com>
Date: Tue Apr 14 17:45:33 2020 -0700
Various doc improvements (#3775)
* Various doc improvements
For Using-Virtual-Environment.md:
- Made a note regarding updating setuptools and pip.
- Changed lists from "-" to "*"
For Using-Tensorboard.md:
- Changed the ordered list to use "1."
For Training-on-Microsoft-Azure-Custom-Instance.md:
- Deleted ...
* Improvements to Getting Started guide
- Changed the ordered list to use "1."
- Trimmed down text
- Removed references to Agent APIs
* Incorporating feedback
* Prettier formatting
* Running prettier formatting as a follow-up PR to #3775
Running the files (except Training-ML-Agents.md) through the `prettier` linter.
https://github.com/Unity-Technologies/ml-agents/pull/3775/files
* minor fixes
Changed a header from “Custom Metrics from C#” to “Custom Metrics from Unity”
Fixed formatting in FAQ
* Minor correction.
* Improvements to Learning-Environment-Create-New.md
- Changed the ordered list to use "1."
- Trimmed down text
- Removed reference to materials as those are in the Example Envs project
* Incorporated PR feedback + new images.
* factor in feedback
removed unnecessary configs
updated the agent image
* Formatting fix
* Making Gym a wrapper
* Readding no graphics to the run gym test
* typo
* Modifying the changelog and the migrating doc
* Applying pre-commit
* [skip ci] Update gym-unity/gym_unity/tests/test_gym.py
Co-Authored-By: Chris Elion <chris.elion@unity3d.com>
* Adding a note that the BaseEnv will close when the wrapper closes
* FoRgOt To rUn PrE-ComMiT
Co-authored-by: Chris Elion <chris.elion@unity3d.com>
* Make EnvironmentParameters a first-class citizen in the API
Missing: Python conterparts and testing.
* Minor comment fix to Engine Parameters
* A second minor fix.
* Make EngineConfigChannel Internal and add a singleton/sealed accessor
* Make StatsSideChannel Internal and add a singleton/sealed accessor
* Changes to SideChannelUtils
- Disallow two sidechannels of the same type to be added
- Remove GetSideChannels that return a list as that is now unnecessary
- Make most methods except (register/unregister) internal to limit users impacting the “system-level” side channels
- Add an improved comment to SideChannel.cs
* Added Dispose methods to system-level sidechannel wrappers
- Specifically to StatsRecorder, EnvironmentParameters and EngineParameters.
- Updated Academy.Dispose to take advantage of these.
- Updated Editor tests to cover all three “system-level” side channels.
Kudos to Unit Tests (TestAcade...
* use py3.8, install cmake for onnx
* disable other circleci tests for now
* install proto for onnx install
* apt-get install protobuf-compiler instead
* skip onnx for python3.8
* use right comparison
* fix up config
* changelog and faq
* Improvements to Key Components section of ML-Agents Overview
- Moved some documentation from Learning-Environment-Design.
- Added the trainers vs LL-API separation.
- Made a note about gym-unity.
- Some update to the Agent/Behavior sections
- Updated diagrams to reflect new side channels. Made Behavior type a consistent color.
* Reorganizing the overview file and creating new (empty) sections
This change defines the new structure for the overview doc. Subsequent commits will fill in the sections and rewrite existing sections.
* Reorganizing the main Training ML-Agents page
Re-organizes into feature-specific sections that somewhat mirror the previous commit of reorganizing the overview doc.
Subsequent commits will populate these empty sections.
* Adding Deep RL
- Update ML-Agents-Overview with description of DeepRL training algorithms
- Decribe the common and trainer-specific hyperparams in Training-ML-Agents.
- Removed ...
* Several, small documentation improvements
- Re-organize main repo README
- Minor clean-ups to Python package-specific readme files
- Clean-up to Unity Inference Engine page
- Update to the docs README
- Added a specific cross-platform section in ML-Agents Overview to amplify Barracuda
- Updated the links in Limitations.md to point to the specific subsections
- Cleaned up the Designing a Learning Environment page. Added an intro paragraph.
- Updated the installation guide to specifically call out local installation
- A few minor formatting, spelling errors fixed.
* [bug-fix] Fix issue with initialize not resetting step count (#3962)
* Develop better error message for #3953 (#3963)
* Making the error for wrong number of agents raise consistently
* Better error message for inputs of wrong dimensions
* Fix#3932, stop the editor from going into a loop when a prefab is selected. (#3949)
* Minor doc updates to release
* add unit tests and fix exceptions (#3930)
Co-authored-by: Ervin T <ervin@unity3d.com>
Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>
Co-authored-by: Chris Goy <christopherg@unity3d.com>
* Add v1.0 blog post and update reference paper. (#3947)
* Develop mm fix readme releases (#3966)
* Fix broken link and clean-up Releases section.
* Updated link to be consistent with the table.
* Update one of the bullets for consistency.
* update table, add Versioning doc
* release_2_docs
Co-authored-by: Marwan Mattar <marwan@unity3d.com>
mlagents.trainers.exception.UnityTrainerException: The hyper-parameter memory_size could not be found for the <class 'mlagents.trainers.ppo.trainer.PPOTrainer'> trainer of brain RollerBall.
* Replaced get_behavior_names and get_behavior_spec with behavior_specs property
* Fixing the test
* [ci]
* addressing some comments
* use typing.Mapping (#3948)
* Update ml-agents-envs/mlagents_envs/base_env.py
Co-authored-by: Chris Elion <chris.elion@unity3d.com>
* Adding the documentation
Co-authored-by: Chris Elion <chris.elion@unity3d.com>
* [WIP] Unity Environment Registry
[JIRA ticket](https://jira.unity3d.com/browse/MLA-997)
[Design Document](https://docs.google.com/document/d/1bFQ3_oXsA80FMou8kwqYxC53kqG5L3i0mbTQUH4shY4/edit#)
In This PR : Prototype of the Unity Environment Registry
Uploaded the 3DBall and Basic Environments for mac only
How to use on Python :
```python
from mlagents_envs.registry import UnityEnvRegistry
registry = UnityEnvRegistry()
print(registry["3DBall"].description)
env = registry["3DBall"].make()
env.reset()
for i in range(10):
print(i)
env.step()
env.close()
```
* Other approach:
- UnityEnvRegistry is no longer static and needs to be instantiated
- Providing a default_registry that will contains our environments
- Added a functionality to register RemoteRegistryEntry with a yaml file
* Some extra verification of the url : The binary will have a hash of the url in its name to make sure the right environ...
* allow vector observations also when using visual observations
* update changelog
* Update CHANGELOG.md
* Update __init__.py
* remove trailing whitespace
* Fix test case where visual and vector observations are used simultaneously
* fix formatting
* add test for visual and vector observations
* Assert vector action shape
* Fix test environment to return multiple visual observations
* use_visual and allow_multiple_visual_obs are replaced by allow_multiple_obs which allows visual and vector observations to be used simultaneously.
* fixing run_gym.py test
* [ci]
* Added some more tests and made the observation space a tuple when using multiple observations
* Modifying the change log
* Addding to the Migrating doc
* Edits to Migrating.md
* Simplification of the code to generate the observation spaces
* Simplified warning messages
* Adding contr...
* update versions for patch release (#3970)
* update versions for patch releae
* Update precommit flake8 (#3961)
* fix changelog
* Release 2 cherry pick (#3971)
* [bug-fix] Fix issue with initialize not resetting step count (#3962)
* Develop better error message for #3953 (#3963)
* Making the error for wrong number of agents raise consistently
* Better error message for inputs of wrong dimensions
* Fix#3932, stop the editor from going into a loop when a prefab is selected. (#3949)
* Minor doc updates to release
* add unit tests and fix exceptions (#3930)
Co-authored-by: Ervin T <ervin@unity3d.com>
Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>
Co-authored-by: Chris Goy <christopherg@unity3d.com>
* update changelog (#3975)
* [docs] Add memory_size hyperparameter (#3973)
* Release 2 docs (#3976)
* Add v1.0 blog post and update reference paper. (#3947)
* Develop mm fix readme rel...
* Update Installation.md
Made it clear on implications for new users on NOT cloning the repo during installation
* Update Installation.md
* Removed trailing whitespace
* Added Microsoft cpp dependency for windows users in faq (https://github.com/Unity-Technologies/ml-agents/pull/4033)
* Update docs/FAQ.md
Co-authored-by: Chris Elion <celion@gmail.com>
* Update docs/FAQ.md
Co-authored-by: Chris Elion <celion@gmail.com>
Co-authored-by: Chris Elion <celion@gmail.com>
* about to implement orientation cube
* oCube spawining works. ready to train
* working. about to try com
* ready for training
* add random rot on episode start
* feet now alternate but runs backwards
* still running with right leg in front
* increased joint strength to 40k
* removed texture example
* reduced maxAngVel, enabled enhanced determinism, cont spec
* rebuilt walker ragdoll to scale 1
* rebuilt ragdoll ready
* update walker pair prefab
* fixed bp heirarchy
* added trained model, renamed scene, usecollisioncallbacks
* updated dynamic platforms
* added dynamic walker tf file. max speed 5
* DynamicWalker working. has working nn file
* collect local rotations
* added new dynamic nn file
* hip facing reward
* Create WalkerDynamic.yaml
* fix hip rotation
* about to clean up code
* added dirIndicator and orentCubeGizmo
* clean up
* clea...
* [bug-fix] Fix regression in --initialize-from feature (#4086)
* Fixed text in GettingStarted page specifying the logdir for tensorboard. Before it was in a directory summaries which no longer existed. Results are now saved to the results dir. (#4085)
* [refactor] Remove nonfunctional `output_path` option from TrainerSettings (#4087)
* Reverting bug introduced in #4071 (#4101)
Co-authored-by: Scott <Scott.m.jordan91@gmail.com>
Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>
* Clarification in the Heuristic() documentation
The `Heuristic()` method will not be able to write to the action array if the action array passed as argument is reassigned in the method.
For example, doing :
```csharp
public override void Heuristic(float[] actionsOut)
{
actionOut = new float[2];
actionOut[0] = 1.0f;
}
```
Will not create the action [1, 0] but [0, 0] as the `actionOut` variable was reassigned.
* adding to the Agent xml doc
* Updated the badge’s link to point to the newest doc version
* Replaced all of the doc to release_3_doc
* Update to release_3 in installation.md (#4144)
Co-authored-by: Yuan Gao <xiaomaogy88@gmail.com>
Co-authored-by: andrewcoh <54679309+andrewcoh@users.noreply.github.com>
* about to implement orientation cube
* oCube spawining works. ready to train
* working. about to try com
* ready for training
* add random rot on episode start
* feet now alternate but runs backwards
* still running with right leg in front
* increased joint strength to 40k
* removed texture example
* reduced maxAngVel, enabled enhanced determinism, cont spec
* rebuilt walker ragdoll to scale 1
* rebuilt ragdoll ready
* update walker pair prefab
* fixed bp heirarchy
* added trained model, renamed scene, usecollisioncallbacks
* updated dynamic platforms
* added dynamic walker tf file. max speed 5
* DynamicWalker working. has working nn file
* collect local rotations
* added new dynamic nn file
* hip facing reward
* Create WalkerDynamic.yaml
* fix hip rotation
* about to clean up code
* added dirIndicator and orentCubeGizmo
* clean up
* cleanup
* up...
* Update Dockerfile
* Separate send environment data from reset (#4128)
* Fixed a typo on ML-Agents-Overview.md (#4130)
Fixed redundant "to" word from the sentence since it is probably a typo in document.
* Updated the badge’s link to point to the newest doc version
* Replaced all of the doc to release_3_doc
* Fix 3DBall and 3DBallHard SAC regressions (#4132)
* Move memory validation to settings
* Update docs
* Add settings test
* Update to release_3 in installation.md (#4144)
* rename to SideChannelManager +backcompat (#4137)
* Remove comment about logo with --help (#4148)
* [bugfix] Make FoodCollector heuristic playable (#4147)
* Make FoodCollector heuristic playable
* Update changelog
* script to check for old release links and references (#4153)
* Remove package validation suite from Project (#4146)
* RayPerceptionSensor: handle empty and invalid tags (#4155...
* note about required Windows Python x86-64
Co-authored-by: Arthur Juliani <awjuliani@gmail.com>
Co-authored-by: andrewcoh <54679309+andrewcoh@users.noreply.github.com>
* doc updates
getting started page now uses consistent run-id
re-order create-new docs to have less back/forth between unity and text editor
* add link explaining decisions where we tell the reader to modify its parameter
* Introduced the Constant Parameter Sampler that will be useful later as samplers and floats can be used interchangeably
* Refactored the settings.py to refect the new format of the config.yaml
* First working version
* Added the unit tests
* Update to Upgrade for Updates
* fixing the tests
* Upgraded the config files
* Fixes
* Additional error catching
* addressing some comments
* Making the code nicer with cattr
* Added and registered an unstructure hook for PrameterRandomization
* Updating C# Walljump
* Adding comments
* Add test for settings export (#4164)
* Add test for settings export
* Update ml-agents/mlagents/trainers/tests/test_settings.py
Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>
Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>
* Including environment parameters for the test for settings export
* First documentation up...
* Update versions for release 4
* Link validation file should ignore itself
* Remove 'unreleased' section from changelog
* Change to 0.18.0 for python versions
* also update extensions package version
Co-authored-by: Chris Elion <chris.elion@unity3d.com>
This change adds an export to .nn for each checkpoint generated by
RLTrainer and adds a NNCheckpointManager to track the generated
checkpoints and final model in training_status.json.
Co-authored-by: Jonathan Harper <jharper+moar@unity3d.com>
* [docs] buffer_size parameter clarification
It was not fully clear that it has a different behavior for PPO and SAC. The docs update should improve the understanding.
* [docs] updated buffer_size parameter clarification
Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>
Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>
* Update Using-Tensorboard.md
"--logdir=results" is broken in newer versions of tensor board; "logdir results" without the equal sign works. See https://github.com/tensorflow/tensorboard/issues/686
* Removing equal sign from tensorboard command line params in docs
Co-authored-by: Nancy Iskander <nancyiskanderonline@gmail.com>
* init
* Add reward manager and hurryUpReward
* fix hurry reward/ add awful first training
* Turn off head height and hurry rew
* changed max speed to 15. added small hh rew
* add NaN check for reward manager. start vel penalty
* add bpVel pen
* add new BPVelPen nn file
* remove outdated nn file
* add randomize speed bool
* try rewad product
* change coeff to 1
* try avg vel of all bp for reward
* move outside loop
* try linear inverselerp for vel
* add avg rew matchspeed15 nn file. looks much better
* save scene
* no hand penalty, random walk speed
* fix inverse lerp
* try new reward falloff
* cleanup
* added new nn file. don't allow hand contact
* update obsv
* remove hh rew. add trained no-hh model
* add new nn file
* new curve
* add new models. try no reset
* add hh rew
* clamp hh
* zero rewards if ground contact
* switch to approved with movi...
* Added link to training configuration file
Realized this link to the configuration file is not linked on this page
* added clarity on checkpoints saving .nn
Updated doc to include point around saving .nn
* Update docs/Training-Configuration-File.md
Co-authored-by: Chris Elion <chris.elion@unity3d.com>
* initial commit
* works with Pyramids
* added unit tests and a separate config file
* Adding first batch of documentation
* adding in the docs that rnd is only for PyTorch
* adding newline at the end of the config files
* adding some docs
* Code comments
* no normalization of the reward
* Fixing the tests
* [skip ci]
* [skip ci] Make sure RND will only work for Torch by editing the config file
* [skip ci] Additional information in the Documentation
* Remove the _has_updated_once flag
* Moved components to the tf folder and moved the TrainerFactory to the `trainer` folder
* Addressing comments
* Editing the migrating doc
* fixing test
VisualFoodCollector is now an example environment of using a mix of visual and vector observation and is able to train with default config file.
Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>
* Torch setup.py
* Set torch to default
* Make torch default in setup.py
* Remove indents
* Remove other instances of TF being used
* Add tensorboard to setup.py
* Adding correst setup commands for verifying torch is installed (#4524)
* Adding correst setup commands for verifying torch is installed
* Editing the test_requirments to add tf and remove torch
* Develop torchdefault raise outside setup (#4530)
* Torch not imported error to raise at first usage
* Torch not imported error to raise at first usage
* [refactor] Use PyTorch TensorBoard utils (#4518)
* Convert stats writer to use PyTorch TB support
* Use common function to print params
* Update test
* Bump tensorboard to 1.15 to fix the tests
* putting tensorboard 1.15.0 as min version requirement
Co-authored-by: vincentpierre <vincentpierre@unity3d.com>
* [Docs] Initial documentation changes for making...
* Use float64 in GAIL tests
* Use float32 when converting np arrays by default
* Enforce torch 1.7.x or below
* Add comment about Windows install
* Adjust tests
* updated image in learning envs examples
* add link to learning example to match-3
* cleaned up headings
* removed anchor
* Update Match3.md
* Delete match3.png
* added new match3 image
* updated match3 image link
- Actuators can now optionally implement IHeuristicProvider to generate heuristic actions for agents.
Co-authored-by: Chris Elion <chris.elion@unity3d.com>
* Add CreateActuators method to the ActuatorComponent class which wraps the original method. The original method will be removed in the future.
Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>
- updated release tag validation script to automate the updating of files with release tags that need to be changed as part of the pre-commit operation.
* Removing Obsolete methods from the package
* Missing depecration and modified changelog
* Readding the obsolete BrainParameter methods, will need a larger discussion on these
* Removing Action Masker, readding the warining when using a non-implemented Heuristic, Removing NumAction from Brain Parameters
* removing documentation and some calls to deprecated methods in the extensions package
* Editing the Changelog to put the unreleased on top
* Removing some scenes, All the Static and all the non variable speed environments. Also removed Bouncer, PushBlock, WallJump and reacher. Removed a bunch of visual environements as well. Removed 3DBallHard and FoodCollector (kept Visual and Grid FoodCollector)
* readding 3DBallHard
* readding pushblock and walljump
* Removing tennis
* removing mentions of removed environments
* removing unused images
* Renaming Crawler demos
* renaming some demo files
* removing and modifying some config files
* new examples image?
* removing Bouncer from build list
* replacing the Bouncer environment with Match3 for llapi tests
* Typo in yamato test
* Add pushblock collab
* Make SimpleMultiAgentGroup public
* Remove GoalDetectTrigger
* Remove GDT meta file
* Remove some comments
* Add training configuration
* Rename behavior
* Add to docs
* Change the reward structure in docs
* Add back GoalDetectTrigger
Co-authored-by: HH <brandonh@unity3d.com>
* Integrate Group Manager to soccer/retrain with POCA (#5115)
* Add Soccer env to changelog
Co-authored-by: andrewcoh <54679309+andrewcoh@users.noreply.github.com>
* Removing Obsolete methods from the package
* Missing depecration and modified changelog
* Readding the obsolete BrainParameter methods, will need a larger discussion on these
* Removing Action Masker, readding the warining when using a non-implemented Heuristic, Removing NumAction from Brain Parameters
* removing documentation and some calls to deprecated methods in the extensions package
* Editing the Changelog to put the unreleased on top
* Updated Learning-Environment-Create-New.md with a section on parallel unity instances.
* Added trailing whitespace to Learning Environment Create New md file.
* Added trailing whitespace to Learning Environment Create New md file after fixes.
* Minor updates.
* Minor updates.
* Whitespace fixes.
* Aded the Goal conditioned GridWorld to replace regular gridworld
* adding missing files
* Code improvements
* Documentation change on gridworld
* resolving conflicts
* new model
* Addressing comments
* comments and renames
* Update docs/Learning-Environment-Examples.md
Co-authored-by: Ervin T. <ervin@unity3d.com>
* adding reference to gridworld in docs about goal signal
Co-authored-by: Chris Elion <chris.elion@unity3d.com>
Co-authored-by: Ervin T. <ervin@unity3d.com>
* LSTM models from 1.x will be incompatible with MLA 2.x
* Adding a test and a new v2 model
* Make the Model Runner raise an error if using 1.0 model with LSTM
* adding a new model for hallway trained with 2.0
* reword error messages
* Only raise if error, not if warning
* Addressing comments: The legacy Barrauda memory generator and applier were removed. All code that checked for (memories + v1.X) have been removed since these will no longer be supported
* Modifying the changelog and the migrating guide with this change
* Fixing the merge issues
Co-authored-by: Chris Elion <chris.elion@unity3d.com>
* initial commit for a fully connected visual encoder
* adding a test
* addressing comments
* Fixing error with minimal size of fully connected network
* adding documentation and changelog
* Python Low Level API Documentation
Added Python Low Level API Documentation in addition to How to Use document. Added link to API documentation in How to Use document.
* Fixed pre-commit issues with docstrings in mlagents-env base_env
* Added local precommit hook to autogenerate markdown documentation using pydoc-markdown
* Updated github precommit workflow to install pydoc-markdown
* Updated github precommit workflow to fix pydoc-markdown install order.
* Some refactoring and docstring updates.
* Removed modules from doc generation as per https://github.com/Unity-Technologies/ml-agents/pull/5325#discussion_r632838268
* Some edits to the documentation (#5369)
* Some edits to the documentation
* fix precommit
* Update ml-agents-envs/mlagents_envs/base_env.py
* regenerating markdown
* Added fixed version to pydoc-markdown precommit install.
* Updated docs readme to add link to new Python API document...
* [WIP] [Fix] Fixing collect observation called on done
* Update com.unity.ml-agents/Runtime/Agent.cs
* ⚠️ Modifying the test of stacking sensor when the agent is done
* modifying the documentation for BufferSensor to specify to call AddObservation in the CollectObservations method