GitHub
6a81a2f4
Add Soft Actor-Critic as trainer option ( #2341 )
* Add Soft Actor-Critic model, trainer, and policy and sac_trainer_config.yaml
* Add documentation for SAC and tweak PPO documentation to reference the new pages.
* Add tests for SAC, change simple_rl test to run both PPO and SAC.
5 年前
GitHub
3df585d9
Fix issue where SAC encoder type is always simple ( #2548 )
5 年前
GitHub
3683cc1c
Enable learning rate decay to be disabled ( #2567 )
5 年前
GitHub
832e4a47
Normalize observations when adding experiences ( #2556 )
* Normalize observations when adding experiences
This change moves normalization of vector observations into the trainer's
"add_experiences" interface.
Prior to this change, normalization occurred at inference time. This
was somewhat confusing since usually executing a forward pass shouldn't
have side-effects which would change the training step. Also, in a
asynchronous or distributed setting where we copy the neural network
weights from a trainer to a remote actor / inference worker we'd end up
with training issues because of the weights being different on the trainer
than the workers.
5 年前
GitHub
67d754c5
Fix flake8 import warnings ( #2584 )
We have been ignoring unused imports and star imports via flake8. These are
both bad practice and grow over time without automated checking. This
commit attempts to fix all existing import errors and add back the corresponding
flake8 checks.
5 年前
GitHub
cb144f20
small mypy cleanup ( #2637 )
* small mypy cleanup
* sac cleanup
* types for ppo policy init
5 年前
Jonathan Harper
3fc14963
EXPERIMENTAL horovod support
5 年前
Jonathan Harper
47893e9c
minor tweaks
5 年前
GitHub
8e931d8d
Merge branch 'develop' into release-0.10.0
5 年前
Anupam Bhatnagar
cc208c00
resolving conflicts
5 年前
Ervin Teng
35669d27
Fix SAC + LSTM Barracuda inference ( #2698 )
5 年前
Chris Elion
43e23941
rough pass at tf2 support, needs cleanup
5 年前
Ervin Teng
024e3677
small mypy cleanup ( #2637 )
* small mypy cleanup
* sac cleanup
* types for ppo policy init
5 年前
Chris Elion
806c77e4
centralize tensorflow imports
5 年前
Chris Elion
8da16bdb
move compat functions
5 年前
GitHub
f22c41db
Merge pull request #2704 from Unity-Technologies/hotfix-0.10.1
Merge Hotfix 0.10.1
5 年前
GitHub
9bac2771
Fix SAC + LSTM Barracuda inference ( #2698 )
5 年前
Chris Elion
254c7d86
Merge remote-tracking branch 'origin/develop' into try-tf2-support
5 年前
GitHub
b95c4d1d
check for unecessary list comprehensions ( #2707 )
5 年前
GitHub
619465e1
Fix crash when SAC is used with Curiosity and Continuous Actions ( #2740 )
* Add test for curiosity + SAC
* Use actions for all curiosity (need to test on PPO)
* Fix issue with reward signals updating multiple times
* Put curiosity actions in the right placeholder
* Test PPO curiosity update
5 年前
Chris Elion
3d8a70fb
Merge remote-tracking branch 'origin/develop' into try-tf2-support
5 年前
GitHub
0fe5adc2
Develop remove memories ( #2795 )
* Initial commit removing memories from C# and deprecating memory fields in proto
* initial changes to Python
* Adding functionalities
* Fixes
* adding the memories to the dictionary
* Fixing bugs
* tweeks
* Resolving bugs
* Recreating the proto
* Addressing comments
* Passing by reference does not work. Do not merge
* Fixing huge bug in Inference
* Applying patches
* fixing tests
* Addressing comments
* Renaming variable to reflect type
* test
5 年前
GitHub
495873e5
Merge pull request #2833 from Unity-Technologies/release-0.11.0
Release 0.11.0
5 年前
Chris Elion
691d21e6
Merge remote-tracking branch 'origin/develop' into try-tf2-support
5 年前
GitHub
c6c01a03
Enable pylint and fix a few things ( #2767 )
* enable pylint, disable some messages and fix a few
* SAC memories in init
5 年前
Jonathan Harper
8550679d
Merge branch 'develop' into release-0.11.0
5 年前
GitHub
4da157fe
more pylint fixes ( #2842 )
5 年前
Chris Elion
fca51de8
Merge remote-tracking branch 'origin/develop' into try-tf2-support
5 年前
Chris Elion
73a346cb
cleanup
5 年前
GitHub
f57b7ac6
Allow usage with tensorflow 2.0.0 (via tf.compat.v1) ( #2665 )
5 年前
Ervin Teng
987e0e3a
Merge tf2 branch
5 年前
Andrew Cohen
13fe9cf8
Bubbled up indexing of AllBrainInfo to trainer controller from trainers
5 年前
GitHub
c0453ae1
Merge pull request #2912 from Unity-Technologies/develop-allbraininfo
Bubbled up indexing of AllBrainInfo to trainer controller from trainers
5 年前
GitHub
99981937
fix errors from new flake8-comprehensions ( #2917 )
5 年前
GitHub
69d1a033
Develop remove past action communication ( #2913 )
* Modifying the .proto files
* attempt 1 at refactoring Python
* works for ppo hallway
* changing the documentation
* now works with both sac and ppo both training and inference
* Ned to fix the tests
* TODOs :
- Fix the demonstration recorder
- Fix the demonstration loader
- verify the intrinsic reward signals work
- Fix the tests on Python
- Fix the C# tests
* Regenerating the protos
* fix proto typo
* protos and modifying the C# demo recorder
* modified the demo loader
* Demos are loading
* IMPORTANT : THESE ARE THE FILES USED FOR CONVERSION FROM OLD TO NEW FORMAT
* Modified all the demo files
* Fixing all the tests
* fixing ci
* addressing comments
* removing reference to memories in the ll-api
5 年前
Andrew Cohen
e96b80db
recieves brain_name and identifier on python side
5 年前
Ervin Teng
54644477
Merge branch 'develop' of github.com:Unity-Technologies/ml-agents into develop-nomaxstep-test
5 年前
Ervin Teng
df5ee7bf
Split buffer into two buffers (PPO works)
5 年前
Ervin Teng
e5459c49
buffer split for SAC
5 年前
Ervin Teng
3a4fa244
Switch to tanh squash in PPO
5 年前
Ervin Teng
fd0647a6
Rename append_update_buffer to append_to_update_buffer
5 年前
Andrew Cohen
bd056007
recieves brain_name and identifier on python side
5 年前
GitHub
d4780a55
Merge pull request #3010 from Unity-Technologies/release-0.12.0-to-master
Merge Release 0.12.0 to master
5 年前
GitHub
213cd68d
Split Buffer into processing and update buffers ( #2964 )
This is the first in a series of PRs that intend to move the agent processing logic (add_experiences and process_experiences) out of the trainer and into a separate class. The plan is to do so in steps:
- Split the processing buffers (keeping track of agent trajectories and assembling trajectories) and update buffer (complete trajectories to be used for training) within the Trainer (this PR)
- Move the processing buffer and add/process experiences into a separate, outside class
- Change the data type of the update buffer to be a Trajectory
- Place and read Trajectories from queues, add subscription mechanism for both AgentProcessor and Trainers
5 年前
Ervin Teng
34f9577c
Merge branch 'develop' into develop-agentprocessor
5 年前
GitHub
35c995e9
Merge pull request #3038 from Unity-Technologies/develop
Merge develop to master
5 年前
Ervin Teng
9c5fdd31
Stats reporting is working
5 年前
Ervin Teng
eb4a04a5
Merge branch 'master' into develop-tanhsquash
5 年前
Andrew Cohen
5097bcc0
recieves brain_name and identifier on python side
5 年前
Ervin Teng
76abf968
Add back max_step logic
5 年前
Ervin Teng
28eba789
Migrate SAC
5 年前
Andrew Cohen
8578b0b7
add_policy and create_policy separated
5 年前
Ervin Teng
f2b3cd7f
Remove dead code
5 年前
GitHub
36048cb6
Moving Env Manager to Trainers ( #3062 ) The Env Manager is only used by the trainer codebase. The entry point to interact with an environment is UnityEnvironment.
* Moving Env Manager to Trainers
* fix pylint madness
5 年前
Ervin Teng
c9116ed2
Move some common logic to buffer class
5 年前
GitHub
42bea858
Improve mypy coverage by adding --namespace-packages ( #3049 )
5 年前
GitHub
90db165f
Add --namespace-packages to mypy for mlagents ( #3075 )
5 年前
GitHub
1fa07edb
Remove Standalone Offline BC Training ( #2969 )
5 年前
Andrew Cohen
614d276f
recieves brain_name and identifier on python side
5 年前
Andrew Cohen
96922f84
recieves brain_name and identifier on python side
5 年前
Chris Elion
fdc810ff
move (first pass)
5 年前
GitHub
58b6c7c2
Rename mlagents.envs to mlagents_envs ( #3083 )
5 年前
Ervin Teng
27c2a55b
Lots of test fixes
5 年前
Ervin Teng
97d66e71
Remove BootstrapExperience
5 年前
Ervin Teng
324d217b
Move agent_id to Trajectory
5 年前
Ervin Teng
77ff4822
Add back next_obs
5 年前
Andrew Cohen
d1edbf43
add_policy and create_policy separated
5 年前
Ervin Teng
2b811fc8
Properly report value estimates and episode length
5 年前
GitHub
2fd305e7
Move add_experiences out of trainer, add Trajectories ( #3067 )
5 年前
Ervin Teng
c330f6f6
Merge branch 'master' into develop-agentprocessor
5 年前
Andrew Cohen
de902fbb
passes all pytest and C# tests
5 年前
GitHub
2ac242f7
Remove TrainerMetrics and add CSVWriter using new StatsWriter API ( #3108 )
5 年前
Ervin Teng
fdf9aea7
Make conversion methods part of NamedTuples
5 年前
Ervin Teng
6242b67d
Add way to check if trajectory is done or max_reached
5 年前
GitHub
0b5b1b01
Develop magic string + trajectory ( #3122 )
* added team id and identifier concat to behavior parameters
* splitting brain params into brain name and identifiers
* set team id in prefab
* recieves brain_name and identifier on python side
* added team id and identifier concat to behavior parameters
* splitting brain params into brain name and identifiers
* set team id in prefab
* recieves brain_name and identifier on python side
* rebased with develop
* Correctly calls concatBehaviorIdentifiers
* added team id and identifier concat to behavior parameters
* splitting brain params into brain name and identifiers
* set team id in prefab
* recieves brain_name and identifier on python side
* rebased with develop
* Correctly calls concatBehaviorIdentifiers
* trainer_controller expects name_behavior_ids
* add_policy and create_policy separated
* adjusting tests to expect trainer.add_policy to be called
* fixing tests
* fixed naming ...
5 年前
GitHub
c7da0139
Fix mypy errors in trainer code. ( #3135 )
5 年前
Andrew Cohen
082789ea
Merge branch 'master' into develop-magic-string
5 年前
Andrew Cohen
6a4e7cf9
added ppo/sac_policy attributes to keep up with master
5 年前
GitHub
e536c09c
Remove unused tf.placeholder ( #3138 )
5 年前
Ervin Teng
1bd791e5
Merge branch 'master' into develop-agentprocessor
5 年前
Andrew Cohen
3e76adbd
fixing more ci tests
5 年前
GitHub
7fbf6b1d
add flake8-bugbear ( #3137 )
* unused loop variables
* change loop variable
5 年前
GitHub
bec2e8f0
Add Trajectory/Policy Queues, move Trainer logic to advance() ( #3113 )
5 年前
Ervin Teng
db743971
Move private methods out of trainer, simplify interface
5 年前
Andrew Cohen
c8514c18
Merge branch 'master' into develop-magic-string
5 年前
GitHub
45010af3
Add stats reporter class and re-enable missing stats ( #3076 )
5 年前
Ervin Teng
b3a4e641
Remove some vestigial code
5 年前
Ervin Teng
48793ec1
Fix test
5 年前
Ervin Teng
3d25f9d2
Merge branch 'master' into develop-agentprocessor
5 年前
GitHub
5bc7531b
Get step from policy ( #3223 )
5 年前
GitHub
d985dded
Merge branch 'master' into merge-release-0.13.0
5 年前
GitHub
f058b18c
Replace BrainInfos with BatchedStepResult ( #3207 )
5 年前
Ervin Teng
29f3330f
Merge master into hotfix-0.13.1
5 年前
GitHub
d52fb483
Merge pull request #3264 from Unity-Technologies/hotfix-0.13.1
Merge hotfix 0.13.1 into master
5 年前
GitHub
329b23e0
Fix extra summary being written when loading from checkpoint ( #3272 )
* Load next summary properly
* Add tests for add_policy and get_policy
5 年前
Ervin Teng
0ef40c08
SAC CC working
5 年前
Ervin Teng
db249ceb
Merge branch 'master' into develop-splitpolicyoptimizer
5 年前
Ervin Teng
28f7608f
Clean up value head creation
5 年前
Ervin Teng
b21b3d5c
Use resamp policy for SAC
5 年前
Ervin Teng
1b6e175c
Fix discrete SAC and clean up policy
5 年前
Ervin Teng
a5caf4d6
Remove epsilon from everywhere
5 年前
Ervin Teng
8e300036
Add some typing to optimizer
5 年前
Ervin Teng
edeceefd
Zeroed version of LSTM working for PPO
5 年前
Ervin Teng
5ec49542
SAC LSTM isn't broken
5 年前
Ervin Teng
cfc2f455
Fix BC and tests
5 年前
Ervin Teng
78671383
Move initialization call around
5 年前
Ervin Teng
cadf6603
Fix SAC CC and some reward signal tests
5 年前
Ervin Teng
85249afc
Fix SAC scoping
5 年前
GitHub
dd86e879
Separate out optimizer creation and policy graph creation ( #3355 )
5 年前
Ervin Teng
cdd57468
Re-fix scoping and add method to get all variables
5 年前
Ervin Teng
dcbb90e1
Fix graph init in ghost trainer
5 年前
Ervin Teng
5f00782b
Clean up some SAC LSTM
5 年前
Ervin Teng
328476d8
Move check for creation into nn_policy
5 年前
Ervin Teng
ce110201
Add optional burn-in for SAC as well
5 年前
Ervin Teng
cbfbff2c
Split optimizer and TFOptimizer
5 年前
Ervin Teng
4d94e180
Move optimizer to common folder
5 年前
Ervin Teng
ffdc41bb
Removed floating constants
5 年前
Ervin Teng
7004604d
Used NamedTuple for create normalization tensors
5 年前
Ervin Teng
7c0fa1c4
Remove action_holder placeholder
5 年前
Ervin Teng
1cfc461a
Remove and rename tf_optimizer
5 年前
Ervin Teng
ff607162
Move learning rate reporting
5 年前
Ervin Teng
c735e722
Make create critic methods private
5 年前
GitHub
c145e75b
Split Policy and Optimizer, common Policy for PPO and SAC ( #3345 )
5 年前
Ervin Teng
da6daebd
Make create losses private
5 年前
Andrew Cohen
5b0aca29
Merge branch 'master' into soccer-fives
5 年前
Ervin Teng
14f2a7f2
Rename LearningModel to ModelUtils
5 年前
Ervin Teng
1156b9b3
Merge branch 'develop-splitpolicyoptimizer' into develop-removeactionholder
5 年前
Ervin Teng
d57124b4
Merge 'master' into develop-removeactionholder
5 年前
Ervin Teng
d6eb262c
Rename resample to reparameterize
5 年前
Ervin Teng
23088088
Remove outdated comment
5 年前
Ervin Teng
53c25fb1
Move one-hot out of policy and remove selected_actions
5 年前
Anupam Bhatnagar
e04fcd71
Merge branch 'master' into master-into-release-0.14.1
5 年前
GitHub
97a1d4b1
[change] Remove the action_holder placeholder from the policy. ( #3492 )
5 年前
Andrew Cohen
de73baa9
Merge branch 'master' into soccer-fives
5 年前
GitHub
7d954797
[change] Separate action outputs into OutputDistributions object ( #3514 )
5 年前
GitHub
e4177de0
[change] Organize trainer files a bit better ( #3538 )
5 年前
Andrew Cohen
573b1f6d
Merge branch 'master' into soccer-fives
5 年前
GitHub
cb153a0f
[change] Change warning language when adversarial scene is used without self-play ( #3561 )
5 年前
Anupam Bhatnagar
f4dbedcf
removed extraneous logging imports and loggers
5 年前
GitHub
86141eee
Merge pull request #3560 from Unity-Technologies/new-logger
Add timestamps to logs
5 年前
GitHub
e3af96ca
Merge branch 'master' into develop-demo-load-seek
5 年前
GitHub
ffd8f855
[bug-fix] Fix crash when demo size is smaller than batch size ( #3591 )
5 年前
Chris Elion
7f2e815a
Merge remote-tracking branch 'origin/master' into develop-sidechannel-usability
5 年前
Chris Elion
fa5e7e6d
Merge remote-tracking branch 'origin/master' into develop-BehaviorParams-public
5 年前
GitHub
873ba7fd
[bug-fix] Fix stats reporting for reward signals in SAC ( #3606 )
5 年前
GitHub
c42a11c3
[change] Throw a proper error when sequence length is greater than batch size. ( #3583 )
5 年前
GitHub
94de596b
[change] Remove concatenate in discrete action probabilities to improve inference performance ( #3598 )
5 年前
Andrew Cohen
b1cfa74d
Merge branch 'master' into develop-test-imitation
5 年前
GitHub
ec278616
Hotfixes for Release 0.15.1 ( #3698 )
* [bug-fix] Increase height of wall in CrawlerStatic (#3650 )
* [bug-fix] Improve performance for PPO with continuous actions (#3662 )
* Corrected a typo in a name of a function (#3670 )
OnEpsiodeBegin was corrected to OnEpisodeBegin in Migrating.md document
* Add Academy.AutomaticSteppingEnabled to migration (#3666 )
* Fix editor port in Dockerfile (#3674 )
* Hotfix memory leak on Python (#3664 )
* Hotfix memory leak on Python
* Fixing
* Fixing a bug in the heuristic policy. A decision should not be requested when the agent is done
* [bug-fix] Make Python able to deal with 0-step episodes (#3671 )
* adding some comments
Co-authored-by: Ervin T <ervin@unity3d.com>
* Remove vis_encode_type from list of required (#3677 )
* Update changelog (#3678 )
* Shorten timeout duration for environment close (#3679 )
The timeout duration for closing an environment was set to the
same duration as the timeout when waiting ...
5 年前
Andrew Cohen
53bea15c
Merge branch 'master' into soccer-fives
5 年前
Andrew Cohen
ac261e36
Merge branch 'master' into self-play-mutex
5 年前
GitHub
6709a9bf
[change] Clean up trainer interface, clean up GhostTrainer stats ( #3634 )
5 年前
Andrew Cohen
eefc4811
Merge branch 'master' into self-play-mutex
5 年前
Andrew Cohen
9f09a65d
team id centric ghost trainer
5 年前
Ervin Teng
293579dd
Use steps_per_update to determine SAC train interval
5 年前
Ervin Teng
0fa2f4f7
Don't count buffer_init_steps
5 年前
Ervin Teng
dbf8f7a5
Fix comment
5 年前
GitHub
ff32035d
Remove vis_encode_type from list of required ( #3677 )
5 年前
GitHub
141831da
[bug-fix] Fix entropy computation for GaussianDistribution ( #3684 )
5 年前
Andrew Cohen
4c9ac553
Merge branch 'master' into self-play-mutex
5 年前
GitHub
4ecd6ad3
Fix how we set logging levels ( #3703 )
* cleanup logging
* comments and cleanup
* pylint, gym
5 年前
Andrew Cohen
cd677346
Merge branch 'self-play-mutex' into soccer-2v1
5 年前
Andrew Cohen
62c87031
Merge branch 'master' into self-play-mutex
5 年前
Andrew Cohen
59b88be6
Merge branch 'master' into self-play-mutex
5 年前
GitHub
9cbc3fa2
Asymmetric self-play ( #3653 )
5 年前
Ervin Teng
06fa3d39
Merge branch 'master' into develop-sac-apex
5 年前
Anupam Bhatnagar
50e52d9c
Merge branch 'master' into distributed-training
5 年前
Andrew Cohen
3de78baa
wrapped trainer has internal policy ghost
5 年前
Ervin Teng
b7151b51
Remove num_update as param
5 年前
Andrew Cohen
3013774b
alternative to internal-policy fix
5 年前
Andrew Cohen
a870d453
Merge branch 'self-play-mutex' into soccer-2v1
5 年前
GitHub
b841c9ab
Wrapped trainer has internal policy in GhostTrainer
5 年前
Ervin Teng
8b52a2d0
Address comments in docs
5 年前
Ervin Teng
817aab95
Update steps_per_update documentation
Add constant
Tweak buffer max size
5 年前
Andrew Cohen
930d6fa3
Merge branch 'self-play-mutex' into soccer-2v1
5 年前
Ervin Teng
f29b17a9
Don't block one policy queue
Only put policies when policy is actually updated
5 年前
GitHub
aae58330
Merge branch 'master' into develop-add-inference-examples
5 年前
Andrew Cohen
b0c506a6
Merge branch 'soccer-2v1' into asymm-envs
5 年前
Ervin Teng
5e980ec1
Merge branch 'master' into develop-sac-apex
5 年前
Anupam Bhatnagar
9d7dd3b6
[skip ci] moving step increment to trainer from environment for sac
5 年前
Andrew Cohen
de0656b6
Merge branch 'internal-policy-ghost' into soccer-2v1
5 年前
Andrew Cohen
85304aff
Merge branch 'soccer-2v1' into asymm-envs
5 年前
Andrew Cohen
89db8428
Merge branch 'internal-policy-ghost-alternate' into soccer-2v1
5 年前
Andrew Cohen
26c0033c
Merge branch 'soccer-2v1' into asymm-envs
5 年前
GitHub
4d23200b
[refactor] Run Trainers in separate threads ( #3690 )
5 年前
Arthur Juliani
212e2d1d
Merge remote-tracking branch 'origin/master' into develop-add-fire
5 年前
GitHub
232519e4
[refactor] Move output artifacts to a single results/ folder ( #3829 )
5 年前
Chris Elion
68b68396
Merge remote-tracking branch 'origin/master' into release_1_to_master
5 年前
GitHub
4641038e
Renaming max_step to interrupted in TermialStep(s) ( #3908 )
5 年前
vincentpierre
c34dd5b6
Merge branch 'master' into develop-gym-wrapper
5 年前
Andrew Cohen
a2f8319a
Merge branch 'master' into asymm-envs
5 年前
Arthur Juliani
89ad3020
Merge remote-tracking branch 'origin/master' into develop-add-fire
# Conflicts:
# ml-agents/mlagents/trainers/policy/tf_policy.py
5 年前
Andrew Cohen
4a3ad193
Add constant decay to beta and epsilon
5 年前
GitHub
c5b94ca6
Use LR schedule for beta and epsilon ( #3940 )
5 年前
Arthur Juliani
2b3a6347
Merge remote-tracking branch 'origin/master' into develop-add-fire
5 年前
Andrew Cohen
704d0d11
add mede optimizer
4 年前
Andrew Cohen
88153b61
add mede opt with format
4 年前
Christopher Goy
ba80b292
format files with pre-commit.
4 年前
GitHub
f7373172
Merge pull request #4385 from Unity-Technologies/release_2_verified-barracuda-1.0.2
update verified brach with barracuda 1.0.2
4 年前
vincentpierre
6ddfe74f
Merge branch 'master' into develop-gym-wrapper
5 年前
GitHub
e92b4f88
[refactor] Structure configuration files into classes ( #3936 )
5 年前
GitHub
a7323393
[bug-fix] Fix issue with SAC updating too much on resume ( #4038 )
5 年前
GitHub
5cce69ae
add "the the" to precommit spell check ( #4059 )
5 年前
Andrew Cohen
e7750fc9
Merge branch 'master' into develop-sampler-refactor
5 年前
GitHub
09853e13
[refactor] Move checkpoint saving into trainer ( #4034 )
4 年前
Andrew Cohen
c0f7052b
Merge branch 'master' into develop-sampler-refactor
4 年前
Andrew Cohen
34ecc7e6
Merge branch 'master' into asymm-envs
5 年前
GitHub
a1c63c4b
Release 3 Cherry-pick bug-fixes and doc changes from master ( #4102 )
* [bug-fix] Fix regression in --initialize-from feature (#4086 )
* Fixed text in GettingStarted page specifying the logdir for tensorboard. Before it was in a directory summaries which no longer existed. Results are now saved to the results dir. (#4085 )
* [refactor] Remove nonfunctional `output_path` option from TrainerSettings (#4087 )
* Reverting bug introduced in #4071 (#4101 )
Co-authored-by: Scott <Scott.m.jordan91@gmail.com>
Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>
4 年前
GitHub
8a49e8e0
[refactor] Remove nonfunctional `output_path` option from TrainerSettings ( #4087 )
4 年前
Anupam Bhatnagar
4afd8f92
first commit
4 年前
Anupam Bhatnagar
f7a3c06e
[skip ci] updating sac
4 年前
Anupam Bhatnagar
a4567f27
[skip ci] restore process trajectory super calls
4 年前
Andrew Cohen
21f871db
Merge branch 'develop-constant-decay' into asymm-envs
5 年前
Anupam Bhatnagar
26dc42e5
[skip ci]
4 年前
Anupam Bhatnagar
0aedad7c
fixing should_still_train call in rl_trainer.py
4 年前
Anupam Bhatnagar
392a84f1
[skip ci] fixing property decorator in sac
4 年前
Arthur Juliani
9724c9ac
Merge master
4 年前
Anupam Bhatnagar
24d5f881
first commit
4 年前
GitHub
45154f52
Pytorch port of SAC ( #4219 )
4 年前
GitHub
a28e2767
Update add-fire to latest master, including Policy refactor ( #4263 )
* Update Dockerfile
* Separate send environment data from reset (#4128 )
* Fixed a typo on ML-Agents-Overview.md (#4130 )
Fixed redundant "to" word from the sentence since it is probably a typo in document.
* Updated the badge’s link to point to the newest doc version
* Replaced all of the doc to release_3_doc
* Fix 3DBall and 3DBallHard SAC regressions (#4132 )
* Move memory validation to settings
* Update docs
* Add settings test
* Update to release_3 in installation.md (#4144 )
* rename to SideChannelManager +backcompat (#4137 )
* Remove comment about logo with --help (#4148 )
* [bugfix] Make FoodCollector heuristic playable (#4147 )
* Make FoodCollector heuristic playable
* Update changelog
* script to check for old release links and references (#4153 )
* Remove package validation suite from Project (#4146 )
* RayPerceptionSensor: handle empty and invalid tags (#4155...
4 年前
GitHub
74c99ec8
[refactor] Refactor normalizers and encoders ( #4275 )
* Refactor normalizers and encoders
* Unify Critic and ValueNetwork
* Rename ActionVectorEncoder
* Update docstring of create_encoders
* Add docstring to UnnormalizedInputEncoder
4 年前
GitHub
93517833
[feature] Fix TF tests, add --torch CLI option, allow run TF without torch installed ( #4305 )
4 年前
Andrew Cohen
f74d301a
Merge branch 'develop-add-fire' into develop-add-fire-bc
4 年前
Ruo-Ping Dong
01e60921
add sac checkpoint
4 年前
vincentpierre
599d7e9f
Merging master
4 年前
GitHub
3a982317
[add-fire] Add learning rate and beta/epsilon decay to PyTorch ( #4318 )
4 年前
GitHub
7ddfd81f
Added Reward Providers for Torch ( #4280 )
* Added Reward Providers for Torch
* Use NetworkBody to encode state in the reward providers
* Integrating the reward prodiders with ppo and torch
* work in progress, integration with PPO. Not training properly Pyramids at the moment
* Integration in PPO
* Removing duplicate file
* Gail and Curiosity working
* addressing comments
* Enfore float32 for tests
* enfore np.float32 in buffer
4 年前
Andrew Cohen
bf8b2328
Merge branch 'develop-add-fire' into develop-add-fire-bc
4 年前
Ervin Teng
37f986c8
Running LSTM for SAC
4 年前
HH
7afa1761
Merge branch 'master' into hh/develop/ragdoll-updates
4 年前
Ervin Teng
8ead82e2
Use correct half of memories
4 年前
Ruo-Ping Dong
71fe4df6
fix formatting and test
4 年前
Ruo-Ping Dong
09a741c8
small improvement
4 年前
GitHub
3bcb029b
[refactor] Remove BrainParameters from Python code ( #4138 )
4 年前
Ruo-Ping Dong
e06812aa
fix tests
4 年前
HH
0fdac847
Merge branch 'master' into hh/develop/crawler-ragdoll-updates
4 年前
GitHub
84440f05
Convert checkpoints to .NN ( #4127 )
This change adds an export to .nn for each checkpoint generated by
RLTrainer and adds a NNCheckpointManager to track the generated
checkpoints and final model in training_status.json.
Co-authored-by: Jonathan Harper <jharper+moar@unity3d.com>
4 年前
Arthur Juliani
6bee0fd1
Merge master
4 年前
GitHub
129f9ddc
[MLA-427] make pyupgrade convert f-strings too ( #4244 )
* make pyupgrade convert f-strings too
4 年前
Andrew Cohen
d8c123a0
Merge branch 'master' into sensitivity
4 年前
GitHub
202db853
Remove unnecessary line ( #4260 )
4 年前
GitHub
1b098c9a
Refactor TFPolicy and Policy ( #4254 )
* Refactor TFPolicy and Policy
4 年前
GitHub
380fef57
[refactor] Move TF-specific files to tf/ folder ( #4266 )
4 年前
GitHub
beb5aca5
[refactor] Make classes except Optimizer framework agnostic ( #4268 )
4 年前
Andrew Cohen
06e4356c
Merge branch 'master' into sensitivity
4 年前
Arthur Juliani
1a123641
Merge remote-tracking branch 'origin/master' into r5-master
4 年前
GitHub
3f44a0bc
cleanup around AdamOptimizer ( #4333 )
* cleanup around AdamOptimizer
* methods to creat Optimizer instances
4 年前
Andrew Cohen
598826fe
Merge branch 'develop-add-fire' into develop-add-fire-bc
4 年前
Ruo-Ping Dong
d3eb6c46
Merge branch 'develop-add-fire' into develop-add-fire-checkpoint
4 年前
Ruo-Ping Dong
95858e25
update saver interface and add tests
4 年前
Anupam Bhatnagar
a5cc4d03
Merge branch 'master' into global-variables
4 年前
Ruo-Ping Dong
523248be
update
4 年前
GitHub
f374f87a
[add-fire] Add LSTM to SAC, LSTM fixes and initializations ( #4324 )
4 年前
Ervin Teng
eeae6d97
Proper initialization and SAC masking
4 年前
HH
8eaddb61
Merge branch 'master' into hh/develop/loco-walker-variable-speed
4 年前
Ruo-Ping Dong
59cc1a9f
Merge branch 'develop-add-fire' into develop-add-fire-checkpoint
4 年前
Ruo-Ping Dong
409a161c
fix bc tests
4 年前
GitHub
25dc8c3d
Add Saver Class to handle all save/load/checkpoint/export work ( #4323 )
4 年前
Ervin Teng
13f15086
Merge branch 'develop-add-fire' into develop-add-fire-amrl
4 年前
Ervin Teng
d65a9326
Merge branch 'master' into develop-add-fire-mm3
4 年前
Ruo-Ping Dong
d57aa9ab
Merge branch 'develop-add-fire-mm3' into develop-add-fire-checkpoint
4 年前
Ervin Teng
02d86902
Use zeros_like
4 年前
GitHub
bd6bcd2f
Merge master and add Saver class for save/load checkpoints
4 年前
GitHub
6de31a03
[add-fire] Fix masked mean for 2d tensors ( #4364 )
4 年前
Ervin Teng
5c1717d1
Bugfixes for continuous case
4 年前
Ervin Teng
42e25b25
Merge branch 'develop-add-fire' into develop-add-fire-memoryclass
4 年前
Christopher Goy
5a233353
Merge remote-tracking branch 'origin/master' into release_6-to-master
4 年前
GitHub
49545ce1
Pytorch ghost trainer ( #4370 )
4 年前
Andrew Cohen
0053713a
fix sac precommit
4 年前
Andrew Cohen
e7c9ff35
clean up docstrings create policies
4 年前
Andrew Cohen
039ae17f
capitalize Tensorflow
4 年前
GitHub
1955af9e
[feature] Add experimental PyTorch support ( #4335 )
* Begin porting work
* Add ResNet and distributions
* Dynamically construct actor and critic
* Initial optimizer port
* Refactoring policy and optimizer
* Resolving a few bugs
* Share more code between tf and torch policies
* Slightly closer to running model
* Training runs, but doesn’t actually work
* Fix a couple additional bugs
* Add conditional sigma for distribution
* Fix normalization
* Support discrete actions as well
* Continuous and discrete now train
* Mulkti-discrete now working
* Visual observations now train as well
* GRU in-progress and dynamic cnns
* Fix for memories
* Remove unused arg
* Combine actor and critic classes. Initial export.
* Support tf and pytorch alongside one another
* Prepare model for onnx export
* Use LSTM and fix a few merge errors
* Fix bug in probs calculation
* Optimize np -> tensor operations
* Time action sample funct...
4 年前
vincentpierre
9f51ab14
Saving the reward providers
4 年前
Ruo-Ping Dong
c47ffc20
Rename saver
4 年前
vincentpierre
108fac9a
Replace torch.detach().cpu().numpy() with a utils method
4 年前
HH
d9962254
Merge branch 'master' into hh/develop/loco-walker-variable-speed
4 年前
GitHub
ec8c24d8
add fire clean up docstrings in create policies ( #4391 )
4 年前
GitHub
328353bc
Torch : Saving/Loading of the reward providers ( #4405 )
* Saving the reward providers
* adding tests
* Moved the tests around
* Update ml-agents/mlagents/trainers/tests/torch/saver/test_saver_reward_providers.py
* Update ml-agents/mlagents/trainers/tests/torch/saver/test_saver_reward_providers.py
* Update ml-agents/mlagents/trainers/tests/torch/saver/test_saver_reward_providers.py
Co-authored-by: Ruo-Ping (Rachel) Dong <ruoping.dong@unity3d.com>
* Update ml-agents/mlagents/trainers/tests/torch/saver/test_saver_reward_providers.py
Co-authored-by: Ruo-Ping (Rachel) Dong <ruoping.dong@unity3d.com>
Co-authored-by: Ruo-Ping (Rachel) Dong <ruoping.dong@unity3d.com>
4 年前
vincentpierre
31750e97
Using item() in place of to_numpy()
4 年前
vincentpierre
fdd343b2
more use of item() and additional tests
4 年前
Ruo-Ping Dong
88eff042
Merge branch 'master' into develop-saver-name
4 年前
GitHub
48f217b9
Rename Saver to ModelSaver ( #4402 )
Rename Saver to ModelSaver to avoid confusion with tf.Saver
4 年前
Anupam Bhatnagar
f4f1a8d9
merge master into trainer-plugin branch
4 年前
GitHub
498934f9
Replace torch.detach().cpu().numpy() with a utils method ( #4406 )
* Replace torch.detach().cpu().numpy() with a utils method
* Using item() in place of to_numpy()
* more use of item() and additional tests
4 年前
Ruo-Ping Dong
27fb4270
brain_name to behavior_name
4 年前
GitHub
bfda9576
Replace brain_name with behavior_name ( #4419 )
brain_name -> behavior_name
some prob -> log_prob in comments
rename files optimizer -> optimizer_tf for tensorflow
4 年前
Ruo-Ping Dong
fd1dc3a6
Merge branch 'master' into develop-torch-omp
4 年前
Ruo-Ping Dong
ef3be79e
sac
4 年前
GitHub
4e93cb6e
[torch] Restructure PyTorch encoders ( #4421 )
* Move linear encoding to NetworkBody
* moved encoders to processors (#4420 )
* fix bad merge
* Get it running
* Replace mentions of visual_encoders
* Remove output_size property
* Fix tests
* Fix some references
* Revert test_simple_rl
* Fix networks test
* Make curiosity test more accomodating
* Rename total_input_size
* [Bug fix] Fix bug in GAIL gradient penalty (#4425 ) (#4426 )
Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>
* Up number of steps
* Rename to visual_processors and vector_processors
Co-authored-by: andrewcoh <54679309+andrewcoh@users.noreply.github.com>
Co-authored-by: Andrew Cohen <andrew.cohen@unity3d.com>
Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>
4 年前
GitHub
beb5eb30
[bug-fix] Fixes for Torch SAC and tests ( #4408 )
* Fixes for Torch SAC and tests
* FIx recurrent sac test
* Properly update normalization for SAC-continuous
* Fix issue with log ent coef reporting in SAC Torch
4 年前
GitHub
6f534366
Add torch_utils class, auto-detect CUDA availability ( #4403 )
* Add torch_utils
* Use torch from torch_utils
* Add torch to banned modules in CI
* Better import error handling
* Fix flake8 errors
* Address comments
* Move networks to GPU if enabled
* Switch to torch_utils
* More flake8 problems
* Move reward providers to GPU/CPU
* Remove anothere set default tensor
* Fix banned import in test
4 年前
Ervin Teng
916eec4b
Run backwards() of losses in threads
4 年前
Ervin Teng
9b797d61
Thread inference and not backprop
4 年前
Ervin Teng
a305a41b
Try futures in Optimizer
4 年前
Ervin Teng
228ea059
Try futures in Optimizer
4 年前
Andrew Cohen
3997b14b
Merge branch 'master' into develop-hybrid-actions
4 年前
Ervin Teng
3e771cbb
Permute visual obs outside of network
4 年前
Ervin Teng
77c810fb
Fix SAC and make utility method
4 年前
Ervin Teng
9088c07a
Optimized SAC soft update
4 年前
Ervin Teng
7754ad7b
Don't run value during inference
4 年前
Ervin Teng
5495b2b6
Works with continuous
4 年前
Ervin Teng
52efe509
Discrete and entrop coeff
4 年前
Ervin Teng
d67b9f95
Remove comment
4 年前
GitHub
1f179527
Do not keep gradients on the q for the v backup ( #4504 )
4 年前
vincentpierre
181bdec0
-
4 年前
GitHub
4e4ad7b0
Don't run value during policy evaluate, optimized soft update function ( #4501 )
* Don't run value during inference
* Execute critic with LSTM
* Address comments
* Unformat
* Optimized soft update
* Move soft update to model utils
* Add test for soft update
4 年前
Ervin Teng
f9ff3efe
Merge branch 'develop-policyonly' into develop-sac-targetq
4 年前
GitHub
05fc088d
[refactor] Don't compute grad for q2_p in SAC Optimizer ( #4509 )
4 年前
HH
a3bf96fd
Merge branch 'master' into hh/develop/gridsensor-tests
4 年前
GitHub
badca342
Rename NNCheckpoint to ModelCheckpoint as Model can be NN or ONNX ( #4540 )
4 年前
Ervin Teng
8dec4771
Add hybrid actions to SAC
4 年前
GitHub
c188781b
[life improvement] Moving Python files around ( #4531 )
* Moved components to the tf folder and moved the TrainerFactory to the `trainer` folder
* Addressing comments
* Editing the migrating doc
* fixing test
4 年前
Ervin Teng
81342148
Revert "Add hybrid actions to SAC"
This reverts commit a759b36a51df4f8f1fd296f9f148269f0f026e42.
4 年前
Andrew Cohen
e5f14400
Merge branch 'master' into develop-hybrid-actions-singleton
4 年前
GitHub
dde34423
[bug-fix] Use proper masking for entropy and policy losses ( #4572 )
* Use proper masking for entropy and policy losses
* Fix dimension
4 年前
GitHub
a690af74
[refactor] Make PyTorch the default and TensorFlow optional ( #4517 )
* Torch setup.py
* Set torch to default
* Make torch default in setup.py
* Remove indents
* Remove other instances of TF being used
* Add tensorboard to setup.py
* Adding correst setup commands for verifying torch is installed (#4524 )
* Adding correst setup commands for verifying torch is installed
* Editing the test_requirments to add tf and remove torch
* Develop torchdefault raise outside setup (#4530 )
* Torch not imported error to raise at first usage
* Torch not imported error to raise at first usage
* [refactor] Use PyTorch TensorBoard utils (#4518 )
* Convert stats writer to use PyTorch TB support
* Use common function to print params
* Update test
* Bump tensorboard to 1.15 to fix the tests
* putting tensorboard 1.15.0 as min version requirement
Co-authored-by: vincentpierre <vincentpierre@unity3d.com>
* [Docs] Initial documentation changes for making...
4 年前
Andrew Cohen
8013e544
ignoring Instance of 'AbstractContextManager' has no 'enter_context' member (no-member)
4 年前
GitHub
cb8e4d25
Add ActionSpec ( #4586 )
Co-authored-by: Ervin T <ervin@unity3d.com>
4 年前
Andrew Cohen
9689cf2c
remove *_action_* from function names
4 年前
Andrew Cohen
dc89318d
remove ActionType
4 年前
vincentpierre
a3a9a56b
Merge branch 'exp-multi-head-attention' into exp-bullet-hell
4 年前
Ruo-Ping Dong
9e08be87
Merge branch 'master' into release_9_branch_merge
4 年前
Andrew Cohen
97dfa142
fix action_spec refs
4 年前
GitHub
b853e5ba
Action buffer ( #4612 )
Co-authored-by: Ervin T <ervin@unity3d.com>
Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>
4 年前
Ervin Teng
2fc23737
Manchausen RL
4 年前
GitHub
3c96a3a2
Action Model ( #4580 )
Co-authored-by: Ervin T <ervin@unity3d.com>
Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>
4 年前
GitHub
88d3ec3e
Merge master into hybrid actions staging branch ( #4704 )
4 年前
GitHub
23800f33
Merge branch 'master' into develop-action-spec
4 年前
GitHub
8175d558
[bug-fix] Fix BC module + action clipping ( #4667 )
4 年前
Ruo-Ping Dong
ee5313e4
Merge branch 'master' into develop-windows-delay
4 年前
GitHub
f0ed3a38
Cherry-pick BC fixes to Release 10 ( #4668 )
4 年前
vincentpierre
b863af57
Removing TensorFlow Trainers
4 年前
Ervin Teng
6c77ac7a
Update SAC, fix PPO batching
4 年前
GitHub
278911a5
Fix staging tests ( #4708 )
4 年前
Ervin Teng
1db21cbb
Fix SAC interrupted condition and typing
4 年前
Ervin Teng
6e6a6b2b
Fix SAC interrupted again
4 年前
vincentpierre
713e65fb
removing tensorflow testing for pytest and yamato
4 年前
vincentpierre
2dd34aa5
Formatting
4 年前
vincentpierre
8f9634c2
Fxing test
4 年前
Ervin Teng
fdaa8c3d
Merge branch 'develop-unified-obs' into develop-centralizedcritic
4 年前
Andrew Cohen
056630d7
sac continuous and discrete train
4 年前
GitHub
990f801a
Develop hybrid action staging ( #4702 )
Co-authored-by: Ervin T <ervin@unity3d.com>
Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>
Co-authored-by: Ruo-Ping Dong <ruoping.dong@unity3d.com>
Co-authored-by: Chris Elion <chris.elion@unity3d.com>
4 年前
vincentpierre
735fcd52
[WIP] Refactor trainers to use list of obs rather than vec and vis obs
4 年前
Andrew Cohen
85e4db33
bc tests pass
4 年前
vincentpierre
7a5cc9ec
Merge master into develop-rm-tf
4 年前
vincentpierre
c1587bce
Solving merge conflicts
4 年前
Andrew Cohen
8172b3d6
test_simple_rl/reward providers pass tf/torch
4 年前
Arthur Juliani
0d2f8887
Merge remote-tracking branch 'origin/master' into goal-conditioning
# Conflicts:
# ml-agents-envs/mlagents_envs/base_env.py
# ml-agents-envs/mlagents_envs/rpc_utils.py
# ml-agents/mlagents/trainers/tests/mock_brain.py
# ml-agents/mlagents/trainers/tests/simple_test_envs.py
4 年前
Andrew Cohen
73b778cc
rename extract to from_dict
4 年前
Ervin Teng
25dfd883
Merge branch 'master' into develop-centralizedcritic
4 年前
GitHub
22658a40
use sensor types to differentiate obs ( #4749 )
4 年前
Andrew Cohen
3c65b964
fixed recurrent prev_action issue
4 年前
GitHub
903d3afe
Merge pull request #4707 from Unity-Technologies/develop-rm-tf
Removing TensorFlow Trainers
4 年前
vincentpierre
8cb050ef
WIP Made initial changes to enale dimension properties and added attention module
4 年前
Andrew Cohen
498b1ee6
Merge branch 'develop-action-buffer' into develop-hybrid-actions-singleton
4 年前
GitHub
29d94c7c
Merge pull request #4734 from Unity-Technologies/develop-obs-as-list
Refactor trainers to use list of obs rather than vec and vis obs
4 年前
vincentpierre
719c969c
addressing comments. ObservationSpec is no longer a list
4 年前
vincentpierre
4bba4e8e
Renaming ObservationSpec to SensorSpec
4 年前
Andrew Cohen
c0d01baf
Merge branch 'master' into merge-release11-master
4 年前
vincentpierre
44ed3258
Merging master
4 年前
Andrew Cohen
3457cd3c
save only discrete actions as prev
4 年前
vincentpierre
449712b0
renaming sensor_spec to sensor_specS
4 年前
Andrew Cohen
35769b53
Merge branch 'develop-action-buffer' into develop-hybrid-actions-singleton
4 年前
Andrew Cohen
17496265
move AgentAction, ActionLogProbs, and ActionFlattener to separate files
4 年前
Chris Elion
76ebc20c
Merge remote-tracking branch 'origin/master' into r12-to-master
4 年前
GitHub
458fee17
Merge pull request #4763 from Unity-Technologies/develop-att
WIP Made initial changes to enable dimension properties and added attention module
4 年前
Ervin Teng
330fc1d0
Merge branch 'master' into develop-centralizedcritic-mm
4 年前
vincentpierre
519c5f47
merging master
4 年前
Ruo-Ping Dong
8ed14762
Merge branch 'develop-hybrid-actions-singleton' into develop-hybrid-actions-csharp
4 年前
GitHub
7387a77f
remove pylint ( #4836 )
* remove pylint
* remove other pylint disables
4 年前
Andrew Cohen
1bc2ff96
add weight decay to trainers
4 年前
Arthur Juliani
0b4b0992
Rename more files
4 年前
Ervin Teng
aba633b2
Merge branch 'develop-attention-refactor' into develop-centralizedcritic-mm
4 年前
Arthur Juliani
0a876b9c
Fix typos
4 年前
Andrew Cohen
ff324d0c
fixed sac recurrent tf simple rl
4 年前
Ruo-Ping Dong
180d3e20
Merge branch 'develop-centralizedcritic-mm' into develop-cc-teammanager
4 年前
HH
0024a286
merge ervin's new stuff
4 年前
GitHub
12e1fc28
[feature] Hybrid SAC ( #4574 )
4 年前
Andrew Cohen
7af25330
fixed torch test sac
4 年前
GitHub
67ad9651
Merge pull request #4825 from Unity-Technologies/sensor-types
[WIP] Observation Types
4 年前
vincentpierre
8660b1c2
merging master
4 年前
brccabral
457fb612
Merge branch 'master' of https://github.com/Unity-Technologies/ml-agents
4 年前
vincentpierre
115e944b
adding weight decay for experimentation
4 年前
vincentpierre
9fbc2e0e
_
4 年前
vincentpierre
bf16bad6
_
4 年前
GitHub
64fc7f43
Buffer key enums ( #4907 )
4 年前
Ervin Teng
b6f88d6d
Merge branch 'develop-base-teammanager' into develop-agentprocessor-teammanager
4 年前
Ervin Teng
1831044a
Update SAC to use separate policy
4 年前
Andrew Cohen
8efdeeb0
make critic a property
4 年前
Ervin Teng
c675393c
Move value network for SAC to device
4 年前
Andrew Cohen
c74dca9f
add SharedActorCritic
4 年前
Ruo-Ping Dong
c87bce9e
Merge branch 'master' into develop-base-teammanager
4 年前
Andrew Cohen
00b891df
fix sac shared
4 年前
Ervin Teng
ae7643b8
Proper critic memories for PPO
4 年前
vincentpierre
e1b94b8b
Merge branch 'master' into develop-var-len-obs-feature
4 年前
Chris Elion
e4f51ca7
Merge remote-tracking branch 'origin/master' into MLA-1734-demo-provider
4 年前
Ervin Teng
d4438878
Merge branch 'develop-base-teammanager' into develop-agentprocessor-teammanager
4 年前
Ervin Teng
fd3f05b9
Enable GAIL to decay
4 年前
Ervin Teng
bb452ffd
Fix SAC
4 年前
Ervin Teng
e46a86ad
Merge branch 'master' into develop-superpush-int
4 年前
HH
15d512f9
Merge branch 'master' into hh/develop/dodgeball
4 年前
GitHub
338af2ec
Move the Critic into the Optimizer ( #4939 )
Co-authored-by: Ervin Teng <ervin@unity3d.com>
4 年前
HH
4c947151
Merge branch 'main' into hh/develop/dodgeball
4 年前
Ervin Teng
61781a1a
Merge branch 'main' into develop-agentprocessor-teammanager
4 年前
Andrew Cohen
9060da06
Merge branch 'develop-agentprocessor-teammanager' into develop-coma2-trainer
4 年前
Arthur Juliani
06c147f8
Merge remote-tracking branch 'origin/main' into goal-conditioning-new
# Conflicts:
# Project/Assets/ML-Agents/Examples/Crawler/Prefabs/CrawlerBase.prefab
# Project/Assets/ML-Agents/Examples/GridWorld/Prefabs/Area.prefab
# Project/Assets/ML-Agents/Examples/GridWorld/Scenes/GridWorld.unity
# Project/ProjectSettings/TagManager.asset
# com.unity.ml-agents/Runtime/Sensors/CameraSensor.cs
# com.unity.ml-agents/Runtime/Sensors/VectorSensor.cs
# ml-agents/mlagents/trainers/torch/networks.py
# ml-agents/mlagents/trainers/torch/utils.py
4 年前
Ervin Teng
c8137dcd
Merge branch 'main' into develop-superpush-int
4 年前
GitHub
f16ce486
Update v2-staging from main (March 15) ( #5123 )
4 年前
Christopher Goy
921ba4f0
Update v2-staging from main (March 15) ( #5123 )
4 年前
Christopher Goy
ebe45056
Merge branch 'main' into release_14_branch-to-main
4 年前
GitHub
fc5d0a3f
[bug-fix] Fix save/restore critic, add test ( #5062 )
* Fix save/restore critic, add test
* Rename module for PPO
* Use correct policy in test
4 年前
Chris Elion
970f1d40
Merge remote-tracking branch 'origin/v2-staging' into MLA-1634-ObservationSpec
4 年前
Ervin Teng
1f026c70
Merge branch 'main' into develop-superpush-branch-cleanup
4 年前
Ervin Teng
ce872033
Revert "Merge branch 'main' into develop-superpush-branch-cleanup"
This reverts commit 5bea802525381f931a5e0f8b8778fe27a12f03af, reversing
changes made to cee3524e85161e13689d95f66bc6bff994d2cdfd.
4 年前
Andrew Cohen
9e77d7e1
Merge branch 'main' into develop-soccer-groupman
4 年前
GitHub
62314056
Fix ghost curriculum and make steps private ( #5098 )
* use get step to determine curriculum
* add to CHANGELOG
* Make step in trainer private (#5099 )
Co-authored-by: Ervin T <ervin@unity3d.com>
4 年前
Ervin Teng
54ffbed6
[cherry-pick] Fix ghost curriculum and make steps private ( #5098 )
* use get step to determine curriculum
* add to CHANGELOG
* Make step in trainer private (#5099 )
Co-authored-by: Ervin T <ervin@unity3d.com>
4 年前
Andrew Cohen
9176247c
Merge branch 'main' into develop-soccer-groupman-mod
4 年前
GitHub
e81e038b
Fix end episode for POCA, add warning for group reward if not POCA ( #5113 )
* Fix end episode for POCA, add warning for group reward if not POCA
* Add missing imports
4 年前
GitHub
63169e2c
[cherry-pick] Fix group rewards for POCA, add warning for non-POCA trainers ( #5120 )
* Fix end episode for POCA, add warning for group reward if not POCA (#5113 )
* Fix end episode for POCA, add warning for group reward if not POCA
* Add missing imports
* Use np.any, which is faster
4 年前
Ervin Teng
d1c24251
[bug-fix] When agent isn't training, don't clear update buffer ( #5205 )
* Don't clear update buffer, but don't append to it either
* Update changelog
* Address comments
* Make experience replay buffer saving more verbose
(cherry picked from commit 63e7ad44d96b7663b91f005ca1d88f4f3b11dd2a)
4 年前
Ervin Teng
9e2e2626
[bug-fix] Use correct memories for LSTM SAC ( #5228 )
* Use correct memories for LSTM SAC
* Add some comments
(cherry picked from commit 707730256a6797336ba749f05f7dbf10dadd8126)
4 年前
Andrew Cohen
18be47e8
Merge branch 'main' into develop-soccer-groupman-mod
4 年前
GitHub
ff21216d
[bug-fix] When agent isn't training, don't clear update buffer ( #5205 )
* Don't clear update buffer, but don't append to it either
* Update changelog
* Address comments
* Make experience replay buffer saving more verbose
4 年前
GitHub
2e19759c
Turning some logger.info into logger.debug and remove some logging overhead when not using debug ( #5211 )
* turning some logger.info into logger.debug and remove some logging overhead when not using debug
* Addressing comments
* Adding to changelog
4 年前
GitHub
6d1b3a64
[bug-fix] Use correct memories for LSTM SAC ( #5228 )
* Use correct memories for LSTM SAC
* Add some comments
4 年前
vincentpierre
bab3ecb7
First version of MEDE, crawler does not seem to work properly, I suspect the actions make it distinguishable to the discriminator but not to the human eye
4 年前
Andrew Cohen
d813bfd5
continuous, crawler integrated, new cube
4 年前
vincentpierre
8da21669
Adding some changes
4 年前
Andrew Cohen
3e642140
use discrete div
4 年前
Andrew Cohen
bcee3bf5
no entropy loss
4 年前
vincentpierre
7c74c967
_
4 年前
vincentpierre
b4f30613
Adding a variational version
4 年前
vincentpierre
8450b154
-
4 年前
vincentpierre
5985959d
Got 2 modes on Wlker I think
4 年前
vincentpierre
4bde393e
Got the walker to walk different based on diversity setting
4 年前
GitHub
fc6e8c35
[ 🐛 🔨 ] Fix sac target for continuous actions ( #5372 )
* Fix of the target entropy for continuous SAC
* Lowering required steps of test and remove unecessary unsqueeze
* Changing the target from -dim(a)^2 to -dim(a) by removing implicit broadcasting
4 年前
vincentpierre
8cdbc17f
modifying SAC to make \alpha converge faster
4 年前
vincentpierre
983982ee
Removing misleading learning rate
3 年前