GitHub
e4177de0
[change] Organize trainer files a bit better ( #3538 )
5 年前
Andrew Cohen
573b1f6d
Merge branch 'master' into soccer-fives
5 年前
Anupam Bhatnagar
07b15ae7
[skip-ci] small refactors
5 年前
Andrew Cohen
ac261e36
Merge branch 'master' into self-play-mutex
5 年前
Anupam Bhatnagar
9341f7a2
[skip-ci] small refactors
5 年前
Anupam Bhatnagar
06a54ae8
step increment moved to _update_policy, fixed exit status issue
5 年前
Anupam Bhatnagar
5d180caf
[skip ci] modify learning rate in horovod optimizer
5 年前
Anupam Bhatnagar
b3c2d431
[skip ci] minor formatting change
5 年前
Arthur Juliani
6879bae4
Initial optimizer port
5 年前
Arthur Juliani
7c3bd376
Refactoring policy and optimizer
5 年前
Arthur Juliani
2e51260a
Resolving a few bugs
5 年前
Arthur Juliani
947f0d32
Slightly closer to running model
5 年前
Arthur Juliani
3c82bf59
Training runs, but doesn’t actually work
5 年前
Arthur Juliani
8c6f4696
Fix a couple additional bugs
5 年前
Arthur Juliani
61d671d8
Add conditional sigma for distribution
5 年前
Arthur Juliani
a5b5b109
Mulkti-discrete now working
5 年前
Arthur Juliani
5f936990
Visual observations now train as well
5 年前
Arthur Juliani
1736559f
Combine actor and critic classes. Initial export.
5 年前
Arthur Juliani
ca887743
Support tf and pytorch alongside one another
5 年前
Arthur Juliani
be7e55e1
Use LSTM and fix a few merge errors
5 年前
Arthur Juliani
3eef9d78
Optimize np -> tensor operations
5 年前
Ervin Teng
72180f9b
Experiment with JIT compiler
5 年前
GitHub
e92b4f88
[refactor] Structure configuration files into classes ( #3936 )
5 年前
Anupam Bhatnagar
4afd8f92
first commit
5 年前
Arthur Juliani
9724c9ac
Merge master
5 年前
Anupam Bhatnagar
24d5f881
first commit
5 年前
GitHub
cde8bd29
Convert List[np.ndarray] to np.ndarray before using torch.as_tensor ( #4183 )
Big speedup in visual obs
4 年前
GitHub
05a11c96
Develop add fire exp framework ( #4213 )
* Experiment branch for comparing torch
* Updates and merging ervin changes
* improvements on experiment_torch.py
* Better printing of results
* preliminary gpu experiment
* Testing gpu
* Prepare to see a lot of commits, because I like my IDE and I am testing on a server and I am using git to sync the two
* Prepare to see a lot of commits, because I like my IDE and I am testing on a server and I am using git to sync the two
* _
* _
* _
* _
* _
* _
* _
* _
* Attempt at gpu on tf. Does not work
* _
* _
* _
* _
* _
* _
* _
* _
* _
* _
* _
* Fixing learn.py
4 年前
GitHub
a28e2767
Update add-fire to latest master, including Policy refactor ( #4263 )
* Update Dockerfile
* Separate send environment data from reset (#4128 )
* Fixed a typo on ML-Agents-Overview.md (#4130 )
Fixed redundant "to" word from the sentence since it is probably a typo in document.
* Updated the badge’s link to point to the newest doc version
* Replaced all of the doc to release_3_doc
* Fix 3DBall and 3DBallHard SAC regressions (#4132 )
* Move memory validation to settings
* Update docs
* Add settings test
* Update to release_3 in installation.md (#4144 )
* rename to SideChannelManager +backcompat (#4137 )
* Remove comment about logo with --help (#4148 )
* [bugfix] Make FoodCollector heuristic playable (#4147 )
* Make FoodCollector heuristic playable
* Update changelog
* script to check for old release links and references (#4153 )
* Remove package validation suite from Project (#4146 )
* RayPerceptionSensor: handle empty and invalid tags (#4155...
4 年前
GitHub
69579611
[refactor] Refactor Actor and Critic classes ( #4287 )
4 年前
Andrew Cohen
ccb492dc
ignore precommit/first bc commit
4 年前
Andrew Cohen
84ea84a6
bc loss for both continuous and disc
4 年前
Andrew Cohen
f74d301a
Merge branch 'develop-add-fire' into develop-add-fire-bc
4 年前
Andrew Cohen
22a0cabc
changed path to torch bc module
4 年前
vincentpierre
599d7e9f
Merging master
5 年前
GitHub
7ddfd81f
Added Reward Providers for Torch ( #4280 )
* Added Reward Providers for Torch
* Use NetworkBody to encode state in the reward providers
* Integrating the reward prodiders with ppo and torch
* work in progress, integration with PPO. Not training properly Pyramids at the moment
* Integration in PPO
* Removing duplicate file
* Gail and Curiosity working
* addressing comments
* Enfore float32 for tests
* enfore np.float32 in buffer
4 年前
Ruo-Ping Dong
79d89158
Merge branch 'develop-add-fire' into develop-add-fire-checkpoint
4 年前
Andrew Cohen
d8c123a0
Merge branch 'master' into sensitivity
4 年前
Andrew Cohen
02df39ab
ignore precommit
4 年前
Andrew Cohen
fa35292c
write hist to tb
4 年前
GitHub
beb5aca5
[refactor] Make classes except Optimizer framework agnostic ( #4268 )
4 年前
Andrew Cohen
06e4356c
Merge branch 'master' into sensitivity
4 年前
Arthur Juliani
1a123641
Merge remote-tracking branch 'origin/master' into r5-master
4 年前
GitHub
3f44a0bc
cleanup around AdamOptimizer ( #4333 )
* cleanup around AdamOptimizer
* methods to creat Optimizer instances
4 年前
Andrew Cohen
598826fe
Merge branch 'develop-add-fire' into develop-add-fire-bc
4 年前
Ruo-Ping Dong
d3eb6c46
Merge branch 'develop-add-fire' into develop-add-fire-checkpoint
4 年前
Anupam Bhatnagar
a5cc4d03
Merge branch 'master' into global-variables
4 年前
GitHub
6b255790
Behavioral Cloning Pytorch ( #4293 )
4 年前
GitHub
f374f87a
[add-fire] Add LSTM to SAC, LSTM fixes and initializations ( #4324 )
4 年前
Andrew Cohen
0a7444f9
revert bc default batch/epoch
4 年前
HH
8eaddb61
Merge branch 'master' into hh/develop/loco-walker-variable-speed
4 年前
Ruo-Ping Dong
59cc1a9f
Merge branch 'develop-add-fire' into develop-add-fire-checkpoint
4 年前
Ervin Teng
f4da3592
Add memories and sequence length to critic_pass
4 年前
Ervin Teng
13f15086
Merge branch 'develop-add-fire' into develop-add-fire-amrl
4 年前
Ervin Teng
fa0d3cb6
Fix next_obs in get_trajectory_value_estimates
4 年前
Ervin Teng
d65a9326
Merge branch 'master' into develop-add-fire-mm3
4 年前
Ruo-Ping Dong
d57aa9ab
Merge branch 'develop-add-fire-mm3' into develop-add-fire-checkpoint
4 年前
GitHub
bd6bcd2f
Merge master and add Saver class for save/load checkpoints
4 年前
Ervin Teng
d218bf4d
Merge branch 'develop-add-fire' into develop-add-fire-sac-lst
4 年前
Ervin Teng
42e25b25
Merge branch 'develop-add-fire' into develop-add-fire-memoryclass
4 年前
Christopher Goy
5a233353
Merge remote-tracking branch 'origin/master' into release_6-to-master
4 年前
GitHub
1955af9e
[feature] Add experimental PyTorch support ( #4335 )
* Begin porting work
* Add ResNet and distributions
* Dynamically construct actor and critic
* Initial optimizer port
* Refactoring policy and optimizer
* Resolving a few bugs
* Share more code between tf and torch policies
* Slightly closer to running model
* Training runs, but doesn’t actually work
* Fix a couple additional bugs
* Add conditional sigma for distribution
* Fix normalization
* Support discrete actions as well
* Continuous and discrete now train
* Mulkti-discrete now working
* Visual observations now train as well
* GRU in-progress and dynamic cnns
* Fix for memories
* Remove unused arg
* Combine actor and critic classes. Initial export.
* Support tf and pytorch alongside one another
* Prepare model for onnx export
* Use LSTM and fix a few merge errors
* Fix bug in probs calculation
* Optimize np -> tensor operations
* Time action sample funct...
4 年前
vincentpierre
108fac9a
Replace torch.detach().cpu().numpy() with a utils method
4 年前
HH
d9962254
Merge branch 'master' into hh/develop/loco-walker-variable-speed
4 年前
Anupam Bhatnagar
f4f1a8d9
merge master into trainer-plugin branch
4 年前
GitHub
498934f9
Replace torch.detach().cpu().numpy() with a utils method ( #4406 )
* Replace torch.detach().cpu().numpy() with a utils method
* Using item() in place of to_numpy()
* more use of item() and additional tests
4 年前
Ruo-Ping Dong
fd1dc3a6
Merge branch 'master' into develop-torch-omp
4 年前
GitHub
4e93cb6e
[torch] Restructure PyTorch encoders ( #4421 )
* Move linear encoding to NetworkBody
* moved encoders to processors (#4420 )
* fix bad merge
* Get it running
* Replace mentions of visual_encoders
* Remove output_size property
* Fix tests
* Fix some references
* Revert test_simple_rl
* Fix networks test
* Make curiosity test more accomodating
* Rename total_input_size
* [Bug fix] Fix bug in GAIL gradient penalty (#4425 ) (#4426 )
Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>
* Up number of steps
* Rename to visual_processors and vector_processors
Co-authored-by: andrewcoh <54679309+andrewcoh@users.noreply.github.com>
Co-authored-by: Andrew Cohen <andrew.cohen@unity3d.com>
Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>
4 年前
GitHub
6f534366
Add torch_utils class, auto-detect CUDA availability ( #4403 )
* Add torch_utils
* Use torch from torch_utils
* Add torch to banned modules in CI
* Better import error handling
* Fix flake8 errors
* Address comments
* Move networks to GPU if enabled
* Switch to torch_utils
* More flake8 problems
* Move reward providers to GPU/CPU
* Remove anothere set default tensor
* Fix banned import in test
4 年前
Andrew Cohen
3997b14b
Merge branch 'master' into develop-hybrid-actions
4 年前
Ervin Teng
3e771cbb
Permute visual obs outside of network
4 年前
Ervin Teng
77c810fb
Fix SAC and make utility method
4 年前
GitHub
c188781b
[life improvement] Moving Python files around ( #4531 )
* Moved components to the tf folder and moved the TrainerFactory to the `trainer` folder
* Addressing comments
* Editing the migrating doc
* fixing test
4 年前
Andrew Cohen
e5f14400
Merge branch 'master' into develop-hybrid-actions-singleton
4 年前
vincentpierre
d3d4eb90
Trainer with attention
4 年前
GitHub
b853e5ba
Action buffer ( #4612 )
Co-authored-by: Ervin T <ervin@unity3d.com>
Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>
4 年前
Ervin Teng
95bdbba3
Less broken PPO
4 年前
vincentpierre
b863af57
Removing TensorFlow Trainers
4 年前
Ervin Teng
5a5bd515
Fix multiple obs
4 年前
Ervin Teng
fdaa8c3d
Merge branch 'develop-unified-obs' into develop-centralizedcritic
4 年前
GitHub
990f801a
Develop hybrid action staging ( #4702 )
Co-authored-by: Ervin T <ervin@unity3d.com>
Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>
Co-authored-by: Ruo-Ping Dong <ruoping.dong@unity3d.com>
Co-authored-by: Chris Elion <chris.elion@unity3d.com>
4 年前
vincentpierre
735fcd52
[WIP] Refactor trainers to use list of obs rather than vec and vis obs
4 年前
Ervin Teng
cb4b7ed3
Some minor tweaks but still broken
4 年前
Ervin Teng
56dcd75a
Get next critic observations into value estimate
4 年前
Andrew Cohen
4ebc6c44
ml-agents-envs pass
4 年前
Arthur Juliani
0d2f8887
Merge remote-tracking branch 'origin/master' into goal-conditioning
# Conflicts:
# ml-agents-envs/mlagents_envs/base_env.py
# ml-agents-envs/mlagents_envs/rpc_utils.py
# ml-agents/mlagents/trainers/tests/mock_brain.py
# ml-agents/mlagents/trainers/tests/simple_test_envs.py
4 年前
GitHub
cc6b4564
Multi Directional Walker and Initial Hypernetwork ( #4740 )
4 年前
Ervin Teng
25dfd883
Merge branch 'master' into develop-centralizedcritic
4 年前
GitHub
22658a40
use sensor types to differentiate obs ( #4749 )
4 年前
Andrew Cohen
3c65b964
fixed recurrent prev_action issue
4 年前
GitHub
903d3afe
Merge pull request #4707 from Unity-Technologies/develop-rm-tf
Removing TensorFlow Trainers
4 年前
Andrew Cohen
498b1ee6
Merge branch 'develop-action-buffer' into develop-hybrid-actions-singleton
4 年前
GitHub
29d94c7c
Merge pull request #4734 from Unity-Technologies/develop-obs-as-list
Refactor trainers to use list of obs rather than vec and vis obs
4 年前
Andrew Cohen
c0d01baf
Merge branch 'master' into merge-release11-master
4 年前
vincentpierre
44ed3258
Merging master
4 年前
Andrew Cohen
3457cd3c
save only discrete actions as prev
4 年前
vincentpierre
449712b0
renaming sensor_spec to sensor_specS
4 年前
Andrew Cohen
35769b53
Merge branch 'develop-action-buffer' into develop-hybrid-actions-singleton
4 年前
Chris Elion
76ebc20c
Merge remote-tracking branch 'origin/master' into r12-to-master
4 年前
GitHub
458fee17
Merge pull request #4763 from Unity-Technologies/develop-att
WIP Made initial changes to enable dimension properties and added attention module
4 年前
Ervin Teng
330fc1d0
Merge branch 'master' into develop-centralizedcritic-mm
4 年前
vincentpierre
519c5f47
merging master
4 年前
Ervin Teng
ad439fb6
Additional changes
4 年前
Ervin Teng
d02a1033
Some more fixes
4 年前
Ruo-Ping Dong
8ed14762
Merge branch 'develop-hybrid-actions-singleton' into develop-hybrid-actions-csharp
4 年前
GitHub
7387a77f
remove pylint ( #4836 )
* remove pylint
* remove other pylint disables
4 年前
Arthur Juliani
0b4b0992
Rename more files
4 年前
Ervin Teng
aba633b2
Merge branch 'develop-attention-refactor' into develop-centralizedcritic-mm
4 年前
Arthur Juliani
0a876b9c
Fix typos
4 年前
Ruo-Ping Dong
180d3e20
Merge branch 'develop-centralizedcritic-mm' into develop-cc-teammanager
4 年前
HH
0024a286
merge ervin's new stuff
4 年前
Ervin Teng
9c3da1b6
New buffer layout, TeamObsUtil, pad dead agents
4 年前
GitHub
67ad9651
Merge pull request #4825 from Unity-Technologies/sensor-types
[WIP] Observation Types
4 年前
vincentpierre
8660b1c2
merging master
4 年前
Ervin Teng
3daa17a9
Merge branch 'develop-centralizedcritic-mm' into develop-zombieteammanager
4 年前
Ervin Teng
6b8b3db3
Try subtract marginalized value
4 年前
Ervin Teng
092ea232
Some more progress - still broken
4 年前
Ervin Teng
457b2630
I think it's running
4 年前
brccabral
457fb612
Merge branch 'master' of https://github.com/Unity-Technologies/ml-agents
4 年前
Andrew Cohen
6e1826f8
might be right
4 年前
Andrew Cohen
1511588d
forcing this to work
4 年前
Andrew Cohen
e1fad8a4
buffer error
4 年前
Andrew Cohen
feb38012
add lambda return and target network
4 年前
Andrew Cohen
5741f8f6
no target net
4 年前
Andrew Cohen
a92baab6
add target network back
4 年前
Andrew Cohen
a4c336c2
value estimator
4 年前
Andrew Cohen
fce842aa
adding zombie to coma2 brnch
4 年前
Andrew Cohen
7f491ae7
cloud run with coma2 of held out zombie test env
4 年前
Andrew Cohen
9af22d30
use only value funcs
4 年前
Andrew Cohen
95253b47
ntegrate teammate dones
4 年前
Andrew Cohen
687f411b
try again on cloud
4 年前
Andrew Cohen
f9ff3fef
shared baseline and v
4 年前
Ervin Teng
3283b6a1
Remove Q-net for perf
4 年前
Ervin Teng
b6f88d6d
Merge branch 'develop-base-teammanager' into develop-agentprocessor-teammanager
4 年前
Andrew Cohen
6bd396ee
add critic to optimizer, ppo runs
4 年前
Andrew Cohen
3aec18a1
fix precommit errors
4 年前
Andrew Cohen
8efdeeb0
make critic a property
4 年前
Ervin Teng
0bde7598
Back out trainer changes
4 年前
Ervin Teng
514873bf
Use correct memories (t-1 instead of t) for training
4 年前
Ervin Teng
f3a2a81f
Merge branch 'develop-fix-lstms' into develop-gru
4 年前
Ervin Teng
219e773b
Merge branch 'develop-fix-lstms' into develop-critic-op-lstm
4 年前
Ervin Teng
ae7643b8
Proper critic memories for PPO
4 年前
Ervin Teng
2b0dd850
Still somewhat broken but cleaner
4 年前
Ervin Teng
64839237
Fix indexing issue
4 年前
Ervin Teng
21e9785a
Fix padding issues
4 年前
Ervin Teng
8d834f0b
Fix more indexing bugs
4 年前
Ervin Teng
4fc0f93e
Code cleanup
4 年前
Ervin Teng
6a573ebf
Code cleanup
4 年前
Ervin Teng
f3cec983
Append the right memories
4 年前
Ervin Teng
a9666a0b
Don't pad when not needed
4 年前
Ervin Teng
c2883f5b
Pad from back of trajectory
4 年前
Ervin Teng
e46a86ad
Merge branch 'master' into develop-superpush-int
4 年前
HH
15d512f9
Merge branch 'master' into hh/develop/dodgeball
4 年前
GitHub
338af2ec
Move the Critic into the Optimizer ( #4939 )
Co-authored-by: Ervin Teng <ervin@unity3d.com>
4 年前
HH
4c947151
Merge branch 'main' into hh/develop/dodgeball
4 年前
Andrew Cohen
4b58527c
checkout ppo/optimizer from main
4 年前
Ervin Teng
61781a1a
Merge branch 'main' into develop-agentprocessor-teammanager
4 年前
GitHub
c1d19e89
Fix gpu pytests ( #5019 )
* Move tensors to cpu before converting it to numpy
4 年前
Arthur Juliani
06c147f8
Merge remote-tracking branch 'origin/main' into goal-conditioning-new
# Conflicts:
# Project/Assets/ML-Agents/Examples/Crawler/Prefabs/CrawlerBase.prefab
# Project/Assets/ML-Agents/Examples/GridWorld/Prefabs/Area.prefab
# Project/Assets/ML-Agents/Examples/GridWorld/Scenes/GridWorld.unity
# Project/ProjectSettings/TagManager.asset
# com.unity.ml-agents/Runtime/Sensors/CameraSensor.cs
# com.unity.ml-agents/Runtime/Sensors/VectorSensor.cs
# ml-agents/mlagents/trainers/torch/networks.py
# ml-agents/mlagents/trainers/torch/utils.py
4 年前
Ervin Teng
fd0dd35c
Merge branch 'main' into develop-coma2-trainer
4 年前
Ervin Teng
c8137dcd
Merge branch 'main' into develop-superpush-int
4 年前
Andrew Cohen
131fa328
inital evaluate_by_seq, does not run
4 年前
Andrew Cohen
67beef88
finished evaluate_by_seq, does not run
4 年前
Andrew Cohen
8f799687
ignoring precommit, grabbing baseline/critic mems from buffer in trainer
4 年前
GitHub
f16ce486
Update v2-staging from main (March 15) ( #5123 )
4 年前
Christopher Goy
921ba4f0
Update v2-staging from main (March 15) ( #5123 )
4 年前
GitHub
ba2af269
[coma2] Make group extrinsic reward part of extrinsic ( #5033 )
* Make group extrinsic part of extrinsic
* Fix test and init
* Fix tests and bug
* Add baseline loss to TensorBoard
4 年前
GitHub
d24b0966
[bug-fix] Fix memory leak when using LSTMs ( #5048 )
* Detach memory before storing
* Add test
* Evaluate with no_grad
4 年前
Christopher Goy
ebe45056
Merge branch 'main' into release_14_branch-to-main
4 年前
Chris Elion
970f1d40
Merge remote-tracking branch 'origin/v2-staging' into MLA-1634-ObservationSpec
4 年前
Ervin Teng
1f026c70
Merge branch 'main' into develop-superpush-branch-cleanup
4 年前
Ervin Teng
ce872033
Revert "Merge branch 'main' into develop-superpush-branch-cleanup"
This reverts commit 5bea802525381f931a5e0f8b8778fe27a12f03af, reversing
changes made to cee3524e85161e13689d95f66bc6bff994d2cdfd.
4 年前
GitHub
8f35bdd3
POCA trainer ( #5005 )
Co-authored-by: Ervin Teng <ervin@unity3d.com>
Co-authored-by: Ruo-Ping Dong <ruoping.dong@unity3d.com>
Co-authored-by: Chris Elion <chris.elion@unity3d.com>
Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>
4 年前
Andrew Cohen
9e77d7e1
Merge branch 'main' into develop-soccer-groupman
4 年前
Ervin Teng
c108da4a
[bug-fix] Fix POCA LSTM, pad sequences in the back ( #5206 )
* Pad buffer at the end
* Fix padding in optimizer value estimate
* Fix additional bugs and POCA
* Fix groupmate obs, add tests
* Update changelog
* Improve tests
* Address comments
* Fix poca test
* Fix buffer test
* Increase entropy for Hallway
* Add EOF newline
* Fix Behavior Name
* Address comments
(cherry picked from commit 2ce6810846ba9268e4fb5fb082fa54e90414c980)
4 年前
Ervin Teng
d461a66a
Fix padding in optimizer value estimate
4 年前
Ervin Teng
81b74634
Fix additional bugs and POCA
4 年前
Ervin Teng
9fd4a81e
Address comments
4 年前
GitHub
c5589b59
[bug-fix] Fix POCA LSTM, pad sequences in the back ( #5206 )
* Pad buffer at the end
* Fix padding in optimizer value estimate
* Fix additional bugs and POCA
* Fix groupmate obs, add tests
* Update changelog
* Improve tests
* Address comments
* Fix poca test
* Fix buffer test
* Increase entropy for Hallway
* Add EOF newline
* Fix Behavior Name
* Address comments
4 年前