Arthur Juliani
dc50162d
Add ResNet and distributions
5 年前
Arthur Juliani
e166d018
Dynamically construct actor and critic
5 年前
Arthur Juliani
7c3bd376
Refactoring policy and optimizer
5 年前
Arthur Juliani
2e51260a
Resolving a few bugs
5 年前
Arthur Juliani
b997f214
Share more code between tf and torch policies
5 年前
Arthur Juliani
947f0d32
Slightly closer to running model
5 年前
Arthur Juliani
3c82bf59
Training runs, but doesn’t actually work
5 年前
Arthur Juliani
8c6f4696
Fix a couple additional bugs
5 年前
Arthur Juliani
4a50444f
Support discrete actions as well
5 年前
Arthur Juliani
a11a79e4
Continuous and discrete now train
5 年前
Arthur Juliani
82688e5c
GRU in-progress and dynamic cnns
5 年前
Arthur Juliani
1736559f
Combine actor and critic classes. Initial export.
5 年前
Arthur Juliani
9835d26c
Prepare model for onnx export
5 年前
Arthur Juliani
b7be7f04
Fix bug in probs calculation
5 年前
Arthur Juliani
3eef9d78
Optimize np -> tensor operations
5 年前
Arthur Juliani
c02e75d6
Time action sample function
5 年前
Arthur Juliani
039f545a
Small performance improvement during inference
5 年前
Ervin Teng
565f92ef
Seems to speed it up
5 年前
Ervin Teng
2fae31e6
Remove another if statement
5 年前
Ervin Teng
72180f9b
Experiment with JIT compiler
5 年前
Ervin Teng
f214836a
Changes for speed test
4 年前
Arthur Juliani
9724c9ac
Merge master
4 年前
Arthur Juliani
46874cc7
ONNX exporting
4 年前
Arthur Juliani
5d33aca7
Remove double setting
4 年前
GitHub
0d80d87a
Fix for discrete actions ( #4181 )
4 年前
Ervin Teng
68169434
Fix discrete actions and GridWorld
4 年前
GitHub
05a11c96
Develop add fire exp framework ( #4213 )
* Experiment branch for comparing torch
* Updates and merging ervin changes
* improvements on experiment_torch.py
* Better printing of results
* preliminary gpu experiment
* Testing gpu
* Prepare to see a lot of commits, because I like my IDE and I am testing on a server and I am using git to sync the two
* Prepare to see a lot of commits, because I like my IDE and I am testing on a server and I am using git to sync the two
* _
* _
* _
* _
* _
* _
* _
* _
* Attempt at gpu on tf. Does not work
* _
* _
* _
* _
* _
* _
* _
* _
* _
* _
* _
* Fixing learn.py
4 年前
GitHub
45154f52
Pytorch port of SAC ( #4219 )
4 年前
GitHub
a28e2767
Update add-fire to latest master, including Policy refactor ( #4263 )
* Update Dockerfile
* Separate send environment data from reset (#4128 )
* Fixed a typo on ML-Agents-Overview.md (#4130 )
Fixed redundant "to" word from the sentence since it is probably a typo in document.
* Updated the badge’s link to point to the newest doc version
* Replaced all of the doc to release_3_doc
* Fix 3DBall and 3DBallHard SAC regressions (#4132 )
* Move memory validation to settings
* Update docs
* Add settings test
* Update to release_3 in installation.md (#4144 )
* rename to SideChannelManager +backcompat (#4137 )
* Remove comment about logo with --help (#4148 )
* [bugfix] Make FoodCollector heuristic playable (#4147 )
* Make FoodCollector heuristic playable
* Update changelog
* script to check for old release links and references (#4153 )
* Remove package validation suite from Project (#4146 )
* RayPerceptionSensor: handle empty and invalid tags (#4155...
4 年前
GitHub
69579611
[refactor] Refactor Actor and Critic classes ( #4287 )
4 年前
Ruo-Ping Dong
6feec58a
add Saver class (only TF working)
4 年前
Ervin Teng
bd97532d
Add normalizer update context
4 年前
Ruo-Ping Dong
9449d711
fix onnx save path and output_name
4 年前
Ruo-Ping Dong
6d67f857
move tf and add torch model serialization
4 年前
Ruo-Ping Dong
01e60921
add sac checkpoint
4 年前
Ruo-Ping Dong
4e87b422
move checkpoint_path logic to saver
4 年前
Ervin Teng
884c97ce
Fix policy memory storinig
4 年前
Ruo-Ping Dong
71fe4df6
fix formatting and test
4 年前
Ruo-Ping Dong
b4713baa
small improvements
4 年前
Ruo-Ping Dong
79d89158
Merge branch 'develop-add-fire' into develop-add-fire-checkpoint
4 年前
Ruo-Ping Dong
e06812aa
fix tests
4 年前
Ruo-Ping Dong
59cc1a9f
Merge branch 'develop-add-fire' into develop-add-fire-checkpoint
4 年前
Ervin Teng
cb0085a7
Memory size abstraction and fixes
4 年前
Ervin Teng
d65a9326
Merge branch 'master' into develop-add-fire-mm3
4 年前
Ruo-Ping Dong
d57aa9ab
Merge branch 'develop-add-fire-mm3' into develop-add-fire-checkpoint
4 年前
Ervin Teng
42e25b25
Merge branch 'develop-add-fire' into develop-add-fire-memoryclass
4 年前
GitHub
8985a040
Removing the experiment script from add fire ( #4373 )
* Removing the experiment script
* Removing the script
4 年前
Andrew Cohen
b822283f
merge add fire
4 年前
Ervin Teng
6e946dba
Policy bugfixes and policy tests
4 年前
Ervin Teng
9ae22c61
Fix SeparateActorCritic export
4 年前
GitHub
03eac72c
[add-fire] Add tests and fix issues with Policy ( #4372 )
4 年前
Andrew Cohen
a65d08c7
ghost trainer tests
4 年前
Ervin Teng
116303f1
Typing for torch policy
4 年前
GitHub
49545ce1
Pytorch ghost trainer ( #4370 )
4 年前
GitHub
6a1d993f
[add-fire] Memory class abstraction ( #4375 )
4 年前
Ervin Teng
a04e68a4
Merge branch 'develop-add-fire' into develop-add-fire-memoryclass
4 年前
Andrew Cohen
effdec13
return copy of state_dict
4 年前
vincentpierre
108fac9a
Replace torch.detach().cpu().numpy() with a utils method
4 年前
Ruo-Ping Dong
27fb4270
brain_name to behavior_name
4 年前
Ruo-Ping Dong
f5dee9d1
jit for continuous control
4 年前
GitHub
6f534366
Add torch_utils class, auto-detect CUDA availability ( #4403 )
* Add torch_utils
* Use torch from torch_utils
* Add torch to banned modules in CI
* Better import error handling
* Fix flake8 errors
* Address comments
* Move networks to GPU if enabled
* Switch to torch_utils
* More flake8 problems
* Move reward providers to GPU/CPU
* Remove anothere set default tensor
* Fix banned import in test
4 年前
Ervin Teng
fdc887a1
Some experimental stuff
4 年前
Ervin Teng
f59f35ea
Remove stuff in policy
4 年前
Ervin Teng
3e771cbb
Permute visual obs outside of network
4 年前
Ervin Teng
77c810fb
Fix SAC and make utility method
4 年前
Ervin Teng
7754ad7b
Don't run value during inference
4 年前
Ervin Teng
b6095151
Execute critic with LSTM
4 年前
GitHub
4e4ad7b0
Don't run value during policy evaluate, optimized soft update function ( #4501 )
* Don't run value during inference
* Execute critic with LSTM
* Address comments
* Unformat
* Optimized soft update
* Move soft update to model utils
* Add test for soft update
4 年前
Andrew Cohen
643c8e58
ppo extended
4 年前
Andrew Cohen
db37db34
fixing errors
4 年前
Andrew Cohen
44c9879e
action models
4 年前
Andrew Cohen
c494bfcc
trains successfully
4 年前
Andrew Cohen
190d8e4d
action model as a singleton
4 年前
Ervin Teng
8dec4771
Add hybrid actions to SAC
4 年前
Ervin Teng
be159ad3
Make entropy reporting same as TF
4 年前
Andrew Cohen
e5f14400
Merge branch 'master' into develop-hybrid-actions-singleton
4 年前
Andrew Cohen
eaecb59e
torch utils to and from buffer
4 年前
Andrew Cohen
8013e544
ignoring Instance of 'AbstractContextManager' has no 'enter_context' member (no-member)
4 年前
GitHub
e0ef30a5
[bug-fix] Change entropy computation and loss reporting in Torch to match TF ( #4538 )
* Proper dimensions for entropy, sum before bonus in PPO
* Make entropy reporting same as TF
* Always use separate critic
* Revert to shared
* Remove unneeded extra line
* Change entropy shape in test
* Change another entropy shape
* Add entropy summing to evaluate_actions
* Add notes about torch.abs(policy_loss)
4 年前
GitHub
cb8e4d25
Add ActionSpec ( #4586 )
Co-authored-by: Ervin T <ervin@unity3d.com>
4 年前
GitHub
b853e5ba
Action buffer ( #4612 )
Co-authored-by: Ervin T <ervin@unity3d.com>
Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>
4 年前
GitHub
3c96a3a2
Action Model ( #4580 )
Co-authored-by: Ervin T <ervin@unity3d.com>
Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>
4 年前
GitHub
88d3ec3e
Merge master into hybrid actions staging branch ( #4704 )
4 年前
GitHub
87a7ccf8
use int64 steps, check for NaN actions ( #4607 )
* use int64 steps
* check for NaN actions
Co-authored-by: Ruo-Ping Dong <ruoping.dong@unity3d.com>
4 年前
GitHub
85a7c0f7
[bug-fix] Add clipping to PyTorch policy, fix initialization ( #4649 )
4 年前
Ervin Teng
0548057d
Use real clipping (as in TF)
4 年前
GitHub
8175d558
[bug-fix] Fix BC module + action clipping ( #4667 )
4 年前
Ervin Teng
78f88c15
Add clip to export and make optional in policy
4 年前
Andrew Cohen
3f771e61
add ActionBuffers and utils
4 年前
Ervin Teng
7a0ebfbd
Pretty broken
4 年前
Ervin Teng
95bdbba3
Less broken PPO
4 年前
Ervin Teng
98948c59
Skip critic when given empty memory array
4 年前
Ervin Teng
4158629e
Properly feed in None rather than empty arrays
4 年前
Andrew Cohen
bd917c9c
action buffer passes continuous
4 年前
Andrew Cohen
b36fcf16
discrete runs/cont passes
4 年前
Andrew Cohen
ad951493
debugging discrete
4 年前
Andrew Cohen
fcf6471e
2d discrete passes
4 年前
vincentpierre
735fcd52
[WIP] Refactor trainers to use list of obs rather than vec and vis obs
4 年前
Ervin Teng
6846af21
Multi-input network
4 年前
Andrew Cohen
85e4db33
bc tests pass
4 年前
vincentpierre
93ca1409
fixing the tests
4 年前
Ervin Teng
cb4b7ed3
Some minor tweaks but still broken
4 年前
vincentpierre
12619155
added some docstrings
4 年前
vincentpierre
c1587bce
Solving merge conflicts
4 年前
GitHub
8ab2e619
update type of evaluate_actions to list tensor ( #4747 )
4 年前
GitHub
a0d1c829
Action Docs part2 ( #4739 )
* reduce usage of "vector action" and "action space"
* more cleanup
* undo GettingStarted change for now
* batch size description
* Apply suggestions from code review
Co-authored-by: andrewcoh <54679309+andrewcoh@users.noreply.github.com>
Co-authored-by: andrewcoh <54679309+andrewcoh@users.noreply.github.com>
4 年前
GitHub
cc6b4564
Multi Directional Walker and Initial Hypernetwork ( #4740 )
4 年前
Ervin Teng
25dfd883
Merge branch 'master' into develop-centralizedcritic
4 年前
GitHub
ad5f878c
[refactor] Remove critic pass during inference ( #4743 )
4 年前
GitHub
22658a40
use sensor types to differentiate obs ( #4749 )
4 年前
vincentpierre
14378aa5
Merging master
4 年前
vincentpierre
0c81006d
addressing comments
4 年前
vincentpierre
8cb050ef
WIP Made initial changes to enale dimension properties and added attention module
4 年前
Andrew Cohen
498b1ee6
Merge branch 'develop-action-buffer' into develop-hybrid-actions-singleton
4 年前
Andrew Cohen
6174c428
move action model to explicit distributions
4 年前
Andrew Cohen
1d234d1d
bc works
4 年前
vincentpierre
719c969c
addressing comments. ObservationSpec is no longer a list
4 年前
vincentpierre
4bba4e8e
Renaming ObservationSpec to SensorSpec
4 年前
Andrew Cohen
e81e68de
comms agent and fixed hallway
4 年前
vincentpierre
44ed3258
Merging master
4 年前
vincentpierre
449712b0
renaming sensor_spec to sensor_specS
4 年前
Andrew Cohen
35769b53
Merge branch 'develop-action-buffer' into develop-hybrid-actions-singleton
4 年前
Andrew Cohen
17496265
move AgentAction, ActionLogProbs, and ActionFlattener to separate files
4 年前
vincentpierre
36cc4665
Removing some vis and vec fields from policy.py
4 年前
Ervin Teng
330fc1d0
Merge branch 'master' into develop-centralizedcritic-mm
4 年前
Andrew Cohen
60309d8f
fix torch policy tests
4 年前
vincentpierre
519c5f47
merging master
4 年前
Andrew Cohen
7ba10239
remove action spec attribute from policy
4 年前
GitHub
7387a77f
remove pylint ( #4836 )
* remove pylint
* remove other pylint disables
4 年前
Arthur Juliani
0b4b0992
Rename more files
4 年前
Ervin Teng
aba633b2
Merge branch 'develop-attention-refactor' into develop-centralizedcritic-mm
4 年前
Arthur Juliani
0a876b9c
Fix typos
4 年前
Arthur Juliani
e3de0406
Plurals
4 年前
GitHub
67ad9651
Merge pull request #4825 from Unity-Technologies/sensor-types
[WIP] Observation Types
4 年前
Ervin Teng
457b2630
I think it's running
4 年前
Andrew Cohen
6e1826f8
might be right
4 年前
vincentpierre
52b011d6
_
4 年前
Andrew Cohen
a4c336c2
value estimator
4 年前
Andrew Cohen
9af22d30
use only value funcs
4 年前
Ervin Teng
3283b6a1
Remove Q-net for perf
4 年前
Ervin Teng
b6f88d6d
Merge branch 'develop-base-teammanager' into develop-agentprocessor-teammanager
4 年前
Andrew Cohen
f73b9dba
update policy to not use critic
4 年前
Andrew Cohen
9b92f5fb
remove commented code
4 年前
Andrew Cohen
c74dca9f
add SharedActorCritic
4 年前
Andrew Cohen
00b891df
fix sac shared
4 年前
Ervin Teng
e46a86ad
Merge branch 'master' into develop-superpush-int
4 年前