708 次代码提交 (754492dc-3b47-41ad-9e2f-39a67d1c9de0)

作者 SHA1 备注 提交日期
GitHub 8317a659 Behavioral Cloning & Trainers Reorg (#328) 7 年前
GitHub e11dae1d Python Testing & Image Inference Improvements (#353) 7 年前
eshvk 030ac5c5 [cleanup] Add a new type hint to call a dictionary of BrainInfo objects as an AllBrainInfo. Propagate this hint to all methods. Some pep8 cleanups. 7 年前
GitHub 9ad4182e Merge pull request #366 from Unity-Technologies/feature/cleanup 7 年前
Arthur Juliani c3644f56 Buffer fix for properly masking gradients 7 年前
GitHub f8d27dc5 Merge branch 'development-0.3' into feature/LSTM2 7 年前
GitHub 2bba53b8 Merge pull request #367 from Unity-Technologies/feature/LSTM2 7 年前
GitHub 99103b29 Use `curr_brain_info` 7 年前
GitHub f134016b On Demand Decision (#308) 7 年前
GitHub dcf58f75 Feature/previous text action (#375) 7 年前
GitHub e0d5b1b0 Fix for when not using teacher helper (#379) 7 年前
GitHub a7c9096f [Semantics] Modified the placeholder names (#381) 7 年前
Vincent Gao 02df3b34 resolved conflicts 7 年前
GitHub 5bdef358 [Fix] Must take mean of entropy to avoid errors what number of agents change during training (#407) 7 年前
Marwan Mattar ba6911c3 Merge branch 'development-0.3' into dev-api-doc-academy 7 年前
GitHub 848b8a58 Fix PPO regression (#434) 7 年前
Joe Ward 9163a54a resolved merge conflict with dev-0.3 branch 7 年前
vincentpierre e5a59e9b [Refactor] renamed is_continuous to is_continuous_action and added is_continuous_observation to decrease confusion 7 年前
eshvk 2d2eb64b [containers] Enables container support for scenes that use visual observations 7 年前
GitHub 74064891 Merge pull request #520 from Unity-Technologies/feature-trainer-ppo-is-continuous 7 年前
GitHub e43c069e Merge pull request #547 from Unity-Technologies/develop-feature-docker-improvements 7 年前
GitHub 237b41f9 Hotfix 0.3.0c (#618) 7 年前
GitHub 78d411f6 Merge pull request #619 from Unity-Technologies/develop 7 年前
GitHub 1a449e98 Hotfix 0.3.1b (#637) 7 年前
vincentpierre 076c8744 Report means instead of totals for losses (#580) 7 年前
GitHub b2675216 Hotfix 0.3.1b (#656) 7 年前
GitHub 755be43e [Cold Fix] Making the episode length and mean reward more accurate for the first episode (#657) 7 年前
GitHub 3b866e9f Use Clipped Gaussian (#649) 7 年前
Arthur Juliani 9477eaa9 Develop fix cumulative reward (#725) 7 年前
GitHub 702d98c6 [Fix] The summary writer is now implemented in the abtract trainer class. (#806) 7 年前
GitHub c17937ef Curiosity Driven Exploration & Pyramids Environments (#739) 7 年前
vincentpierre a22c0f65 [fixing encoding_size] 7 年前
Arthur Juliani d7338050 Enable concurrent sessions 7 年前
Arthur Juliani 5d402be9 Minor Optimizations (#836) 7 年前
GitHub 8526dcfc Fix for visual observations (#847) 7 年前
GitHub 0f65e272 [Addresses #842] (#849) 7 年前
GitHub 47fc38ab Additional Tests & Bug Fixes (#854) 7 年前
GitHub 6e6e8d96 Fix for CC models w/ RNN and Curiosity (#860) 7 年前
GitHub b5722dc9 Fix for visual observation w/ curiosity (#873) 7 年前
vincentpierre 4c6439d5 [Attempted fix] 7 年前
GitHub 6df07946 Fix for Discrete observations + Curiosity (#866) 7 年前
GitHub 68d6170f Error message when using ODD and Curiosity (#883) 7 年前
GitHub bf858cd6 Merge pull request #884 from Unity-Technologies/release-v0.4 7 年前
GitHub 4b3c6c9f Merge pull request #885 from Unity-Technologies/release-v0.4 7 年前
Arthur Juliani 5e48766d Remove discrete observations 7 年前
Arthur Juliani b46b8708 Rename function 7 年前
GitHub b6fe0bca Merge pull request #906 from Unity-Technologies/develop-no-discrete-obs 7 年前
Arthur Juliani 195ac934 Merge branch 'develop' into develop-runs 7 年前
vincentpierre e47cec56 [Initial Commit] 7 年前
unityjeffrey 0d67f311 changed ml agents to ml-agents 7 年前
unityjeffrey 19fb437a changed to Unity ML-Agents Toolkit (english) 7 年前
GitHub 7b9a2905 Merge pull request #916 from Unity-Technologies/hotfix-trademarkupdate 7 年前
Arthur Juliani 9701c3db Merge branch 'hotfix-0' into release-v0.4-fix-curiosity-odd 7 年前
Arthur Juliani 0c6411c2 Use switch between old and new behavior 7 年前
Arthur Juliani 1bfbf67a Simplify approach 7 年前
Arthur Juliani cfb7cfef Code clean-up 7 年前
Arthur Juliani 083cbff5 Add to docstring 7 年前
Arthur Juliani c31f63b5 Fix typo 7 年前
GitHub 3b5af6b2 Merge pull request #937 from Unity-Technologies/release-v0.4-fix-curiosity-odd 7 年前
GitHub f155d661 Merge pull request #908 from Unity-Technologies/hotfix-0 7 年前
GitHub e50ac7ae Merge branch 'develop' into hotfix-0 7 年前
GitHub b36e6a2e Merge pull request #946 from Unity-Technologies/hotfix-0 7 年前
Deric Pang 8380f2f2 Moved curriculum code out of environment code. 6 年前
Deric Pang ae944381 Removing print statements. 6 年前
Deric Pang 798c8bf9 Removing print statements. 6 年前
GitHub 2d715dc5 Revert "Release v0.5 (#1202)" (#1221) 6 年前
GitHub 4e73f770 Merge branch 'develop' into hotfix-0.4b 6 年前
Arthur Juliani 1eb701af Merge remote-tracking branch 'origin/develop' into develop-value-estimates-ppo 6 年前
Arthur Juliani f52d5a92 Merge remote-tracking branch 'origin/develop' into develop-runs 6 年前
GitHub 1e21c143 Merge pull request #934 from Unity-Technologies/develop-value-estimates-ppo 6 年前
GitHub ef3025e6 Merge pull request #1004 from Unity-Technologies/develop-runs 6 年前
GitHub 7d0990cf Fix MultiBrain bug that was introduced with the value estimates (#1018) 6 年前
Arthur Juliani 52865022 [Fix bug 1040] (#1062) 6 年前
Deric Pang 6eba6940 Merge remote-tracking branch 'upstream/develop' into develop-trainer-controller-cleanup 6 年前
Arthur Juliani 3659bbcd Develop multi discrete (#1022) 6 年前
Arthur Juliani fee02a84 Attempted fix for #1059 (#1089) 6 年前
Deric Pang 634280a6 Fixed imports, all tests are passing. 6 年前
Arthur Juliani 17224292 Fix for Curiosity with ODD (#1107) 6 年前
GitHub ded0d8c7 Develop action masking (#1080) 6 年前
Deric Pang e55b1764 Merge remote-tracking branch 'upstream/develop' into develop-flat-code-restructure 6 年前
Deric Pang e0e02ae6 Merge remote-tracking branch 'upstream/develop' into develop-flat-code-restructure 6 年前
Deric Pang cdb41480 Merge remote-tracking branch 'upstream/develop' into develop-flat-code-restructure 6 年前
GitHub 3900ed66 Merge pull request #1083 from Unity-Technologies/develop-flat-code-restructure 6 年前
GitHub fbf92810 Refactor Trainers to use Policy (#1098) 6 年前
GitHub 10d2a19d Release v0.5 (Develop) (#1203) 6 年前
GitHub f8df71a0 Revert "Release v0.5 (Develop) (#1203)" (#1222) 6 年前
GitHub ab5c49e8 Release v0.5 delete unityagents (#1151) 6 年前
GitHub 2d4b4209 Use single scope declaration for models (#1160) 6 年前
GitHub 29084e77 Curriculum learning reward thresholding bug fix (#1141) 6 年前
GitHub 25495874 Merge pull request #1223 from Unity-Technologies/release-v0.5 6 年前
GitHub 560f1bd7 Merge pull request #1224 from Unity-Technologies/release-v0.5 6 年前
GitHub d2c320dd Remove graph scope (#1205) 6 年前
GitHub 3c9603d6 Demonstration Recorder (#1240) 6 年前
GitHub 840417ff Use organized tags for tensorboard stats (#1248) 6 年前
GitHub 6c354d16 New Learning Brain (#1303) 6 年前
vincentpierre 1045b6e7 Fix continuous curriosity 6 年前
GitHub 547f0e98 Merge pull request #1361 from Unity-Technologies/release-v0.6 6 年前
GitHub a196dde2 Merge pull request #1494 from Unity-Technologies/release-v0.6 6 年前
GitHub b6c97cb6 Fix for divide-by-zero error with Discrete Actions (#1520) 6 年前
GitHub 8b1f0a38 Merge pull request #1589 from Unity-Technologies/hotfix-0.6.0a 6 年前
GitHub c0c289cc Merge pull request #1588 from Unity-Technologies/hotfix-0.6.0a 6 年前
GitHub c258b1c3 Move 'take_action' into Policy class (#1669) 6 年前
Ervin T b30f4c90 Split `mlagents` into two packages (#1812) 6 年前
eshvk cc9bdf17 Added logging per Brain of time to update policy, time elapsed during training, time to collect experiences, buffer length, average return 6 年前
eshvk fb04c40c Reorganize to make metrics collection more accurate 6 年前
GitHub a0b44f1b Merge pull request #1858 from Unity-Technologies/develop-esh-metrics 6 年前
GitHub 93760bc4 Adds SubprocessUnityEnvironment for parallel envs (#1751) 6 年前
GitHub 2d1bda57 Merge pull request #1931 from Unity-Technologies/release-v0.8 6 年前
eshvk ef8009d9 Python code reformat via [`black`](https://github.com/ambv/black). 6 年前
GitHub 70d14910 Merge pull request #1934 from Unity-Technologies/develop-black 6 年前
Vincent(Yuan) Gao a15763f8 Clear cumulative_returns_since_policy_update (#2120) 6 年前
GitHub a4d5b2d3 Doc/comment cleanup - Fix some occurrences of 'the the' (#2119) 6 年前
GitHub d5f6b7f8 Merge pull request #2157 from Unity-Technologies/release-v0.8.2 6 年前
GitHub 2671e1a0 Enable mypy in precommit checks (#2177) 6 年前
GitHub 40c7fc48 Merge branch 'develop' into protobuf_update 6 年前
GitHub 4ac79742 Refactor reward signals into separate class (#2144) 6 年前
Jonathan Harper 177ee5b8 Remove unused "last reward" logic, TF nodes 6 年前
GitHub b05c9ac1 Add environment manager for parallel environments (#2209) 5 年前
Chris Elion bb7773c1 add flake8 to precommit 5 年前
GitHub 84d9d622 python timers (#2180) 5 年前
GitHub 9c50abcf GAIL and Pretraining (#2118) 5 年前
GitHub 1c18bd18 Swap 0 set and reward buffer append (#2273) 5 年前
GitHub a5b7cf95 Fix get_value_estimate and buffer append (#2276) 5 年前
Chris Elion 5d07ca1f Merge remote-tracking branch 'origin/develop' into enable-flake8 5 年前
Chris Elion dfdf7b83 fix whitespace and line breaks 5 年前
GitHub f8041534 Merge pull request #2236 from Unity-Technologies/enable-flake8 5 年前
GitHub be4292fb Add different types of visual encoder (nature cnn/resnet) 5 年前
GitHub 6a212f73 Improvements for GAIL (#2296) 5 年前
GitHub 9eb3f049 Cleanup unused code in TrainerController (#2315) 5 年前
GitHub 6225317d refactor vis_encoder_type and add to doc 5 年前
GitHub 53475207 Merge pull request #2380 from Unity-Technologies/release-0.9.0 5 年前
GitHub a9fe719c Add Multi-GPU implementation for PPO (#2288) 5 年前
GitHub d7ebaae1 Return list instead of np array for make_mini_batch() (#2371) 5 年前
GitHub 7b69bd14 Refactor Trainer and Model (#2360) 5 年前
Ervin Teng 072d2ef8 Merge latest develop 5 年前
GitHub bd7eb286 Update reward signals in parallel with policy (#2362) 5 年前
GitHub 689765d6 Modification of reward signals and rl_trainer for SAC (#2433) 5 年前
GitHub 43696d60 Fix bug in add_rewards_output and add test (#2442) 5 年前
GitHub 0a163871 Merge pull request #2469 from Unity-Technologies/release-0.9.2 5 年前
GitHub 3683cc1c Enable learning rate decay to be disabled (#2567) 5 年前
GitHub 832e4a47 Normalize observations when adding experiences (#2556) 5 年前
GitHub 67d754c5 Fix flake8 import warnings (#2584) 5 年前
GitHub 36ed3c16 Fix issue exporting graph with multi-GPU (#2573) 5 年前
GitHub cb144f20 small mypy cleanup (#2637) 5 年前
Jonathan Harper 3fc14963 EXPERIMENTAL horovod support 5 年前
Jonathan Harper 47893e9c minor tweaks 5 年前
GitHub b2fa2268 Merge pull request #2648 from Unity-Technologies/release-0.10.0 5 年前
GitHub 8e931d8d Merge branch 'develop' into release-0.10.0 5 年前
Ervin Teng 094cbe4d Fix bug when batch size is a non-multiple of sequence length (#2661) 5 年前
Anupam Bhatnagar cc208c00 resolving conflicts 5 年前
GitHub b2a2047e Fix bug when batch size is a non-multiple of sequence length (#2661) 5 年前
Chris Elion 43e23941 rough pass at tf2 support, needs cleanup 5 年前
Ervin Teng 024e3677 small mypy cleanup (#2637) 5 年前
Chris Elion 806c77e4 centralize tensorflow imports 5 年前
GitHub f22c41db Merge pull request #2704 from Unity-Technologies/hotfix-0.10.1 5 年前
Anupam Bhatnagar b733b34c resolving conflicts 5 年前
Chris Elion a1967c19 Merge remote-tracking branch 'origin/develop' into try-tf2-support 5 年前
GitHub 5d3e05d1 Fix "memory leak" during inference (#2722) 5 年前
Ervin Teng 12a1e306 start on tf2 policy 5 年前
Ervin Teng e185844f Start on TF 2 policy 5 年前
Chris Elion 3d8a70fb Merge remote-tracking branch 'origin/develop' into try-tf2-support 5 年前
GitHub 0fe5adc2 Develop remove memories (#2795) 5 年前
GitHub 495873e5 Merge pull request #2833 from Unity-Technologies/release-0.11.0 5 年前
Chris Elion 691d21e6 Merge remote-tracking branch 'origin/develop' into try-tf2-support 5 年前
GitHub c6c01a03 Enable pylint and fix a few things (#2767) 5 年前
Jonathan Harper 8550679d Merge branch 'develop' into release-0.11.0 5 年前
GitHub 4da157fe more pylint fixes (#2842) 5 年前
Chris Elion fca51de8 Merge remote-tracking branch 'origin/develop' into try-tf2-support 5 年前
GitHub bf68edcf ingore attribute-defined-outside-init in multi_gpu_policy (#2876) 5 年前
Chris Elion 73a346cb cleanup 5 年前
GitHub f57b7ac6 Allow usage with tensorflow 2.0.0 (via tf.compat.v1) (#2665) 5 年前
Chris Elion 7353ad22 Merge remote-tracking branch 'origin/develop' into try-tf2-support 5 年前
Ervin Teng 987e0e3a Merge tf2 branch 5 年前
Ervin Teng 748c250e Somewhat running 5 年前
Andrew Cohen 13fe9cf8 Bubbled up indexing of AllBrainInfo to trainer controller from trainers 5 年前
Ervin Teng 9dbbfd77 Somewhat running 5 年前
Ervin Teng 5e6de46f Add normalizer 5 年前
GitHub c0453ae1 Merge pull request #2912 from Unity-Technologies/develop-allbraininfo 5 年前
Ervin Teng 5e1c1a00 Tweaks to Policy 5 年前
GitHub 99981937 fix errors from new flake8-comprehensions (#2917) 5 年前
Ervin Teng a665daed It's mostly training 5 年前
Ervin Teng 3eb1e9c2 Pytorch port of continuous PPO 5 年前
Ervin Teng d46b60b3 Add ReLU to the dense 5 年前
Ervin Teng ed2c35b9 Remove some comments 5 年前
Ervin Teng 135a5bb4 Add dummy save methods 5 年前
GitHub 69d1a033 Develop remove past action communication (#2913) 5 年前
Andrew Cohen e96b80db recieves brain_name and identifier on python side 5 年前
Ervin Teng 437c6c2f Add dummy save methods 5 年前
Ervin Teng d983a636 Speed up a bit faster 5 年前
Ervin Teng 54644477 Merge branch 'develop' of github.com:Unity-Technologies/ml-agents into develop-nomaxstep-test 5 年前
Ervin Teng df5ee7bf Split buffer into two buffers (PPO works) 5 年前
Ervin Teng 3a4fa244 Switch to tanh squash in PPO 5 年前
Ervin Teng fd0647a6 Rename append_update_buffer to append_to_update_buffer 5 年前
Andrew Cohen bd056007 recieves brain_name and identifier on python side 5 年前
GitHub d4780a55 Merge pull request #3010 from Unity-Technologies/release-0.12.0-to-master 5 年前
GitHub 652488d9 check for numpy float64 (#2948) 5 年前
GitHub 681093cf cherry pick PR#3032 (#3066) 5 年前
GitHub 213cd68d Split Buffer into processing and update buffers (#2964) 5 年前
Ervin Teng 34f9577c Merge branch 'develop' into develop-agentprocessor 5 年前
Ervin Teng 2c9376bc Convert to trajectory 5 年前
Ervin Teng 9e661f0c Looks like it's training 5 年前
GitHub ef2514ba Develop cold fix recurrent (#3032) 5 年前
GitHub 35c995e9 Merge pull request #3038 from Unity-Technologies/develop 5 年前
Ervin Teng a97ffb47 Attempt reward reporting 5 年前
Ervin Teng 9c5fdd31 Stats reporting is working 5 年前
Ervin Teng eb4a04a5 Merge branch 'master' into develop-tanhsquash 5 年前
GitHub 3b4b0d55 Remove random normal epsilon (#3039) 5 年前
Ervin Teng e0e57188 Clean up some stuff 5 年前
Ervin Teng b501f75b reduce sum to do squashing properly 5 年前
Andrew Cohen 5097bcc0 recieves brain_name and identifier on python side 5 年前
Ervin Teng f94365a2 No longer using ProcessingBuffer for PPO 5 年前
Ervin Teng 8b3b9e6c Move trajectory and related functions to trajectory.py 5 年前
Ervin Teng 76abf968 Add back max_step logic 5 年前
Ervin Teng 88b1123a Merge branch 'master' of github.com:Unity-Technologies/ml-agents into develop-agentprocessor 5 年前
Andrew Cohen 8578b0b7 add_policy and create_policy separated 5 年前
GitHub 36048cb6 Moving Env Manager to Trainers (#3062) The Env Manager is only used by the trainer codebase. The entry point to interact with an environment is UnityEnvironment. 5 年前
Ervin Teng c9116ed2 Move some common logic to buffer class 5 年前
GitHub 90db165f Add --namespace-packages to mypy for mlagents (#3075) 5 年前
Ervin Teng c7632aa7 Fix some bugs for visual obs 5 年前
GitHub 1fa07edb Remove Standalone Offline BC Training (#2969) 5 年前
Andrew Cohen 614d276f recieves brain_name and identifier on python side 5 年前
Ervin Teng 5ab2563b Fixes for recurrent 5 年前
Andrew Cohen 96922f84 recieves brain_name and identifier on python side 5 年前
Chris Elion fdc810ff move (first pass) 5 年前
GitHub 58b6c7c2 Rename mlagents.envs to mlagents_envs (#3083) 5 年前
Ervin Teng 27c2a55b Lots of test fixes 5 年前
Ervin Teng 97d66e71 Remove BootstrapExperience 5 年前
Ervin Teng 324d217b Move agent_id to Trajectory 5 年前
Ervin Teng 77ff4822 Add back next_obs 5 年前
Andrew Cohen d1edbf43 add_policy and create_policy separated 5 年前
Ervin Teng 2b811fc8 Properly report value estimates and episode length 5 年前
GitHub 2fd305e7 Move add_experiences out of trainer, add Trajectories (#3067) 5 年前
Ervin Teng c330f6f6 Merge branch 'master' into develop-agentprocessor 5 年前
Andrew Cohen de902fbb passes all pytest and C# tests 5 年前
GitHub 2ac242f7 Remove TrainerMetrics and add CSVWriter using new StatsWriter API (#3108) 5 年前
Ervin Teng fdf9aea7 Make conversion methods part of NamedTuples 5 年前
Ervin Teng 6242b67d Add way to check if trajectory is done or max_reached 5 年前
GitHub 0b5b1b01 Develop magic string + trajectory (#3122) 5 年前
GitHub c7da0139 Fix mypy errors in trainer code. (#3135) 5 年前
Andrew Cohen 082789ea Merge branch 'master' into develop-magic-string 5 年前
Andrew Cohen 6a4e7cf9 added ppo/sac_policy attributes to keep up with master 5 年前
Ervin Teng 1bd791e5 Merge branch 'master' into develop-agentprocessor 5 年前
Andrew Cohen 3e76adbd fixing more ci tests 5 年前
Ervin Teng e577d5ea Fix some mypy issues and remove unused code 5 年前
Andrew Cohen c3a92afa fixing ci ppo_policy 5 年前
Ervin Teng 9e0ef912 Fixed value estimate bug 5 年前
GitHub bec2e8f0 Add Trajectory/Policy Queues, move Trainer logic to advance() (#3113) 5 年前
Ervin Teng db743971 Move private methods out of trainer, simplify interface 5 年前
Andrew Cohen c8514c18 Merge branch 'master' into develop-magic-string 5 年前
GitHub 45010af3 Add stats reporter class and re-enable missing stats (#3076) 5 年前
Ervin Teng b3a4e641 Remove some vestigial code 5 年前
Ervin Teng 48793ec1 Fix test 5 年前
Ervin Teng 3d25f9d2 Merge branch 'master' into develop-agentprocessor 5 年前
GitHub 5bc7531b Get step from policy (#3223) 5 年前
GitHub d985dded Merge branch 'master' into merge-release-0.13.0 5 年前
Ervin Teng 35d73d1d Split value and policy networks 5 年前
GitHub f058b18c Replace BrainInfos with BatchedStepResult (#3207) 5 年前
Ervin Teng 03c750a7 Move some functionality to optimizer 5 年前
Ervin Teng 2c1ef594 Move some functionality to optimizer-black 5 年前
Ervin Teng 6688453b Move some functionality to optimizer-black 5 年前
Ervin Teng 91ffde5f More incremental steps to separation 5 年前
Ervin Teng cd74e51b More progress 5 年前
Ervin Teng 2373cae8 Move methods into common optimizer 5 年前
Ervin Teng 76ad64d7 Some more bugfixes 5 年前
Ervin Teng bc04f9dc Working continuous updates 5 年前
Ervin Teng 29f3330f Merge master into hotfix-0.13.1 5 年前
Ervin Teng 17dc17e5 Discrete PPO working 5 年前
GitHub d52fb483 Merge pull request #3264 from Unity-Technologies/hotfix-0.13.1 5 年前
Ervin Teng 2b63415e Clean up policy files 5 年前
Ervin Teng 9ad99eb6 Combined model and policy for PPO 5 年前
GitHub 329b23e0 Fix extra summary being written when loading from checkpoint (#3272) 5 年前
Ervin Teng 6baaf980 Remove PPO model 5 年前
Ervin Teng e912fa47 Simplify creation of optimizer, breaks multi-GPU 5 年前
Ervin Teng 164732a9 Move optimizer creation to Trainer, fix some of the reward signals 5 年前
Ervin Teng 151e3b1c Move policy to common location, remove epsilon 5 年前
Ervin Teng d9fe2f9c Unified policy 5 年前
Ervin Teng 0ef40c08 SAC CC working 5 年前
Ervin Teng db249ceb Merge branch 'master' into develop-splitpolicyoptimizer 5 年前
Ervin Teng 28f7608f Clean up value head creation 5 年前
Ervin Teng edeceefd Zeroed version of LSTM working for PPO 5 年前
Ervin Teng 649c4185 Zero out memory 5 年前
Ervin Teng 7f53bf8b Cleanup LSTM code 5 年前
Ervin Teng 5ec49542 SAC LSTM isn't broken 5 年前
Ervin Teng 7d616651 Add burn-in for memory PPO 5 年前
Ervin Teng 4871f49c Fix comments for PPO 5 年前
Ervin Teng cfc2f455 Fix BC and tests 5 年前
Ervin Teng 78671383 Move initialization call around 5 年前
GitHub dd86e879 Separate out optimizer creation and policy graph creation (#3355) 5 年前
Ervin Teng dcbb90e1 Fix graph init in ghost trainer 5 年前
Ervin Teng 14720e2d Remove burn-in 5 年前
Ervin Teng 328476d8 Move check for creation into nn_policy 5 年前
Ervin Teng ce110201 Add optional burn-in for SAC as well 5 年前
Ervin Teng cbfbff2c Split optimizer and TFOptimizer 5 年前
Ervin Teng 4d94e180 Move optimizer to common folder 5 年前
Ervin Teng 00017bab Temporarily remove multi-GPU 5 年前
Ervin Teng 441e6a0c Add typing to optimizer, rename self.tf_optimizer 5 年前
Ervin Teng ffdc41bb Removed floating constants 5 年前
Ervin Teng 7c0fa1c4 Remove action_holder placeholder 5 年前
Ervin Teng be9d772e Add option to not condition sigma on obs 5 年前
Ervin Teng 30e4424c Fix PPO optimizer creation 5 年前
Ervin Teng ff607162 Move learning rate reporting 5 年前
Ervin Teng 88998fc9 Add add_policy docstrings 5 年前
Ervin Teng c735e722 Make create critic methods private 5 年前
GitHub c145e75b Split Policy and Optimizer, common Policy for PPO and SAC (#3345) 5 年前
Ervin Teng da6daebd Make create losses private 5 年前
Andrew Cohen 5b0aca29 Merge branch 'master' into soccer-fives 5 年前
Ervin Teng 14f2a7f2 Rename LearningModel to ModelUtils 5 年前
Ervin Teng 1156b9b3 Merge branch 'develop-splitpolicyoptimizer' into develop-removeactionholder 5 年前
Ervin Teng 53c25fb1 Move one-hot out of policy and remove selected_actions 5 年前
Anupam Bhatnagar e04fcd71 Merge branch 'master' into master-into-release-0.14.1 5 年前
GitHub 97a1d4b1 [change] Remove the action_holder placeholder from the policy. (#3492) 5 年前
Andrew Cohen de73baa9 Merge branch 'master' into soccer-fives 5 年前
GitHub 7d954797 [change] Separate action outputs into OutputDistributions object (#3514) 5 年前
GitHub e4177de0 [change] Organize trainer files a bit better (#3538) 5 年前
GitHub 870338b4 [bug-fix] Fix issue with more than one continuous actions (#3547) 5 年前
Andrew Cohen 573b1f6d Merge branch 'master' into soccer-fives 5 年前
GitHub cb153a0f [change] Change warning language when adversarial scene is used without self-play (#3561) 5 年前
Anupam Bhatnagar f4dbedcf removed extraneous logging imports and loggers 5 年前
GitHub 86141eee Merge pull request #3560 from Unity-Technologies/new-logger 5 年前
Anupam Bhatnagar e8e0078e first commit 5 年前
Anupam Bhatnagar 07b15ae7 [skip-ci] small refactors 5 年前
GitHub e3af96ca Merge branch 'master' into develop-demo-load-seek 5 年前
GitHub c42a11c3 [change] Throw a proper error when sequence length is greater than batch size. (#3583) 5 年前
GitHub 94de596b [change] Remove concatenate in discrete action probabilities to improve inference performance (#3598) 5 年前
Andrew Cohen b1cfa74d Merge branch 'master' into develop-test-imitation 5 年前
GitHub ec278616 Hotfixes for Release 0.15.1 (#3698) 5 年前
Andrew Cohen 53bea15c Merge branch 'master' into soccer-fives 5 年前
Andrew Cohen ac261e36 Merge branch 'master' into self-play-mutex 5 年前
GitHub 6709a9bf [change] Clean up trainer interface, clean up GhostTrainer stats (#3634) 5 年前
Andrew Cohen eefc4811 Merge branch 'master' into self-play-mutex 5 年前
Andrew Cohen 9f09a65d team id centric ghost trainer 5 年前
GitHub 4ecd6ad3 Fix how we set logging levels (#3703) 5 年前
Andrew Cohen 59b88be6 Merge branch 'master' into self-play-mutex 5 年前
GitHub 9cbc3fa2 Asymmetric self-play (#3653) 5 年前
Ervin Teng 06fa3d39 Merge branch 'master' into develop-sac-apex 5 年前
Anupam Bhatnagar 50e52d9c Merge branch 'master' into distributed-training 5 年前
Andrew Cohen 3de78baa wrapped trainer has internal policy ghost 5 年前
Andrew Cohen 3013774b alternative to internal-policy fix 5 年前
Anupam Bhatnagar 001fce2a first commit 5 年前
Anupam Bhatnagar 9341f7a2 [skip-ci] small refactors 5 年前
GitHub b841c9ab Wrapped trainer has internal policy in GhostTrainer 5 年前
Andrew Cohen 930d6fa3 Merge branch 'self-play-mutex' into soccer-2v1 5 年前
Ervin Teng f29b17a9 Don't block one policy queue 5 年前
Anupam Bhatnagar eb9f3f19 [skip ci] replace buffer length by buffer size 5 年前
GitHub aae58330 Merge branch 'master' into develop-add-inference-examples 5 年前
Anupam Bhatnagar 7ae32cc2 [skip ci] replace buffer length by buffer size 5 年前
Andrew Cohen b0c506a6 Merge branch 'soccer-2v1' into asymm-envs 5 年前
Anupam Bhatnagar ac80ec82 [skip ci] increment steps on training 5 年前
Anupam Bhatnagar d49ceecc [skip ci] moving summary writer to update_policy 5 年前
Anupam Bhatnagar 95ba923d [skip ci] fix first summary statement output 5 年前
Anupam Bhatnagar e8d09d00 [skip ci] increment steps on training 5 年前
Ervin Teng 5e980ec1 Merge branch 'master' into develop-sac-apex 5 年前
Anupam Bhatnagar 45bac63e [skip ci] more fixes 5 年前
Anupam Bhatnagar 86e16a64 [skip ci] tweaking 3dball configs 5 年前
Anupam Bhatnagar 2c68e921 [skip ci] fix first summary statement output 5 年前
Anupam Bhatnagar 9d7dd3b6 [skip ci] moving step increment to trainer from environment for sac 5 年前
Andrew Cohen de0656b6 Merge branch 'internal-policy-ghost' into soccer-2v1 5 年前
Andrew Cohen 85304aff Merge branch 'soccer-2v1' into asymm-envs 5 年前
Andrew Cohen 89db8428 Merge branch 'internal-policy-ghost-alternate' into soccer-2v1 5 年前
Andrew Cohen 26c0033c Merge branch 'soccer-2v1' into asymm-envs 5 年前
Arthur Juliani 6879bae4 Initial optimizer port 5 年前
GitHub 4d23200b [refactor] Run Trainers in separate threads (#3690) 5 年前
Arthur Juliani 7c3bd376 Refactoring policy and optimizer 5 年前
Arthur Juliani 2e51260a Resolving a few bugs 5 年前
Arthur Juliani 947f0d32 Slightly closer to running model 5 年前
Arthur Juliani 3c82bf59 Training runs, but doesn’t actually work 5 年前
Arthur Juliani 8c6f4696 Fix a couple additional bugs 5 年前
Arthur Juliani 61d671d8 Add conditional sigma for distribution 5 年前
Arthur Juliani 4a50444f Support discrete actions as well 5 年前
Arthur Juliani a11a79e4 Continuous and discrete now train 5 年前
Arthur Juliani a5b5b109 Mulkti-discrete now working 5 年前
Arthur Juliani 5f936990 Visual observations now train as well 5 年前
Arthur Juliani 212e2d1d Merge remote-tracking branch 'origin/master' into develop-add-fire 5 年前
GitHub 232519e4 [refactor] Move output artifacts to a single results/ folder (#3829) 5 年前
Arthur Juliani 82688e5c GRU in-progress and dynamic cnns 5 年前
Arthur Juliani 29223931 Fix for memories 5 年前
Arthur Juliani 1736559f Combine actor and critic classes. Initial export. 5 年前
Arthur Juliani ca887743 Support tf and pytorch alongside one another 5 年前
Arthur Juliani 9835d26c Prepare model for onnx export 5 年前
GitHub 422247a0 update versions for patch release (#3970) 5 年前
Chris Elion 68b68396 Merge remote-tracking branch 'origin/master' into release_1_to_master 5 年前
GitHub 4641038e Renaming max_step to interrupted in TermialStep(s) (#3908) 5 年前
vincentpierre c34dd5b6 Merge branch 'master' into develop-gym-wrapper 5 年前
Andrew Cohen a2f8319a Merge branch 'master' into asymm-envs 5 年前
Arthur Juliani 89ad3020 Merge remote-tracking branch 'origin/master' into develop-add-fire 5 年前
Arthur Juliani be7e55e1 Use LSTM and fix a few merge errors 5 年前
Andrew Cohen 4a3ad193 Add constant decay to beta and epsilon 5 年前
GitHub c5b94ca6 Use LR schedule for beta and epsilon (#3940) 5 年前
Arthur Juliani 2b3a6347 Merge remote-tracking branch 'origin/master' into develop-add-fire 5 年前
Arthur Juliani b7be7f04 Fix bug in probs calculation 5 年前
Arthur Juliani 3eef9d78 Optimize np -> tensor operations 5 年前
Christopher Goy ba80b292 format files with pre-commit. 4 年前
GitHub e274bcf6 Update precommit flake8 (#3961) 5 年前
GitHub f7373172 Merge pull request #4385 from Unity-Technologies/release_2_verified-barracuda-1.0.2 4 年前
Ervin Teng 72180f9b Experiment with JIT compiler 5 年前
Andrew Cohen 1e50c76e calculating gradient norms 5 年前
vincentpierre 6ddfe74f Merge branch 'master' into develop-gym-wrapper 5 年前
Andrew Cohen 0e965a4d sensitivity 5 年前
Andrew Cohen c1f91b5a slightly nicer output 5 年前
Andrew Cohen 23b84dea ignoring commit checks but write to csv 5 年前
Andrew Cohen 61aa9915 write to csv 5 年前
Andrew Cohen d794964f constant beta 5 年前
Arthur Juliani 28e095e0 Merge remote-tracking branch 'origin/master' into develop-add-fire 5 年前
Ervin Teng f214836a Changes for speed test 5 年前
Andrew Cohen 13c2a209 added opp, decay eps removed 5 年前
GitHub e92b4f88 [refactor] Structure configuration files into classes (#3936) 5 年前
Andrew Cohen 50e4585f fixed beta 5 年前
GitHub 09853e13 [refactor] Move checkpoint saving into trainer (#4034) 5 年前
GitHub 7229214c [cleanup] Remove unused param keys (#4067) 5 年前
Andrew Cohen c0f7052b Merge branch 'master' into develop-sampler-refactor 5 年前
Andrew Cohen 34ecc7e6 Merge branch 'master' into asymm-envs 5 年前
GitHub a1c63c4b Release 3 Cherry-pick bug-fixes and doc changes from master (#4102) 5 年前
GitHub 8a49e8e0 [refactor] Remove nonfunctional `output_path` option from TrainerSettings (#4087) 5 年前
Anupam Bhatnagar 4afd8f92 first commit 5 年前
Andrew Cohen 21f871db Merge branch 'develop-constant-decay' into asymm-envs 5 年前
Anupam Bhatnagar 8b6c19ae [skip ci] adding should_still_train method to ppo 5 年前
Anupam Bhatnagar 392a84f1 [skip ci] fixing property decorator in sac 5 年前
Arthur Juliani 9724c9ac Merge master 5 年前
Arthur Juliani 46874cc7 ONNX exporting 5 年前
GitHub 0d80d87a Fix for discrete actions (#4181) 4 年前
Anupam Bhatnagar 24d5f881 first commit 5 年前
GitHub cde8bd29 Convert List[np.ndarray] to np.ndarray before using torch.as_tensor (#4183) 4 年前
GitHub 05a11c96 Develop add fire exp framework (#4213) 4 年前
GitHub 45154f52 Pytorch port of SAC (#4219) 4 年前
GitHub a28e2767 Update add-fire to latest master, including Policy refactor (#4263) 4 年前
GitHub 69579611 [refactor] Refactor Actor and Critic classes (#4287) 4 年前
Ruo-Ping Dong 6feec58a add Saver class (only TF working) 4 年前
GitHub 93517833 [feature] Fix TF tests, add --torch CLI option, allow run TF without torch installed (#4305) 4 年前
Andrew Cohen f74d301a Merge branch 'develop-add-fire' into develop-add-fire-bc 4 年前
vincentpierre 599d7e9f Merging master 5 年前
GitHub 3a982317 [add-fire] Add learning rate and beta/epsilon decay to PyTorch (#4318) 4 年前
GitHub 7ddfd81f Added Reward Providers for Torch (#4280) 4 年前
Andrew Cohen bf8b2328 Merge branch 'develop-add-fire' into develop-add-fire-bc 4 年前
HH 7afa1761 Merge branch 'master' into hh/develop/ragdoll-updates 5 年前
Ruo-Ping Dong 71fe4df6 fix formatting and test 4 年前
Ruo-Ping Dong 09a741c8 small improvement 4 年前
Ruo-Ping Dong 79d89158 Merge branch 'develop-add-fire' into develop-add-fire-checkpoint 4 年前
GitHub 3bcb029b [refactor] Remove BrainParameters from Python code (#4138) 4 年前
Ruo-Ping Dong e06812aa fix tests 4 年前
HH 0fdac847 Merge branch 'master' into hh/develop/crawler-ragdoll-updates 5 年前
GitHub 84440f05 Convert checkpoints to .NN (#4127) 4 年前
Arthur Juliani 6bee0fd1 Merge master 4 年前
GitHub 1f5eb9da add pyupgrade to pre-commit and run (#4239) 4 年前
GitHub 129f9ddc [MLA-427] make pyupgrade convert f-strings too (#4244) 4 年前
HH 9e6edb6c try new reward falloff 4 年前
HH c3c83920 cleanup 4 年前
Andrew Cohen d8c123a0 Merge branch 'master' into sensitivity 4 年前
Andrew Cohen 02df39ab ignore precommit 4 年前
Andrew Cohen fa35292c write hist to tb 4 年前
GitHub 1b098c9a Refactor TFPolicy and Policy (#4254) 4 年前
GitHub 380fef57 [refactor] Move TF-specific files to tf/ folder (#4266) 4 年前
GitHub beb5aca5 [refactor] Make classes except Optimizer framework agnostic (#4268) 4 年前
Andrew Cohen 06e4356c Merge branch 'master' into sensitivity 4 年前
Arthur Juliani 1a123641 Merge remote-tracking branch 'origin/master' into r5-master 4 年前
GitHub 3f44a0bc cleanup around AdamOptimizer (#4333) 4 年前
Andrew Cohen 598826fe Merge branch 'develop-add-fire' into develop-add-fire-bc 4 年前
Ruo-Ping Dong d3eb6c46 Merge branch 'develop-add-fire' into develop-add-fire-checkpoint 4 年前
Ervin Teng eaa59cf4 Use loss masks in PPO. 4 年前
Ruo-Ping Dong 95858e25 update saver interface and add tests 4 年前
Anupam Bhatnagar a5cc4d03 Merge branch 'master' into global-variables 4 年前
Ervin Teng a48a0af4 Proper shape of masks 4 年前
Ruo-Ping Dong 523248be update 4 年前
GitHub f374f87a [add-fire] Add LSTM to SAC, LSTM fixes and initializations (#4324) 4 年前
Ervin Teng 1d4bc99e Proper mask mean for PPO 4 年前
Ervin Teng 6ba23234 Fix dtype for actions 4 年前
HH 8eaddb61 Merge branch 'master' into hh/develop/loco-walker-variable-speed 4 年前
Ruo-Ping Dong 59cc1a9f Merge branch 'develop-add-fire' into develop-add-fire-checkpoint 4 年前
Ruo-Ping Dong 409a161c fix bc tests 4 年前
GitHub 25dc8c3d Add Saver Class to handle all save/load/checkpoint/export work (#4323) 4 年前
Ervin Teng d65a9326 Merge branch 'master' into develop-add-fire-mm3 4 年前
Ruo-Ping Dong d57aa9ab Merge branch 'develop-add-fire-mm3' into develop-add-fire-checkpoint 4 年前
GitHub bd6bcd2f Merge master and add Saver class for save/load checkpoints 4 年前
Ervin Teng f8b40b9b Don't flatten when there are multiple continuous actions 4 年前
GitHub 6de31a03 [add-fire] Fix masked mean for 2d tensors (#4364) 4 年前
Ervin Teng 5c1717d1 Bugfixes for continuous case 4 年前
Ervin Teng 42e25b25 Merge branch 'develop-add-fire' into develop-add-fire-memoryclass 4 年前
GitHub 8985a040 Removing the experiment script from add fire (#4373) 4 年前
Christopher Goy 5a233353 Merge remote-tracking branch 'origin/master' into release_6-to-master 4 年前
Andrew Cohen a65d08c7 ghost trainer tests 4 年前
GitHub 49545ce1 Pytorch ghost trainer (#4370) 4 年前
Ervin Teng a04e68a4 Merge branch 'develop-add-fire' into develop-add-fire-memoryclass 4 年前
HH c72553c8 reset these to master 4 年前
Andrew Cohen fcec6734 added comments 4 年前
GitHub 0d0d2ead [add-fire] Revert unneeded changes back to master (#4389) 4 年前
Ervin Teng 987ea2d0 Revert unneeded changes back to master 4 年前
Andrew Cohen e7c9ff35 clean up docstrings create policies 4 年前
Andrew Cohen 039ae17f capitalize Tensorflow 4 年前
GitHub 1955af9e [feature] Add experimental PyTorch support (#4335) 4 年前
vincentpierre 9f51ab14 Saving the reward providers 4 年前
Ruo-Ping Dong c47ffc20 Rename saver 4 年前
vincentpierre 108fac9a Replace torch.detach().cpu().numpy() with a utils method 4 年前
HH d9962254 Merge branch 'master' into hh/develop/loco-walker-variable-speed 4 年前
GitHub ec8c24d8 add fire clean up docstrings in create policies (#4391) 4 年前
GitHub 328353bc Torch : Saving/Loading of the reward providers (#4405) 4 年前
vincentpierre 31750e97 Using item() in place of to_numpy() 4 年前
Ruo-Ping Dong 88eff042 Merge branch 'master' into develop-saver-name 4 年前
GitHub 48f217b9 Rename Saver to ModelSaver (#4402) 4 年前
Anupam Bhatnagar f4f1a8d9 merge master into trainer-plugin branch 4 年前
GitHub 498934f9 Replace torch.detach().cpu().numpy() with a utils method (#4406) 4 年前
Ruo-Ping Dong 27fb4270 brain_name to behavior_name 4 年前
GitHub bfda9576 Replace brain_name with behavior_name (#4419) 4 年前
Ruo-Ping Dong fd1dc3a6 Merge branch 'master' into develop-torch-omp 4 年前
Ruo-Ping Dong f5dee9d1 jit for continuous control 4 年前
GitHub 4e93cb6e [torch] Restructure PyTorch encoders (#4421) 4 年前
GitHub 6f534366 Add torch_utils class, auto-detect CUDA availability (#4403) 4 年前
Ruo-Ping Dong fb50b0ec add wb 4 年前
Andrew Cohen 3997b14b Merge branch 'master' into develop-hybrid-actions 4 年前
Ervin Teng 3e771cbb Permute visual obs outside of network 4 年前
Ervin Teng 77c810fb Fix SAC and make utility method 4 年前
vincentpierre 181bdec0 - 4 年前
Andrew Cohen 643c8e58 ppo extended 4 年前
Andrew Cohen 44c9879e action models 4 年前
Ervin Teng e8431a6d Proper dimensions for entropy, sum before bonus in PPO 4 年前
GitHub c188781b [life improvement] Moving Python files around (#4531) 4 年前
Ervin Teng be159ad3 Make entropy reporting same as TF 4 年前
Ervin Teng b3e15d30 Always use separate critic 4 年前
Ervin Teng bbf7b71d Revert to shared 4 年前
Andrew Cohen e5f14400 Merge branch 'master' into develop-hybrid-actions-singleton 4 年前
GitHub a690af74 [refactor] Make PyTorch the default and TensorFlow optional (#4517) 4 年前
Andrew Cohen eaecb59e torch utils to and from buffer 4 年前
Andrew Cohen 8013e544 ignoring Instance of 'AbstractContextManager' has no 'enter_context' member (no-member) 4 年前
GitHub e0ef30a5 [bug-fix] Change entropy computation and loss reporting in Torch to match TF (#4538) 4 年前
GitHub cb8e4d25 Add ActionSpec (#4586) 4 年前
Andrew Cohen 9689cf2c remove *_action_* from function names 4 年前
vincentpierre a3a9a56b Merge branch 'exp-multi-head-attention' into exp-bullet-hell 4 年前
Ruo-Ping Dong 9e08be87 Merge branch 'master' into release_9_branch_merge 4 年前
vincentpierre d3d4eb90 Trainer with attention 4 年前
vincentpierre 7ef3c9a1 Trainer with attention 4 年前
GitHub b853e5ba Action buffer (#4612) 4 年前
GitHub 3c96a3a2 Action Model (#4580) 4 年前
GitHub 88d3ec3e Merge master into hybrid actions staging branch (#4704) 4 年前
GitHub 23800f33 Merge branch 'master' into develop-action-spec 4 年前
GitHub 85a7c0f7 [bug-fix] Add clipping to PyTorch policy, fix initialization (#4649) 4 年前
Ervin Teng 184f27c6 Make buffer type-agnostic 4 年前
Ervin Teng 0548057d Use real clipping (as in TF) 4 年前
Ervin Teng 0cdb2040 Use tanh squash 4 年前
GitHub 3ab45b3f [bug-fix] Separate critic only for PPO (#4661) 4 年前
GitHub 2a8c6800 [bug-fix] Add clipping to PyTorch policy, fix initialization (#4649) (#4662) 4 年前
Ruo-Ping Dong 953cb6bb Merge branch 'master' into develop-windows-delay 4 年前
Ervin Teng 2be74856 Double policy loss for no reason 4 年前
GitHub f1206bed Cherry-pick separate critic only for PPO (#4661) (#4666) 4 年前
Ervin Teng 3b15cc32 Multiprocessing but Stats are quite broken 4 年前
Ervin Teng 3eba7423 Increase initialization 4 年前
Andrew Cohen 3f771e61 add ActionBuffers and utils 4 年前
Ervin Teng 3765c15a Merge branch 'develop-multitype-buffer' into develop-unified-obs 4 年前
Ervin Teng 7a0ebfbd Pretty broken 4 年前
Ervin Teng 95bdbba3 Less broken PPO 4 年前
vincentpierre b863af57 Removing TensorFlow Trainers 4 年前
Ervin Teng 3b614302 Merge branch 'develop-multitype-buffer' into develop-centralizedcritic 4 年前
Ervin Teng 6c77ac7a Update SAC, fix PPO batching 4 年前
vincentpierre 713e65fb removing tensorflow testing for pytest and yamato 4 年前
Andrew Cohen bd917c9c action buffer passes continuous 4 年前
vincentpierre 2dd34aa5 Formatting 4 年前
Andrew Cohen ad951493 debugging discrete 4 年前
Andrew Cohen fcf6471e 2d discrete passes 4 年前
Ervin Teng fdaa8c3d Merge branch 'develop-unified-obs' into develop-centralizedcritic 4 年前
Andrew Cohen 056630d7 sac continuous and discrete train 4 年前
GitHub 990f801a Develop hybrid action staging (#4702) 4 年前
vincentpierre 735fcd52 [WIP] Refactor trainers to use list of obs rather than vec and vis obs 4 年前
Ervin Teng 6846af21 Multi-input network 4 年前
vincentpierre 93ca1409 fixing the tests 4 年前
vincentpierre 7a5cc9ec Merge master into develop-rm-tf 4 年前
Ervin Teng 56dcd75a Get next critic observations into value estimate 4 年前
vincentpierre c1587bce Solving merge conflicts 4 年前
Andrew Cohen 8172b3d6 test_simple_rl/reward providers pass tf/torch 4 年前
Arthur Juliani 0d2f8887 Merge remote-tracking branch 'origin/master' into goal-conditioning 4 年前
Andrew Cohen 73b778cc rename extract to from_dict 4 年前
GitHub cc6b4564 Multi Directional Walker and Initial Hypernetwork (#4740) 4 年前
Ervin Teng 25dfd883 Merge branch 'master' into develop-centralizedcritic 4 年前
Andrew Cohen cd73cce2 test_trajectory fixed 4 年前
GitHub 22658a40 use sensor types to differentiate obs (#4749) 4 年前
GitHub 903d3afe Merge pull request #4707 from Unity-Technologies/develop-rm-tf 4 年前
Andrew Cohen 498b1ee6 Merge branch 'develop-action-buffer' into develop-hybrid-actions-singleton 4 年前
GitHub d2d46103 Remove print from ppo tf opti 4 年前
Andrew Cohen 68b98915 Merge branch 'develop-action-buffer' of https://github.com/Unity-Technologies/ml-agents into develop-action-buffer 4 年前
GitHub 29d94c7c Merge pull request #4734 from Unity-Technologies/develop-obs-as-list 4 年前
Andrew Cohen 1d234d1d bc works 4 年前
Andrew Cohen c0d01baf Merge branch 'master' into merge-release11-master 4 年前
Andrew Cohen e81e68de comms agent and fixed hallway 4 年前
vincentpierre 44ed3258 Merging master 4 年前
Andrew Cohen ca5a5194 soccer comms on the cloud 4 年前
Andrew Cohen 3457cd3c save only discrete actions as prev 4 年前
Andrew Cohen 12828bdc remove tau from diff for 4 年前
Andrew Cohen 55e928cf fix 16 envs soccer 4 年前
vincentpierre 449712b0 renaming sensor_spec to sensor_specS 4 年前
Andrew Cohen 35769b53 Merge branch 'develop-action-buffer' into develop-hybrid-actions-singleton 4 年前
Andrew Cohen c843e3d4 hallway collab exps on cloud 4 年前
Andrew Cohen a20287f7 continuous comms 4 年前
Andrew Cohen 14ea0ad2 comment out comms in ppo optimizer 4 年前
Andrew Cohen 17496265 move AgentAction, ActionLogProbs, and ActionFlattener to separate files 4 年前
Chris Elion 76ebc20c Merge remote-tracking branch 'origin/master' into r12-to-master 4 年前
Andrew Cohen f57875e0 layer norm 4 年前
GitHub 458fee17 Merge pull request #4763 from Unity-Technologies/develop-att 4 年前
Andrew Cohen bc77c990 layer norm and weight decay with fixed architecture 4 年前
Ervin Teng 330fc1d0 Merge branch 'master' into develop-centralizedcritic-mm 4 年前
vincentpierre 519c5f47 merging master 4 年前
Ruo-Ping Dong 8ed14762 Merge branch 'develop-hybrid-actions-singleton' into develop-hybrid-actions-csharp 4 年前
Andrew Cohen 96c01a63 custom layer norm 4 年前
GitHub cc948a41 Policy output actiontuple (#4651) 4 年前
GitHub 14129a08 [MLA-470] Barracuda + TF cleanup (#4837) 4 年前
Andrew Cohen 1bc2ff96 add weight decay to trainers 4 年前
Arthur Juliani 0b4b0992 Rename more files 4 年前
Ervin Teng aba633b2 Merge branch 'develop-attention-refactor' into develop-centralizedcritic-mm 4 年前
Ruo-Ping Dong a7d04be6 Merge branch 'develop-hybrid-actions-singleton' into develop-hybrid-actions-csharp 4 年前
Ruo-Ping Dong 180d3e20 Merge branch 'develop-centralizedcritic-mm' into develop-cc-teammanager 4 年前
HH 0024a286 merge ervin's new stuff 4 年前
Ervin Teng 9c3da1b6 New buffer layout, TeamObsUtil, pad dead agents 4 年前
GitHub 67ad9651 Merge pull request #4825 from Unity-Technologies/sensor-types 4 年前
vincentpierre 8660b1c2 merging master 4 年前
Ervin Teng 3daa17a9 Merge branch 'develop-centralizedcritic-mm' into develop-zombieteammanager 4 年前
Ervin Teng 6b8b3db3 Try subtract marginalized value 4 年前
Ervin Teng 2203fc0e Bootstrap if teammates not done 4 年前
Ervin Teng 092ea232 Some more progress - still broken 4 年前
Ervin Teng 457b2630 I think it's running 4 年前
Ervin Teng 3e481f7d Fix issue with team_actions 4 年前
brccabral 457fb612 Merge branch 'master' of https://github.com/Unity-Technologies/ml-agents 4 年前
Ervin Teng 0919a32d Add next action and next team obs 4 年前
Andrew Cohen 07e92563 Merge branch 'develop-centralizedcritic-counterfact' into develop-coma2 4 年前
Andrew Cohen 6e1826f8 might be right 4 年前
vincentpierre 52b011d6 _ 4 年前
vincentpierre 5f9ea5ea _ 4 年前
vincentpierre 6f3ea7b8 _ 4 年前
Andrew Cohen feb38012 add lambda return and target network 4 年前
Andrew Cohen 5741f8f6 no target net 4 年前
Andrew Cohen 79c658d2 remove normalize advantages 4 年前
Andrew Cohen a92baab6 add target network back 4 年前
Andrew Cohen a4c336c2 value estimator 4 年前
vincentpierre 115e944b adding weight decay for experimentation 4 年前
Andrew Cohen d1285626 add target net 4 年前
Andrew Cohen bd341f7f no target, increase lambda 4 年前
Andrew Cohen bdd73403 remove prints 4 年前
Andrew Cohen 8a5d291f use v return 4 年前
Andrew Cohen 6b2a6c5f use target net 4 年前
Andrew Cohen fce842aa adding zombie to coma2 brnch 4 年前
Andrew Cohen 7f491ae7 cloud run with coma2 of held out zombie test env 4 年前
Andrew Cohen 9af22d30 use only value funcs 4 年前
Andrew Cohen a3453c5d target of baseline is returns_v 4 年前
Andrew Cohen 511a9a7e no baseline 4 年前
Andrew Cohen e3239529 remove target update 4 年前
Andrew Cohen 95253b47 ntegrate teammate dones 4 年前
Andrew Cohen 2c3147b9 add value clipping 4 年前
Andrew Cohen 687f411b try again on cloud 4 年前
Andrew Cohen b0bf7817 clipping values and updated zombie 4 年前
Andrew Cohen b5271926 remove value head clipping 4 年前
Ervin Teng a4eaebcb Add trust region to COMA updates 4 年前
Ervin Teng bca6c92c Add clipping, use same network for value 4 年前
Ervin Teng 3283b6a1 Remove Q-net for perf 4 年前
Ervin Teng 3aefac39 Use GAE again 4 年前
GitHub 64fc7f43 Buffer key enums (#4907) 4 年前
Andrew Cohen b08318f9 add clipping 4 年前
Ervin Teng adad5183 Weight decay, regularizaton loss 4 年前
Ervin Teng 4fe8d036 Try reduce bias 4 年前
Andrew Cohen 39592650 remove clipping 4 年前
Ervin Teng 2be83146 Use same network 4 年前
Ervin Teng 6094613d try reduce bias more 4 年前
Andrew Cohen 74885bab add local reward to plot 4 年前
Ervin Teng ac4dc336 Remove reg loss, still stable 4 年前
Andrew Cohen c08fefbc reduce initialization weights 4 年前
Ervin Teng 64b34759 Black format 4 年前
Ervin Teng 1cf27871 Merge branch 'develop-coma2-samenet' into develop-coma2-samenet-sum 4 年前
Ervin Teng b6f88d6d Merge branch 'develop-base-teammanager' into develop-agentprocessor-teammanager 4 年前
Andrew Cohen 6bd396ee add critic to optimizer, ppo runs 4 年前
Andrew Cohen 3aec18a1 fix precommit errors 4 年前
Andrew Cohen 8efdeeb0 make critic a property 4 年前
Ervin Teng 0bde7598 Back out trainer changes 4 年前
Andrew Cohen c74dca9f add SharedActorCritic 4 年前
Ruo-Ping Dong c87bce9e Merge branch 'master' into develop-base-teammanager 4 年前
Ervin Teng a9116382 Bug fixes 4 年前
Andrew Cohen 98d647de MultiInputNetBody 4 年前
Ervin Teng ae7643b8 Proper critic memories for PPO 4 年前
vincentpierre e1b94b8b Merge branch 'master' into develop-var-len-obs-feature 4 年前
Chris Elion e4f51ca7 Merge remote-tracking branch 'origin/master' into MLA-1734-demo-provider 4 年前
Ervin Teng d4438878 Merge branch 'develop-base-teammanager' into develop-agentprocessor-teammanager 4 年前
Ervin Teng fd3f05b9 Enable GAIL to decay 4 年前
Ervin Teng 97842f81 Fix non-lstm PPO 4 年前
Ervin Teng e46a86ad Merge branch 'master' into develop-superpush-int 4 年前
HH 15d512f9 Merge branch 'master' into hh/develop/dodgeball 4 年前
Ervin Teng 9bc88c41 Running COMA (not sure if learning) 4 年前
Ervin Teng 2f209c12 Buffer fixes 4 年前
GitHub 338af2ec Move the Critic into the Optimizer (#4939) 4 年前
HH 4c947151 Merge branch 'main' into hh/develop/dodgeball 4 年前
Andrew Cohen 4b58527c checkout ppo/optimizer from main 4 年前
Ervin Teng 61781a1a Merge branch 'main' into develop-agentprocessor-teammanager 4 年前
Andrew Cohen 9060da06 Merge branch 'develop-agentprocessor-teammanager' into develop-coma2-trainer 4 年前
Arthur Juliani 06c147f8 Merge remote-tracking branch 'origin/main' into goal-conditioning-new 4 年前
GitHub d36a5242 Python Dataflow for Group Manager (#4926) 4 年前
Ervin Teng c8137dcd Merge branch 'main' into develop-superpush-int 4 年前
GitHub f16ce486 Update v2-staging from main (March 15) (#5123) 4 年前
GitHub 47db8ce1 [bug-fix] Fix padding for List entries in buffer (#5046) 4 年前
Christopher Goy 921ba4f0 Update v2-staging from main (March 15) (#5123) 4 年前
Christopher Goy ebe45056 Merge branch 'main' into release_14_branch-to-main 4 年前
Ervin Teng 8902c058 Merge branch 'main' into develop-coma2-trainer 4 年前
GitHub fc5d0a3f [bug-fix] Fix save/restore critic, add test (#5062) 4 年前
Chris Elion 970f1d40 Merge remote-tracking branch 'origin/v2-staging' into MLA-1634-ObservationSpec 4 年前
Ervin Teng 1f026c70 Merge branch 'main' into develop-superpush-branch-cleanup 4 年前
Ervin Teng ce872033 Revert "Merge branch 'main' into develop-superpush-branch-cleanup" 4 年前
GitHub 8f35bdd3 POCA trainer (#5005) 4 年前
Andrew Cohen 9e77d7e1 Merge branch 'main' into develop-soccer-groupman 4 年前
GitHub 62314056 Fix ghost curriculum and make steps private (#5098) 4 年前
Ervin Teng 54ffbed6 [cherry-pick] Fix ghost curriculum and make steps private (#5098) 4 年前
Andrew Cohen 9176247c Merge branch 'main' into develop-soccer-groupman-mod 4 年前
GitHub e81e038b Fix end episode for POCA, add warning for group reward if not POCA (#5113) 4 年前
GitHub 63169e2c [cherry-pick] Fix group rewards for POCA, add warning for non-POCA trainers (#5120) 4 年前
Ervin Teng d1c24251 [bug-fix] When agent isn't training, don't clear update buffer (#5205) 4 年前
Andrew Cohen 18be47e8 Merge branch 'main' into develop-soccer-groupman-mod 4 年前
Ervin Teng a9ca7b3b Do burn-in for PPO 4 年前
GitHub ff21216d [bug-fix] When agent isn't training, don't clear update buffer (#5205) 4 年前
vincentpierre 5d384292 forgot one 4 年前