ml-agents

作者	SHA1	备注	提交日期
GitHub	4ac79742	Refactor reward signals into separate class (#2144 ) * Create new class (RewardSignal) that represents a reward signal. * Add value heads for each reward signal in the PPO model. * Make summaries agnostic to the type of reward signals, and log weighted rewards per reward signal. * Move extrinsic and curiosity rewards into this new structure. * Allow defining multiple reward signals in YAML file. Add documentation for this new structure.	5 年前
GitHub	9c50abcf	GAIL and Pretraining (#2118 ) Based on the new reward signals architecture, add BC pretrainer and GAIL for PPO. Main changes: - A new GAILRewardSignal and GAILModel for GAIL/VAIL - A BCModule component (not a reward signal) to do pretraining during RL - Documentation for both of these - Change to Demo Loader that lets you load multiple demo files in a folder - Example Demo files for all of our tested sample environments (for future regression testing)	5 年前
GitHub	be4292fb	Add different types of visual encoder (nature cnn/resnet) Add resnet and nature cnn in addition to default visual encoder	5 年前
GitHub	d7ebaae1	Return list instead of np array for make_mini_batch() (#2371 ) Return list instead of np array for make_mini_batch() to reduce time copying data	5 年前
GitHub	bd7eb286	Update reward signals in parallel with policy (#2362 )	5 年前
GitHub	689765d6	Modification of reward signals and rl_trainer for SAC (#2433 ) * Adds evaluate_batch to reward signals. Evaluates on minibatch rather than on BrainInfo. * Changes the way reward signal results are reported in rl_trainer so that we get the pure, unprocessed environment reward separate from the reward signals. * Moves end_episode to rl_trainer * Fixed bug with BCModule with RNN	5 年前
GitHub	6a81a2f4	Add Soft Actor-Critic as trainer option (#2341 ) * Add Soft Actor-Critic model, trainer, and policy and sac_trainer_config.yaml * Add documentation for SAC and tweak PPO documentation to reference the new pages. * Add tests for SAC, change simple_rl test to run both PPO and SAC.	5 年前
GitHub	3df585d9	Fix issue where SAC encoder type is always simple (#2548 )	5 年前
GitHub	67d754c5	Fix flake8 import warnings (#2584 ) We have been ignoring unused imports and star imports via flake8. These are both bad practice and grow over time without automated checking. This commit attempts to fix all existing import errors and add back the corresponding flake8 checks.	5 年前
GitHub	149ebd67	Fix crash with VAIL + GAIL (#2598 )	5 年前
GitHub	24ba9d58	Develop deprecate broadcasting (#2669 ) * Feature Deprecation : Online Behavioral Cloning In this PR : - Delete the online_bc_trainer - Delete the tests for online bc - delete the configuration file for online bc training * Deleting the BCTeacherHelper.cs Script TODO : - Remove usages in the scene - Documentation Edits DO NOT MERGE * IMPORTANT : REMOVED ALL IL SCENES - Removed all the IL scenes from the Examples folder * Removed all mentions of online BC training in the Documentation * Made a note in the Migrating.md doc about the removal of the Online BC feature. * Modified the Academy UI to remove the control checkbox and replaced it with a train in the editor checkbox * Removed the Broadcast functionality from the non-Learning brains * Bug fix * Note that the scenes are broken since the BroadcastHub has changed * Modified the LL-API for Python to remove the broadcasting functiuonality. * All unit tests are running * Modifie...	5 年前
GitHub	69d1a033	Develop remove past action communication (#2913 ) * Modifying the .proto files * attempt 1 at refactoring Python * works for ppo hallway * changing the documentation * now works with both sac and ppo both training and inference * Ned to fix the tests * TODOs : - Fix the demonstration recorder - Fix the demonstration loader - verify the intrinsic reward signals work - Fix the tests on Python - Fix the C# tests * Regenerating the protos * fix proto typo * protos and modifying the C# demo recorder * modified the demo loader * Demos are loading * IMPORTANT : THESE ARE THE FILES USED FOR CONVERSION FROM OLD TO NEW FORMAT * Modified all the demo files * Fixing all the tests * fixing ci * addressing comments * removing reference to memories in the ll-api	5 年前
Ervin Teng	a80b47d1	Fix demo loader and remaining tests	5 年前
GitHub	652488d9	check for numpy float64 (#2948 )	5 年前
GitHub	213cd68d	Split Buffer into processing and update buffers (#2964 ) This is the first in a series of PRs that intend to move the agent processing logic (add_experiences and process_experiences) out of the trainer and into a separate class. The plan is to do so in steps: - Split the processing buffers (keeping track of agent trajectories and assembling trajectories) and update buffer (complete trajectories to be used for training) within the Trainer (this PR) - Move the processing buffer and add/process experiences into a separate, outside class - Change the data type of the update buffer to be a Trajectory - Place and read Trajectories from queues, add subscription mechanism for both AgentProcessor and Trainers	5 年前
GitHub	1fa07edb	Remove Standalone Offline BC Training (#2969 )	5 年前
GitHub	58b6c7c2	Rename mlagents.envs to mlagents_envs (#3083 )	5 年前
GitHub	29c91b14	update flake8 plugin version and fix warnings (#3180 )	5 年前
Yuan Gao	0817c44b	Moved the demo files	5 年前
GitHub	f058b18c	Replace BrainInfos with BatchedStepResult (#3207 )	5 年前
Ervin Teng	164732a9	Move optimizer creation to Trainer, fix some of the reward signals	5 年前
Ervin Teng	151e3b1c	Move policy to common location, remove epsilon	5 年前
GitHub	0ff8f9af	Create ML-Agents Package (#3267 ) Convert the UnitySDK to a Packman Package. - Separate Examples into a sample project. - Move core UnitySDK Code into com.unity.ml-agents. - Create asmdefs for the ml-agents package. - Add package validation tests for win/linux/max. - Update protobuf generation scripts. - Add Barracuda as a package dependency for ML-Agents. (users no longer have to install it themselves).	5 年前
Ervin Teng	db249ceb	Merge branch 'master' into develop-splitpolicyoptimizer	5 年前
Ervin Teng	cadf6603	Fix SAC CC and some reward signal tests	5 年前
Ervin Teng	48b39b80	Fix ghost trainer and all tests	5 年前
GitHub	e4177de0	[change] Organize trainer files a bit better (#3538 )	5 年前
Ervin Teng	ee27e2cc	Fix tests	5 年前
Arthur Juliani	3c82bf59	Training runs, but doesn’t actually work	5 年前
GitHub	adeb6536	Catch dimension mismatches between demos and policy (#3821 )	5 年前
Arthur Juliani	212e2d1d	Merge remote-tracking branch 'origin/master' into develop-add-fire	5 年前
GitHub	232519e4	[refactor] Move output artifacts to a single results/ folder (#3829 )	5 年前
Arthur Juliani	ca887743	Support tf and pytorch alongside one another	5 年前
Arthur Juliani	89ad3020	Merge remote-tracking branch 'origin/master' into develop-add-fire # Conflicts: # ml-agents/mlagents/trainers/policy/tf_policy.py	5 年前
GitHub	e92b4f88	[refactor] Structure configuration files into classes (#3936 )	5 年前
GitHub	a1c63c4b	Release 3 Cherry-pick bug-fixes and doc changes from master (#4102 ) * [bug-fix] Fix regression in --initialize-from feature (#4086) * Fixed text in GettingStarted page specifying the logdir for tensorboard. Before it was in a directory summaries which no longer existed. Results are now saved to the results dir. (#4085) * [refactor] Remove nonfunctional `output_path` option from TrainerSettings (#4087) * Reverting bug introduced in #4071 (#4101) Co-authored-by: Scott <Scott.m.jordan91@gmail.com> Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>	4 年前
GitHub	a28e2767	Update add-fire to latest master, including Policy refactor (#4263 ) * Update Dockerfile * Separate send environment data from reset (#4128) * Fixed a typo on ML-Agents-Overview.md (#4130) Fixed redundant "to" word from the sentence since it is probably a typo in document. * Updated the badge’s link to point to the newest doc version * Replaced all of the doc to release_3_doc * Fix 3DBall and 3DBallHard SAC regressions (#4132) * Move memory validation to settings * Update docs * Add settings test * Update to release_3 in installation.md (#4144) * rename to SideChannelManager +backcompat (#4137) * Remove comment about logo with --help (#4148) * [bugfix] Make FoodCollector heuristic playable (#4147) * Make FoodCollector heuristic playable * Update changelog * script to check for old release links and references (#4153) * Remove package validation suite from Project (#4146) * RayPerceptionSensor: handle empty and invalid tags (#4155...	4 年前
GitHub	69579611	[refactor] Refactor Actor and Critic classes (#4287 )	4 年前
GitHub	93517833	[feature] Fix TF tests, add --torch CLI option, allow run TF without torch installed (#4305 )	4 年前
GitHub	3bcb029b	[refactor] Remove BrainParameters from Python code (#4138 )	4 年前
Ruo-Ping Dong	95858e25	update saver interface and add tests	4 年前
GitHub	25dc8c3d	Add Saver Class to handle all save/load/checkpoint/export work (#4323 )	4 年前
Ruo-Ping Dong	27fb4270	brain_name to behavior_name	4 年前

43 次代码提交 (7006b5ff-59cf-4ebd-bc26-d7f40cfc3791)