ml-agents

作者	SHA1	备注	提交日期
Arthur Juliani	6879bae4	Initial optimizer port	5 年前
Arthur Juliani	7c3bd376	Refactoring policy and optimizer	5 年前
Arthur Juliani	2e51260a	Resolving a few bugs	5 年前
Arthur Juliani	947f0d32	Slightly closer to running model	5 年前
Arthur Juliani	3c82bf59	Training runs, but doesn’t actually work	5 年前
Arthur Juliani	8c6f4696	Fix a couple additional bugs	5 年前
Arthur Juliani	4a50444f	Support discrete actions as well	5 年前
Arthur Juliani	a11a79e4	Continuous and discrete now train	5 年前
Arthur Juliani	a5b5b109	Mulkti-discrete now working	5 年前
Arthur Juliani	5f936990	Visual observations now train as well	5 年前
Arthur Juliani	82688e5c	GRU in-progress and dynamic cnns	5 年前
Arthur Juliani	29223931	Fix for memories	5 年前
Arthur Juliani	1736559f	Combine actor and critic classes. Initial export.	5 年前
Arthur Juliani	ca887743	Support tf and pytorch alongside one another	5 年前
Arthur Juliani	9835d26c	Prepare model for onnx export	5 年前
Arthur Juliani	be7e55e1	Use LSTM and fix a few merge errors	5 年前
Arthur Juliani	b7be7f04	Fix bug in probs calculation	5 年前
Arthur Juliani	3eef9d78	Optimize np -> tensor operations	5 年前
Ervin Teng	72180f9b	Experiment with JIT compiler	5 年前
Arthur Juliani	9724c9ac	Merge master	4 年前
GitHub	0d80d87a	Fix for discrete actions (#4181 )	4 年前
GitHub	cde8bd29	Convert List[np.ndarray] to np.ndarray before using torch.as_tensor (#4183 ) Big speedup in visual obs	4 年前
GitHub	05a11c96	Develop add fire exp framework (#4213 ) * Experiment branch for comparing torch * Updates and merging ervin changes * improvements on experiment_torch.py * Better printing of results * preliminary gpu experiment * Testing gpu * Prepare to see a lot of commits, because I like my IDE and I am testing on a server and I am using git to sync the two * Prepare to see a lot of commits, because I like my IDE and I am testing on a server and I am using git to sync the two * _ * _ * _ * _ * _ * _ * _ * _ * Attempt at gpu on tf. Does not work * _ * _ * _ * _ * _ * _ * _ * _ * _ * _ * _ * Fixing learn.py	4 年前
GitHub	a28e2767	Update add-fire to latest master, including Policy refactor (#4263 ) * Update Dockerfile * Separate send environment data from reset (#4128) * Fixed a typo on ML-Agents-Overview.md (#4130) Fixed redundant "to" word from the sentence since it is probably a typo in document. * Updated the badge’s link to point to the newest doc version * Replaced all of the doc to release_3_doc * Fix 3DBall and 3DBallHard SAC regressions (#4132) * Move memory validation to settings * Update docs * Add settings test * Update to release_3 in installation.md (#4144) * rename to SideChannelManager +backcompat (#4137) * Remove comment about logo with --help (#4148) * [bugfix] Make FoodCollector heuristic playable (#4147) * Make FoodCollector heuristic playable * Update changelog * script to check for old release links and references (#4153) * Remove package validation suite from Project (#4146) * RayPerceptionSensor: handle empty and invalid tags (#4155...	4 年前
Ruo-Ping Dong	6feec58a	add Saver class (only TF working)	4 年前
GitHub	3a982317	[add-fire] Add learning rate and beta/epsilon decay to PyTorch (#4318 )	4 年前
GitHub	7ddfd81f	Added Reward Providers for Torch (#4280 ) * Added Reward Providers for Torch * Use NetworkBody to encode state in the reward providers * Integrating the reward prodiders with ppo and torch * work in progress, integration with PPO. Not training properly Pyramids at the moment * Integration in PPO * Removing duplicate file * Gail and Curiosity working * addressing comments * Enfore float32 for tests * enfore np.float32 in buffer	4 年前
Ruo-Ping Dong	71fe4df6	fix formatting and test	4 年前
Ruo-Ping Dong	d3eb6c46	Merge branch 'develop-add-fire' into develop-add-fire-checkpoint	4 年前
Ervin Teng	eaa59cf4	Use loss masks in PPO.	4 年前
Ervin Teng	a48a0af4	Proper shape of masks	4 年前
GitHub	f374f87a	[add-fire] Add LSTM to SAC, LSTM fixes and initializations (#4324 )	4 年前
Ervin Teng	1d4bc99e	Proper mask mean for PPO	4 年前
Ruo-Ping Dong	59cc1a9f	Merge branch 'develop-add-fire' into develop-add-fire-checkpoint	4 年前
Ervin Teng	f8b40b9b	Don't flatten when there are multiple continuous actions	4 年前
GitHub	6de31a03	[add-fire] Fix masked mean for 2d tensors (#4364 )	4 年前
vincentpierre	9f51ab14	Saving the reward providers	4 年前
vincentpierre	108fac9a	Replace torch.detach().cpu().numpy() with a utils method	4 年前
vincentpierre	31750e97	Using item() in place of to_numpy()	4 年前
GitHub	498934f9	Replace torch.detach().cpu().numpy() with a utils method (#4406 ) * Replace torch.detach().cpu().numpy() with a utils method * Using item() in place of to_numpy() * more use of item() and additional tests	4 年前
Ruo-Ping Dong	f5dee9d1	jit for continuous control	4 年前
GitHub	4e93cb6e	[torch] Restructure PyTorch encoders (#4421 ) * Move linear encoding to NetworkBody * moved encoders to processors (#4420) * fix bad merge * Get it running * Replace mentions of visual_encoders * Remove output_size property * Fix tests * Fix some references * Revert test_simple_rl * Fix networks test * Make curiosity test more accomodating * Rename total_input_size * [Bug fix] Fix bug in GAIL gradient penalty (#4425) (#4426) Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com> * Up number of steps * Rename to visual_processors and vector_processors Co-authored-by: andrewcoh <54679309+andrewcoh@users.noreply.github.com> Co-authored-by: Andrew Cohen <andrew.cohen@unity3d.com> Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>	4 年前
GitHub	6f534366	Add torch_utils class, auto-detect CUDA availability (#4403 ) * Add torch_utils * Use torch from torch_utils * Add torch to banned modules in CI * Better import error handling * Fix flake8 errors * Address comments * Move networks to GPU if enabled * Switch to torch_utils * More flake8 problems * Move reward providers to GPU/CPU * Remove anothere set default tensor * Fix banned import in test	4 年前
Ruo-Ping Dong	fb50b0ec	add wb	4 年前
Ervin Teng	3e771cbb	Permute visual obs outside of network	4 年前
Ervin Teng	77c810fb	Fix SAC and make utility method	4 年前
vincentpierre	181bdec0	-	4 年前
Andrew Cohen	643c8e58	ppo extended	4 年前
Andrew Cohen	44c9879e	action models	4 年前
Ervin Teng	e8431a6d	Proper dimensions for entropy, sum before bonus in PPO	4 年前
Ervin Teng	be159ad3	Make entropy reporting same as TF	4 年前
Andrew Cohen	eaecb59e	torch utils to and from buffer	4 年前
GitHub	e0ef30a5	[bug-fix] Change entropy computation and loss reporting in Torch to match TF (#4538 ) * Proper dimensions for entropy, sum before bonus in PPO * Make entropy reporting same as TF * Always use separate critic * Revert to shared * Remove unneeded extra line * Change entropy shape in test * Change another entropy shape * Add entropy summing to evaluate_actions * Add notes about torch.abs(policy_loss)	4 年前
vincentpierre	d3d4eb90	Trainer with attention	4 年前
vincentpierre	7ef3c9a1	Trainer with attention	4 年前
GitHub	b853e5ba	Action buffer (#4612 ) Co-authored-by: Ervin T <ervin@unity3d.com> Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>	4 年前
GitHub	3c96a3a2	Action Model (#4580 ) Co-authored-by: Ervin T <ervin@unity3d.com> Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>	4 年前
GitHub	85a7c0f7	[bug-fix] Add clipping to PyTorch policy, fix initialization (#4649 )	4 年前
Ervin Teng	2be74856	Double policy loss for no reason	4 年前
Andrew Cohen	3f771e61	add ActionBuffers and utils	4 年前
Ervin Teng	7a0ebfbd	Pretty broken	4 年前
Ervin Teng	6c77ac7a	Update SAC, fix PPO batching	4 年前
Andrew Cohen	bd917c9c	action buffer passes continuous	4 年前
Andrew Cohen	ad951493	debugging discrete	4 年前
Andrew Cohen	fcf6471e	2d discrete passes	4 年前
Andrew Cohen	056630d7	sac continuous and discrete train	4 年前
vincentpierre	735fcd52	[WIP] Refactor trainers to use list of obs rather than vec and vis obs	4 年前
Ervin Teng	6846af21	Multi-input network	4 年前
Ervin Teng	56dcd75a	Get next critic observations into value estimate	4 年前
vincentpierre	c1587bce	Solving merge conflicts	4 年前
GitHub	cc6b4564	Multi Directional Walker and Initial Hypernetwork (#4740 )	4 年前
Ervin Teng	25dfd883	Merge branch 'master' into develop-centralizedcritic	4 年前
GitHub	22658a40	use sensor types to differentiate obs (#4749 )	4 年前
Andrew Cohen	498b1ee6	Merge branch 'develop-action-buffer' into develop-hybrid-actions-singleton	4 年前
Andrew Cohen	e81e68de	comms agent and fixed hallway	4 年前
vincentpierre	44ed3258	Merging master	4 年前
Andrew Cohen	ca5a5194	soccer comms on the cloud	4 年前
vincentpierre	449712b0	renaming sensor_spec to sensor_specS	4 年前
Andrew Cohen	c843e3d4	hallway collab exps on cloud	4 年前
Andrew Cohen	a20287f7	continuous comms	4 年前
Andrew Cohen	14ea0ad2	comment out comms in ppo optimizer	4 年前
Andrew Cohen	f57875e0	layer norm	4 年前
Andrew Cohen	bc77c990	layer norm and weight decay with fixed architecture	4 年前
Ervin Teng	330fc1d0	Merge branch 'master' into develop-centralizedcritic-mm	4 年前
Andrew Cohen	96c01a63	custom layer norm	4 年前
GitHub	14129a08	[MLA-470] Barracuda + TF cleanup (#4837 ) * remove barracuda conversion, tensorflow cleanup * unused var	4 年前
Andrew Cohen	1bc2ff96	add weight decay to trainers	4 年前
Arthur Juliani	0b4b0992	Rename more files	4 年前
Ervin Teng	aba633b2	Merge branch 'develop-attention-refactor' into develop-centralizedcritic-mm	4 年前
Ervin Teng	9c3da1b6	New buffer layout, TeamObsUtil, pad dead agents	4 年前
GitHub	67ad9651	Merge pull request #4825 from Unity-Technologies/sensor-types [WIP] Observation Types	4 年前
Ervin Teng	6b8b3db3	Try subtract marginalized value	4 年前
Ervin Teng	457b2630	I think it's running	4 年前
Ervin Teng	3e481f7d	Fix issue with team_actions	4 年前
Ervin Teng	0919a32d	Add next action and next team obs	4 年前
Andrew Cohen	6e1826f8	might be right	4 年前
vincentpierre	52b011d6	_	4 年前
vincentpierre	5f9ea5ea	_	4 年前
Andrew Cohen	feb38012	add lambda return and target network	4 年前
Andrew Cohen	5741f8f6	no target net	4 年前
Andrew Cohen	a92baab6	add target network back	4 年前
Andrew Cohen	a4c336c2	value estimator	4 年前
vincentpierre	115e944b	adding weight decay for experimentation	4 年前
Andrew Cohen	d1285626	add target net	4 年前
Andrew Cohen	bd341f7f	no target, increase lambda	4 年前
Andrew Cohen	fce842aa	adding zombie to coma2 brnch	4 年前
Andrew Cohen	7f491ae7	cloud run with coma2 of held out zombie test env	4 年前
Andrew Cohen	9af22d30	use only value funcs	4 年前
Andrew Cohen	e3239529	remove target update	4 年前
Andrew Cohen	2c3147b9	add value clipping	4 年前
Andrew Cohen	687f411b	try again on cloud	4 年前
Ervin Teng	a4eaebcb	Add trust region to COMA updates	4 年前
Ervin Teng	3283b6a1	Remove Q-net for perf	4 年前
GitHub	64fc7f43	Buffer key enums (#4907 )	4 年前
Ervin Teng	adad5183	Weight decay, regularizaton loss	4 年前
Andrew Cohen	39592650	remove clipping	4 年前
Ervin Teng	2be83146	Use same network	4 年前
Ervin Teng	ac4dc336	Remove reg loss, still stable	4 年前
Ervin Teng	64b34759	Black format	4 年前
Ervin Teng	b6f88d6d	Merge branch 'develop-base-teammanager' into develop-agentprocessor-teammanager	4 年前
Andrew Cohen	6bd396ee	add critic to optimizer, ppo runs	4 年前
Andrew Cohen	3aec18a1	fix precommit errors	4 年前
Andrew Cohen	8efdeeb0	make critic a property	4 年前
Andrew Cohen	c74dca9f	add SharedActorCritic	4 年前
Ervin Teng	ae7643b8	Proper critic memories for PPO	4 年前
Ervin Teng	fd3f05b9	Enable GAIL to decay	4 年前
Ervin Teng	e46a86ad	Merge branch 'master' into develop-superpush-int	4 年前
GitHub	338af2ec	Move the Critic into the Optimizer (#4939 ) Co-authored-by: Ervin Teng <ervin@unity3d.com>	4 年前
GitHub	f16ce486	Update v2-staging from main (March 15) (#5123 )	4 年前
GitHub	fc5d0a3f	[bug-fix] Fix save/restore critic, add test (#5062 ) * Fix save/restore critic, add test * Rename module for PPO * Use correct policy in test	4 年前
Ervin Teng	a9ca7b3b	Do burn-in for PPO	4 年前
vincentpierre	5d384292	forgot one	3 年前

1 2 3

132 次代码提交 (d56bf3a8-32f8-4bd6-a803-65a3604ba7b3)