ml-agents

作者	SHA1	备注	提交日期
GitHub	8f35bdd3	POCA trainer (#5005 ) Co-authored-by: Ervin Teng <ervin@unity3d.com> Co-authored-by: Ruo-Ping Dong <ruoping.dong@unity3d.com> Co-authored-by: Chris Elion <chris.elion@unity3d.com> Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>	4 年前
GitHub	62314056	Fix ghost curriculum and make steps private (#5098 ) * use get step to determine curriculum * add to CHANGELOG * Make step in trainer private (#5099) Co-authored-by: Ervin T <ervin@unity3d.com>	4 年前
Ervin Teng	54ffbed6	[cherry-pick] Fix ghost curriculum and make steps private (#5098 ) * use get step to determine curriculum * add to CHANGELOG * Make step in trainer private (#5099) Co-authored-by: Ervin T <ervin@unity3d.com>	4 年前
Andrew Cohen	9176247c	Merge branch 'main' into develop-soccer-groupman-mod	4 年前
GitHub	e81e038b	Fix end episode for POCA, add warning for group reward if not POCA (#5113 ) * Fix end episode for POCA, add warning for group reward if not POCA * Add missing imports	4 年前
GitHub	e79d8a9d	[bug-fix] Move POCA critic to default device (#5124 ) * Move critic to default device * Make sure to clone onto default device * Add some debug stuff * Some more debug * Fix issue * Fix bool tensor too	4 年前
GitHub	63169e2c	[cherry-pick] Fix group rewards for POCA, add warning for non-POCA trainers (#5120 ) * Fix end episode for POCA, add warning for group reward if not POCA (#5113) * Fix end episode for POCA, add warning for group reward if not POCA * Add missing imports * Use np.any, which is faster	4 年前
GitHub	e6143a83	[bug-fix] Move POCA critic to default device (#5124 ) (#5131 ) * Move critic to default device * Make sure to clone onto default device * Add some debug stuff * Some more debug * Fix issue * Fix bool tensor too	4 年前
Ervin Teng	d1c24251	[bug-fix] When agent isn't training, don't clear update buffer (#5205 ) * Don't clear update buffer, but don't append to it either * Update changelog * Address comments * Make experience replay buffer saving more verbose (cherry picked from commit 63e7ad44d96b7663b91f005ca1d88f4f3b11dd2a)	4 年前
Ervin Teng	c108da4a	[bug-fix] Fix POCA LSTM, pad sequences in the back (#5206 ) * Pad buffer at the end * Fix padding in optimizer value estimate * Fix additional bugs and POCA * Fix groupmate obs, add tests * Update changelog * Improve tests * Address comments * Fix poca test * Fix buffer test * Increase entropy for Hallway * Add EOF newline * Fix Behavior Name * Address comments (cherry picked from commit 2ce6810846ba9268e4fb5fb082fa54e90414c980)	4 年前
Andrew Cohen	18be47e8	Merge branch 'main' into develop-soccer-groupman-mod	4 年前
Ervin Teng	81b74634	Fix additional bugs and POCA	4 年前
Ervin Teng	c05ec9af	Fix groupmate obs, add tests	4 年前
Ervin Teng	9fd4a81e	Address comments	4 年前
GitHub	ff21216d	[bug-fix] When agent isn't training, don't clear update buffer (#5205 ) * Don't clear update buffer, but don't append to it either * Update changelog * Address comments * Make experience replay buffer saving more verbose	4 年前
GitHub	c5589b59	[bug-fix] Fix POCA LSTM, pad sequences in the back (#5206 ) * Pad buffer at the end * Fix padding in optimizer value estimate * Fix additional bugs and POCA * Fix groupmate obs, add tests * Update changelog * Improve tests * Address comments * Fix poca test * Fix buffer test * Increase entropy for Hallway * Add EOF newline * Fix Behavior Name * Address comments	4 年前
vincentpierre	983982ee	Removing misleading learning rate	3 年前

17 次代码提交 (56528b87-3f23-4082-ba99-869bb337b682)