ml-agents

作者	SHA1	备注	提交日期
Deric Pang	634280a6	Fixed imports, all tests are passing.	6 年前
GitHub	fbf92810	Refactor Trainers to use Policy (#1098 )	6 年前
eshvk	ef8009d9	Python code reformat via [`black`](https://github.com/ambv/black ). Features: - Reformat code via black. - Adding circleci configurations. - Add contribution guidelines. Steps to reproduce: - `pip install black` - `black <source code directory>`	6 年前
GitHub	d7ebaae1	Return list instead of np array for make_mini_batch() (#2371 ) Return list instead of np array for make_mini_batch() to reduce time copying data	5 年前
GitHub	7b69bd14	Refactor Trainer and Model (#2360 ) - Move common functions to trainer.py, model.pyfromppo/trainer.py, ppo/policy.pyandppo/model.py' - Introduce RLTrainer class and move most of add_experiences and some common reward signal code there. PPO and SAC will inherit from this, not so much BC Trainer. - Add methods to Buffer to enable sampling, truncating, and save/loading. - Add scoping to create encoders in model.py	5 年前
Ervin Teng	df5ee7bf	Split buffer into two buffers (PPO works)	5 年前
Ervin Teng	9053610f	Fix buffer tests and truncate	5 年前
Ervin Teng	fd0647a6	Rename append_update_buffer to append_to_update_buffer	5 年前
GitHub	213cd68d	Split Buffer into processing and update buffers (#2964 ) This is the first in a series of PRs that intend to move the agent processing logic (add_experiences and process_experiences) out of the trainer and into a separate class. The plan is to do so in steps: - Split the processing buffers (keeping track of agent trajectories and assembling trajectories) and update buffer (complete trajectories to be used for training) within the Trainer (this PR) - Move the processing buffer and add/process experiences into a separate, outside class - Change the data type of the update buffer to be a Trajectory - Place and read Trajectories from queues, add subscription mechanism for both AgentProcessor and Trainers	5 年前
Ervin Teng	336ca456	Kill the ProcessingBuffer	5 年前
GitHub	1f9d04f2	Fix clear update buffer when trainer stops training, add test (#3422 ) * Fix clear update buffer when trainer stops training, add test * Fix buffer changing types when truncated	5 年前
Ervin Teng	184f27c6	Make buffer type-agnostic	4 年前
GitHub	64fc7f43	Buffer key enums (#4907 )	4 年前
Ervin Teng	d4438878	Merge branch 'develop-base-teammanager' into develop-agentprocessor-teammanager	4 年前
Ervin Teng	e46a86ad	Merge branch 'master' into develop-superpush-int	4 年前
Ervin Teng	2f209c12	Buffer fixes (cherry picked from commit 2c03d2b544d0c615e7b60d939f01532674d80753)	4 年前
Ervin Teng	50ab983e	Fix slicing typing and string printing in AgentBufferField	4 年前
GitHub	af36ef3b	[bug-fix] Fix typo (#5035 ) * Fix typo * Add test	4 年前
GitHub	f16ce486	Update v2-staging from main (March 15) (#5123 )	4 年前
Ervin Teng	c108da4a	[bug-fix] Fix POCA LSTM, pad sequences in the back (#5206 ) * Pad buffer at the end * Fix padding in optimizer value estimate * Fix additional bugs and POCA * Fix groupmate obs, add tests * Update changelog * Improve tests * Address comments * Fix poca test * Fix buffer test * Increase entropy for Hallway * Add EOF newline * Fix Behavior Name * Address comments (cherry picked from commit 2ce6810846ba9268e4fb5fb082fa54e90414c980)	4 年前

20 次代码提交 (d20bda06-1db5-4fb7-8ae7-9dffd80204cf)