ml-agents

9 提交

337 分支

128 Plastic标签

目录树: 583dc7ff

作者	SHA1	备注	提交日期
GitHub	1955af9e	[feature] Add experimental PyTorch support (#4335 ) * Begin porting work * Add ResNet and distributions * Dynamically construct actor and critic * Initial optimizer port * Refactoring policy and optimizer * Resolving a few bugs * Share more code between tf and torch policies * Slightly closer to running model * Training runs, but doesn’t actually work * Fix a couple additional bugs * Add conditional sigma for distribution * Fix normalization * Support discrete actions as well * Continuous and discrete now train * Mulkti-discrete now working * Visual observations now train as well * GRU in-progress and dynamic cnns * Fix for memories * Remove unused arg * Combine actor and critic classes. Initial export. * Support tf and pytorch alongside one another * Prepare model for onnx export * Use LSTM and fix a few merge errors * Fix bug in probs calculation * Optimize np -> tensor operations * Time action sample funct...	4 年前

作者

SHA1

备注

提交日期

GitHub

1955af9e

[feature] Add experimental PyTorch support (#4335 )

* Begin porting work

* Add ResNet and distributions

* Dynamically construct actor and critic

* Initial optimizer port

* Refactoring policy and optimizer

* Resolving a few bugs

* Share more code between tf and torch policies

* Slightly closer to running model

* Training runs, but doesn’t actually work

* Fix a couple additional bugs

* Add conditional sigma for distribution

* Fix normalization

* Support discrete actions as well

* Continuous and discrete now train

* Mulkti-discrete now working

* Visual observations now train as well

* GRU in-progress and dynamic cnns

* Fix for memories

* Remove unused arg

* Combine actor and critic classes. Initial export.

* Support tf and pytorch alongside one another

* Prepare model for onnx export

* Use LSTM and fix a few merge errors

* Fix bug in probs calculation

* Optimize np -> tensor operations

* Time action sample funct...

4 年前

1 次代码提交 (583dc7ff-0214-4a37-9344-0e30626a3b0c)