When using parallel SubprocessUnityEnvironment instances along
with Academy Done(), a new step might be taken when reset should
have been called because some environments may have been done while
others were not (making "global done" less useful).
This change manages the reset on `global_done` at the level of the
environment worker, and removes the global reset from
TrainerController.
* Add GetTotalStepCount to the Academy
This will allow the RecordVideos plugin to record based on the current academy step
* fixup! Add GetTotalStepCount to the Academy
* Add the video recorder to the documentation
* Sanitize demo filenames so that they can't be too long, overflow the header, and corrupt demo files
* Fix issue where 1st demo of each episode is always recorded as 0 action
* Update Learning-Environment-Create-New.md
Section : Final Editor Setup - Step 3. It says:
Drag the Brain RollerBallPlayer from the Project window to the RollerAgent Brain field.
Should say:
Drag the Brain RollerBallBrain from the Project window to the RollerAgent Brain field.
* Develop black format fix (#1998)
* fixed the format
* changed the circleci config
* [Gym] Added no_graphics argument (#1997)
> Added the no_graphics argument to the gym interface. #1413
* [Documentation] SetReward method (#1996)
Added a paragraph in the docs/Learning-Environment-Design-Agents.md document regarding the use of SetReward and how it is different from AddReward
* [Documentation] Added information for the environments the trainer cannot train with the default configurations (#1995)
* Format gym_unity using black
* Added the builder script
* Removed the menu item
* Changed the brainToControl to public
* Added the scene for switching
* Modified according to the comments
* Removed the Builder and BuilderUtils script, made all of the logic into the Startup.cs
* Switched back to the previous way using PreExport method
* Added the return at the EOF.
* Resolved the codacy comments.
* Removed one empty line
* Resolved the 2 round comments
A change was made to the way the "train_mode" flag was used by
environments when SubprocessUnityEnvironment was added which was
intended to be part of a separate change set. This broke the CLI
'--slow' flag. This change undoes those changes, so that the slow
/ fast simulation option works correctly.
As a minor additional change, the remaining tests from top level
'tests' folders have been moved into the new test folders.
* update title caps
* Rename Custom-Protos.md to Creating-Custom-Protobuf-Messages.md
* Updated with custom protobuf messages
* Cleanup against to our doc guidelines
* Minor text revision
* Create Training-Concurrent-Unity-Instances
* Rename Training-Concurrent-Unity-Instances to Training-Concurrent-Unity-Instances.md
* update to right format for --num-envs
* added link to concurrent unity instances
* Update and rename Training-Concurrent-Unity-Instances.md to Training-Using-Concurrent-Unity-Instances.md
* Added considerations section
* Update Training-Using-Concurrent-Unity-Instances.md
* cleaned up language to match doc
* minor updates
* retroactive migration from 0.6 to 0.7
* Updated from 0.7 to 0.8 migration
* Minor typo
* minor fix
* accidentally duplicated step
* updated with new features list
On Windows the interrupt for subprocesses works in a different
way from OSX/Linux. The result is that child subprocesses and
their pipes may close while the parent process is still running
during a keyboard (ctrl+C) interrupt.
To handle this, this change adds handling for EOFError and
BrokenPipeError exceptions when interacting with subprocess
environments. Additional management is also added to be sure
when using parallel runs using the "num-runs" option that
the threads for each run are joined and KeyboardInterrupts are
handled.
These changes made the "_win_handler" we used to specially
manage interrupts on Windows unnecessary, so they have been
removed.
When using the SubprocessUnityEnvironment, parallel writes are
made to UnitySDK.log. This causes file access violation issues
in Windows/C#. This change modifies the access and sharing mode
for our writes to UnitySDK.log to fix the issue.
SubprocessUnityEnvironment sends an environment factory function to
each worker which it can use to create a UnityEnvironment to interact
with. We use Python's standard multiprocessing library, which pickles
all data sent to the subprocess. The built-in pickle library doesn't
pickle function objects on Windows machines (tested with Python 3.6 on
Windows 10 Pro).
This PR adds cloudpickle as a dependency in order to serialize the
environment factory. Other implementations of subprocess environments
do the same:
https://github.com/openai/baselines/blob/master/baselines/common/vec_env/subproc_vec_env.py
We need to document the meaning of the two new flags added for
multi-environment training. We may also want to add more specific
instructions for people wanting to speed up training in the future.