Fixed various typos (#2652)

* Add console log section to Bug Report form (#2566) * Fixed typos
5 年前 · 7a178f12
--- a/.github/ISSUE_TEMPLATE/bug_report.md
+++ b/.github/ISSUE_TEMPLATE/bug_report.md
 3. Scroll down to '....'
 4. See error

+**Console logs / stack traces**
+Please wrap in [triple backticks (```)](https://help.github.com/en/articles/creating-and-highlighting-code-blocks) to make it easier to read.
+
 **Screenshots**
 If applicable, add screenshots to help explain your problem.

--- a/docs/Learning-Environment-Design-Agents.md
+++ b/docs/Learning-Environment-Design-Agents.md
 ![RenderTexture with Raw Image](images/visual-observation-rawimage.png)

 The [GridWorld environment](Learning-Environment-Examples.md#gridworld) 
-is an example on how to use a RenderTexure for both debugging and observation. Note 
+is an example on how to use a RenderTexture for both debugging and observation. Note 
 that in this example, a Camera is rendered to a RenderTexture, which is then used for 
 observations and debugging. To update the RenderTexture, the Camera must be asked to 
 render every time a decision is requested within the game code. When using Cameras 
--- a/docs/Migrating.md
+++ b/docs/Migrating.md
 ## Migrating from ML-Agents toolkit v0.7 to v0.8

 ### Important Changes
-* We have split the Python packges into two seperate packages `ml-agents` and `ml-agents-envs`.
+* We have split the Python packages into two separate packages `ml-agents` and `ml-agents-envs`.
 * `--worker-id` option of `learn.py` has been removed, use `--base-port` instead if you'd like to run multiple instances of `learn.py`.

 #### Steps to Migrate
--- a/docs/Reward-Signals.md
+++ b/docs/Reward-Signals.md
 to reaching some goal. These are what we refer to as "extrinsic" rewards, as they are defined
 external of the learning algorithm.

-Rewards, however, can be defined outside of the enviroment as well, to encourage the agent to
+Rewards, however, can be defined outside of the environment as well, to encourage the agent to
 behave in certain ways, or to aid the learning of the true extrinsic reward. We refer to these
 rewards as "intrinsic" reward signals. The total reward that the agent will learn to maximize can
 be a mix of extrinsic and intrinsic reward signals.
 The `curiosity` Reward Signal enables the Intrinsic Curiosity Module. This is an implementation
 of the approach described in "Curiosity-driven Exploration by Self-supervised Prediction"
 by Pathak, et al. It trains two networks:
-* an inverse model, which takes the current and next obersvation of the agent, encodes them, and
+* an inverse model, which takes the current and next observation of the agent, encodes them, and
-* a forward model, which takes the encoded current obseravation and action, and predicts the
+* a forward model, which takes the encoded current observation and action, and predicts the
 next encoded observation.

 The loss of the forward model (the difference between the predicted and actual encoded observations) is used as the intrinsic reward, so the more surprised the model is, the larger the reward will be.
--- a/docs/Training-Generalized-Reinforcement-Learning-Agents.md
+++ b/docs/Training-Generalized-Reinforcement-Learning-Agents.md

 One of the challenges of training and testing agents on the same
 environment is that the agents tend to overfit. The result is that the
-agents are unable to generalize to any tweaks or variations in the enviornment.
-This is analgous to a model being trained and tested on an identical dataset
+agents are unable to generalize to any tweaks or variations in the environment.
+This is analogous to a model being trained and tested on an identical dataset
-should be trained over multiple variations of the enviornment. Using this approach
+should be trained over multiple variations of the environment. Using this approach
-to future unseen variations of the enviornment
+to future unseen variations of the environment

 _Example of variations of the 3D Ball environment._

--- a/docs/Training-Imitation-Learning.md
+++ b/docs/Training-Imitation-Learning.md

 The ML-Agents toolkit provides several ways to learn from demonstrations.

-* To train using GAIL (Generative Adversarial Imitaiton Learning) you can add the
+* To train using GAIL (Generative Adversarial Imitation Learning) you can add the
  [GAIL reward signal](Reward-Signals.md#gail-reward-signal). GAIL can be
  used with or without environment rewards, and works well when there are a limited
  number of demonstrations.
--- a/docs/localized/KR/docs/Training-PPO.md
+++ b/docs/localized/KR/docs/Training-PPO.md

 ### Beta

-`beta` 는 엔트로피 정규화 (Entropy Regulazation)의 정도를 결정하며 이를 통해 정책을 더 랜덤하게 만들 수 있습니다. 이 값을 통해 에이전트는 학습 동안 액션 공간을 적절하게 탐험할 수 있습니다. 이 값을 증가시키면 에이전트가 더 많이 랜덤 행동을 취하게 됩니다. 엔트로피 (텐서보드를 통해 측정 가능)는 보상이 증가함에 따라 서서히 크기를 감소시켜야합니다. 만약 엔트로피가 너무 빠르게 떨어지면 `beta`를 증가시켜야합니다. 만약 엔트로피가 너무 느리게 떨어지면 `beta`를 감소시켜야 합니다. 
+`beta` 는 엔트로피 정규화 (Entropy Regularization)의 정도를 결정하며 이를 통해 정책을 더 랜덤하게 만들 수 있습니다. 이 값을 통해 에이전트는 학습 동안 액션 공간을 적절하게 탐험할 수 있습니다. 이 값을 증가시키면 에이전트가 더 많이 랜덤 행동을 취하게 됩니다. 엔트로피 (텐서보드를 통해 측정 가능)는 보상이 증가함에 따라 서서히 크기를 감소시켜야합니다. 만약 엔트로피가 너무 빠르게 떨어지면 `beta`를 증가시켜야합니다. 만약 엔트로피가 너무 느리게 떨어지면 `beta`를 감소시켜야 합니다. 

 일반적인 범위: 1e-4 - 1e-2