Merge remote-tracking branch 'origin/development-0.3' into docs-training-brains-etc

7 年前 · 83ee0226
--- a/docs/Background-Machine-Learning.md
+++ b/docs/Background-Machine-Learning.md

 ## Unsupervised Learning

-The goal of unsupervised learning is to group or cluster similar items in a 
+The goal of 
+[unsupervised learning](https://en.wikipedia.org/wiki/Unsupervised_learning) is to group or cluster similar items in a 
 data set. For example, consider the players of a game. We may want to group 
 the players depending on how engaged they are with the game. This would enable
 us to target different groups (e.g. for highly-engaged players we might

 ## Supervised Learning

-In supervised learning, we do not want to just group similar items but directly
+In [supervised learning](https://en.wikipedia.org/wiki/Supervised_learning),
+we do not want to just group similar items but directly
 learn a mapping from each item to the group (or class) that it belongs to.
 Returning to our earlier example of
 clustering players, let's say we now wish to predict which of our players are

 ## Reinforcement Learning

-Reinforcement learning can be viewed as a form of learning for sequential
+[Reinforcement learning](https://en.wikipedia.org/wiki/Reinforcement_learning)
+can be viewed as a form of learning for sequential
 decision making that is commonly associated with controlling robots (but is,
 in fact, much more general). Consider an autonomous firefighting robot that is
 tasked with navigating into an area, finding the fire and neutralizing it. At

 ## Deep Learning

-To be completed.
+[Deep learning](https://en.wikipedia.org/wiki/Deep_learning) is a family of 
+algorithms that can be used to address any of the problems introduced 
+above. More specifically, they can be used to solve both attribute and 
+model selection tasks. Deep learning has gained popularity in recent 
+years due to its outstanding performance on several challenging machine learning 
+tasks. One example is [AlphaGo](https://en.wikipedia.org/wiki/AlphaGo), 
+a  [computer Go](https://en.wikipedia.org/wiki/Computer_Go) program, that 
+leverages deep learning, that was able to beat Lee Sedol (a Go world champion).
-Link to TensorFlow background page.
+A key characteristic of deep learning algorithms is their ability learn very
+complex functions from large amounts of training data. This makes them a
+natural choice for reinforcement learning tasks when a large amount of data
+can be generated, say through the use of a simulator or engine such as Unity.
+By generating hundreds of thousands of simulations of
+the environment within Unity, we can learn policies for very complex environments
+(a complex environment is one where the number of observations an agent percieves
+and the number of actions they can take are large).
+Many of the algorithms we provide in ML-Agents use some form of deep learning,
+built on top of the open-source library, [TensorFlow](Background-TensorFlow.md).
--- a/docs/Background-TensorFlow.md
+++ b/docs/Background-TensorFlow.md
 # Background: TensorFlow

-**Work In Progress**
+As discussed in our 
+[machine learning background page](Background-Machine-Learning.md), many of the
+algorithms we provide in ML-Agents leverage some form of deep learning.
+More specifically, our implementations are built on top of the open-source 
+library [TensorFlow](https://www.tensorflow.org/). This means that the models
+produced by ML-Agents are (currently) in a format only understood by
+TensorFlow. In this page we provide a brief overview of TensorFlow, in addition
+to TensorFlow-related tools that we leverage within ML-Agents.
-[TensorFlow](https://www.tensorflow.org/) is a deep learning library.
-
-Link to Arthur's content?
-
-A few words about TensorFlow and why/how it is relevant would be nice.
-
-TensorFlow is used for training the machine learning models in ML-Agents. 
-Unless you are implementing new algorithms, the use of TensorFlow
-is mostly abstracted away and behind the scenes. 
+[TensorFlow](https://www.tensorflow.org/) is an open source library for
+performing computations using data flow graphs, the underlying representation
+of deep learning models. It facilitates training and inference on CPUs and
+GPUs in a desktop, server, or mobile device. Within ML-Agents, when you
+train the behavior of an Agent, the output is a TensorFlow model (.bytes)
+file that you can then embed within an Internal Brain. Unless you implement 
+a new algorithm, the use of TensorFlow is mostly abstracted away and behind 
+the scenes. 

 ## TensorBoard


 ## TensorFlowSharp

-Third-party used in Internal Brain mode.
-
-
-
+One of the drawbacks of TensorFlow is that it does not provide a native
+C# API. This means that the Internal Brain is not natively supported since
+Unity scripts are written in C#. Consequently,
+to enable the Internal Brain, we leverage a third-party 
+library [TensorFlowSharp](https://github.com/migueldeicaza/TensorFlowSharp) 
+which provides .NET bindings to TensorFlow. Thus, when a Unity environment
+that contains an Internal Brain is built, inference is performed via
+TensorFlowSharp. We provide an additional in-depth overview of how to
+leverage [TensorFlowSharp within Unity](Using-TensorFlow-Sharp-in-Unity.md)
+which will become more relevant once you install and start training
+behaviors within ML-Agents. Given the reliance on TensorFlowSharp, the
+Internal Brain is currently marked as experimental.