Merge pull request #1003 from dericp/develop-curriculum-learning-rework

Curriculum learning now supports multiple brains.
6 年前 · 322d2bbe
--- a/.gitignore
+++ b/.gitignore
 *.eggs*
 *.gitignore.swp

+# VSCode hidden files
+*.vscode/
-.DS_Store
+# pytest cache 
+*.pytest_cache/
--- a/docs/Training-Curriculum-Learning.md
+++ b/docs/Training-Curriculum-Learning.md

 ## Sample Environment

-Imagine a task in which an agent needs to scale a wall to arrive at a goal. The starting 
-point when training an agent to accomplish this task will be a random policy. That 
-starting policy will have the agent running in circles, and will likely never, or very 
-rarely scale the wall properly to the achieve the reward. If we start with a simpler 
-task, such as moving toward an unobstructed goal, then the agent can easily learn to 
-accomplish the task. From there, we can slowly add to the difficulty of the task by 
-increasing the size of the wall, until the agent can complete the initially 
-near-impossible task of scaling the wall. We are including just such an environment with 
-the ML-Agents toolkit 0.2, called Wall Jump.
+Imagine a task in which an agent needs to scale a wall to arrive at a goal. The
+starting point when training an agent to accomplish this task will be a random
+policy. That starting policy will have the agent running in circles, and will
+likely never, or very rarely scale the wall properly to the achieve the reward.
+If we start with a simpler task, such as moving toward an unobstructed goal,
+then the agent can easily learn to accomplish the task. From there, we can
+slowly add to the difficulty of the task by increasing the size of the wall,
+until the agent can complete the initially near-impossible task of scaling the
+wall. We are including just such an environment with the ML-Agents toolkit 0.2,
+called __Wall Jump__.
-_Demonstration of a curriculum training scenario in which a progressively taller wall 
-obstructs the path to the goal._
- 
-To see this in action, observe the two learning curves below. Each displays the reward 
-over time for an agent trained using PPO with the same set of training hyperparameters. 
-The difference is that one agent was trained using the full-height wall
-version of the task, and the other agent was trained using the curriculum version of
-the task. As you can see, without using curriculum learning the agent has a lot of 
-difficulty. We think that by using well-crafted curricula, agents trained using 
-reinforcement learning will be able to accomplish tasks otherwise much more difficult. 
- 
+_Demonstration of a curriculum training scenario in which a progressively taller
+wall obstructs the path to the goal._
+
+To see this in action, observe the two learning curves below. Each displays the
+reward over time for an agent trained using PPO with the same set of training
+hyperparameters. The difference is that one agent was trained using the
+full-height wall version of the task, and the other agent was trained using the
+curriculum version of the task. As you can see, without using curriculum
+learning the agent has a lot of difficulty. We think that by using well-crafted
+curricula, agents trained using reinforcement learning will be able to
+accomplish tasks otherwise much more difficult.
+
- 
-So how does it work? In order to define a curriculum, the first step is to decide which 
-parameters of the environment will vary. In the case of the Wall Area environment, what 
-varies is the height of the wall. We can define this as a reset parameter in the Academy 
-object of our scene, and by doing so it becomes adjustable via the Python API. Rather 
-than adjusting it by hand, we then create a simple JSON file which describes the 
-structure of the curriculum. Within it we can set at what points in the training process 
-our wall height will change, either based on the percentage of training steps which have 
-taken place, or what the average reward the agent has received in the recent past is. 
-Once these are in place, we simply launch learn.py using the `–curriculum-file` flag to 
-point to the JSON file, and PPO we will train using Curriculum Learning. Of course we can 
-then keep track of the current lesson and progress via TensorBoard.
+
+Each Brain in an environment can have a corresponding curriculum. These
+curriculums are held in what we call a metacurriculum. A metacurriculum allows
+different brains to follow different curriculums within the same environment.
+### Specifying a Metacurriculum
+
+We first create a folder inside `python/curricula/` for the environment we want
+to use curriculum learning with. For example, if we were creating a
+metacurriculum for Wall Jump, we would create the folder
+`python/curricula/wall-jump/`. We will place our curriculums inside this folder.
+
+### Specifying a Curriculum
+
+In order to define a curriculum, the first step is to decide which parameters of
+the environment will vary. In the case of the Wall Jump environment, what varies
+is the height of the wall. We define this as a `Reset Parameter` in the Academy
+object of our scene, and by doing so it becomes adjustable via the Python API.
+Rather than adjusting it by hand, we will create a JSON file which
+describes the structure of the curriculum. Within it, we can specify which
+points in the training process our wall height will change, either based on the
+percentage of training steps which have taken place, or what the average reward
+the agent has received in the recent past is. Below is an example curriculum for
+the BigWallBrain in the Wall Jump environment.

 ```json
 {
-    "signal_smoothing" : true, 
-    "parameters" : 
+    "signal_smoothing" : true,
+    "parameters" :
-        "big_wall_max_height" : [4.0, 7.0, 8.0, 8.0],
-        "small_wall_height" : [1.5, 2.0, 2.5, 4.0]
+        "big_wall_max_height" : [4.0, 7.0, 8.0, 8.0]
-    * `reward` - Uses a measure received reward. 
+    * `reward` - Uses a measure received reward.
-* `thresholds` (float array) - Points in value of `measure` where lesson should be increased.
-* `min_lesson_length` (int) - How many times the progress measure should be reported before 
-incrementing the lesson.
-* `signal_smoothing` (true/false) - Whether to weight the current progress measure by previous values.
+* `thresholds` (float array) - Points in value of `measure` where lesson should
+  be increased.
+* `min_lesson_length` (int) - How many times the progress measure should be
+  reported before incrementing the lesson.
+* `signal_smoothing` (true/false) - Whether to weight the current progress
+  measure by previous values.
-* `parameters` (dictionary of key:string, value:float array) - Corresponds to academy reset parameters to control. Length of each array
-should be one greater than number of thresholds.
+* `parameters` (dictionary of key:string, value:float array) - Corresponds to
+  academy reset parameters to control. Length of each array should be one
+  greater than number of thresholds.
+
+
+Once our curriculum is defined, we have to use the reset parameters we defined
+and modify the environment from the agent's `AgentReset()` function. See
+[WallJumpAgent.cs](https://github.com/Unity-Technologies/ml-agents/blob/master/unity-environment/Assets/ML-Agents/Examples/WallJump/Scripts/WallJumpAgent.cs)
+for an example. Note that if the Academy's __Max Steps__ is not set to some
+positive number the environment will never be reset. The Academy must reset
+for the environment to reset.
+
+We will save this file into our metacurriculum folder with the name of its
+corresponding Brain. For example, in the Wall Jump environment, there are two
+brains---BigWallBrain and SmallWallBrain. If we want to define a curriculum for
+the BigWallBrain, we will save `BigWallBrain.json` into
+`python/curricula/wall-jump/`.
+
+### Training with a Curriculum
+
+Once we have specified our metacurriculum and curriculums, we can launch
+`learn.py` using the `–curriculum` flag to point to the metacurriculum folder
+and PPO will train using Curriculum Learning. For example, to train agents in
+the Wall Jump environment with curriculum learning, we can run `python learn.py
+--curriculum=curricula/wall-jump/ --run-id=wall-jump-curriculum --train`. We can
+then keep track of the current lessons and progresses via TensorBoard.
--- a/python/tests/test_unitytrainers.py
+++ b/python/tests/test_unitytrainers.py
    memory_size: 8
 ''')

-dummy_curriculum = json.loads('''{
-    "measure" : "reward",
-    "thresholds" : [10, 20, 50],
-    "min_lesson_length" : 3,
-    "signal_smoothing" : true,
-    "parameters" :
-    {
-        "param1" : [0.7, 0.5, 0.3, 0.1],
-        "param2" : [100, 50, 20, 15],
-        "param3" : [0.2, 0.3, 0.7, 0.9]
-    }
-}''')
-bad_curriculum = json.loads('''{
-    "measure" : "reward",
-    "thresholds" : [10, 20, 50],
-    "min_lesson_length" : 3,
-    "signal_smoothing" : false,
-    "parameters" :
-    {
-        "param1" : [0.7, 0.5, 0.3, 0.1],
-        "param2" : [100, 50, 20],
-        "param3" : [0.2, 0.3, 0.7, 0.9]
-    }
-}''')
-

@mock.patch('unityagents.UnityEnvironment.executable_launcher')
@mock.patch('unityagents.UnityEnvironment.get_communicator')
                           batch_size=None, training_length=2)
    assert len(b.update_buffer['action']) == 10
    assert np.array(b.update_buffer['action']).shape == (10, 2, 2)
-
-
-def test_curriculum():
-    open_name = '%s.open' % __name__
-    with mock.patch('json.load') as mock_load:
-        with mock.patch(open_name, create=True) as mock_open:
-            mock_open.return_value = 0
-            mock_load.return_value = bad_curriculum
-            with pytest.raises(CurriculumError):
-                Curriculum('tests/test_unityagents.py', {"param1": 1, "param2": 1, "param3": 1})
-            mock_load.return_value = dummy_curriculum
-            with pytest.raises(CurriculumError):
-                Curriculum('tests/test_unityagents.py', {"param1": 1, "param2": 1})
-            curriculum = Curriculum('tests/test_unityagents.py', {"param1": 1, "param2": 1, "param3": 1})
-            assert curriculum.get_lesson_number == 0
-            curriculum.set_lesson_number(1)
-            assert curriculum.get_lesson_number == 1
-            curriculum.increment_lesson(10)
-            assert curriculum.get_lesson_number == 1
-            curriculum.increment_lesson(30)
-            curriculum.increment_lesson(30)
-            assert curriculum.get_lesson_number == 1
-            assert curriculum.lesson_length == 3
-            curriculum.increment_lesson(30)
-            assert curriculum.get_config() == {'param1': 0.3, 'param2': 20, 'param3': 0.7}
-            assert curriculum.get_config(0) == {"param1": 0.7, "param2": 100, "param3": 0.2}
-            assert curriculum.lesson_length == 0
-            assert curriculum.get_lesson_number == 2


 if __name__ == '__main__':
--- a/python/unityagents/environment.py
+++ b/python/unityagents/environment.py
        """
        if config is None:
            config = self._resetParameters
-        elif config != {}:
-            logger.info("\nAcademy Reset with parameters : \t{0}"
+        elif config:
+            logger.info("Academy reset with parameters: {0}"
                        .format(', '.join([str(x) + ' -> ' + str(config[x]) for x in config])))
        for k in config:
            if (k in self._resetParameters) and (isinstance(config[k], (int, float))):
--- a/python/unitytrainers/init.py
+++ b/python/unitytrainers/init.py
 from .buffer import *
 from .curriculum import *
+from .meta_curriculum import *
 from .models import *
 from .trainer_controller import *
 from .bc.models import *
--- a/python/unitytrainers/curriculum.py
+++ b/python/unitytrainers/curriculum.py
+import os
 import json

 from .exception import CurriculumError
-logger = logging.getLogger("unityagents")
+logger = logging.getLogger('unitytrainers')


 class Curriculum(object):
        :param default_reset_parameters: Set of reset parameters for environment.
        """
        self.lesson_length = 0
-        self.max_lesson_number = 0
-        self.measure_type = None
-        if location is None:
-            self.data = None
-        else:
-            try:
-                with open(location) as data_file:
-                    self.data = json.load(data_file)
-            except IOError:
+        self.max_lesson_num = 0
+        self.measure = None
+        self._lesson_num = 0
+        # The name of the brain should be the basename of the file without the
+        # extension.
+        self._brain_name = os.path.basename(location).split('.')[0]
+
+        try:
+            with open(location) as data_file:
+                self.data = json.load(data_file)
+        except IOError:
+            raise CurriculumError(
+                'The file {0} could not be found.'.format(location))
+        except UnicodeDecodeError:
+            raise CurriculumError('There was an error decoding {}'.format(location))
+        self.smoothing_value = 0
+        for key in ['parameters', 'measure', 'thresholds',
+                    'min_lesson_length', 'signal_smoothing']:
+            if key not in self.data:
+                raise CurriculumError("{0} does not contain a "
+                                                "{1} field.".format(location, key))
+        self.smoothing_value = 0
+        self.measure = self.data['measure']
+        self.max_lesson_num = len(self.data['thresholds'])
+
+        parameters = self.data['parameters']
+        for key in parameters:
+            if key not in default_reset_parameters:
-                    "The file {0} could not be found.".format(location))
-            except UnicodeDecodeError:
-                raise CurriculumError("There was an error decoding {}".format(location))
-            self.smoothing_value = 0
-            for key in ['parameters', 'measure', 'thresholds',
-                        'min_lesson_length', 'signal_smoothing']:
-                if key not in self.data:
-                    raise CurriculumError("{0} does not contain a "
-                                                    "{1} field.".format(location, key))
-            parameters = self.data['parameters']
-            self.measure_type = self.data['measure']
-            self.max_lesson_number = len(self.data['thresholds'])
-            for key in parameters:
-                if key not in default_reset_parameters:
-                    raise CurriculumError(
-                        "The parameter {0} in Curriculum {1} is not present in "
-                        "the Environment".format(key, location))
-            for key in parameters:
-                if len(parameters[key]) != self.max_lesson_number + 1:
-                    raise CurriculumError(
-                        "The parameter {0} in Curriculum {1} must have {2} values "
-                        "but {3} were found".format(key, location,
-                                                    self.max_lesson_number + 1, len(parameters[key])))
-        self.set_lesson_number(0)
+                    'The parameter {0} in Curriculum {1} is not present in '
+                    'the Environment'.format(key, location))
+            if len(parameters[key]) != self.max_lesson_num + 1:
+                raise CurriculumError(
+                    'The parameter {0} in Curriculum {1} must have {2} values '
+                    'but {3} were found'.format(key, location,
+                                                self.max_lesson_num + 1, len(parameters[key])))
-    def measure(self):
-        return self.measure_type
+    def lesson_num(self):
+        return self._lesson_num
-    @property
-    def get_lesson_number(self):
-        return self.lesson_number
-
-    def set_lesson_number(self, value):
+    @lesson_num.setter
+    def lesson_num(self, lesson_num):
-        self.lesson_number = max(0, min(value, self.max_lesson_number))
+        self._lesson_num = max(0, min(lesson_num, self.max_lesson_num))

    def increment_lesson(self, progress):
        """
        if self.data is None or progress is None:
            return
-        if self.data["signal_smoothing"]:
+        if self.data['signal_smoothing']:
-        if self.lesson_number < self.max_lesson_number:
-            if ((progress > self.data['thresholds'][self.lesson_number]) and
+        if self.lesson_num < self.max_lesson_num:
+            if ((progress > self.data['thresholds'][self.lesson_num]) and
-                self.lesson_number += 1
+                self.lesson_num += 1
-                parameters = self.data["parameters"]
+                parameters = self.data['parameters']
-                    config[key] = parameters[key][self.lesson_number]
-                logger.info("\nLesson changed. Now in Lesson {0} : \t{1}"
-                            .format(self.lesson_number,
+                    config[key] = parameters[key][self.lesson_num]
+                logger.info('{0} lesson changed. Now in lesson {1}: {2}'
+                            .format(self._brain_name,
+                                    self.lesson_num,
                                    ', '.join([str(x) + ' -> ' + str(config[x]) for x in config])))

    def get_config(self, lesson=None):
        if self.data is None:
            return {}
        if lesson is None:
-            lesson = self.lesson_number
-        lesson = max(0, min(lesson, self.max_lesson_number))
+            lesson = self.lesson_num
+        lesson = max(0, min(lesson, self.max_lesson_num))
-        parameters = self.data["parameters"]
+        parameters = self.data['parameters']
        for key in parameters:
            config[key] = parameters[key][lesson]
        return config
--- a/python/unitytrainers/exception.py
+++ b/python/unitytrainers/exception.py
    Any error related to training with a curriculum.
    """
    pass
+
+class MetaCurriculumError(TrainerError):
+    """
+    Any error related to the configuration of a metacurriculum.
+    """
--- a/python/unitytrainers/trainer.py
+++ b/python/unitytrainers/trainer.py

 from unityagents import UnityException, AllBrainInfo

-logger = logging.getLogger("unityagents")
+logger = logging.getLogger("unitytrainers")


 class UnityTrainerException(UnityException):
        """
        raise UnityTrainerException("The update_model method was not implemented.")

-    def write_summary(self, lesson_number):
+    def write_summary(self, lesson_num):
-        :param lesson_number: The lesson the trainer is at.
+        :param lesson_num: The lesson the trainer is at.
        """
        if (self.get_step % self.trainer_parameters['summary_freq'] == 0 and self.get_step != 0 and
                self.is_training and self.get_step <= self.get_max_steps):
                    stat_mean = float(np.mean(self.stats[key]))
                    summary.value.add(tag='Info/{}'.format(key), simple_value=stat_mean)
                    self.stats[key] = []
-            summary.value.add(tag='Info/Lesson', simple_value=lesson_number)
+            summary.value.add(tag='Info/Lesson', simple_value=lesson_num)
            self.summary_writer.add_summary(summary, self.get_step)
            self.summary_writer.flush()

--- a/python/unitytrainers/trainer_controller.py
+++ b/python/unitytrainers/trainer_controller.py
 # Launches unitytrainers for each External Brains in a Unity Environment

 import logging
-import numpy as np
-import re
-import tensorflow as tf
+import re
+import numpy as np
+import tensorflow as tf
+from unityagents.environment import UnityEnvironment
+from unityagents.exception import UnityEnvironmentException
+
-from unitytrainers import Curriculum
-from unityagents import UnityEnvironment, UnityEnvironmentException
+from unitytrainers.meta_curriculum import MetaCurriculum
+from unitytrainers.exception import MetaCurriculumError
-    def __init__(self, env_path, run_id, save_freq, curriculum_file, fast_simulation, load, train,
+    def __init__(self, env_path, run_id, save_freq, curriculum_folder, fast_simulation, load, train,
                 worker_id, keep_checkpoints, lesson, seed, docker_target_name, trainer_config_path,
                 no_graphics):
        """
-        :param curriculum_file: Curriculum json file for environment
+        :param curriculum_folder: Folder containing JSON curriculums for the env
        :param fast_simulation: Whether to run the game at training speed
        :param load: Whether to load the model or randomly initialize
        :param train: Whether to train model, or only run inference
        :param no_graphics: Whether to run the Unity simulator in no-graphics mode
        """
        self.trainer_config_path = trainer_config_path
+
        if env_path is not None:
            env_path = (env_path.strip()
                        .replace('.app', '')
+
-            self.curriculum_file = curriculum_file
+            self.curriculum_folder = curriculum_folder
            self.summaries_dir = './summaries'
        else:
            self.docker_training = True
            if env_path is not None:
                env_path = '/{docker_target_name}/{env_name}'.format(docker_target_name=docker_target_name,
                                                                     env_name=env_path)
-            if curriculum_file is None:
-                self.curriculum_file = None
-            else:
-                self.curriculum_file = '/{docker_target_name}/{curriculum_file}'.format(
+            if curriculum_folder is not None:
+                self.curriculum_folder = '/{docker_target_name}/{curriculum_file}'.format(
-                    curriculum_file=curriculum_file)
+                    curriculum_folder=curriculum_folder)
+
+
        self.logger = logging.getLogger("unityagents")
        self.run_id = run_id
        self.save_freq = save_freq
            self.env_name = 'editor_'+self.env.academy_name
        else:
            self.env_name = os.path.basename(os.path.normpath(env_path))  # Extract out name of environment
-        self.curriculum = Curriculum(curriculum_file, self.env._resetParameters)
+
+        if curriculum_folder is None:
+            self.meta_curriculum = None
+        else:
+            self.meta_curriculum = MetaCurriculum(self.curriculum_folder, self.env._resetParameters)
+
+        if self.meta_curriculum is not None and self.curriculum_folder is not None:
+            for brain_name in self.meta_curriculum.brains_to_curriculums.keys():
+                if brain_name not in self.env.external_brain_names:
+                    raise MetaCurriculumError('One of the curriculums '
+                                              'defined in ' +
+                                              self.curriculum_folder + ' '
+                                              'does not have a corresponding '
+                                              'Brain. Check that the '
+                                              'curriculum file has the same '
+                                              'name as the Brain '
+                                              'whose curriculum it defines.')
-    def _get_progress(self):
-        if self.curriculum_file is not None:
-            progress = 0
-            if self.curriculum.measure_type == "progress":
-                for brain_name in self.env.external_brain_names:
-                    progress += self.trainers[brain_name].get_step / self.trainers[brain_name].get_max_steps
-                return progress / len(self.env.external_brain_names)
-            elif self.curriculum.measure_type == "reward":
-                for brain_name in self.env.external_brain_names:
-                    progress += self.trainers[brain_name].get_last_reward
-                return progress
-            else:
-                return None
+    def _get_progresses(self):
+        if self.meta_curriculum is not None:
+            brain_names_to_progresses = {}
+            for brain_name, curriculum in self.meta_curriculum.brains_to_curriculums.items():
+                if curriculum.measure == "progress":
+                    progress = self.trainers[brain_name].get_step / self.trainers[brain_name].get_max_steps
+                    brain_names_to_progresses[brain_name] = progress
+                elif curriculum.measure == "reward":
+                    progress = self.trainers[brain_name].get_last_reward
+                    brain_names_to_progresses[brain_name] = progress
+            return brain_names_to_progresses
        else:
            return None


    def _initialize_trainers(self, trainer_config, sess):
        trainer_parameters_dict = {}
+        # TODO: This probably doesn't need to be reinitialized.
        self.trainers = {}
        for brain_name in self.env.external_brain_names:
            trainer_parameters = trainer_config['default'].copy()
                                            .format(model_path))

    def start_learning(self):
-        self.curriculum.set_lesson_number(self.lesson)
+        # TODO: Should be able to start learning at different lesson numbers for each curriculum.
+        self.meta_curriculum.set_all_curriculums_to_lesson_num(self.lesson)
        trainer_config = self._load_config()
        self._create_model_path(self.model_path)

            self._initialize_trainers(trainer_config, sess)
-            for k, t in self.trainers.items():
+            for _, t in self.trainers.items():
                self.logger.info(t)
            init = tf.global_variables_initializer()
            saver = tf.train.Saver(max_to_keep=self.keep_checkpoints)
            else:
                sess.run(init)
            global_step = 0  # This is only for saving the model
-            self.curriculum.increment_lesson(self._get_progress())
-            curr_info = self.env.reset(config=self.curriculum.get_config(), train_mode=self.fast_simulation)
+            self.meta_curriculum.increment_lessons(self._get_progresses())
+            curr_info = self.env.reset(config=self.meta_curriculum.get_config(), train_mode=self.fast_simulation)
            if self.train_model:
                for brain_name, trainer in self.trainers.items():
                    trainer.write_tensorboard_text('Hyperparameters', trainer.parameters)
-                        self.curriculum.increment_lesson(self._get_progress())
-                        curr_info = self.env.reset(config=self.curriculum.get_config(), train_mode=self.fast_simulation)
+                        self.meta_curriculum.increment_lessons(self._get_progresses())
+                        curr_info = self.env.reset(config=self.meta_curriculum.get_config(), train_mode=self.fast_simulation)
                        for brain_name, trainer in self.trainers.items():
                            trainer.end_episode()
                    # Decide and take an action
                            # Perform gradient descent with experience buffer
                            trainer.update_model()
                        # Write training statistics to Tensorboard.
-                        trainer.write_summary(self.curriculum.lesson_number)
+                        trainer.write_summary(self.meta_curriculum.brains_to_curriculums[brain_name].lesson_num)
                        if self.train_model and trainer.get_step <= trainer.get_max_steps:
                            trainer.increment_step_and_update_last_reward()
                    if self.train_model:
--- a/python/curricula/wall-jump/BigWallBrain.json
+++ b/python/curricula/wall-jump/BigWallBrain.json
    "parameters" : 
    {
        "big_wall_min_height" : [0.0, 4.0, 6.0, 8.0],
-        "big_wall_max_height" : [4.0, 7.0, 8.0, 8.0],
-        "small_wall_height" : [1.5, 2.0, 2.5, 4.0]
+        "big_wall_max_height" : [4.0, 7.0, 8.0, 8.0]
    }
 }
--- a/python/tests/test_curriculum.py
+++ b/python/tests/test_curriculum.py
+import pytest
+import json
+from unittest.mock import patch, mock_open
+
+from unitytrainers.exception import CurriculumError
+from unitytrainers import Curriculum
+
+
+dummy_curriculum_json_str = '''
+    {
+        "measure" : "reward",
+        "thresholds" : [10, 20, 50],
+        "min_lesson_length" : 3,
+        "signal_smoothing" : true,
+        "parameters" :
+        {
+            "param1" : [0.7, 0.5, 0.3, 0.1],
+            "param2" : [100, 50, 20, 15],
+            "param3" : [0.2, 0.3, 0.7, 0.9]
+        }
+    }
+    '''
+
+
+bad_curriculum_json_str = '''
+    {
+        "measure" : "reward",
+        "thresholds" : [10, 20, 50],
+        "min_lesson_length" : 3,
+        "signal_smoothing" : false,
+        "parameters" :
+        {
+            "param1" : [0.7, 0.5, 0.3, 0.1],
+            "param2" : [100, 50, 20],
+            "param3" : [0.2, 0.3, 0.7, 0.9]
+        }
+    }
+    '''
+
+@pytest.fixture
+def location():
+    return 'TestBrain.json'
+
+
+@pytest.fixture
+def default_reset_parameters():
+    return {"param1": 1, "param2": 1, "param3": 1}
+
+
+@patch('builtins.open', new_callable=mock_open, read_data=dummy_curriculum_json_str)
+def test_init_curriculum_happy_path(mock_file, location, default_reset_parameters):
+    curriculum = Curriculum(location, default_reset_parameters)
+
+    assert curriculum._brain_name == 'TestBrain'
+    assert curriculum.lesson_num == 0
+    assert curriculum.measure == 'reward'
+
+
+@patch('builtins.open', new_callable=mock_open, read_data=bad_curriculum_json_str)
+def test_init_curriculum_bad_curriculum_raises_error(mock_file, location, default_reset_parameters):
+    with pytest.raises(CurriculumError):
+        Curriculum(location, default_reset_parameters)
+
+
+@patch('builtins.open', new_callable=mock_open, read_data=dummy_curriculum_json_str)
+def test_increment_lesson(mock_file, location, default_reset_parameters):
+    curriculum = Curriculum(location, default_reset_parameters)
+    assert curriculum.lesson_num == 0
+
+    curriculum.lesson_num = 1
+    assert curriculum.lesson_num == 1
+
+    curriculum.increment_lesson(10)
+    assert curriculum.lesson_num == 1
+
+    curriculum.increment_lesson(30)
+    curriculum.increment_lesson(30)
+    assert curriculum.lesson_num == 1
+    assert curriculum.lesson_length == 3
+
+    curriculum.increment_lesson(30)
+    assert curriculum.lesson_length == 0
+    assert curriculum.lesson_num == 2
+
+
+@patch('builtins.open', new_callable=mock_open, read_data=dummy_curriculum_json_str)
+def test_get_config(mock_file):
+    curriculum = Curriculum('TestBrain.json', {"param1": 1, "param2": 1, "param3": 1})
+    assert curriculum.get_config() == {"param1": 0.7, "param2": 100, "param3": 0.2}
+
+    curriculum.lesson_num = 2
+    assert curriculum.get_config() == {'param1': 0.3, 'param2': 20, 'param3': 0.7}
+    assert curriculum.get_config(0) == {"param1": 0.7, "param2": 100, "param3": 0.2}
--- a/python/tests/test_meta_curriculum.py
+++ b/python/tests/test_meta_curriculum.py
+import pytest
+from unittest.mock import patch, call, Mock
+
+from unitytrainers.meta_curriculum import MetaCurriculum
+from unitytrainers.exception import MetaCurriculumError
+
+
+class MetaCurriculumTest(MetaCurriculum):
+    """This class allows us to test MetaCurriculum objects without calling
+    MetaCurriculum's __init__ function.
+    """
+    def __init__(self, brains_to_curriculums):
+        self._brains_to_curriculums = brains_to_curriculums
+
+
+@pytest.fixture
+def default_reset_parameters():
+    return {'param1' : 1, 'param2' : 2, 'param3' : 3}
+
+
+@pytest.fixture
+def more_reset_parameters():
+    return {'param4' : 4, 'param5' : 5, 'param6' : 6}
+
+
+@pytest.fixture
+def progresses():
+    return {'Brain1' : 0.2, 'Brain2' : 0.3}
+
+
+@patch('unitytrainers.Curriculum.get_config', return_value={})
+@patch('unitytrainers.Curriculum.__init__', return_value=None)
+@patch('os.listdir', return_value=['Brain1.json', 'Brain2.json'])
+def test_init_meta_curriculum_happy_path(listdir, mock_curriculum_init,
+                                         mock_curriculum_get_config,
+                                         default_reset_parameters):
+    meta_curriculum = MetaCurriculum('test/', default_reset_parameters)
+
+    assert len(meta_curriculum.brains_to_curriculums) == 2
+
+    assert 'Brain1' in meta_curriculum.brains_to_curriculums
+    assert 'Brain2' in meta_curriculum.brains_to_curriculums
+
+    calls = [call('test/Brain1.json', default_reset_parameters),
+             call('test/Brain2.json', default_reset_parameters)]
+
+    mock_curriculum_init.assert_has_calls(calls)
+
+
+@patch('os.listdir', side_effect=NotADirectoryError())
+def test_init_meta_curriculum_bad_curriculum_folder_raises_error(listdir):
+    with pytest.raises(MetaCurriculumError):
+        MetaCurriculum('test/', default_reset_parameters)
+
+
+@patch('unitytrainers.Curriculum')
+@patch('unitytrainers.Curriculum')
+def test_set_lesson_nums(curriculum_a, curriculum_b):
+    meta_curriculum = MetaCurriculumTest({'Brain1' : curriculum_a,
+                                          'Brain2' : curriculum_b})
+
+    meta_curriculum.lesson_nums = {'Brain1' : 1, 'Brain2' : 3}
+
+    assert curriculum_a.lesson_num == 1
+    assert curriculum_b.lesson_num == 3
+
+
+
+@patch('unitytrainers.Curriculum')
+@patch('unitytrainers.Curriculum')
+def test_increment_lessons(curriculum_a, curriculum_b, progresses):
+    meta_curriculum = MetaCurriculumTest({'Brain1' : curriculum_a,
+                                          'Brain2' : curriculum_b})
+
+    meta_curriculum.increment_lessons(progresses)
+
+    curriculum_a.increment_lesson.assert_called_with(0.2)
+    curriculum_b.increment_lesson.assert_called_with(0.3)
+
+
+@patch('unitytrainers.Curriculum')
+@patch('unitytrainers.Curriculum')
+def test_set_all_curriculums_to_lesson_num(curriculum_a, curriculum_b):
+    meta_curriculum = MetaCurriculumTest({'Brain1' : curriculum_a,
+                                          'Brain2' : curriculum_b})
+
+    meta_curriculum.set_all_curriculums_to_lesson_num(2)
+
+    assert curriculum_a.lesson_num == 2
+    assert curriculum_b.lesson_num == 2
+
+
+@patch('unitytrainers.Curriculum')
+@patch('unitytrainers.Curriculum')
+def test_get_config(curriculum_a, curriculum_b, default_reset_parameters,
+                    more_reset_parameters):
+    curriculum_a.get_config.return_value = default_reset_parameters
+    curriculum_b.get_config.return_value = default_reset_parameters
+    meta_curriculum = MetaCurriculumTest({'Brain1' : curriculum_a,
+                                          'Brain2' : curriculum_b})
+
+    assert meta_curriculum.get_config() == default_reset_parameters
+
+    curriculum_b.get_config.return_value = more_reset_parameters
+
+    new_reset_parameters = dict(default_reset_parameters)
+    new_reset_parameters.update(more_reset_parameters)
+
+    assert meta_curriculum.get_config() == new_reset_parameters
--- a/python/unitytrainers/meta_curriculum.py
+++ b/python/unitytrainers/meta_curriculum.py
+"""Contains the MetaCurriculum class."""
+
+import os
+from unitytrainers.curriculum import Curriculum
+from unitytrainers.exception import MetaCurriculumError
+
+import logging
+
+logger = logging.getLogger('unitytrainers')
+
+
+class MetaCurriculum(object):
+    """A MetaCurriculum holds curriculums. Each curriculum is associated to a particular
+    brain in the environment.
+    """
+
+    def __init__(self, curriculum_folder, default_reset_parameters):
+        """Initializes a MetaCurriculum object.
+
+        Args:
+            curriculum_folder (str): The relative or absolute path of the
+                folder which holds the curriculums for this environment.
+                The folder should contain JSON files whose names are the
+                brains that the curriculums belong to.
+            default_reset_parameters (dict): The default reset parameters
+                of the environment.
+        """
+        used_reset_parameters = set()
+        self._brains_to_curriculums = {}
+
+        try:
+            for curriculum_filename in os.listdir(curriculum_folder):
+                brain_name = curriculum_filename.split('.')[0]
+                curriculum_filepath = \
+                    os.path.join(curriculum_folder, curriculum_filename)
+                curriculum = Curriculum(curriculum_filepath, default_reset_parameters)
+
+                # Check if any two curriculums use the same reset params.
+                if any([(parameter in curriculum.get_config().keys()) for parameter in used_reset_parameters]):
+                    logger.warning('Two or more curriculums will '
+                                'attempt to change the same reset '
+                                'parameter. The result will be '
+                                'non-deterministic.')
+
+                used_reset_parameters.update(curriculum.get_config().keys())
+                self._brains_to_curriculums[brain_name] = curriculum
+        except NotADirectoryError:
+            raise MetaCurriculumError(curriculum_folder + ' is not a '
+                                      'directory. Refer to the ML-Agents '
+                                      'curriculum learning docs.')
+
+
+    @property
+    def brains_to_curriculums(self):
+        """A dict from brain_name to the brain's curriculum."""
+        return self._brains_to_curriculums
+
+    @property
+    def lesson_nums(self):
+        """A dict from brain name to the brain's curriculum's lesson number."""
+        lesson_nums = {}
+        for brain_name, curriculum in self.brains_to_curriculums.items():
+            lesson_nums[brain_name] = curriculum.lesson_num
+
+        return lesson_nums
+
+    @lesson_nums.setter
+    def lesson_nums(self, lesson_nums):
+        for brain_name, lesson in lesson_nums.items():
+            self.brains_to_curriculums[brain_name].lesson_num = lesson
+
+    def increment_lessons(self, progresses):
+        """Increments all the lessons of all the curriculums in this MetaCurriculum.
+
+        Args:
+            progresses (dict): A dict of brain name to progress.
+        """
+        for brain_name, progress in progresses.items():
+            self.brains_to_curriculums[brain_name].increment_lesson(progress)
+
+
+    def set_all_curriculums_to_lesson_num(self, lesson_num):
+        """Sets all the curriculums in this meta curriculum to a specified lesson number.
+
+        Args:
+            lesson_num (int): The lesson number which all the curriculums will
+                be set to.
+        """
+        for _, curriculum in self.brains_to_curriculums.items():
+            curriculum.lesson_num = lesson_num
+
+
+    def get_config(self):
+        """Get the combined configuration of all curriculums in this MetaCurriculum.
+
+        Returns:
+            A dict from parameter to value.
+        """
+        config = {}
+
+        for _, curriculum in self.brains_to_curriculums.items():
+            curr_config = curriculum.get_config()
+            config.update(curr_config)
+
+        return config
--- a/python/curricula/push-block/PushBlockBrain.json
+++ b/python/curricula/push-block/PushBlockBrain.json
+{
+    "measure" : "reward",
+    "thresholds" : [0.75, 0.75, 0.75, 0.75, 0.75, 0.75, 0.75, 0.75, 0.75, 0.75, 0.75, 0.75, 0.75, 0.75, 0.75],
+    "min_lesson_length" : 2,
+    "signal_smoothing" : true, 
+    "parameters" : 
+    {
+        "goal_size" : [3.5, 3.25, 3.0, 2.75, 2.5, 2.25, 2.0, 1.75, 1.5, 1.25, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0],
+        "block_size": [1.5, 1.4, 1.3, 1.2, 1.1, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0],
+        "x_variation":[1.5, 1.55, 1.6, 1.65, 1.7, 1.75, 1.8, 1.85, 1.9, 1.95, 2.0, 2.1, 2.2, 2.3, 2.4, 2.5]
+    }
+}
--- a/python/curricula/wall-jump/SmallWallBrain.json
+++ b/python/curricula/wall-jump/SmallWallBrain.json
+{
+    "measure" : "progress",
+    "thresholds" : [0.1, 0.3, 0.5],
+    "min_lesson_length" : 2,
+    "signal_smoothing" : true, 
+    "parameters" : 
+    {
+        "small_wall_height" : [1.5, 2.0, 2.5, 4.0]
+    }
+}
--- a/python/curricula/push.json
+++ b/python/curricula/push.json
-{
-    "measure" : "reward",
-    "thresholds" : [0.75, 0.75, 0.75, 0.75, 0.75, 0.75, 0.75, 0.75, 0.75, 0.75, 0.75, 0.75, 0.75, 0.75, 0.75],
-    "min_lesson_length" : 2,
-    "signal_smoothing" : true, 
-    "parameters" : 
-    {
-        "goal_size" : [3.5, 3.25, 3.0, 2.75, 2.5, 2.25, 2.0, 1.75, 1.5, 1.25, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0],
-        "block_size": [1.5, 1.4, 1.3, 1.2, 1.1, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0],
-        "x_variation":[1.5, 1.55, 1.6, 1.65, 1.7, 1.75, 1.8, 1.85, 1.9, 1.95, 2.0, 2.1, 2.2, 2.3, 2.4, 2.5]
-    }
-}
--- a//python/curricula/test/TestBrain.json
+++ b//python/curricula/test/TestBrain.json
--- a//python/curricula/wall-jump/BigWallBrain.json
+++ b//python/curricula/wall-jump/BigWallBrain.json