浏览代码

Fix Basic Environment & Discrete States (#356)

* Fix Basic environment to properly reflect number of states.
* Fix discrete states when using stacked states.
* Add trained model for Basic environment.
/develop-generalizationTraining-TrainerController
GitHub 7 年前
当前提交
0277039d
共有 8 个文件被更改,包括 274 次插入51 次删除
  1. 13
      python/trainer_config.yaml
  2. 131
      unity-environment/Assets/ML-Agents/Examples/Basic/Scene.unity
  3. 36
      unity-environment/Assets/ML-Agents/Scripts/Agent.cs
  4. 9
      unity-environment/Assets/ML-Agents/Scripts/CoreBrainInternal.cs
  5. 8
      unity-environment/Assets/ML-Agents/Examples/Basic/TFModels.meta
  6. 121
      unity-environment/Assets/ML-Agents/Examples/Basic/TFModels/Basic.bytes
  7. 7
      unity-environment/Assets/ML-Agents/Examples/Basic/TFModels/Basic.bytes.meta

13
python/trainer_config.yaml


GridWorldBrain:
batch_size: 32
normalize: false
num_layers: 1
hidden_units: 256
beta: 5.0e-3

summary_freq: 2000
time_horizon: 5
BasicBrain:
batch_size: 32
normalize: false
num_layers: 1
hidden_units: 20
beta: 5.0e-3
gamma: 0.9
buffer_size: 256
max_steps: 5.0e5
summary_freq: 2000
time_horizon: 3
StudentBrain:
trainer: imitation

131
unity-environment/Assets/ML-Agents/Examples/Basic/Scene.unity


--- !u!104 &2
RenderSettings:
m_ObjectHideFlags: 0
serializedVersion: 8
serializedVersion: 9
m_Fog: 0
m_FogColor: {r: 0.5, g: 0.5, b: 0.5, a: 1}
m_FogMode: 3

m_CustomReflection: {fileID: 0}
m_Sun: {fileID: 0}
m_IndirectSpecularColor: {r: 0, g: 0, b: 0, a: 1}
m_UseRadianceAmbientProbe: 0
--- !u!157 &3
LightmapSettings:
m_ObjectHideFlags: 0

m_EnableBakedLightmaps: 1
m_EnableRealtimeLightmaps: 1
m_LightmapEditorSettings:
serializedVersion: 9
serializedVersion: 10
m_TextureWidth: 1024
m_TextureHeight: 1024
m_AtlasSize: 1024
m_AO: 0
m_AOMaxDistance: 1
m_CompAOExponent: 1

m_PVRDirectSampleCount: 32
m_PVRSampleCount: 500
m_PVRBounces: 2
m_PVRFiltering: 0
m_PVRFilterTypeDirect: 0
m_PVRFilterTypeIndirect: 0
m_PVRFilterTypeAO: 0
m_PVRFilteringAtrousColorSigma: 1
m_PVRFilteringAtrousNormalSigma: 1
m_PVRFilteringAtrousPositionSigma: 1
m_PVRFilteringAtrousPositionSigmaDirect: 0.5
m_PVRFilteringAtrousPositionSigmaIndirect: 2
m_PVRFilteringAtrousPositionSigmaAO: 1
m_ShowResolutionOverlay: 1
m_LightingDataAsset: {fileID: 0}
m_UseShadowmask: 1
--- !u!196 &4

manualTileSize: 0
tileSize: 256
accuratePlacement: 0
debug:
m_Flags: 0
--- !u!114 &223707724
MonoBehaviour:
m_ObjectHideFlags: 0
m_PrefabParentObject: {fileID: 0}
m_PrefabInternal: {fileID: 0}
m_GameObject: {fileID: 0}
m_Enabled: 1
m_EditorHideFlags: 0
m_Script: {fileID: 11500000, guid: 943466ab374444748a364f9d6c3e2fe2, type: 3}
m_Name: (Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)
m_EditorClassIdentifier:
broadcast: 1
brain: {fileID: 0}
--- !u!1 &282272644
GameObject:
m_ObjectHideFlags: 0

m_Enabled: 1
m_CastShadows: 1
m_ReceiveShadows: 1
m_DynamicOccludee: 1
m_RenderingLayerMask: 4294967295
m_Materials:
- {fileID: 2100000, guid: 260483cdfc6b14e26823a02f23bd8baa, type: 2}
m_StaticBatchInfo:

m_PreserveUVs: 1
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0
m_StitchLightmapSeams: 0
m_SelectedEditorRenderState: 3
m_MinimumChartSize: 4
m_AutoUVMaxDistance: 0.5

observations: []
maxStep: 0
resetOnDone: 1
state: []
stackedStates: []
maxStepReached: 0
CummulativeReward: 0
CumulativeReward: 0
stepCounter: 0
agentStoredAction: []
memory: []

smallGoal: {fileID: 1178588871}
minPosition: -10
maxPosition: 10
--- !u!114 &395380616
--- !u!114 &718270126
MonoBehaviour:
m_ObjectHideFlags: 0
m_PrefabParentObject: {fileID: 0}

m_EditorHideFlags: 0
m_Script: {fileID: 11500000, guid: 943466ab374444748a364f9d6c3e2fe2, type: 3}
m_Name: (Clone)
m_Script: {fileID: 11500000, guid: 35813a1be64e144f887d7d5f15b963fa, type: 3}
m_Name: (Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)
brain: {fileID: 0}
--- !u!114 &577874698
MonoBehaviour:
m_ObjectHideFlags: 0
m_PrefabParentObject: {fileID: 0}
m_PrefabInternal: {fileID: 0}
m_GameObject: {fileID: 0}
m_Enabled: 1
m_EditorHideFlags: 0
m_Script: {fileID: 11500000, guid: 41e9bda8f3cf1492fa74926a530f6f70, type: 3}
m_Name: (Clone)
m_EditorClassIdentifier:
continuousPlayerActions: []
discretePlayerActions:
- key: 97
value: 0
- key: 100
value: 1
defaultAction: -1
brain: {fileID: 846768605}
--- !u!1 &762086410
GameObject:

- component: {fileID: 846768604}
- component: {fileID: 846768605}
m_Layer: 0
m_Name: Brain
m_Name: BasicBrain
m_TagString: Untagged
m_Icon: {fileID: 0}
m_NavMeshLayer: 0

m_Name:
m_EditorClassIdentifier:
brainParameters:
stateSize: 1
stateSize: 20
stackedStates: 1
actionSize: 2
memorySize: 0
cameraResolutions: []

actionSpaceType: 0
stateSpaceType: 0
brainType: 0
brainType: 3
- {fileID: 577874698}
- {fileID: 395380616}
- {fileID: 1503497339}
instanceID: 10208
- {fileID: 968741156}
- {fileID: 223707724}
- {fileID: 718270126}
- {fileID: 1383558892}
instanceID: 12322
--- !u!114 &968741156
MonoBehaviour:
m_ObjectHideFlags: 0
m_PrefabParentObject: {fileID: 0}
m_PrefabInternal: {fileID: 0}
m_GameObject: {fileID: 0}
m_Enabled: 1
m_EditorHideFlags: 0
m_Script: {fileID: 11500000, guid: 41e9bda8f3cf1492fa74926a530f6f70, type: 3}
m_Name: (Clone)(Clone)(Clone)(Clone)(Clone)(Clone)(Clone)
m_EditorClassIdentifier:
broadcast: 1
continuousPlayerActions: []
discretePlayerActions:
- key: 97
value: 0
- key: 100
value: 1
defaultAction: -1
brain: {fileID: 846768605}
--- !u!1 &984725368
GameObject:
m_ObjectHideFlags: 0

m_Enabled: 1
m_CastShadows: 1
m_ReceiveShadows: 1
m_DynamicOccludee: 1
m_RenderingLayerMask: 4294967295
m_Materials:
- {fileID: 2100000, guid: 624b24bbec31f44babfb57ef2dfbc537, type: 2}
m_StaticBatchInfo:

m_PreserveUVs: 1
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0
m_StitchLightmapSeams: 0
m_SelectedEditorRenderState: 3
m_MinimumChartSize: 4
m_AutoUVMaxDistance: 0.5

m_Enabled: 1
m_CastShadows: 1
m_ReceiveShadows: 1
m_DynamicOccludee: 1
m_RenderingLayerMask: 4294967295
m_Materials:
- {fileID: 2100000, guid: 624b24bbec31f44babfb57ef2dfbc537, type: 2}
m_StaticBatchInfo:

m_PreserveUVs: 1
m_IgnoreNormalsForChartDetection: 0
m_ImportantGI: 0
m_StitchLightmapSeams: 0
m_SelectedEditorRenderState: 3
m_MinimumChartSize: 4
m_AutoUVMaxDistance: 0.5

m_Father: {fileID: 0}
m_RootOrder: 5
m_LocalEulerAnglesHint: {x: 0, y: 0, z: 0}
--- !u!114 &1503497339
--- !u!114 &1383558892
MonoBehaviour:
m_ObjectHideFlags: 0
m_PrefabParentObject: {fileID: 0}

m_EditorHideFlags: 0
m_Script: {fileID: 11500000, guid: 35813a1be64e144f887d7d5f15b963fa, type: 3}
m_Name: (Clone)
m_Script: {fileID: 11500000, guid: 8b23992c8eb17439887f5e944bf04a40, type: 3}
m_Name: (Clone)(Clone)(Clone)(Clone)(Clone)(Clone)
broadcast: 1
graphModel: {fileID: 4900000, guid: 07e40c2d0871b4e989b41d1b8519fb93, type: 3}
graphScope:
graphPlaceholders: []
BatchSizePlaceholderName: batch_size
StatePlacholderName: state
RecurrentInPlaceholderName: recurrent_in
RecurrentOutPlaceholderName: recurrent_out
ObservationPlaceholderName: []
ActionPlaceholderName: action
brain: {fileID: 846768605}
--- !u!1 &1574236047
GameObject:

maxSteps: 0
frameToSkip: 0
waitTime: 0.5
isInference: 0
trainingConfiguration:
width: 80
height: 80

targetFrameRate: 60
defaultResetParameters: []
done: 0
maxStepReached: 0
isInference: 0
windowResize: 0
--- !u!4 &1574236049
Transform:
m_ObjectHideFlags: 0

m_TargetEye: 3
m_HDR: 1
m_AllowMSAA: 1
m_AllowDynamicResolution: 0
m_StereoMirrorMode: 0
--- !u!4 &1715640925
Transform:
m_ObjectHideFlags: 0

36
unity-environment/Assets/ML-Agents/Scripts/Agent.cs


}
memory = new float[brain.brainParameters.memorySize];
}
state = new List<float>(brain.brainParameters.stateSize);
if (brain.brainParameters.stateSpaceType == StateType.continuous)
{
stackedStates = new List<float>(brain.brainParameters.stateSize * brain.brainParameters.stackedStates);
stackedStates.AddRange(new float[brain.brainParameters.stateSize * brain.brainParameters.stackedStates]);
}
else
{
stackedStates = new List<float>(brain.brainParameters.stackedStates);
stackedStates.AddRange(new float[brain.brainParameters.stackedStates]);
}
InitializeAgent();
}

*/
public virtual void InitializeAgent()
{
state = new List<float>(brain.brainParameters.stateSize);
stackedStates = new List<float>(brain.brainParameters.stateSize * brain.brainParameters.stackedStates);
stackedStates.AddRange(new float[brain.brainParameters.stateSize * brain.brainParameters.stackedStates]);
}
/// Collect the states of the agent with this method

public List<float> ClearAndCollectState() {
state.Clear();
CollectState();
stackedStates.RemoveRange(0, brain.brainParameters.stateSize);
if (brain.brainParameters.stateSpaceType == StateType.continuous)
{
stackedStates.RemoveRange(0, brain.brainParameters.stateSize);
}
else
{
stackedStates.RemoveRange(0, 1);
}
stackedStates.AddRange(state);
return stackedStates;
}

{
memory = new float[brain.brainParameters.memorySize];
stackedStates.Clear();
stackedStates.AddRange(new float[brain.brainParameters.stateSize * brain.brainParameters.stackedStates]);
if (brain.brainParameters.stateSpaceType == StateType.continuous)
{
stackedStates.AddRange(new float[brain.brainParameters.stateSize *
brain.brainParameters.stackedStates]);
}
else
{
stackedStates.AddRange(new float[brain.brainParameters.stackedStates]);
}
stepCounter = 0;
AgentReset();
CumulativeReward = -reward;

9
unity-environment/Assets/ML-Agents/Scripts/CoreBrainInternal.cs


// Create the state tensor
if (hasState)
{
int stateLength = 1;
if (brain.brainParameters.stateSpaceType == StateType.continuous)
{
stateLength = brain.brainParameters.stateSize;
}
inputState = new float[currentBatchSize, brain.brainParameters.stateSize * brain.brainParameters.stackedStates];
inputState = new float[currentBatchSize, stateLength * brain.brainParameters.stackedStates];
for (int j = 0; j < brain.brainParameters.stateSize * brain.brainParameters.stackedStates; j++)
for (int j = 0; j < stateLength * brain.brainParameters.stackedStates; j++)
{
inputState[i, j] = state_list[j];

8
unity-environment/Assets/ML-Agents/Examples/Basic/TFModels.meta


fileFormatVersion: 2
guid: a1b9b0ef56a7943f8b6eae4a5c2d4c13
folderAsset: yes
DefaultImporter:
externalObjects: {}
userData:
assetBundleName:
assetBundleVariant:

121
unity-environment/Assets/ML-Agents/Examples/Basic/TFModels/Basic.bytes


?
state Placeholder*
dtype0*
shape: ���������
D
Reshape/shapeConst*
valueB:
���������*
dtype0
?
ReshapeReshapestate Reshape/shape*
T0*
Tshape0
?
OneHotEncoding/ToInt64CastReshape*
SrcT0*
DstT0
F
OneHotEncoding/one_hot/depthConst*
value B:*
dtype0
L
OneHotEncoding/one_hot/on_valueConst*
value B
*�?*
dtype0
M
OneHotEncoding/one_hot/off_valueConst*
value B
**
dtype0
�
OneHotEncoding/one_hotOneHotOneHotEncoding/ToInt64OneHotEncoding/one_hot/depthOneHotEncoding/one_hot/on_value OneHotEncoding/one_hot/off_value*
axis ���������*
T0*
TI0
dense/kernelConst*
dtype0*�
value� B� "� �6;�M>،>Iv>��g�ܿ��˾łT�`�>>aP�= ��=�U�>Z+����;A���HE���+y>�� ����/��=�H�>(����� ;F��>i�� �#���'<n�n�M�P?�p�=��z>^��>�����>K)>�>���>�b�&d����>/-�>��T�>�]z>�/���m(�ĵ-�W��e�E��>Y�"?���=�����??cx=t"ڽ)�?{��^�վ���=�B
==Y�=O?��k=�i��O�$��)���1��"��9 >� ?c��>��>Z�> ��0Ly���>>�l��g
����>�?g�˾���>�?X>+�o;�_8>aC��_ ����>��'>#?���>7���'�>����Q�����?���������"7n>IL⾒\?��=m��fqO� t�>8����>���>��=?�l>DY>,qD?'����~ھ��?_���ƥ$��/h=�?"#>Q��=i���G���}����S�-�Խ���=_��� S?�>�ξrVA?�a����B��;^?�/�ˉ��W�a�: ����>$��]`��po >��J��¾����<ք=�+=�M��Z�>��>�����N[>)�ľ�6��̠>)���L�<���> ����)>��B����>p�e>~�(�DȨ=� z>(�����>~�>ޞ�>l��>�uk=�ST>h0�=q������>@���`B ��>��>�t
����=��u>�T>8�=�=�����Ϛ=�VT���r=��>lZ*>��>��:>��m>]y��T����ݐ�Ȥ��9���C�>RT�>�[
> #����>`�L<l��>�"�l�K>� =�ϧ�� u����i��x�Y�vG��0��<�bz�
����'>�_��k>���<�>&.>xA�=���=F��� -2>�k��M9��>@� =0�c�t�F>P�w>G>d�>>���y�����>�%���]&�@�M�9ꋾ$ 5>s岾�v�>�x� �> {=Fչ�N����a]�5�s����>���=�⦾�W���Π=^���>��!��3���"=�Q>`�_>p\�<쎧>Fڒ>B���I=ĉ>NH�>���o�ri����!��ƭ>4_�>T�>�뮾T��>���vB�� ��nt��nF��=��T>P&�=��i�8ӛ>����`�5>��h���=�2����M�u����4�=@x���F�=�3>��&�`r��<��>,�}>����P�)��᷾Լa>8��>`��>t��Z��>�J�uR���=���y�=i�r��F�S巾�^G>� ����>�{��|U�=�\��F�H �>b#���{>�b�=(� =x��=����4�>��8���@+�����g�>js�@����E����>�D�>0��(,>8�ؽ|���,����J��9_ � �>qF=��(��N=�6����ſ>� 9>,��>�0�uf��D����>�d�;�&C���q�Ӌ��}�N����>�To��᪾A����F��|�>�� >d)�>�Y�����>�0�>�6H>�>x�=X%= ⹾',�����>�VҽP9���ed>��]>���
U
dense/kernel/readIdentity dense/kernel*
T0*
_class
loc:@dense/kernel
p
dense/MatMulMatMulOneHotEncoding/one_hotdense/kernel/read*
transpose_a(*
transpose_b(*
T0
'
dense/EluElu dense/MatMul*
T0
�
dense_1/kernelConst*
dtype0*�
value�B�"��2� �-?���>�֮�!C���?Ő�� �>F�.?�S8�9h?=C��<?�D
�b�?I��-���p�>�(�����>jXL��T?MO�1??���>��۾�|K�^J?9>XC�s2�>����4Q:��:&?�t?y%��{d?3�c�X⣾�Z�>
[
dense_1/kernel/readIdentitydense_1/kernel*
T0*!
_class
loc:@dense_1/kernel
g
dense_2/MatMulMatMul dense/Eludense_1/kernel/read*
transpose_a(*
transpose_b(*
T0
0
action_probsSoftmaxdense_2/MatMul*
T0
M
#multinomial/Multinomial/num_samplesConst*
value B:*
dtype0

multinomial/Multinomial Multinomialdense_2/MatMul#multinomial/Multinomial/num_samples*
seed��7*
seed20*
T0
4
actionIdentitymultinomial/Multinomial*
T0
�
dense_2/kernelConst*
dtype0*i
value`B^"P)S�>�؍��$�>~�װ�;��ڽ�1?�1�9���̙�=�.>�]�>���=��K��� �٠�� 㾽U~�eO!>
[
dense_2/kernel/readIdentitydense_2/kernel*
T0*!
_class
loc:@dense_2/kernel
=
dense_2/biasConst*
dtype0*
valueB*΄v>
U
dense_2/bias/readIdentity dense_2/bias*
T0*
_class
loc:@dense_2/bias
g
dense_3/MatMulMatMul dense/Eludense_2/kernel/read*
transpose_a(*
transpose_b(*
T0
]
dense_3/BiasAddBiasAdddense_3/MatMuldense_2/bias/read*
T0*
data_formatNHWC
4
value_estimateIdentitydense_3/BiasAdd*
T0

7
unity-environment/Assets/ML-Agents/Examples/Basic/TFModels/Basic.bytes.meta


fileFormatVersion: 2
guid: 07e40c2d0871b4e989b41d1b8519fb93
TextScriptImporter:
externalObjects: {}
userData:
assetBundleName:
assetBundleVariant:
正在加载...
取消
保存