您最多选择25个主题
主题必须以中文或者字母或数字开头,可以包含连字符 (-),并且长度不得超过35个字符
63 行
2.0 KiB
63 行
2.0 KiB
using System.Collections.Generic;
|
|
using Unity.MLAgents.Actuators;
|
|
using Unity.MLAgents.Analytics;
|
|
using Unity.MLAgents.Sensors;
|
|
|
|
|
|
namespace Unity.MLAgents.Policies
|
|
{
|
|
/// <summary>
|
|
/// The Remote Policy only works when training.
|
|
/// When training your Agents, the RemotePolicy will be controlled by Python.
|
|
/// </summary>
|
|
internal class RemotePolicy : IPolicy
|
|
{
|
|
int m_AgentId;
|
|
string m_FullyQualifiedBehaviorName;
|
|
ActionSpec m_ActionSpec;
|
|
ActionBuffers m_LastActionBuffer;
|
|
private bool m_AnalyticsSent = false;
|
|
|
|
internal ICommunicator m_Communicator;
|
|
|
|
/// <inheritdoc />
|
|
public RemotePolicy(
|
|
ActionSpec actionSpec,
|
|
string fullyQualifiedBehaviorName)
|
|
{
|
|
m_FullyQualifiedBehaviorName = fullyQualifiedBehaviorName;
|
|
m_Communicator = Academy.Instance.Communicator;
|
|
m_Communicator?.SubscribeBrain(m_FullyQualifiedBehaviorName, actionSpec);
|
|
m_ActionSpec = actionSpec;
|
|
}
|
|
|
|
/// <inheritdoc />
|
|
public void RequestDecision(AgentInfo info, List<ISensor> sensors)
|
|
{
|
|
if (!m_AnalyticsSent)
|
|
{
|
|
m_AnalyticsSent = true;
|
|
TrainingAnalytics.RemotePolicyInitialized(
|
|
m_FullyQualifiedBehaviorName,
|
|
sensors,
|
|
m_ActionSpec
|
|
);
|
|
}
|
|
m_AgentId = info.episodeId;
|
|
m_Communicator?.PutObservations(m_FullyQualifiedBehaviorName, info, sensors);
|
|
}
|
|
|
|
/// <inheritdoc />
|
|
public ref readonly ActionBuffers DecideAction()
|
|
{
|
|
m_Communicator?.DecideBatch();
|
|
var actions = m_Communicator?.GetActions(m_FullyQualifiedBehaviorName, m_AgentId);
|
|
m_LastActionBuffer = actions == null ? ActionBuffers.Empty : (ActionBuffers)actions;
|
|
return ref m_LastActionBuffer;
|
|
}
|
|
|
|
public void Dispose()
|
|
{
|
|
}
|
|
}
|
|
}
|