rlDiscreteCategoricalActor
Stochastic categorical actor with a discrete action space for reinforcement learning agents
Since R2022a
Description
This object implements a function approximator to be used as a stochastic actor
within a reinforcement learning agent with a discrete action space. A discrete categorical
actor takes an environment observation as input and returns as output a random action sampled
from a categorical (also known as Multinoulli) probability distribution, thereby implementing
a parametrized stochastic policy. After you create an
rlDiscreteCategoricalActor
object, use it to create a suitable agent, such
as rlACAgent
or rlPGAgent
. For more
information on creating representations, see Create Policies and Value Functions.
Creation
Syntax
Description
creates a stochastic actor with a discrete action space, using the deep neural network
actor
= rlDiscreteCategoricalActor(net
,observationInfo
,actionInfo
)net
as underlying approximation model. For this actor,
actionInfo
must specify a discrete action space. The network
input layers are automatically associated with the environment observation channels
according to the dimension specifications in observationInfo
. The
network must have a single output layer with as many elements as the number of possible
discrete actions, as specified in actionInfo
. This function sets
the ObservationInfo
and ActionInfo
properties
of actor
to the inputs observationInfo
and
actionInfo
, respectively.
specifies the names of the network input layers to be associated with the environment
observation channels. The function assigns, in sequential order, each environment
observation channel specified in actor
= rlDiscreteCategoricalActor(net
,observationInfo
,actionInfo
,ObservationInputNames=netObsNames
)observationInfo
to the layer
specified by the corresponding name in the string array
netObsNames
. Therefore, the network input layers, ordered as the
names in netObsNames
, must have the same data type and dimensions
as the observation channels, as ordered in observationInfo
.
creates a discrete space stochastic actor using a custom basis function as underlying
approximation model. The first input argument is a two-element cell array whose first
element is the handle actor
= rlDiscreteCategoricalActor({basisFcn
,W0
},observationInfo
,actionInfo
)basisFcn
to a custom basis function and whose
second element is the initial weight matrix W0
. This function sets
the ObservationInfo
and ActionInfo
properties
of actor
to the inputs observationInfo
and
actionInfo
, respectively.
specifies the device used to perform computational operations on the
actor
= rlDiscreteCategoricalActor(___,UseDevice=useDevice
)actor
object, and sets the UseDevice
property of actor
to the useDevice
input
argument. You can use this syntax with any of the previous input-argument
combinations.
Input Arguments
Properties
Object Functions
rlACAgent | Actor-critic (AC) reinforcement learning agent |
rlPGAgent | Policy gradient (PG) reinforcement learning agent |
rlPPOAgent | Proximal policy optimization (PPO) reinforcement learning agent |
getAction | Obtain action from agent, actor, or policy object given environment observations |
evaluate | Evaluate function approximator object given observation (or observation-action) input data |
gradient | Evaluate gradient of function approximator object given observation and action input data |
accelerate | Option to accelerate computation of gradient for approximator object based on neural network |
getLearnableParameters | Obtain learnable parameter values from agent, function approximator, or policy object |
setLearnableParameters | Set learnable parameter values of agent, function approximator, or policy object |
setModel | Set function approximation model for actor or critic |
getModel | Get function approximator model from actor or critic |
Examples
Version History
Introduced in R2022a