rlConservativeQLearningOptions
Description
Use an rlConservativeQLearningOptions
object to specify
conservative Q-learning regularizer options to train a DQN or SAC agents. The options you can
specify are the minimum weight and the number of random actions used for Q-value compensation,
and are mostly useful to train agents offline (specifically to deal with possible differences
between the probability distribution of the dataset and the one generated by the environment).
To enable the conservative Q-learning regularizer when training an agent, set the
BatchDataRegularizerOptions
property of the agent options object to a
rlConservativeQLearningOptions
object (that has your preferred minimum
weight and number of samples).
Creation
Description
returns a default conservative Q-learning regularizer options set.cqOpts
= rlConservativeQLearningOptions
creates the conservative Q-learning regularizer option set cqOpts
= rlConservativeQLearningOptions(Name=Value
)cqOpts
and
sets its properties using one or more name-value arguments.
Properties
Object Functions
Examples
Algorithms
References
[1] Kumar, Aviral, Aurick Zhou, George Tucker, and Sergey Levine. "Conservative q-learning for offline reinforcement learning." Advances in Neural Information Processing Systems 33 (2020): 1179-1191.
Version History
Introduced in R2023a