public interface QGradientPlannerFactory
QGradientPlannerobjects. This class is use for
MultipleIntentionsMLIRL, so that it can generate a different differentiable valueFunction for each cluster; that way, after a maximization step, it can query the policy for each cluster in any state without replanning,rather than using a single valueFunction instance that would require replanning for each cluster (since it would have to switch the reward function).
|Modifier and Type||Interface and Description|
QGradientPlanner generateDifferentiablePlannerForRequest(MLIRLRequest request)
MLIRLRequestobject's domain, reward function, discount factor, and Boltzmann beta parameter.
request- the request defining the problem the valueFunction should solve.