public static class QGradientPlannerFactory.DifferentiableVIFactory extends java.lang.Object implements QGradientPlannerFactory
DifferentiableVI factory.QGradientPlannerFactory.DifferentiableVIFactory| Modifier and Type | Field and Description |
|---|---|
protected HashableStateFactory |
hashingFactory
The
HashableStateFactory used by the valueFunction. |
protected double |
maxDelta
The value function change threshold to stop VI.
|
protected int |
maxIterations
The maximum allowed number of VI iterations.
|
protected TerminalFunction |
tf
The terminal function that the valueFunction uses.
|
| Constructor and Description |
|---|
QGradientPlannerFactory.DifferentiableVIFactory(HashableStateFactory hashingFactory)
Initializes the factory with the given
HashableStateFactory. |
QGradientPlannerFactory.DifferentiableVIFactory(HashableStateFactory hashingFactory,
TerminalFunction tf,
double maxDelta,
int maxIterations)
Initializes.
|
| Modifier and Type | Method and Description |
|---|---|
QGradientPlanner |
generateDifferentiablePlannerForRequest(MLIRLRequest request)
Returns a
QGradientPlanner for an
MLIRLRequest object's domain,
reward function, discount factor, and Boltzmann beta parameter. |
protected HashableStateFactory hashingFactory
HashableStateFactory used by the valueFunction.protected double maxDelta
protected int maxIterations
protected TerminalFunction tf
NullTermination.public QGradientPlannerFactory.DifferentiableVIFactory(HashableStateFactory hashingFactory)
HashableStateFactory.
The terminal function will be defaulted to a NullTermination;
value function change threshold to 0.01; and the max VI iterations to 500.hashingFactory - the HashableStateFactory to use for planning.public QGradientPlannerFactory.DifferentiableVIFactory(HashableStateFactory hashingFactory, TerminalFunction tf, double maxDelta, int maxIterations)
hashingFactory - the HashableStateFactory to use for planning.tf - The terminal function that the generated planners use.maxDelta - The value function change threshold to stop VI.maxIterations - The maximum allowed number of VI iterationspublic QGradientPlanner generateDifferentiablePlannerForRequest(MLIRLRequest request)
QGradientPlannerFactoryQGradientPlanner for an
MLIRLRequest object's domain,
reward function, discount factor, and Boltzmann beta parameter.generateDifferentiablePlannerForRequest in interface QGradientPlannerFactoryrequest - the request defining the problem the valueFunction should solve.QGradientPlanner instance.