public static class QGradientPlannerFactory.DifferentiableVIFactory extends java.lang.Object implements QGradientPlannerFactory
DifferentiableVI
factory.QGradientPlannerFactory.DifferentiableVIFactory
Modifier and Type | Field and Description |
---|---|
protected HashableStateFactory |
hashingFactory
The
HashableStateFactory used by the valueFunction. |
protected double |
maxDelta
The value function change threshold to stop VI.
|
protected int |
maxIterations
The maximum allowed number of VI iterations.
|
protected TerminalFunction |
tf
The terminal function that the valueFunction uses.
|
Constructor and Description |
---|
DifferentiableVIFactory(HashableStateFactory hashingFactory)
Initializes the factory with the given
HashableStateFactory . |
DifferentiableVIFactory(HashableStateFactory hashingFactory,
TerminalFunction tf,
double maxDelta,
int maxIterations)
Initializes.
|
Modifier and Type | Method and Description |
---|---|
DifferentiableQFunction |
generateDifferentiablePlannerForRequest(MLIRLRequest request)
Returns a
DifferentiableQFunction for an
MLIRLRequest object's domain,
reward function, discount factor, and Boltzmann beta parameter. |
protected HashableStateFactory hashingFactory
HashableStateFactory
used by the valueFunction.protected double maxDelta
protected int maxIterations
protected TerminalFunction tf
NullTermination
.public DifferentiableVIFactory(HashableStateFactory hashingFactory)
HashableStateFactory
.
The terminal function will be defaulted to a NullTermination
;
value function change threshold to 0.01; and the max VI iterations to 500.hashingFactory
- the HashableStateFactory
to use for planning.public DifferentiableVIFactory(HashableStateFactory hashingFactory, TerminalFunction tf, double maxDelta, int maxIterations)
hashingFactory
- the HashableStateFactory
to use for planning.tf
- The terminal function that the generated planners use.maxDelta
- The value function change threshold to stop VI.maxIterations
- The maximum allowed number of VI iterationspublic DifferentiableQFunction generateDifferentiablePlannerForRequest(MLIRLRequest request)
QGradientPlannerFactory
DifferentiableQFunction
for an
MLIRLRequest
object's domain,
reward function, discount factor, and Boltzmann beta parameter.generateDifferentiablePlannerForRequest
in interface QGradientPlannerFactory
request
- the request defining the problem the valueFunction should solve.DifferentiableQFunction
instance.