public static class QComputablePlanner.QComputablePlannerHelper
extends java.lang.Object
Constructor and Description |
---|
QComputablePlanner.QComputablePlannerHelper() |
Modifier and Type | Method and Description |
---|---|
static double |
getOptimalValue(QComputablePlanner planner,
State s)
Returns the optimal state value function for a state given a
QComputablePlanner . |
static double |
getOptimalValue(QComputablePlanner planner,
State s,
TerminalFunction tf)
Returns the optimal state value for a state given a
QComputablePlanner . |
static double |
getPolicyValue(QComputablePlanner planner,
State s,
Policy p)
Returns the state value under a given policy for a state and
QComputablePlanner . |
static double |
getPolicyValue(QComputablePlanner planner,
State s,
Policy p,
TerminalFunction tf)
Returns the state value under a given policy for a state and
QComputablePlanner . |
public QComputablePlanner.QComputablePlannerHelper()
public static double getOptimalValue(QComputablePlanner planner, State s)
QComputablePlanner
.
The optimal value is the max Q-value. If no actions are permissible in the input state, then zero is returned.planner
- the QComputablePlanner
capable of producing Q-values.s
- the query State
for which the value should be returned.public static double getOptimalValue(QComputablePlanner planner, State s, TerminalFunction tf)
QComputablePlanner
.
The optimal value is the max Q-value. If no actions are permissible in the input state or the input state is a terminal state, then zero is returned.planner
- the QComputablePlanner
capable of producing Q-values.s
- the query State
for which the value should be returned.tf
- a terminal function.public static double getPolicyValue(QComputablePlanner planner, State s, Policy p)
QComputablePlanner
.
The value is the expected Q-value under the input policy action distribution. If no actions are permissible in the input state, then zero is returned.planner
- the QComputablePlanner
capable of producing Q-values.s
- the query State
for which the value should be returned.p
- the policy defining the action distribution.public static double getPolicyValue(QComputablePlanner planner, State s, Policy p, TerminalFunction tf)
QComputablePlanner
.
The value is the expected Q-value under the input policy action distribution. If no actions are permissible in the input state, then zero is returned.planner
- the QComputablePlanner
capable of producing Q-values.s
- the query State
for which the value should be returned.p
- the policy defining the action distribution.tf
- a terminal function.