public class PolicyEvaluation extends DynamicProgramming
evaluatePolicy(EnumerablePolicy, State)
method to evaluate a
policy from some initial seed state. You can reuse this class to evaluate different subsequent policies, but doing so
will overwrite the value function. If you want to save the value function that was computed for some policy,
use the DynamicProgramming.getCopyOfValueFunction()
method.
Alternatively, you can also evaluate a policy with the evaluatePolicy(EnumerablePolicy)
method,
but you should have already seeded the state space by having called the evaluatePolicy(EnumerablePolicy, State)
method or the performReachabilityFrom(State)
method at least once previously,
a runtime exception will be thrown.
QProvider.Helper
Modifier and Type | Field and Description |
---|---|
protected double |
maxEvalDelta
When the maximum change in the value function is smaller than this value, policy evaluation will terminate.
|
protected double |
maxEvalIterations
When the maximum number of evaluation iterations passes this number, policy evaluation will terminate
|
operator, valueFunction, valueInitializer
actionTypes, debugCode, domain, gamma, hashingFactory, model, usingOptionModel
Constructor and Description |
---|
PolicyEvaluation(SADomain domain,
double gamma,
HashableStateFactory hashingFactory,
double maxEvalDelta,
double maxEvalIterations)
Initializes.
|
Modifier and Type | Method and Description |
---|---|
void |
evaluatePolicy(EnumerablePolicy policy)
Computes the value function for the given policy over the states that have been discovered
|
void |
evaluatePolicy(EnumerablePolicy policy,
State s)
Computes the value function for the given policy after finding all reachable states from seed state s
|
boolean |
performReachabilityFrom(State si)
This method will find all reachable states that will be used when computing the value function.
|
computeQ, DPPInit, getAllStates, getCopyOfValueFunction, getDefaultValue, getModel, getOperator, getValueFunctionInitialization, hasComputedValueFor, loadValueTable, performBellmanUpdateOn, performBellmanUpdateOn, performFixedPolicyBellmanUpdateOn, performFixedPolicyBellmanUpdateOn, qValue, qValues, resetSolver, setOperator, setValueFunctionInitialization, value, value, writeValueTable
addActionType, applicableActions, getActionTypes, getDebugCode, getDomain, getGamma, getHashingFactory, setActionTypes, setDebugCode, setDomain, setGamma, setHashingFactory, setModel, solverInit, stateHash, toggleDebugPrinting
protected double maxEvalDelta
protected double maxEvalIterations
public PolicyEvaluation(SADomain domain, double gamma, HashableStateFactory hashingFactory, double maxEvalDelta, double maxEvalIterations)
domain
- the domain on which to evaluate a policygamma
- the discount factorhashingFactory
- the HashableStateFactory
used to index states and perform state equalitymaxEvalDelta
- the minimum change in the value function that will cause policy evaluation to terminatemaxEvalIterations
- the maximum number of evaluation iterations to perform before terminating policy evaluationpublic void evaluatePolicy(EnumerablePolicy policy, State s)
policy
- The Policy
to evaluates
- the seed initiate state from which to find all reachable statespublic void evaluatePolicy(EnumerablePolicy policy)
policy
- the Policy
to evaluatepublic boolean performReachabilityFrom(State si)
si
- the source state from which all reachable states will be found