public class DynamicProgramming extends MDPSolver implements ValueFunction, QProvider
QProvider.Helper| Modifier and Type | Field and Description |
|---|---|
protected DPOperator |
operator |
protected java.util.Map<HashableState,java.lang.Double> |
valueFunction
A map for storing the current value function estimate for each state.
|
protected ValueFunction |
valueInitializer
The value function initialization to use; defaulted to an initialization of 0 everywhere.
|
actionTypes, debugCode, domain, gamma, hashingFactory, model, usingOptionModel| Constructor and Description |
|---|
DynamicProgramming() |
| Modifier and Type | Method and Description |
|---|---|
protected double |
computeQ(State s,
Action ga)
Computes the Q-value This computation
*is* compatible with
Option objects. |
void |
DPPInit(SADomain domain,
double gamma,
HashableStateFactory hashingFactory)
Common init method for
DynamicProgramming instances. |
java.util.List<State> |
getAllStates()
This method will return all states that are stored in this planners value function.
|
DynamicProgramming |
getCopyOfValueFunction() |
protected double |
getDefaultValue(State s)
Returns the default V-value to use for the state
|
SampleModel |
getModel()
Returns the model being used by this solver
|
DPOperator |
getOperator()
Returns the dynamic programming operator used
|
ValueFunction |
getValueFunctionInitialization()
Returns the value initialization function used.
|
boolean |
hasComputedValueFor(State s)
Returns whether a value for the given state has been computed previously.
|
void |
loadValueTable(java.lang.String path)
Loads the value function table located on disk at the specified path.
|
protected double |
performBellmanUpdateOn(HashableState sh)
Performs a Bellman value function update on the provided (hashed) state.
|
double |
performBellmanUpdateOn(State s)
Performs a Bellman value function update on the provided state.
|
protected double |
performFixedPolicyBellmanUpdateOn(HashableState sh,
EnumerablePolicy p)
Performs a fixed-policy Bellman value function update (i.e., policy evaluation) on the provided state.
|
double |
performFixedPolicyBellmanUpdateOn(State s,
EnumerablePolicy p)
Performs a fixed-policy Bellman value function update (i.e., policy evaluation) on the provided state.
|
double |
qValue(State s,
Action a)
Returns the
QValue for the given state-action pair. |
java.util.List<QValue> |
qValues(State s)
Returns a
List of QValue objects for ever permissible action for the given input state. |
void |
resetSolver()
This method resets all solver results so that a solver can be restarted fresh
as if had never solved the MDP.
|
void |
setOperator(DPOperator operator)
Sets the dynamic programming operator use.
|
void |
setValueFunctionInitialization(ValueFunction vfInit)
Sets the value function initialization to use.
|
double |
value(HashableState sh)
Returns the value function evaluation of the given hashed state.
|
double |
value(State s)
Returns the value function evaluation of the given state.
|
void |
writeValueTable(java.lang.String path)
Writes the value function table stored in this object to the specified file path.
|
addActionType, applicableActions, getActionTypes, getDebugCode, getDomain, getGamma, getHashingFactory, setActionTypes, setDebugCode, setDomain, setGamma, setHashingFactory, setModel, solverInit, stateHash, toggleDebugPrintingprotected java.util.Map<HashableState,java.lang.Double> valueFunction
protected ValueFunction valueInitializer
protected DPOperator operator
public void DPPInit(SADomain domain, double gamma, HashableStateFactory hashingFactory)
DynamicProgramming instances. This will automatically call the
MDPSolver.solverInit(SADomain, double, HashableStateFactory)
method.domain - the domain in which to plangamma - the discount factorhashingFactory - the state hashing factorypublic SampleModel getModel()
MDPSolverInterfacegetModel in interface MDPSolverInterfacegetModel in class MDPSolverSampleModelpublic void resetSolver()
MDPSolverInterfaceresetSolver in interface MDPSolverInterfaceresetSolver in class MDPSolverpublic void setValueFunctionInitialization(ValueFunction vfInit)
vfInit - the object that defines how to initializes the value function.public ValueFunction getValueFunctionInitialization()
public DPOperator getOperator()
public void setOperator(DPOperator operator)
BellmanOperator (max)operator - the dynamic programming operator to use.public boolean hasComputedValueFor(State s)
s - the state to checkpublic double value(State s)
value in interface ValueFunctions - the state to evaluate.public double value(HashableState sh)
sh - the hashed state to evaluate.public java.util.List<QValue> qValues(State s)
QProviderList of QValue objects for ever permissible action for the given input state.public double qValue(State s, Action a)
QFunctionQValue for the given state-action pair.public java.util.List<State> getAllStates()
public DynamicProgramming getCopyOfValueFunction()
public double performBellmanUpdateOn(State s)
s - the state on which to perform the Bellman update.public double performFixedPolicyBellmanUpdateOn(State s, EnumerablePolicy p)
s - the state on which to perform the Bellman update.p - the policy that is being evaluatedpublic void writeValueTable(java.lang.String path)
path - the path to write the value functionpublic void loadValueTable(java.lang.String path)
Map from HashableState to Double.path - the path to the save value function tableprotected double performBellmanUpdateOn(HashableState sh)
sh - the hashed state on which to perform the Bellman update.protected double performFixedPolicyBellmanUpdateOn(HashableState sh, EnumerablePolicy p)
sh - the hashed state on which to perform the Bellman update.p - the policy that is being evaluatedprotected double computeQ(State s, Action ga)
Option objects.s - the given statega - the given actionprotected double getDefaultValue(State s)
s - the input state to get the default V-value for