public class DynamicProgramming extends MDPSolver implements ValueFunction, QProvider
QProvider.Helper
Modifier and Type | Field and Description |
---|---|
protected DPOperator |
operator |
protected java.util.Map<HashableState,java.lang.Double> |
valueFunction
A map for storing the current value function estimate for each state.
|
protected ValueFunction |
valueInitializer
The value function initialization to use; defaulted to an initialization of 0 everywhere.
|
actionTypes, debugCode, domain, gamma, hashingFactory, model, usingOptionModel
Constructor and Description |
---|
DynamicProgramming() |
Modifier and Type | Method and Description |
---|---|
protected double |
computeQ(State s,
Action ga)
Computes the Q-value This computation
*is* compatible with
Option objects. |
void |
DPPInit(SADomain domain,
double gamma,
HashableStateFactory hashingFactory)
Common init method for
DynamicProgramming instances. |
java.util.List<State> |
getAllStates()
This method will return all states that are stored in this planners value function.
|
DynamicProgramming |
getCopyOfValueFunction() |
protected double |
getDefaultValue(State s)
Returns the default V-value to use for the state
|
SampleModel |
getModel()
Returns the model being used by this solver
|
DPOperator |
getOperator()
Returns the dynamic programming operator used
|
ValueFunction |
getValueFunctionInitialization()
Returns the value initialization function used.
|
boolean |
hasComputedValueFor(State s)
Returns whether a value for the given state has been computed previously.
|
void |
loadValueTable(java.lang.String path)
Loads the value function table located on disk at the specified path.
|
protected double |
performBellmanUpdateOn(HashableState sh)
Performs a Bellman value function update on the provided (hashed) state.
|
double |
performBellmanUpdateOn(State s)
Performs a Bellman value function update on the provided state.
|
protected double |
performFixedPolicyBellmanUpdateOn(HashableState sh,
EnumerablePolicy p)
Performs a fixed-policy Bellman value function update (i.e., policy evaluation) on the provided state.
|
double |
performFixedPolicyBellmanUpdateOn(State s,
EnumerablePolicy p)
Performs a fixed-policy Bellman value function update (i.e., policy evaluation) on the provided state.
|
double |
qValue(State s,
Action a)
Returns the
QValue for the given state-action pair. |
java.util.List<QValue> |
qValues(State s)
Returns a
List of QValue objects for ever permissible action for the given input state. |
void |
resetSolver()
This method resets all solver results so that a solver can be restarted fresh
as if had never solved the MDP.
|
void |
setOperator(DPOperator operator)
Sets the dynamic programming operator use.
|
void |
setValueFunctionInitialization(ValueFunction vfInit)
Sets the value function initialization to use.
|
double |
value(HashableState sh)
Returns the value function evaluation of the given hashed state.
|
double |
value(State s)
Returns the value function evaluation of the given state.
|
void |
writeValueTable(java.lang.String path)
Writes the value function table stored in this object to the specified file path.
|
addActionType, applicableActions, getActionTypes, getDebugCode, getDomain, getGamma, getHashingFactory, setActionTypes, setDebugCode, setDomain, setGamma, setHashingFactory, setModel, solverInit, stateHash, toggleDebugPrinting
protected java.util.Map<HashableState,java.lang.Double> valueFunction
protected ValueFunction valueInitializer
protected DPOperator operator
public void DPPInit(SADomain domain, double gamma, HashableStateFactory hashingFactory)
DynamicProgramming
instances. This will automatically call the
MDPSolver.solverInit(SADomain, double, HashableStateFactory)
method.domain
- the domain in which to plangamma
- the discount factorhashingFactory
- the state hashing factorypublic SampleModel getModel()
MDPSolverInterface
getModel
in interface MDPSolverInterface
getModel
in class MDPSolver
SampleModel
public void resetSolver()
MDPSolverInterface
resetSolver
in interface MDPSolverInterface
resetSolver
in class MDPSolver
public void setValueFunctionInitialization(ValueFunction vfInit)
vfInit
- the object that defines how to initializes the value function.public ValueFunction getValueFunctionInitialization()
public DPOperator getOperator()
public void setOperator(DPOperator operator)
BellmanOperator
(max)operator
- the dynamic programming operator to use.public boolean hasComputedValueFor(State s)
s
- the state to checkpublic double value(State s)
value
in interface ValueFunction
s
- the state to evaluate.public double value(HashableState sh)
sh
- the hashed state to evaluate.public java.util.List<QValue> qValues(State s)
QProvider
List
of QValue
objects for ever permissible action for the given input state.public double qValue(State s, Action a)
QFunction
QValue
for the given state-action pair.public java.util.List<State> getAllStates()
public DynamicProgramming getCopyOfValueFunction()
public double performBellmanUpdateOn(State s)
s
- the state on which to perform the Bellman update.public double performFixedPolicyBellmanUpdateOn(State s, EnumerablePolicy p)
s
- the state on which to perform the Bellman update.p
- the policy that is being evaluatedpublic void writeValueTable(java.lang.String path)
path
- the path to write the value functionpublic void loadValueTable(java.lang.String path)
Map
from HashableState
to Double
.path
- the path to the save value function tableprotected double performBellmanUpdateOn(HashableState sh)
sh
- the hashed state on which to perform the Bellman update.protected double performFixedPolicyBellmanUpdateOn(HashableState sh, EnumerablePolicy p)
sh
- the hashed state on which to perform the Bellman update.p
- the policy that is being evaluatedprotected double computeQ(State s, Action ga)
Option
objects.s
- the given statega
- the given actionprotected double getDefaultValue(State s)
s
- the input state to get the default V-value for