public class VIModelPlanner extends java.lang.Object implements ModelPlanner, QComputablePlanner
| Modifier and Type | Class and Description |
|---|---|
static class |
VIModelPlanner.VIModelPlannerGenerator |
ModelPlanner.ModelPlannerGeneratorQComputablePlanner.QComputablePlannerHelper| Modifier and Type | Field and Description |
|---|---|
protected Domain |
domain
the model domain
|
protected double |
gamma
The model planning discount factor
|
protected StateHashFactory |
hashingFactory
The hashing factory to use
|
protected State |
initialState
The last initial state of an episode
|
protected double |
maxDelta
The maximium VI delta
|
protected int |
maxIterations
The maximum number of VI iterations
|
protected Policy |
modelPolicy
The greedy policy that results from VI
|
protected java.util.Set<StateHashTuple> |
observedStates
States the agent has observed during learning.
|
protected RewardFunction |
rf
The model reward function
|
protected TerminalFunction |
tf
The model termination function
|
protected ValueIteration |
vi
The value iteration planning object
|
| Constructor and Description |
|---|
VIModelPlanner(Domain domain,
RewardFunction rf,
TerminalFunction tf,
double gamma,
StateHashFactory hashingFactory,
double maxDelta,
int maxIterations)
Initializes
|
| Modifier and Type | Method and Description |
|---|---|
QValue |
getQ(State s,
AbstractGroundedAction a)
Returns the
QValue for the given state-action pair. |
java.util.List<QValue> |
getQs(State s)
Returns a
List of QValue objects for ever permissible action for the given input state. |
ValueIteration |
getValueIterationPlanner()
Returns the value iteration object used for planning whenever the model updates.
|
void |
initializePlannerIn(State s)
This is method is expected to be called at the beginning of any new learning episode.
|
void |
modelChanged(State changedState)
Tells the planner that the model has changed and that it will need to replan accordingly
|
Policy |
modelPlannedPolicy()
Returns a policy encoding the planner's results.
|
protected void |
rerunVI()
Reruns VI on the new updated model.
|
void |
resetPlanner()
Resets planner as if no planning had never been called.
|
protected ValueIteration vi
protected Policy modelPolicy
protected State initialState
protected Domain domain
protected RewardFunction rf
protected TerminalFunction tf
protected double gamma
protected StateHashFactory hashingFactory
protected double maxDelta
protected int maxIterations
protected java.util.Set<StateHashTuple> observedStates
public VIModelPlanner(Domain domain, RewardFunction rf, TerminalFunction tf, double gamma, StateHashFactory hashingFactory, double maxDelta, int maxIterations)
domain - model domainrf - model reward funcitontf - model termination functiongamma - discount factorhashingFactory - the hashing factorymaxDelta - max value function delta in VImaxIterations - max iterations of VIpublic void initializePlannerIn(State s)
ModelPlannerinitializePlannerIn in interface ModelPlanners - the input statepublic void modelChanged(State changedState)
ModelPlannermodelChanged in interface ModelPlannerchangedState - the source state that caused a change in the model.public Policy modelPlannedPolicy()
ModelPlannermodelPlannedPolicy in interface ModelPlannerpublic void resetPlanner()
ModelPlannerresetPlanner in interface ModelPlannerpublic java.util.List<QValue> getQs(State s)
QComputablePlannerList of QValue objects for ever permissible action for the given input state.getQs in interface QComputablePlanners - the state for which Q-values are to be returned.List of QValue objects for ever permissible action for the given input state.public QValue getQ(State s, AbstractGroundedAction a)
QComputablePlannerQValue for the given state-action pair.getQ in interface QComputablePlanners - the input statea - the input actionQValue for the given state-action pair.public ValueIteration getValueIterationPlanner()
protected void rerunVI()