public class QMDP extends MDPSolver implements Planner, QFunction
QFunction.QFunctionHelper| Modifier and Type | Field and Description | 
|---|---|
| protected QFunction | mdpQSourceThe fully observable MDP  QFunctionsource. | 
actions, debugCode, domain, gamma, hashingFactory, mapToStateIndex, rf, tf| Constructor and Description | 
|---|
| QMDP(PODomain domain,
    QFunction mdpQSource)Initializes. | 
| QMDP(PODomain domain,
    RewardFunction rf,
    TerminalFunction tf,
    double discount,
    HashableStateFactory hashingFactory,
    double maxDelta,
    int maxIterations)Initializes and creates a  ValueIterationplanner
 to solve the underling MDP. | 
| Modifier and Type | Method and Description | 
|---|---|
| void | forceMDPPlanningFromAllStates()Calls the  Planner.planFromState(burlap.oomdp.core.states.State)method
 on all states defined in the POMDP. | 
| QValue | getQ(State s,
    AbstractGroundedAction a)Returns the  QValuefor the given state-action pair. | 
| java.util.List<QValue> | getQs(State s)Returns a  ListofQValueobjects for ever permissible action for the given input state. | 
| Policy | planFromState(State initialState) | 
| double | qForBelief(EnumerableBeliefState bs,
          GroundedAction ga)Computes the expected Q-value of the underlying hidden MDP by marginalizing over of the states in the belief state. | 
| protected double | qForBeliefList(java.util.List<EnumerableBeliefState.StateBelief> beliefs,
              GroundedAction ga)Computes the expected Q-value of the underlying hidden MDP by marginalizing over of the states in the belief state. | 
| void | resetSolver()This method resets all solver results so that a solver can be restarted fresh
 as if had never solved the MDP. | 
| double | value(State s)Returns the value function evaluation of the given state. | 
addNonDomainReferencedAction, getActions, getAllGroundedActions, getDebugCode, getDomain, getGamma, getHashingFactory, getRf, getRF, getTf, getTF, setActions, setDebugCode, setDomain, setGamma, setHashingFactory, setRf, setTf, solverInit, stateHash, toggleDebugPrinting, translateActionclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitaddNonDomainReferencedAction, getActions, getDebugCode, getDomain, getGamma, getHashingFactory, getRf, getRF, getTf, getTF, setActions, setDebugCode, setDomain, setGamma, setHashingFactory, setRf, setTf, solverInit, toggleDebugPrintingpublic QMDP(PODomain domain, QFunction mdpQSource)
domain - the POMDP domainmdpQSource - the underlying fully observable MDP QFunction source.public QMDP(PODomain domain, RewardFunction rf, TerminalFunction tf, double discount, HashableStateFactory hashingFactory, double maxDelta, int maxIterations)
ValueIteration planner
 to solve the underling MDP. You should call the forceMDPPlanningFromAllStates() method after construction
 to have the constructed ValueIteration instance
 perform planning.domain - the POMDP domainrf - the POMDP hidden state reward functiontf - the POMDP hidden state terminal functiondiscount - the discount factorhashingFactory - the HashableStateFactory to use for the ValueIteration instance to use.maxDelta - the maximum value function change threshold that will cause planning to terminiatemaxIterations - the maximum number of value iteration iterations.public void forceMDPPlanningFromAllStates()
Planner.planFromState(burlap.oomdp.core.states.State) method
 on all states defined in the POMDP. Calling this method requires that the PODomain provides a StateEnumerator,
 otherwise an exception will be thrown.public java.util.List<QValue> getQs(State s)
QFunctionList of QValue objects for ever permissible action for the given input state.public QValue getQ(State s, AbstractGroundedAction a)
QFunctionQValue for the given state-action pair.public double value(State s)
ValueFunctionvalue in interface ValueFunctions - the state to evaluate.public double qForBelief(EnumerableBeliefState bs, GroundedAction ga)
bs - the belief statega - the action whose Q-value is to be computedprotected double qForBeliefList(java.util.List<EnumerableBeliefState.StateBelief> beliefs, GroundedAction ga)
beliefs - belief state distributionga - the action whose Q-value is to be computedpublic Policy planFromState(State initialState)
PlannerPlanner to begin planning from the specified initial State.
 It will then return an appropriate Policy object that captured the planning results.
 Note that typically you can use a variety of different Policy objects
 in conjunction with this Planner to get varying behavior and
 the returned Policy is not required to be used.planFromState in interface PlannerinitialState - the initial state of the planning problemPolicy that captures the planning results from input State.public void resetSolver()
MDPSolverInterfaceresetSolver in interface MDPSolverInterfaceresetSolver in class MDPSolver