public class BeliefSparseSampling extends MDPSolver implements Planner, QFunction
SparseSampling
to solve it. If the full transition dynamics are used (set c in the constructor to -1), then it provides and optimal finite horizon POMDP policy.QFunction.QFunctionHelper| Modifier and Type | Field and Description |
|---|---|
protected SADomain |
beliefMDP
The belief MDP domain to solve.
|
protected RewardFunction |
beliefRF
The belief MDP reward function
|
protected SparseSampling |
mdpPlanner
The
SparseSampling planning instance to solve the problem. |
actions, debugCode, domain, gamma, hashingFactory, mapToStateIndex, rf, tf| Constructor and Description |
|---|
BeliefSparseSampling(PODomain domain,
RewardFunction rf,
double discount,
HashableStateFactory hashingFactory,
int h,
int c)
Initializes the planner.
|
| Modifier and Type | Method and Description |
|---|---|
SADomain |
getBeliefMDP()
Returns the generated Belief MDP that will be solved.
|
QValue |
getQ(State s,
AbstractGroundedAction a)
Returns the
QValue for the given state-action pair. |
java.util.List<QValue> |
getQs(State s)
Returns a
List of QValue objects for ever permissible action for the given input state. |
SparseSampling |
getSparseSamplingPlanner()
Returns the
SparseSampling planning used to solve the Belief MDP. |
static void |
main(java.lang.String[] args) |
Policy |
planFromState(State initialState)
|
void |
resetSolver()
This method resets all solver results so that a solver can be restarted fresh
as if had never solved the MDP.
|
double |
value(State s)
Returns the value function evaluation of the given state.
|
addNonDomainReferencedAction, getActions, getAllGroundedActions, getDebugCode, getDomain, getGamma, getHashingFactory, getRf, getRF, getTf, getTF, setActions, setDebugCode, setDomain, setGamma, setHashingFactory, setRf, setTf, solverInit, stateHash, toggleDebugPrinting, translateActionclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitaddNonDomainReferencedAction, getActions, getDebugCode, getDomain, getGamma, getHashingFactory, getRf, getRF, getTf, getTF, setActions, setDebugCode, setDomain, setGamma, setHashingFactory, setRf, setTf, solverInit, toggleDebugPrintingprotected SADomain beliefMDP
protected RewardFunction beliefRF
protected SparseSampling mdpPlanner
SparseSampling planning instance to solve the problem.public BeliefSparseSampling(PODomain domain, RewardFunction rf, double discount, HashableStateFactory hashingFactory, int h, int c)
domain - the POMDP domainrf - the POMDP reward functiondiscount - the discount factorhashingFactory - the Belief MDP HashableStateFactory that SparseSampling will use.h - the height of the SparseSampling tree.c - the number of samples SparseSampling will use. Set to -1 to use the full BeliefMDP transition dynamics.public SADomain getBeliefMDP()
public SparseSampling getSparseSamplingPlanner()
SparseSampling planning used to solve the Belief MDP.SparseSampling planning used to solve the Belief MDP.public java.util.List<QValue> getQs(State s)
QFunctionList of QValue objects for ever permissible action for the given input state.public QValue getQ(State s, AbstractGroundedAction a)
QFunctionQValue for the given state-action pair.public Policy planFromState(State initialState)
PlannerPlanner to begin planning from the specified initial State.
It will then return an appropriate Policy object that captured the planning results.
Note that typically you can use a variety of different Policy objects
in conjunction with this Planner to get varying behavior and
the returned Policy is not required to be used.planFromState in interface PlannerinitialState - the initial state of the planning problemPolicy that captures the planning results from input State.public void resetSolver()
MDPSolverInterfaceresetSolver in interface MDPSolverInterfaceresetSolver in class MDPSolverpublic double value(State s)
ValueFunctionvalue in interface ValueFunctions - the state to evaluate.public static void main(java.lang.String[] args)