public class BeliefSparseSampling extends MDPSolver implements Planner, QProvider
SparseSampling
to solve it. If the full transition dynamics are used (set c in the constructor to -1), then it provides and optimal finite horizon POMDP policy.QProvider.Helper
Modifier and Type | Field and Description |
---|---|
protected SADomain |
beliefMDP
The belief MDP domain to solve.
|
protected SparseSampling |
mdpPlanner
The
SparseSampling planning instance to solve the problem. |
actionTypes, debugCode, domain, gamma, hashingFactory, model, usingOptionModel
Constructor and Description |
---|
BeliefSparseSampling(PODomain domain,
double discount,
HashableStateFactory hashingFactory,
int h,
int c)
Initializes the planner.
|
Modifier and Type | Method and Description |
---|---|
SADomain |
getBeliefMDP()
Returns the generated Belief MDP that will be solved.
|
SparseSampling |
getSparseSamplingPlanner()
Returns the
SparseSampling planning used to solve the Belief MDP. |
static void |
main(java.lang.String[] args) |
Policy |
planFromState(State initialState)
|
double |
qValue(State s,
Action a)
Returns the
QValue for the given state-action pair. |
java.util.List<QValue> |
qValues(State s)
Returns a
List of QValue objects for ever permissible action for the given input state. |
void |
resetSolver()
This method resets all solver results so that a solver can be restarted fresh
as if had never solved the MDP.
|
double |
value(State s)
Returns the value function evaluation of the given state.
|
addActionType, applicableActions, getActionTypes, getDebugCode, getDomain, getGamma, getHashingFactory, getModel, setActionTypes, setDebugCode, setDomain, setGamma, setHashingFactory, setModel, solverInit, stateHash, toggleDebugPrinting
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
addActionType, getActionTypes, getDebugCode, getDomain, getGamma, getHashingFactory, getModel, setActionTypes, setDebugCode, setDomain, setGamma, setHashingFactory, setModel, solverInit, toggleDebugPrinting
protected SADomain beliefMDP
protected SparseSampling mdpPlanner
SparseSampling
planning instance to solve the problem.public BeliefSparseSampling(PODomain domain, double discount, HashableStateFactory hashingFactory, int h, int c)
domain
- the POMDP domaindiscount
- the discount factorhashingFactory
- the Belief MDP HashableStateFactory
that SparseSampling
will use.h
- the height of the SparseSampling
tree.c
- the number of samples SparseSampling
will use. Set to -1 to use the full BeliefMDP transition dynamics.public SADomain getBeliefMDP()
public SparseSampling getSparseSamplingPlanner()
SparseSampling
planning used to solve the Belief MDP.SparseSampling
planning used to solve the Belief MDP.public java.util.List<QValue> qValues(State s)
QProvider
List
of QValue
objects for ever permissible action for the given input state.public double qValue(State s, Action a)
QFunction
QValue
for the given state-action pair.public Policy planFromState(State initialState)
Planner
Planner
to begin planning from the specified initial State
.
It will then return an appropriate Policy
object that captured the planning results.
Note that typically you can use a variety of different Policy
objects
in conjunction with this Planner
to get varying behavior and
the returned Policy
is not required to be used.planFromState
in interface Planner
initialState
- the initial state of the planning problemPolicy
that captures the planning results from input State
.public void resetSolver()
MDPSolverInterface
resetSolver
in interface MDPSolverInterface
resetSolver
in class MDPSolver
public double value(State s)
ValueFunction
value
in interface ValueFunction
s
- the state to evaluate.public static void main(java.lang.String[] args)