public class BeliefSparseSampling extends MDPSolver implements Planner, QFunction
SparseSampling
to solve it. If the full transition dynamics are used (set c in the constructor to -1), then it provides and optimal finite horizon POMDP policy.QFunction.QFunctionHelper
Modifier and Type | Field and Description |
---|---|
protected SADomain |
beliefMDP
The belief MDP domain to solve.
|
protected RewardFunction |
beliefRF
The belief MDP reward function
|
protected SparseSampling |
mdpPlanner
The
SparseSampling planning instance to solve the problem. |
actions, debugCode, domain, gamma, hashingFactory, mapToStateIndex, rf, tf
Constructor and Description |
---|
BeliefSparseSampling(PODomain domain,
RewardFunction rf,
double discount,
HashableStateFactory hashingFactory,
int h,
int c)
Initializes the planner.
|
Modifier and Type | Method and Description |
---|---|
SADomain |
getBeliefMDP()
Returns the generated Belief MDP that will be solved.
|
QValue |
getQ(State s,
AbstractGroundedAction a)
Returns the
QValue for the given state-action pair. |
java.util.List<QValue> |
getQs(State s)
Returns a
List of QValue objects for ever permissible action for the given input state. |
SparseSampling |
getSparseSamplingPlanner()
Returns the
SparseSampling planning used to solve the Belief MDP. |
static void |
main(java.lang.String[] args) |
Policy |
planFromState(State initialState)
|
void |
resetSolver()
This method resets all solver results so that a solver can be restarted fresh
as if had never solved the MDP.
|
double |
value(State s)
Returns the value function evaluation of the given state.
|
addNonDomainReferencedAction, getActions, getAllGroundedActions, getDebugCode, getDomain, getGamma, getHashingFactory, getRf, getRF, getTf, getTF, setActions, setDebugCode, setDomain, setGamma, setHashingFactory, setRf, setTf, solverInit, stateHash, toggleDebugPrinting, translateAction
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
addNonDomainReferencedAction, getActions, getDebugCode, getDomain, getGamma, getHashingFactory, getRf, getRF, getTf, getTF, setActions, setDebugCode, setDomain, setGamma, setHashingFactory, setRf, setTf, solverInit, toggleDebugPrinting
protected SADomain beliefMDP
protected RewardFunction beliefRF
protected SparseSampling mdpPlanner
SparseSampling
planning instance to solve the problem.public BeliefSparseSampling(PODomain domain, RewardFunction rf, double discount, HashableStateFactory hashingFactory, int h, int c)
domain
- the POMDP domainrf
- the POMDP reward functiondiscount
- the discount factorhashingFactory
- the Belief MDP HashableStateFactory
that SparseSampling
will use.h
- the height of the SparseSampling
tree.c
- the number of samples SparseSampling
will use. Set to -1 to use the full BeliefMDP transition dynamics.public SADomain getBeliefMDP()
public SparseSampling getSparseSamplingPlanner()
SparseSampling
planning used to solve the Belief MDP.SparseSampling
planning used to solve the Belief MDP.public java.util.List<QValue> getQs(State s)
QFunction
List
of QValue
objects for ever permissible action for the given input state.public QValue getQ(State s, AbstractGroundedAction a)
QFunction
QValue
for the given state-action pair.public Policy planFromState(State initialState)
Planner
Planner
to begin planning from the specified initial State
.
It will then return an appropriate Policy
object that captured the planning results.
Note that typically you can use a variety of different Policy
objects
in conjunction with this Planner
to get varying behavior and
the returned Policy
is not required to be used.planFromState
in interface Planner
initialState
- the initial state of the planning problemPolicy
that captures the planning results from input State
.public void resetSolver()
MDPSolverInterface
resetSolver
in interface MDPSolverInterface
resetSolver
in class MDPSolver
public double value(State s)
ValueFunction
value
in interface ValueFunction
s
- the state to evaluate.public static void main(java.lang.String[] args)