public class WeightedGreedy extends AStar
If a terminal function is provided via the setter method defined for OO-MDPs, then the search algorithm will not expand any nodes that are terminal states, as if there were no actions that could be executed from that state. Note that terminal states are not necessarily the same as goal states, since there could be a fail condition from which the agent cannot act, but that is not explicitly represented in the transition dynamics.
DeterministicPlanner.PlanningFailedException
Modifier and Type | Field and Description |
---|---|
protected double |
costWeight
The cost function weight.
|
cumulatedRewardMap, heuristic, lastComputedCumR
gc, internalPolicy
actionTypes, debugCode, domain, gamma, hashingFactory, model, usingOptionModel
Constructor and Description |
---|
WeightedGreedy(SADomain domain,
RewardFunction rf,
StateConditionTest gc,
HashableStateFactory hashingFactory,
Heuristic heuristic,
double costWeight)
Initializes the valueFunction.
|
Modifier and Type | Method and Description |
---|---|
double |
computeF(PrioritizedSearchNode parentNode,
Action generatingAction,
HashableState successorState,
double r)
This method returns the f-score for a state given the parent search node, the generating action, the state that was produced.
|
insertIntoOpen, postPlanPrep, prePlanPrep, updateOpen
planFromState
deterministicPlannerInit, encodePlanIntoPolicy, hasCachedPlanForState, planContainsOption, planHasDupilicateStates, querySelectedActionForState, resetSolver
addActionType, applicableActions, getActionTypes, getDebugCode, getDomain, getGamma, getHashingFactory, getModel, setActionTypes, setDebugCode, setDomain, setGamma, setHashingFactory, setModel, solverInit, stateHash, toggleDebugPrinting
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
addActionType, getActionTypes, getDebugCode, getDomain, getGamma, getHashingFactory, getModel, setActionTypes, setDebugCode, setDomain, setGamma, setHashingFactory, setModel, solverInit, toggleDebugPrinting
public WeightedGreedy(SADomain domain, RewardFunction rf, StateConditionTest gc, HashableStateFactory hashingFactory, Heuristic heuristic, double costWeight)
domain
- the domain in which to planrf
- the reward function that represents costs as negative rewardgc
- should evaluate to true for goal states; false otherwisehashingFactory
- the state hashing factory to useheuristic
- the planning heuristic. Should return non-positive values.costWeight
- a fraction 0 <= w <= 1. When w = 0, the search is fully greedy. When w = 1, the search is optimal and equivalent to A*.public double computeF(PrioritizedSearchNode parentNode, Action generatingAction, HashableState successorState, double r)
BestFirst
computeF
in class AStar
parentNode
- the parent search node (and its priority) that from which the next state was generated.generatingAction
- the action that was used to generate the next state.successorState
- the next state that was generatedr
- the reward received for the transition