public class WeightedGreedy extends AStar
If a terminal function is provided via the setter method defined for OO-MDPs, then the search algorithm will not expand any nodes that are terminal states, as if there were no actions that could be executed from that state. Note that terminal states are not necessarily the same as goal states, since there could be a fail condition from which the agent cannot act, but that is not explicitly represented in the transition dynamics.
|Modifier and Type||Field and Description|
The cost function weight.
cumulatedRewardMap, heuristic, lastComputedCumR
actionTypes, debugCode, domain, gamma, hashingFactory, model, usingOptionModel
|Constructor and Description|
Initializes the valueFunction.
|Modifier and Type||Method and Description|
This method returns the f-score for a state given the parent search node, the generating action, the state that was produced.
insertIntoOpen, postPlanPrep, prePlanPrep, updateOpen
deterministicPlannerInit, encodePlanIntoPolicy, hasCachedPlanForState, planContainsOption, planHasDupilicateStates, querySelectedActionForState, resetSolver
addActionType, applicableActions, getActionTypes, getDebugCode, getDomain, getGamma, getHashingFactory, getModel, setActionTypes, setDebugCode, setDomain, setGamma, setHashingFactory, setModel, solverInit, stateHash, toggleDebugPrinting
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
addActionType, getActionTypes, getDebugCode, getDomain, getGamma, getHashingFactory, getModel, setActionTypes, setDebugCode, setDomain, setGamma, setHashingFactory, setModel, solverInit, toggleDebugPrinting
public WeightedGreedy(SADomain domain, RewardFunction rf, StateConditionTest gc, HashableStateFactory hashingFactory, Heuristic heuristic, double costWeight)
domain- the domain in which to plan
rf- the reward function that represents costs as negative reward
gc- should evaluate to true for goal states; false otherwise
hashingFactory- the state hashing factory to use
heuristic- the planning heuristic. Should return non-positive values.
costWeight- a fraction 0 <= w <= 1. When w = 0, the search is fully greedy. When w = 1, the search is optimal and equivalent to A*.
public double computeF(PrioritizedSearchNode parentNode, Action generatingAction, HashableState successorState, double r)
parentNode- the parent search node (and its priority) that from which the next state was generated.
generatingAction- the action that was used to generate the next state.
successorState- the next state that was generated
r- the reward received for the transition