PrioritizedSweeping

java.lang.Object
- burlap.behavior.singleagent.planning.OOMDPPlanner
- - burlap.behavior.singleagent.planning.ValueFunctionPlanner
  - - burlap.behavior.singleagent.planning.stochastic.valueiteration.ValueIteration
    - - burlap.behavior.singleagent.planning.stochastic.valueiteration.PrioritizedSweeping

All Implemented Interfaces:

QComputablePlanner, ValueFunction
```
public class PrioritizedSweeping
extends ValueIteration
```
An implementation of Prioritized Sweeping as DP planning algorithm as described by Li and Littman [1]. This class will perform Bellman updates on states according to their position in a Priority queue. The priority of any state is updated with respect to the change in the Bellman error of a state to which it transitions. This means that there is greater memory utilization in this algorithm than standard VI because the backwards transition dynamics must be stored. The priority queue takes C*lg(N) time to manage at each step, where C is the number of backpointers per state, but if large gains can be achieved by the ordeing of the states, then this cost may be worth it. 1. Li, Lihong, Michael L. Littman, and L. Littman. Prioritized sweeping converges to the optimal value function. Tech. Rep. DCS-TR-631, 2008.

Author:

James MacGlashan

Nested Class Summary

Nested Classes
Modifier and Type	Class and Description
`protected class`	`PrioritizedSweeping.BPTR` A back pointer and its max action probability of transition.
`protected class`	`PrioritizedSweeping.BPTRNode` A node for state thar contains a list of its back pointers, their max probability of transition to this state, and the priority of this nodes state.
`protected static class`	`PrioritizedSweeping.BPTRNodeComparator` Comparator for the the priority of BPTRNodes

Nested classes/interfaces inherited from class burlap.behavior.singleagent.planning.ValueFunctionPlanner
ValueFunctionPlanner.StaticVFPlanner

Nested classes/interfaces inherited from interface burlap.behavior.singleagent.planning.QComputablePlanner
QComputablePlanner.QComputablePlannerHelper

Field Summary

Fields
Modifier and Type	Field and Description
`protected int`	`maxBackups` THe maximum number Bellman backups permitted
`protected HashIndexedHeap<PrioritizedSweeping.BPTRNode>`	`priorityNodes` The priority queue of states

Fields inherited from class burlap.behavior.singleagent.planning.stochastic.valueiteration.ValueIteration
foundReachableStates, hasRunVI, maxDelta, maxIterations, stopReachabilityFromTerminalStates

Fields inherited from class burlap.behavior.singleagent.planning.ValueFunctionPlanner
transitionDynamics, useCachedTransitions, valueFunction, valueInitializer

Fields inherited from class burlap.behavior.singleagent.planning.OOMDPPlanner
actions, containsParameterizedActions, debugCode, domain, gamma, hashingFactory, mapToStateIndex, rf, tf

Constructor Summary

Constructors
Constructor and Description
`PrioritizedSweeping(Domain domain, RewardFunction rf, TerminalFunction tf, double gamma, StateHashFactory hashingFactory, double maxDelta, int maxBackups)` Initializes

Method Summary

Methods
Modifier and Type	Method and Description
`protected PrioritizedSweeping.BPTRNode`	`getNodeFor(StateHashTuple sh)` Returns or creates, stores, and returns a priority back pointer node for the given hased state
`boolean`	`performReachabilityFrom(State si)` This method will find all reachable states that will be used by the `ValueIteration.runVI()` method and will cache all the transition dynamics.
`void`	`planFromState(State initialState)` This method will cause the planner to begin planning from the specified initial state
`void`	`runVI()` Runs VI until the specified termination conditions are met.

Methods inherited from class burlap.behavior.singleagent.planning.stochastic.valueiteration.ValueIteration
recomputeReachableStates, resetPlannerResults, toggleReachabiltiyTerminalStatePruning

Methods inherited from class burlap.behavior.singleagent.planning.ValueFunctionPlanner
computeQ, computeQ, getActionsTransitions, getAllStates, getCopyOfValueFunction, getDefaultValue, getQ, getQ, getQs, getValueFunctionInitialization, hasComputedValueFor, initializeOptionsForExpectationComputations, performBellmanUpdateOn, performBellmanUpdateOn, performFixedPolicyBellmanUpdateOn, performFixedPolicyBellmanUpdateOn, setValueFunctionInitialization, toggleUseCachedTransitionDynamics, value, value, VFPInit

Methods inherited from class burlap.behavior.singleagent.planning.OOMDPPlanner
addNonDomainReferencedAction, getActions, getAllGroundedActions, getDebugCode, getDomain, getGamma, getHashingFactory, getRf, getRF, getTf, getTF, plannerInit, setActions, setDebugCode, setDomain, setGamma, setRf, setTf, stateHash, toggleDebugPrinting, translateAction

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - priorityNodes
```
protected HashIndexedHeap<PrioritizedSweeping.BPTRNode> priorityNodes
```
    The priority queue of states
  - maxBackups
```
protected int maxBackups
```
    THe maximum number Bellman backups permitted
- Constructor Detail
  - PrioritizedSweeping
```
public PrioritizedSweeping(Domain domain,
                   RewardFunction rf,
                   TerminalFunction tf,
                   double gamma,
                   StateHashFactory hashingFactory,
                   double maxDelta,
                   int maxBackups)
```
    Initializes
    
    Parameters:
    domain - the domain in which to plan
    rf - the reward function
    tf - the terminal state function
    gamma - the discount factor
    hashingFactory - the state hashing factor to use
    maxDelta - when the maximum change in the value function is smaller than this value, VI will terminate.
    maxBackups - the maximum number of Bellman backups. If set to -1, then there is no hard limit.
- Method Detail
  - planFromState
```
public void planFromState(State initialState)
```
    Description copied from class: OOMDPPlanner
    
    This method will cause the planner to begin planning from the specified initial state
    
    Overrides:
    
    planFromState in class ValueIteration
    
    Parameters:
    initialState - the initial state of the planning problem
  - runVI
```
public void runVI()
```
    Description copied from class: ValueIteration
    
    Runs VI until the specified termination conditions are met. In general, this method should only be called indirectly through the ValueIteration.planFromState(State) method. The ValueIteration.performReachabilityFrom(State) must have been performed at least once in the past or a runtime exception will be thrown. The ValueIteration.planFromState(State) method will automatically call the ValueIteration.performReachabilityFrom(State) method first and then this if it hasn't been run.
    
    Overrides:
    
    runVI in class ValueIteration
  - performReachabilityFrom
```
public boolean performReachabilityFrom(State si)
```
    Description copied from class: ValueIteration
    
    This method will find all reachable states that will be used by the ValueIteration.runVI() method and will cache all the transition dynamics. This method will not do anything if all reachable states from the input state have been discovered from previous calls to this method.
    
    Overrides:
    
    performReachabilityFrom in class ValueIteration
    
    Parameters:
    si - the source state from which all reachable states will be found
    
    Returns:
    true if a reachability analysis had never been performed from this state; false otherwise.
  - getNodeFor
```
protected PrioritizedSweeping.BPTRNode getNodeFor(StateHashTuple sh)
```
    Returns or creates, stores, and returns a priority back pointer node for the given hased state
    
    Parameters:
    sh - the hashed state for which its node should be returned.
    
    Returns:
    a priority back pointer node for the given hased state

Class PrioritizedSweeping

Nested Class Summary

Nested classes/interfaces inherited from class burlap.behavior.singleagent.planning.ValueFunctionPlanner

Nested classes/interfaces inherited from interface burlap.behavior.singleagent.planning.QComputablePlanner

Field Summary

Fields inherited from class burlap.behavior.singleagent.planning.stochastic.valueiteration.ValueIteration

Fields inherited from class burlap.behavior.singleagent.planning.ValueFunctionPlanner

Fields inherited from class burlap.behavior.singleagent.planning.OOMDPPlanner

Constructor Summary

Method Summary

Methods inherited from class burlap.behavior.singleagent.planning.stochastic.valueiteration.ValueIteration

Methods inherited from class burlap.behavior.singleagent.planning.ValueFunctionPlanner

Methods inherited from class burlap.behavior.singleagent.planning.OOMDPPlanner

Methods inherited from class java.lang.Object

Field Detail

priorityNodes

maxBackups

Constructor Detail

PrioritizedSweeping

Method Detail

planFromState

runVI

performReachabilityFrom

getNodeFor