PrioritizedSweeping

java.lang.Object
- burlap.behavior.singleagent.MDPSolver
- - burlap.behavior.singleagent.planning.stochastic.DynamicProgramming
  - - burlap.behavior.singleagent.planning.stochastic.valueiteration.ValueIteration
    - - burlap.behavior.singleagent.planning.stochastic.valueiteration.PrioritizedSweeping

All Implemented Interfaces:

MDPSolverInterface, Planner, QFunction, QProvider, ValueFunction
```
public class PrioritizedSweeping
extends ValueIteration
```
An implementation of Prioritized Sweeping as DP planning algorithm as described by Li and Littman [1]. This class will perform Bellman updates on states according to their position in a Priority queue. The priority of any state is updated with respect to the change in the Bellman error of a state to which it transitions. This means that there is greater memory utilization in this algorithm than standard VI because the backwards transition dynamics must be stored. The priority queue takes C*lg(N) time to manage at each step, where C is the number of backpointers per state, but if large gains can be achieved by the ordeing of the states, then this cost may be worth it. 1. Li, Lihong, Michael L. Littman. Prioritized sweeping converges to the optimal value function. Tech. Rep. DCS-TR-631, 2008.

Author:

James MacGlashan

Nested Class Summary

Nested Classes
Modifier and Type	Class and Description
`protected class`	`PrioritizedSweeping.BPTR` A back pointer and its max action probability of transition.
`protected class`	`PrioritizedSweeping.BPTRNode` A node for state thar contains a list of its back pointers, their max probability of transition to this state, and the priority of this nodes state.
`protected static class`	`PrioritizedSweeping.BPTRNodeComparator` Comparator for the the priority of BPTRNodes

Nested classes/interfaces inherited from interface burlap.behavior.valuefunction.QProvider
QProvider.Helper

Field Summary

Fields
Modifier and Type	Field and Description
`protected int`	`maxBackups` THe maximum number Bellman backups permitted
`protected HashIndexedHeap<PrioritizedSweeping.BPTRNode>`	`priorityNodes` The priority queue of states

Fields inherited from class burlap.behavior.singleagent.planning.stochastic.valueiteration.ValueIteration
foundReachableStates, hasRunVI, maxDelta, maxIterations, stopReachabilityFromTerminalStates

Fields inherited from class burlap.behavior.singleagent.planning.stochastic.DynamicProgramming
operator, valueFunction, valueInitializer

Fields inherited from class burlap.behavior.singleagent.MDPSolver
actionTypes, debugCode, domain, gamma, hashingFactory, model, usingOptionModel

Constructor Summary

Constructors
Constructor and Description
`PrioritizedSweeping(SADomain domain, double gamma, HashableStateFactory hashingFactory, double maxDelta, int maxBackups)` Initializes

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`protected PrioritizedSweeping.BPTRNode`	`getNodeFor(HashableState sh)` Returns or creates, stores, and returns a priority back pointer node for the given hased state
`boolean`	`performReachabilityFrom(State si)` This method will find all reachable states that will be used by the `ValueIteration.runVI()` method and will cache all the transition dynamics.
`void`	`runVI()` Runs VI until the specified termination conditions are met.

Methods inherited from class burlap.behavior.singleagent.planning.stochastic.valueiteration.ValueIteration
planFromState, recomputeReachableStates, resetSolver, toggleReachabiltiyTerminalStatePruning

Methods inherited from class burlap.behavior.singleagent.planning.stochastic.DynamicProgramming
computeQ, DPPInit, getAllStates, getCopyOfValueFunction, getDefaultValue, getModel, getOperator, getValueFunctionInitialization, hasComputedValueFor, loadValueTable, performBellmanUpdateOn, performBellmanUpdateOn, performFixedPolicyBellmanUpdateOn, performFixedPolicyBellmanUpdateOn, qValue, qValues, setOperator, setValueFunctionInitialization, value, value, writeValueTable

Methods inherited from class burlap.behavior.singleagent.MDPSolver
addActionType, applicableActions, getActionTypes, getDebugCode, getDomain, getGamma, getHashingFactory, setActionTypes, setDebugCode, setDomain, setGamma, setHashingFactory, setModel, solverInit, stateHash, toggleDebugPrinting

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Methods inherited from interface burlap.behavior.singleagent.MDPSolverInterface
addActionType, getActionTypes, getDebugCode, getDomain, getGamma, getHashingFactory, getModel, setActionTypes, setDebugCode, setDomain, setGamma, setHashingFactory, setModel, solverInit, toggleDebugPrinting

- Field Detail
  - priorityNodes
```
protected HashIndexedHeap<PrioritizedSweeping.BPTRNode> priorityNodes
```
    The priority queue of states
  - maxBackups
```
protected int maxBackups
```
    THe maximum number Bellman backups permitted
- Constructor Detail
  - PrioritizedSweeping
```
public PrioritizedSweeping(SADomain domain,
                           double gamma,
                           HashableStateFactory hashingFactory,
                           double maxDelta,
                           int maxBackups)
```
    Initializes
    
    Parameters:
    
    domain - the domain in which to plan
    
    gamma - the discount factor
    
    hashingFactory - the state hashing factor to use
    
    maxDelta - when the maximum change in the value function is smaller than this value, VI will terminate.
    
    maxBackups - the maximum number of Bellman backups. If set to -1, then there is no hard limit.
- Method Detail
  - runVI
```
public void runVI()
```
    Description copied from class: ValueIteration
    
    Runs VI until the specified termination conditions are met. In general, this method should only be called indirectly through the ValueIteration.planFromState(State) method. The ValueIteration.performReachabilityFrom(State) must have been performed at least once in the past or a runtime exception will be thrown. The ValueIteration.planFromState(State) method will automatically call the ValueIteration.performReachabilityFrom(State) method first and then this if it hasn't been run.
    
    Overrides:
    
    runVI in class ValueIteration
  - performReachabilityFrom
```
public boolean performReachabilityFrom(State si)
```
    Description copied from class: ValueIteration
    
    This method will find all reachable states that will be used by the ValueIteration.runVI() method and will cache all the transition dynamics. This method will not do anything if all reachable states from the input state have been discovered from previous calls to this method.
    
    Overrides:
    
    performReachabilityFrom in class ValueIteration
    
    Parameters:
    
    si - the source state from which all reachable states will be found
    
    Returns:
    
    true if a reachability analysis had never been performed from this state; false otherwise.
  - getNodeFor
```
protected PrioritizedSweeping.BPTRNode getNodeFor(HashableState sh)
```
    Returns or creates, stores, and returns a priority back pointer node for the given hased state
    
    Parameters:
    
    sh - the hashed state for which its node should be returned.
    
    Returns:
    
    a priority back pointer node for the given hased state

Class PrioritizedSweeping

Nested Class Summary

Nested classes/interfaces inherited from interface burlap.behavior.valuefunction.QProvider

Field Summary

Fields inherited from class burlap.behavior.singleagent.planning.stochastic.valueiteration.ValueIteration

Fields inherited from class burlap.behavior.singleagent.planning.stochastic.DynamicProgramming

Fields inherited from class burlap.behavior.singleagent.MDPSolver

Constructor Summary

Method Summary

Methods inherited from class burlap.behavior.singleagent.planning.stochastic.valueiteration.ValueIteration

Methods inherited from class burlap.behavior.singleagent.planning.stochastic.DynamicProgramming

Methods inherited from class burlap.behavior.singleagent.MDPSolver

Methods inherited from class java.lang.Object

Methods inherited from interface burlap.behavior.singleagent.MDPSolverInterface

Field Detail

priorityNodes

maxBackups

Constructor Detail

PrioritizedSweeping

Method Detail

runVI

performReachabilityFrom

getNodeFor