BFSMarkovOptionModel

java.lang.Object
- burlap.behavior.singleagent.options.model.BFSMarkovOptionModel

All Implemented Interfaces:

FullModel, SampleModel

Direct Known Subclasses:

BFSNonMarkovOptionModel
```
public class BFSMarkovOptionModel
extends java.lang.Object
implements FullModel
```
A model that can compute a Markov option's transition model, and cache it, from a source SampleModel. A FullModel is required for the transitions(State, Action) method. Note that the transitions model for an option is a multi-time model, which means the state transition probabilities factor in the discount factor. That is, P(s' | s, o) = \sum_k^\ifnty p(s', k | s, o) \gamma^k, where p(s', k | s, o) is the probability that the agent will terminate in state s' after k steps, given that option o was initiated in state s.
The computation of the transition model can be quite expensive (particularly for stochastic domains) and ideally, you should consider a custom implementation of your option model. The computation of the model proceeds by running a BFS-like algorithm from the input state following the option policy to possible option (or environment) termination states. The BFS expansion will stop when a minimum threshold of the probability mass of all possible trajectories following the policy is computed (by default 0.999). However, you can shrink the probability threshold using the method setMinProb(double) to decrease computation time. When you decrease the probability threshold, the compute probabilities are normalized by the amount of the trajectory probability mass computed, given an estimated option transition model.
If you need a model for non-Markov options (e.g., a MacroAction), use the BFSNonMarkovOptionModel model, which using slightly more memory overhead in the computation to maintain the fully trajectory history.

Author:

James MacGlashan.

Nested Class Summary

Nested Classes
Modifier and Type Class and Description

static class BFSMarkovOptionModel.CachedModel

static class BFSMarkovOptionModel.OptionScanNode
- Nested classes/interfaces inherited from interface burlap.mdp.singleagent.model.FullModel
  FullModel.Helper

Nested Classes
Modifier and Type	Class and Description
`static class`	`BFSMarkovOptionModel.CachedModel`
`static class`	`BFSMarkovOptionModel.OptionScanNode`

Field Summary

Fields
Modifier and Type	Field and Description
`protected java.util.Map<Option,BFSMarkovOptionModel.CachedModel>`	`cachedModels`
`protected double`	`discount`
`protected HashableStateFactory`	`hashingFactory`
`protected double`	`minProb`
`protected SampleModel`	`model`
`protected boolean`	`requireMarkov`
`protected java.util.Set<HashableState>`	`srcTerminateStates`

Constructor Summary

Constructors
Constructor and Description

BFSMarkovOptionModel(SampleModel model, double discount, HashableStateFactory hashingFactory)

Constructors
Constructor and Description
`BFSMarkovOptionModel(SampleModel model, double discount, HashableStateFactory hashingFactory)`

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`protected double`	`computeTransitions(State s, Option o, HashedAggregator<HashableState> possibleTerminations, double[] expectedReturn)`
`protected BFSMarkovOptionModel.CachedModel`	`getOrCreateModel(Option o)`
`EnvironmentOutcome`	`sample(State s, Action a)` Samples a transition from the transition distribution and returns it.
`void`	`setMinProb(double minProb)`
`boolean`	`terminal(State s)` Indicates whether a state is a terminal state (i.e., no more action occurs and zero reward received from there on out)
`java.util.List<TransitionProb>`	`transitions(State s, Action a)` Returns the set of possible transitions when `Action` is applied in `State` s.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Detail

model
```
protected SampleModel model
```

discount
```
protected double discount
```

hashingFactory

protected HashableStateFactory hashingFactory

cachedModels

protected java.util.Map<Option,BFSMarkovOptionModel.CachedModel> cachedModels

srcTerminateStates

protected java.util.Set<HashableState> srcTerminateStates

minProb
```
protected double minProb
```

requireMarkov
```
protected boolean requireMarkov
```

Constructor Detail

BFSMarkovOptionModel

public BFSMarkovOptionModel(SampleModel model,
                            double discount,
                            HashableStateFactory hashingFactory)

Method Detail
- setMinProb
```
public void setMinProb(double minProb)
```
- transitions
```
public java.util.List<TransitionProb> transitions(State s,
                                                  Action a)
```
  Description copied from interface: FullModel
  
  Returns the set of possible transitions when Action is applied in State s. The returned list only needs to include transitions that have non-zero probability of occurring.
  
  Specified by:
  
  transitions in interface FullModel
  
  Parameters:
  
  s - the source State
  
  a - the Action applied in the source state
  
  Returns:
  
  the probability distribution over possible transitions.
- sample
```
public EnvironmentOutcome sample(State s,
                                 Action a)
```
  Description copied from interface: SampleModel
  
  Samples a transition from the transition distribution and returns it.
  
  Specified by:
  
  sample in interface SampleModel
  
  Parameters:
  
  s - the source state
  
  a - the action taken in the source state
  
  Returns:
  
  and EnvironmentOutcome describing the sampled transition
- terminal
```
public boolean terminal(State s)
```
  Description copied from interface: SampleModel
  
  Indicates whether a state is a terminal state (i.e., no more action occurs and zero reward received from there on out)
  
  Specified by:
  
  terminal in interface SampleModel
  
  Parameters:
  
  s - the input state to test
  
  Returns:
  
  true if the state is a terminal state, false if it is not.
- getOrCreateModel
```
protected BFSMarkovOptionModel.CachedModel getOrCreateModel(Option o)
```
- computeTransitions
```
protected double computeTransitions(State s,
                                    Option o,
                                    HashedAggregator<HashableState> possibleTerminations,
                                    double[] expectedReturn)
```

Class BFSMarkovOptionModel

Nested Class Summary

Nested classes/interfaces inherited from interface burlap.mdp.singleagent.model.FullModel

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

model

discount

hashingFactory

cachedModels

srcTerminateStates

minProb

requireMarkov

Constructor Detail

BFSMarkovOptionModel

Method Detail

setMinProb

transitions

sample

terminal

getOrCreateModel

computeTransitions