Policy

java.lang.Object
- burlap.behavior.singleagent.Policy

Direct Known Subclasses:

Actor, ApprenticeshipLearning.RandomPolicy, BoltzmannQPolicy, CachedPolicy, DDPlannerPolicy, DomainMappedPolicy, EpsilonGreedy, GreedyDeterministicQPolicy, GreedyQPolicy, JointPolicy, Policy.RandomPolicy, PolicyFromJointPolicy, SDPlannerPolicy, UCTTreeWalkPolicy, UnmodeledFavoredPolicy
```
public abstract class Policy
extends java.lang.Object
```
This abstract class is used to store a policy for a domain that can be queried and perform common operations with the policy.

Author:

James MacGlashan

Nested Class Summary

Nested Classes
Modifier and Type	Class and Description
`static class`	`Policy.ActionProb` Class for storing an action and probability tuple.
`static class`	`Policy.PolicyUndefinedException` RuntimeException to be thrown when a Policy is queried for a state in which the policy is undefined.
`static class`	`Policy.RandomPolicy` A uniform random policy for single agent domains.

Field Summary

Fields
Modifier and Type Field and Description

protected boolean annotateOptionDecomposition

protected boolean evaluateDecomposesOptions

Fields
Modifier and Type	Field and Description
`protected boolean`	`annotateOptionDecomposition`
`protected boolean`	`evaluateDecomposesOptions`

Constructor Summary

Constructors
Constructor and Description

Policy()

Constructors
Constructor and Description
`Policy()`

Method Summary

Methods
Modifier and Type	Method and Description
`EpisodeAnalysis`	`evaluateBehavior(State s, RewardFunction rf, int numSteps)` This method will return the an episode that results from following this policy from state s.
`EpisodeAnalysis`	`evaluateBehavior(State s, RewardFunction rf, TerminalFunction tf)` This method will return the an episode that results from following this policy from state s.
`EpisodeAnalysis`	`evaluateBehavior(State s, RewardFunction rf, TerminalFunction tf, int maxSteps)` This method will return the an episode that results from following this policy from state s.
`void`	`evaluateMethodsShouldAnnotateOptionDecomposition(boolean toggle)` Sets whether options that are decomposed into primitives will have the option that produced them and listed.
`void`	`evaluateMethodsShouldDecomposeOption(boolean toggle)` Sets whether the primitive actions taken during an options will be included as steps in produced EpisodeAnalysis objects.
`abstract AbstractGroundedAction`	`getAction(State s)` This method will return an action sampled by the policy for the given state.
`abstract java.util.List<Policy.ActionProb>`	`getActionDistributionForState(State s)` This method will return action probability distribution defined by the policy.
`protected java.util.List<Policy.ActionProb>`	`getDeterministicPolicy(State s)` A helper method for defining deterministic policies.
`double`	`getProbOfAction(State s, AbstractGroundedAction ga)` Will return the probability of this policy taking action ga in state s
`static double`	`getProbOfActionGivenDistribution(AbstractGroundedAction ga, java.util.List<Policy.ActionProb> distribution)` Searches the input distribution for the occurrence of the input action and returns its probability.
`static double`	`getProbOfActionGivenDistribution(State s, AbstractGroundedAction ga, java.util.List<Policy.ActionProb> distribution)` Deprecated.
`abstract boolean`	`isDefinedFor(State s)` Specifies whether this policy is defined for the input state.
`abstract boolean`	`isStochastic()` Indicates whether the policy is stochastic or deterministic.
`protected AbstractGroundedAction`	`sampleFromActionDistribution(State s)` This is a helper method for stochastic policies.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - evaluateDecomposesOptions
```
protected boolean evaluateDecomposesOptions
```
  - annotateOptionDecomposition
```
protected boolean annotateOptionDecomposition
```
- Constructor Detail
  - Policy
```
public Policy()
```
- Method Detail
  - getAction
```
public abstract AbstractGroundedAction getAction(State s)
```
    This method will return an action sampled by the policy for the given state. If the defined policy is stochastic, then multiple calls to this method for the same state may return different actions. The sampling should be with respect to defined action distribution that is returned by getActionDistributionForState
    
    Parameters:
    s - the state for which an action should be returned
    
    Returns:
    a sample action from the action distribution; null if the policy is undefined for s
  - getActionDistributionForState
```
public abstract java.util.List<Policy.ActionProb> getActionDistributionForState(State s)
```
    This method will return action probability distribution defined by the policy. The action distribution is represented by a list of ActionProb objects, each which specifies a grounded action and a probability of that grounded action being taken. The returned list does not have to include actions with probability 0.
    
    Parameters:
    s - the state for which an action distribution should be returned
    
    Returns:
    a list of possible actions taken by the policy and their probability.
  - isStochastic
```
public abstract boolean isStochastic()
```
    Indicates whether the policy is stochastic or deterministic.
    
    Returns:
    true when the policy is stochastic; false when it is deterministic.
  - isDefinedFor
```
public abstract boolean isDefinedFor(State s)
```
    Specifies whether this policy is defined for the input state.
    
    Parameters:
    s - the input state to test for whether this policy is defined
    
    Returns:
    true if this policy is defined for State s, false otherwise.
  - getProbOfAction
```
public double getProbOfAction(State s,
                     AbstractGroundedAction ga)
```
    Will return the probability of this policy taking action ga in state s
    
    Parameters:
    s - the state in which the action would be taken
    ga - the action being queried
    
    Returns:
    the probability of this policy taking action ga in state s
  - getProbOfActionGivenDistribution
```
@Deprecated
public static double getProbOfActionGivenDistribution(State s,
                                                 AbstractGroundedAction ga,
                                                 java.util.List<Policy.ActionProb> distribution)
```
    Deprecated.
    
    Don't use this, the input state is not necessary; instead use getProbOfActionGivenDistribution(burlap.oomdp.core.AbstractGroundedAction, java.util.List).
  - getProbOfActionGivenDistribution
```
public static double getProbOfActionGivenDistribution(AbstractGroundedAction ga,
                                      java.util.List<Policy.ActionProb> distribution)
```
    Searches the input distribution for the occurrence of the input action and returns its probability.
    
    Parameters:
    ga - the AbstractGroundedAction for which its probability in specified distribution should be returned.
    distribution - the probability distribution over actions.
    
    Returns:
    the probability of selecting action ga according to the probability specified in distribution.
  - getDeterministicPolicy
```
protected java.util.List<Policy.ActionProb> getDeterministicPolicy(State s)
```
    A helper method for defining deterministic policies. This method relies on the getAction method being implemented and will return a list of ActionProb objects with a single instance: the result of the getAction method with assigned probability 1. This method simplifies the definition of deterministic policies because the getActionDistributionForState method can just retunr this method.
    
    Parameters:
    s - the state for which the action distribution should be returned.
    
    Returns:
    a deterministic action distribution for the action returned by the getAction method.
  - sampleFromActionDistribution
```
protected AbstractGroundedAction sampleFromActionDistribution(State s)
```
    This is a helper method for stochastic policies. If the policy is stochastic, then rather than having the subclass policy define both the getAction method and getActionDistribution method, the subclass needs to only define the getActionDistribution method and the getAction method can simply call this method to return an action.
    
    Parameters:
    s -
    
    Returns:
    an AbstractGroundedAction to take
  - evaluateMethodsShouldDecomposeOption
```
public void evaluateMethodsShouldDecomposeOption(boolean toggle)
```
    Sets whether the primitive actions taken during an options will be included as steps in produced EpisodeAnalysis objects. The default value is true. If this is set to false, then EpisodeAnalysis objects returned from evaluating a policy will record options as a single "action" and the steps taken by the option will be hidden.
    
    Parameters:
    toggle - whether to decompose options into the primitive actions taken by them or not.
  - evaluateMethodsShouldAnnotateOptionDecomposition
```
public void evaluateMethodsShouldAnnotateOptionDecomposition(boolean toggle)
```
    Sets whether options that are decomposed into primitives will have the option that produced them and listed. The default value is true. If option decomposition is not enabled, changing this value will do nothing. When it is enabled and this is set to true, primitive actions taken by an option in EpisodeAnalysis objects will be recorded with a special action name that indicates which option was called to produce the primitive action as well as which step of the option the primitive action is. When set to false, recorded names of primitives will be only the primitive aciton's name it will be unclear which option was taken to generate it.
    
    Parameters:
    toggle - whether to annotate the primitive actions of options with the calling option's name.
  - evaluateBehavior
```
public EpisodeAnalysis evaluateBehavior(State s,
                               RewardFunction rf,
                               TerminalFunction tf)
```
    This method will return the an episode that results from following this policy from state s. The episode will terminate when the policy reaches a terminal state defined by tf.
    
    Parameters:
    s - the state from which to roll out the policy
    rf - the reward function used to track rewards accumulated during the episode
    tf - the terminal function defining when the policy should stop being followed.
    
    Returns:
    an EpisodeAnalysis object that records the events from following the policy.
  - evaluateBehavior
```
public EpisodeAnalysis evaluateBehavior(State s,
                               RewardFunction rf,
                               TerminalFunction tf,
                               int maxSteps)
```
    This method will return the an episode that results from following this policy from state s. The episode will terminate when the policy reaches a terminal state defined by tf or when the number of steps surpasses maxSteps.
    
    Parameters:
    s - the state from which to roll out the policy
    rf - the reward function used to track rewards accumulated during the episode
    tf - the terminal function defining when the policy should stop being followed.
    maxSteps - the maximum number of steps to take before terminating the policy rollout.
    
    Returns:
    an EpisodeAnalysis object that records the events from following the policy.
  - evaluateBehavior
```
public EpisodeAnalysis evaluateBehavior(State s,
                               RewardFunction rf,
                               int numSteps)
```
    This method will return the an episode that results from following this policy from state s. The episode will terminate when the number of steps taken is >= numSteps.
    
    Parameters:
    s - the state from which to roll out the policy
    rf - the reward function used to track rewards accumulated during the episode
    numSteps - the number of steps to take before terminating the policy rollout
    
    Returns:
    an EpisodeAnalysis object that records the events from following the policy.

Class Policy

Nested Class Summary

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

evaluateDecomposesOptions

annotateOptionDecomposition

Constructor Detail

Policy

Method Detail

getAction

getActionDistributionForState

isStochastic

isDefinedFor

getProbOfAction

getProbOfActionGivenDistribution

getProbOfActionGivenDistribution

getDeterministicPolicy

sampleFromActionDistribution

evaluateMethodsShouldDecomposeOption

evaluateMethodsShouldAnnotateOptionDecomposition

evaluateBehavior

evaluateBehavior

evaluateBehavior