EpsilonGreedy

java.lang.Object
- burlap.behavior.policy.EpsilonGreedy

All Implemented Interfaces:

EnumerablePolicy, Policy, SolverDerivedPolicy
```
public class EpsilonGreedy
extends java.lang.Object
implements SolverDerivedPolicy, EnumerablePolicy
```
This class defines a an epsilon-greedy policy over Q-values and requires a QComputable valueFunction to be specified. With probability epsilon the policy will return a random action (with uniform distribution over all possible action). With probability 1 - epsilon the policy will return the greedy action. If multiple actions tie for the highest Q-value, then one of the tied actions is randomly selected.

Author:

James MacGlashan

Field Summary

Fields
Modifier and Type Field and Description

protected double epsilon

protected QProvider qplanner

protected java.util.Random rand

Fields
Modifier and Type	Field and Description
`protected double`	`epsilon`
`protected QProvider`	`qplanner`
`protected java.util.Random`	`rand`

Constructor Summary

Constructors
Constructor and Description
`EpsilonGreedy(double epsilon)` Initializes with the value of epsilon, where epsilon is the probability of taking a random action.
`EpsilonGreedy(QProvider planner, double epsilon)` Initializes with the QComputablePlanner to use and the value of epsilon to use, where epsilon is the probability of taking a random action.

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`Action`	`action(State s)` This method will return an action sampled by the policy for the given state.
`double`	`actionProb(State s, Action a)` Returns the probability/probability density that the given action will be taken in the given state.
`boolean`	`definedFor(State s)` Specifies whether this policy is defined for the input state.
`double`	`getEpsilon()` Returns the epsilon value, where epsilon is the probability of taking a random action.
`java.util.List<ActionProb>`	`policyDistribution(State s)` This method will return action probability distribution defined by the policy.
`void`	`setEpsilon(double epsilon)` Sets the epsilon value, where epsilon is the probability of taking a random action.
`void`	`setSolver(MDPSolverInterface solver)` Sets the valueFunction whose results affect this policy.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - qplanner
```
protected QProvider qplanner
```
  - epsilon
```
protected double epsilon
```
  - rand
```
protected java.util.Random rand
```
- Constructor Detail
  - EpsilonGreedy
```
public EpsilonGreedy(double epsilon)
```
    Initializes with the value of epsilon, where epsilon is the probability of taking a random action.
    
    Parameters:
    
    epsilon - the probability of taking a random action.
  - EpsilonGreedy
```
public EpsilonGreedy(QProvider planner,
                     double epsilon)
```
    Initializes with the QComputablePlanner to use and the value of epsilon to use, where epsilon is the probability of taking a random action.
    
    Parameters:
    
    planner - the QComputablePlanner to use
    
    epsilon - the probability of taking a random action.
- Method Detail
  - getEpsilon
```
public double getEpsilon()
```
    Returns the epsilon value, where epsilon is the probability of taking a random action.
    
    Returns:
    
    the epsilon value
  - setEpsilon
```
public void setEpsilon(double epsilon)
```
    Sets the epsilon value, where epsilon is the probability of taking a random action.
    
    Parameters:
    
    epsilon - the probability of taking a random action.
  - setSolver
```
public void setSolver(MDPSolverInterface solver)
```
    Description copied from interface: SolverDerivedPolicy
    
    Sets the valueFunction whose results affect this policy.
    
    Specified by:
    
    setSolver in interface SolverDerivedPolicy
    
    Parameters:
    
    solver - the solver from which this policy is derived
  - action
```
public Action action(State s)
```
    Description copied from interface: Policy
    
    This method will return an action sampled by the policy for the given state. If the defined policy is stochastic, then multiple calls to this method for the same state may return different actions. The sampling should be with respect to defined action distribution that is returned by getActionDistributionForState
    
    Specified by:
    
    action in interface Policy
    
    Parameters:
    
    s - the state for which an action should be returned
    
    Returns:
    
    a sample action from the action distribution; null if the policy is undefined for s
  - actionProb
```
public double actionProb(State s,
                         Action a)
```
    Description copied from interface: Policy
    
    Returns the probability/probability density that the given action will be taken in the given state.
    
    Specified by:
    
    actionProb in interface Policy
    
    Parameters:
    
    s - the state of interest
    
    a - the action that may be taken in the state
    
    Returns:
    
    the probability/probability density
  - policyDistribution
```
public java.util.List<ActionProb> policyDistribution(State s)
```
    Description copied from interface: EnumerablePolicy
    
    This method will return action probability distribution defined by the policy. The action distribution is represented by a list of ActionProb objects, each which specifies a grounded action and a probability of that grounded action being taken. The returned list does not have to include actions with probability 0.
    
    Specified by:
    
    policyDistribution in interface EnumerablePolicy
    
    Parameters:
    
    s - the state for which an action distribution should be returned
    
    Returns:
    
    a list of possible actions taken by the policy and their probability.
  - definedFor
```
public boolean definedFor(State s)
```
    Description copied from interface: Policy
    
    Specifies whether this policy is defined for the input state.
    
    Specified by:
    
    definedFor in interface Policy
    
    Parameters:
    
    s - the input state to test for whether this policy is defined
    
    Returns:
    
    true if this policy is defined for State s, false otherwise.

Class EpsilonGreedy

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

qplanner

epsilon

rand

Constructor Detail

EpsilonGreedy

EpsilonGreedy

Method Detail

getEpsilon

setEpsilon

setSolver

action

actionProb

policyDistribution

definedFor