SoftTimeInverseDecayLR

java.lang.Object
- burlap.behavior.learningrate.SoftTimeInverseDecayLR

All Implemented Interfaces:

LearningRate
```
public class SoftTimeInverseDecayLR
extends java.lang.Object
implements LearningRate
```
Implements a learning rate decay schedule where the learning rate at time t is alpha_0 * (n_0 + 1) / (n_0 + t), where alpha_0 is the initial learning rate and n_0 is a parameter. When n_0 is 0, this behaves has decaying a learning rate inversely porportional to the amount of time passed. The larger n_0 is, the slower the decay schedule. By default, the learning rate may decrease to Double.MIN_NORMAL, which is the smallest fraction a double value can hold, but a larger minimum learning rate may also be set. This class may be specified to use a universal learning rate that is shared regardless of state and action, or it can be set to have a different learning rate for each state (or state feature) that is decayed independently of other states, or it may also be specified to have a learning rate that is independently decayed for each state-action (or state feature-action) pair. However, the state-action decay will ignore any parameterizations of actions.

Author:

James MacGlashan

Nested Class Summary

Nested Classes
Modifier and Type	Class and Description
`protected class`	`SoftTimeInverseDecayLR.MutableInt` A class for storing a mutable int value object
`protected class`	`SoftTimeInverseDecayLR.StateWiseTimeIndex` A class for storing a time index for a state, or a time index for each action for a given state

Field Summary

Fields
Modifier and Type	Field and Description
`protected double`	`decayConstantShift` The division scale offset
`protected java.util.Map<java.lang.Integer,SoftTimeInverseDecayLR.StateWiseTimeIndex>`	`featureWiseMap` The state feature dependent or state feature-action dependent learning rate time indicies
`protected HashableStateFactory`	`hashingFactory` How to hash and perform equality checks of states
`protected double`	`initialLearningRate` The initial learning rate value at time 0
`protected int`	`lastPollTime` The last agent time at which they polled the learning rate
`protected double`	`minimumLR` The minimum learning rate
`protected java.util.Map<HashableState,SoftTimeInverseDecayLR.StateWiseTimeIndex>`	`stateWiseMap` The state dependent or state-action dependent learning rate time indices
`protected int`	`universalTime` The universal number of learning rate polls
`protected boolean`	`useStateActionWise` Whether the learning rate is dependent on state-actions
`protected boolean`	`useStateWise` Whether the learning rate is dependent on the state

Constructor Summary

Constructors
Constructor and Description
`SoftTimeInverseDecayLR(double initialLearningRate, double decayConstantShift)` Initializes with an initial learning rate and decay constant shift for a state independent learning rate.
`SoftTimeInverseDecayLR(double initialLearningRate, double decayConstantShift, double minimumLearningRate)` Initializes with an initial learning rate and decay constant shift (n_0) for a state independent learning rate that will decay to a value no smaller than minimumLearningRate
`SoftTimeInverseDecayLR(double initialLearningRate, double decayConstantShift, double minimumLearningRate, HashableStateFactory hashingFactory, boolean useSeparateLRPerStateAction)` Initializes with an initial learning rate and decay constant shift (n_0) for a state or state-action (or state feature-action) dependent learning rate that will decay to a value no smaller than minimumLearningRate If this learning rate function is to be used for state state features, rather than states, then the hashing factory can be null;
`SoftTimeInverseDecayLR(double initialLearningRate, double decayConstantShift, HashableStateFactory hashingFactory, boolean useSeparateLRPerStateAction)` Initializes with an initial learning rate and decay constant shift (n_0) for a state or state-action (or state feature-action) dependent learning rate.

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`protected SoftTimeInverseDecayLR.StateWiseTimeIndex`	`getFeatureWiseTimeIndex(int featureId)` Returns the learning rate data structure for the given state feature.
`protected SoftTimeInverseDecayLR.StateWiseTimeIndex`	`getStateWiseTimeIndex(State s)` Returns the learning rate data structure for the given state.
`protected double`	`learningRate(int time)`
`double`	`peekAtLearningRate(int featureId)` A method for looking at the current learning rate for a state (-action) feature without having it altered.
`double`	`peekAtLearningRate(State s, Action ga)` A method for looking at the current learning rate for a state-action pair without having it altered.
`double`	`pollLearningRate(int agentTime, int featureId)` A method for returning the learning rate for a given state (-action) feature and then decaying the learning rate as defined by this class.
`double`	`pollLearningRate(int agentTime, State s, Action ga)` A method for returning the learning rate for a given state action pair and then decaying the learning rate as defined by this class.
`void`	`resetDecay()` Causes any learnign rate decay to reset to where it started.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - initialLearningRate
```
protected double initialLearningRate
```
    The initial learning rate value at time 0
  - decayConstantShift
```
protected double decayConstantShift
```
    The division scale offset
  - minimumLR
```
protected double minimumLR
```
    The minimum learning rate
  - universalTime
```
protected int universalTime
```
    The universal number of learning rate polls
  - stateWiseMap
```
protected java.util.Map<HashableState,SoftTimeInverseDecayLR.StateWiseTimeIndex> stateWiseMap
```
    The state dependent or state-action dependent learning rate time indices
  - featureWiseMap
```
protected java.util.Map<java.lang.Integer,SoftTimeInverseDecayLR.StateWiseTimeIndex> featureWiseMap
```
    The state feature dependent or state feature-action dependent learning rate time indicies
  - useStateWise
```
protected boolean useStateWise
```
    Whether the learning rate is dependent on the state
  - useStateActionWise
```
protected boolean useStateActionWise
```
    Whether the learning rate is dependent on state-actions
  - hashingFactory
```
protected HashableStateFactory hashingFactory
```
    How to hash and perform equality checks of states
  - lastPollTime
```
protected int lastPollTime
```
    The last agent time at which they polled the learning rate
- Constructor Detail
  - SoftTimeInverseDecayLR
```
public SoftTimeInverseDecayLR(double initialLearningRate,
                              double decayConstantShift)
```
    Initializes with an initial learning rate and decay constant shift for a state independent learning rate. Minimum learning rate that can be returned will be Double.MIN_NORMAL
    
    Parameters:
    
    initialLearningRate - the initial learning rate
    
    decayConstantShift - the constant added to the inver time decay schedule (n_0). That is; learning rate time time t is alpha_0 * (n_0 + 1) / (n_0 + t)
  - SoftTimeInverseDecayLR
```
public SoftTimeInverseDecayLR(double initialLearningRate,
                              double decayConstantShift,
                              double minimumLearningRate)
```
    Initializes with an initial learning rate and decay constant shift (n_0) for a state independent learning rate that will decay to a value no smaller than minimumLearningRate
    
    Parameters:
    
    initialLearningRate - the initial learning rate
    
    decayConstantShift - the constant added to the inver time decay schedule (n_0). That is; learning rate time time t is alpha_0 * (n_0 + 1) / (n_0 + t)
    
    minimumLearningRate - the smallest value to which the learning rate will decay
  - SoftTimeInverseDecayLR
```
public SoftTimeInverseDecayLR(double initialLearningRate,
                              double decayConstantShift,
                              HashableStateFactory hashingFactory,
                              boolean useSeparateLRPerStateAction)
```
    Initializes with an initial learning rate and decay constant shift (n_0) for a state or state-action (or state feature-action) dependent learning rate. Minimum learning rate that can be returned will be Double.MIN_NORMAL. If this learning rate function is to be used for state state features, rather than states, then the hashing factory can be null;
    
    Parameters:
    
    initialLearningRate - the initial learning rate for each state or state-action
    
    decayConstantShift - the constant added to the inver time decay schedule (n_0). That is; learning rate time time t is alpha_0 * (n_0 + 1) / (n_0 + t)
    
    hashingFactory - how to hash and compare states
    
    useSeparateLRPerStateAction - whether to have an independent learning rate for each state-action pair, rather than just each state
  - SoftTimeInverseDecayLR
```
public SoftTimeInverseDecayLR(double initialLearningRate,
                              double decayConstantShift,
                              double minimumLearningRate,
                              HashableStateFactory hashingFactory,
                              boolean useSeparateLRPerStateAction)
```
    Initializes with an initial learning rate and decay constant shift (n_0) for a state or state-action (or state feature-action) dependent learning rate that will decay to a value no smaller than minimumLearningRate If this learning rate function is to be used for state state features, rather than states, then the hashing factory can be null;
    
    Parameters:
    
    initialLearningRate - the initial learning rate for each state or state-action
    
    decayConstantShift - the constant added to the inver time decay schedule (n_0). That is; learning rate time time t is alpha_0 * (n_0 + 1) / (n_0 + t)
    
    minimumLearningRate - the smallest value to which the learning rate will decay
    
    hashingFactory - how to hash and compare states
    
    useSeparateLRPerStateAction - whether to have an independent learning rate for each state-action pair, rather than just each state
- Method Detail
  - peekAtLearningRate
```
public double peekAtLearningRate(State s,
                                 Action ga)
```
    Description copied from interface: LearningRate
    
    A method for looking at the current learning rate for a state-action pair without having it altered.
    
    Specified by:
    
    peekAtLearningRate in interface LearningRate
    
    Parameters:
    
    s - the state for which the learning rate should be returned
    
    ga - the action from which the learning rate should be returned
    
    Returns:
    
    the current learning rate for the given state-action pair
  - pollLearningRate
```
public double pollLearningRate(int agentTime,
                               State s,
                               Action ga)
```
    Description copied from interface: LearningRate
    
    A method for returning the learning rate for a given state action pair and then decaying the learning rate as defined by this class.
    
    Specified by:
    
    pollLearningRate in interface LearningRate
    
    Parameters:
    
    agentTime - the time index of the agent when polling.
    
    s - the state for which the learning rate should be returned
    
    ga - the action from which the learning rate should be returned
    
    Returns:
    
    the current learning rate for the given state-action pair
  - peekAtLearningRate
```
public double peekAtLearningRate(int featureId)
```
    Description copied from interface: LearningRate
    
    A method for looking at the current learning rate for a state (-action) feature without having it altered.
    
    Specified by:
    
    peekAtLearningRate in interface LearningRate
    
    Parameters:
    
    featureId - the state feature for which the learning rate should be returned
    
    Returns:
    
    the current learning rate for the given state feature-action pair
  - pollLearningRate
```
public double pollLearningRate(int agentTime,
                               int featureId)
```
    Description copied from interface: LearningRate
    
    A method for returning the learning rate for a given state (-action) feature and then decaying the learning rate as defined by this class.
    
    Specified by:
    
    pollLearningRate in interface LearningRate
    
    Parameters:
    
    agentTime - the time index of the agent when polling.
    
    featureId - the state feature for which the learning rate should be returned
    
    Returns:
    
    the current learning rate for the given state feature-action pair
  - resetDecay
```
public void resetDecay()
```
    Description copied from interface: LearningRate
    
    Causes any learnign rate decay to reset to where it started.
    
    Specified by:
    
    resetDecay in interface LearningRate
  - learningRate
```
protected double learningRate(int time)
```
  - getStateWiseTimeIndex
```
protected SoftTimeInverseDecayLR.StateWiseTimeIndex getStateWiseTimeIndex(State s)
```
    Returns the learning rate data structure for the given state. An entry will be created if it does not already exist.
    
    Parameters:
    
    s - the state to get a learning rate time index for
    
    Returns:
    
    the learning rate data structure for the given state feature
  - getFeatureWiseTimeIndex
```
protected SoftTimeInverseDecayLR.StateWiseTimeIndex getFeatureWiseTimeIndex(int featureId)
```
    Returns the learning rate data structure for the given state feature. An entry will be created if it does not already exist.
    
    Parameters:
    
    featureId - the state feature id to get a learning rate time index for
    
    Returns:
    
    the learning rate data structure for the given state feature

Class SoftTimeInverseDecayLR

Nested Class Summary

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

initialLearningRate

decayConstantShift

minimumLR

universalTime

stateWiseMap

featureWiseMap

useStateWise

useStateActionWise

hashingFactory

lastPollTime

Constructor Detail

SoftTimeInverseDecayLR

SoftTimeInverseDecayLR

SoftTimeInverseDecayLR

SoftTimeInverseDecayLR

Method Detail

peekAtLearningRate

pollLearningRate

peekAtLearningRate

pollLearningRate

resetDecay

learningRate

getStateWiseTimeIndex

getFeatureWiseTimeIndex