public class SoftTimeInverseDecayLR extends java.lang.Object implements LearningRate
Modifier and Type  Class and Description 

protected class 
SoftTimeInverseDecayLR.MutableInt
A class for storing a mutable int value object

protected class 
SoftTimeInverseDecayLR.StateWiseTimeIndex
A class for storing a time index for a state, or a time index for each action for a given state

Modifier and Type  Field and Description 

protected double 
decayConstantShift
The division scale offset

protected java.util.Map<java.lang.Integer,SoftTimeInverseDecayLR.StateWiseTimeIndex> 
featureWiseMap
The state feature dependent or state featureaction dependent learning rate time indicies

protected HashableStateFactory 
hashingFactory
How to hash and perform equality checks of states

protected double 
initialLearningRate
The initial learning rate value at time 0

protected int 
lastPollTime
The last agent time at which they polled the learning rate

protected double 
minimumLR
The minimum learning rate

protected java.util.Map<HashableState,SoftTimeInverseDecayLR.StateWiseTimeIndex> 
stateWiseMap
The state dependent or stateaction dependent learning rate time indices

protected int 
universalTime
The universal number of learning rate polls

protected boolean 
useStateActionWise
Whether the learning rate is dependent on stateactions

protected boolean 
useStateWise
Whether the learning rate is dependent on the state

Constructor and Description 

SoftTimeInverseDecayLR(double initialLearningRate,
double decayConstantShift)
Initializes with an initial learning rate and decay constant shift for a state independent learning rate.

SoftTimeInverseDecayLR(double initialLearningRate,
double decayConstantShift,
double minimumLearningRate)
Initializes with an initial learning rate and decay constant shift (n_0) for a state independent learning rate that will decay to a value no smaller than minimumLearningRate

SoftTimeInverseDecayLR(double initialLearningRate,
double decayConstantShift,
double minimumLearningRate,
HashableStateFactory hashingFactory,
boolean useSeparateLRPerStateAction)
Initializes with an initial learning rate and decay constant shift (n_0) for a state or stateaction (or state featureaction) dependent learning rate that will decay to a value no smaller than minimumLearningRate
If this learning rate function is to be used for state state features, rather than states,
then the hashing factory can be null;

SoftTimeInverseDecayLR(double initialLearningRate,
double decayConstantShift,
HashableStateFactory hashingFactory,
boolean useSeparateLRPerStateAction)
Initializes with an initial learning rate and decay constant shift (n_0) for a state or stateaction (or state featureaction) dependent learning rate.

Modifier and Type  Method and Description 

protected SoftTimeInverseDecayLR.StateWiseTimeIndex 
getFeatureWiseTimeIndex(int featureId)
Returns the learning rate data structure for the given state feature.

protected SoftTimeInverseDecayLR.StateWiseTimeIndex 
getStateWiseTimeIndex(State s)
Returns the learning rate data structure for the given state.

protected double 
learningRate(int time) 
double 
peekAtLearningRate(int featureId)
A method for looking at the current learning rate for a state (action) feature without having it altered.

double 
peekAtLearningRate(State s,
Action ga)
A method for looking at the current learning rate for a stateaction pair without having it altered.

double 
pollLearningRate(int agentTime,
int featureId)
A method for returning the learning rate for a given state (action) feature and then decaying the learning rate as defined by this class.

double 
pollLearningRate(int agentTime,
State s,
Action ga)
A method for returning the learning rate for a given state action pair and then decaying the learning rate as defined by this class.

void 
resetDecay()
Causes any learnign rate decay to reset to where it started.

protected double initialLearningRate
protected double decayConstantShift
protected double minimumLR
protected int universalTime
protected java.util.Map<HashableState,SoftTimeInverseDecayLR.StateWiseTimeIndex> stateWiseMap
protected java.util.Map<java.lang.Integer,SoftTimeInverseDecayLR.StateWiseTimeIndex> featureWiseMap
protected boolean useStateWise
protected boolean useStateActionWise
protected HashableStateFactory hashingFactory
protected int lastPollTime
public SoftTimeInverseDecayLR(double initialLearningRate, double decayConstantShift)
initialLearningRate
 the initial learning ratedecayConstantShift
 the constant added to the inver time decay schedule (n_0). That is; learning rate time time t is alpha_0 * (n_0 + 1) / (n_0 + t)public SoftTimeInverseDecayLR(double initialLearningRate, double decayConstantShift, double minimumLearningRate)
initialLearningRate
 the initial learning ratedecayConstantShift
 the constant added to the inver time decay schedule (n_0). That is; learning rate time time t is alpha_0 * (n_0 + 1) / (n_0 + t)minimumLearningRate
 the smallest value to which the learning rate will decaypublic SoftTimeInverseDecayLR(double initialLearningRate, double decayConstantShift, HashableStateFactory hashingFactory, boolean useSeparateLRPerStateAction)
initialLearningRate
 the initial learning rate for each state or stateactiondecayConstantShift
 the constant added to the inver time decay schedule (n_0). That is; learning rate time time t is alpha_0 * (n_0 + 1) / (n_0 + t)hashingFactory
 how to hash and compare statesuseSeparateLRPerStateAction
 whether to have an independent learning rate for each stateaction pair, rather than just each statepublic SoftTimeInverseDecayLR(double initialLearningRate, double decayConstantShift, double minimumLearningRate, HashableStateFactory hashingFactory, boolean useSeparateLRPerStateAction)
initialLearningRate
 the initial learning rate for each state or stateactiondecayConstantShift
 the constant added to the inver time decay schedule (n_0). That is; learning rate time time t is alpha_0 * (n_0 + 1) / (n_0 + t)minimumLearningRate
 the smallest value to which the learning rate will decayhashingFactory
 how to hash and compare statesuseSeparateLRPerStateAction
 whether to have an independent learning rate for each stateaction pair, rather than just each statepublic double peekAtLearningRate(State s, Action ga)
LearningRate
peekAtLearningRate
in interface LearningRate
s
 the state for which the learning rate should be returnedga
 the action from which the learning rate should be returnedpublic double pollLearningRate(int agentTime, State s, Action ga)
LearningRate
pollLearningRate
in interface LearningRate
agentTime
 the time index of the agent when polling.s
 the state for which the learning rate should be returnedga
 the action from which the learning rate should be returnedpublic double peekAtLearningRate(int featureId)
LearningRate
peekAtLearningRate
in interface LearningRate
featureId
 the state feature for which the learning rate should be returnedpublic double pollLearningRate(int agentTime, int featureId)
LearningRate
pollLearningRate
in interface LearningRate
agentTime
 the time index of the agent when polling.featureId
 the state feature for which the learning rate should be returnedpublic void resetDecay()
LearningRate
resetDecay
in interface LearningRate
protected double learningRate(int time)
protected SoftTimeInverseDecayLR.StateWiseTimeIndex getStateWiseTimeIndex(State s)
s
 the state to get a learning rate time index forprotected SoftTimeInverseDecayLR.StateWiseTimeIndex getFeatureWiseTimeIndex(int featureId)
featureId
 the state feature id to get a learning rate time index for