public class SoftTimeInverseDecayLR extends java.lang.Object implements LearningRate
Modifier and Type | Class and Description |
---|---|
protected class |
SoftTimeInverseDecayLR.MutableInt
A class for storing a mutable int value object
|
protected class |
SoftTimeInverseDecayLR.StateWiseTimeIndex
A class for storing a time index for a state, or a time index for each action for a given state
|
Modifier and Type | Field and Description |
---|---|
protected double |
decayConstantShift
The division scale offset
|
protected java.util.Map<java.lang.Integer,SoftTimeInverseDecayLR.StateWiseTimeIndex> |
featureWiseMap
The state feature dependent or state feature-action dependent learning rate time indicies
|
protected HashableStateFactory |
hashingFactory
How to hash and perform equality checks of states
|
protected double |
initialLearningRate
The initial learning rate value at time 0
|
protected int |
lastPollTime
The last agent time at which they polled the learning rate
|
protected double |
minimumLR
The minimum learning rate
|
protected java.util.Map<HashableState,SoftTimeInverseDecayLR.StateWiseTimeIndex> |
stateWiseMap
The state dependent or state-action dependent learning rate time indices
|
protected int |
universalTime
The universal number of learning rate polls
|
protected boolean |
useStateActionWise
Whether the learning rate is dependent on state-actions
|
protected boolean |
useStateWise
Whether the learning rate is dependent on the state
|
Constructor and Description |
---|
SoftTimeInverseDecayLR(double initialLearningRate,
double decayConstantShift)
Initializes with an initial learning rate and decay constant shift for a state independent learning rate.
|
SoftTimeInverseDecayLR(double initialLearningRate,
double decayConstantShift,
double minimumLearningRate)
Initializes with an initial learning rate and decay constant shift (n_0) for a state independent learning rate that will decay to a value no smaller than minimumLearningRate
|
SoftTimeInverseDecayLR(double initialLearningRate,
double decayConstantShift,
double minimumLearningRate,
HashableStateFactory hashingFactory,
boolean useSeparateLRPerStateAction)
Initializes with an initial learning rate and decay constant shift (n_0) for a state or state-action (or state feature-action) dependent learning rate that will decay to a value no smaller than minimumLearningRate
If this learning rate function is to be used for state state features, rather than states,
then the hashing factory can be null;
|
SoftTimeInverseDecayLR(double initialLearningRate,
double decayConstantShift,
HashableStateFactory hashingFactory,
boolean useSeparateLRPerStateAction)
Initializes with an initial learning rate and decay constant shift (n_0) for a state or state-action (or state feature-action) dependent learning rate.
|
Modifier and Type | Method and Description |
---|---|
protected SoftTimeInverseDecayLR.StateWiseTimeIndex |
getFeatureWiseTimeIndex(int featureId)
Returns the learning rate data structure for the given state feature.
|
protected SoftTimeInverseDecayLR.StateWiseTimeIndex |
getStateWiseTimeIndex(State s)
Returns the learning rate data structure for the given state.
|
protected double |
learningRate(int time) |
double |
peekAtLearningRate(int featureId)
A method for looking at the current learning rate for a state (-action) feature without having it altered.
|
double |
peekAtLearningRate(State s,
Action ga)
A method for looking at the current learning rate for a state-action pair without having it altered.
|
double |
pollLearningRate(int agentTime,
int featureId)
A method for returning the learning rate for a given state (-action) feature and then decaying the learning rate as defined by this class.
|
double |
pollLearningRate(int agentTime,
State s,
Action ga)
A method for returning the learning rate for a given state action pair and then decaying the learning rate as defined by this class.
|
void |
resetDecay()
Causes any learnign rate decay to reset to where it started.
|
protected double initialLearningRate
protected double decayConstantShift
protected double minimumLR
protected int universalTime
protected java.util.Map<HashableState,SoftTimeInverseDecayLR.StateWiseTimeIndex> stateWiseMap
protected java.util.Map<java.lang.Integer,SoftTimeInverseDecayLR.StateWiseTimeIndex> featureWiseMap
protected boolean useStateWise
protected boolean useStateActionWise
protected HashableStateFactory hashingFactory
protected int lastPollTime
public SoftTimeInverseDecayLR(double initialLearningRate, double decayConstantShift)
initialLearningRate
- the initial learning ratedecayConstantShift
- the constant added to the inver time decay schedule (n_0). That is; learning rate time time t is alpha_0 * (n_0 + 1) / (n_0 + t)public SoftTimeInverseDecayLR(double initialLearningRate, double decayConstantShift, double minimumLearningRate)
initialLearningRate
- the initial learning ratedecayConstantShift
- the constant added to the inver time decay schedule (n_0). That is; learning rate time time t is alpha_0 * (n_0 + 1) / (n_0 + t)minimumLearningRate
- the smallest value to which the learning rate will decaypublic SoftTimeInverseDecayLR(double initialLearningRate, double decayConstantShift, HashableStateFactory hashingFactory, boolean useSeparateLRPerStateAction)
initialLearningRate
- the initial learning rate for each state or state-actiondecayConstantShift
- the constant added to the inver time decay schedule (n_0). That is; learning rate time time t is alpha_0 * (n_0 + 1) / (n_0 + t)hashingFactory
- how to hash and compare statesuseSeparateLRPerStateAction
- whether to have an independent learning rate for each state-action pair, rather than just each statepublic SoftTimeInverseDecayLR(double initialLearningRate, double decayConstantShift, double minimumLearningRate, HashableStateFactory hashingFactory, boolean useSeparateLRPerStateAction)
initialLearningRate
- the initial learning rate for each state or state-actiondecayConstantShift
- the constant added to the inver time decay schedule (n_0). That is; learning rate time time t is alpha_0 * (n_0 + 1) / (n_0 + t)minimumLearningRate
- the smallest value to which the learning rate will decayhashingFactory
- how to hash and compare statesuseSeparateLRPerStateAction
- whether to have an independent learning rate for each state-action pair, rather than just each statepublic double peekAtLearningRate(State s, Action ga)
LearningRate
peekAtLearningRate
in interface LearningRate
s
- the state for which the learning rate should be returnedga
- the action from which the learning rate should be returnedpublic double pollLearningRate(int agentTime, State s, Action ga)
LearningRate
pollLearningRate
in interface LearningRate
agentTime
- the time index of the agent when polling.s
- the state for which the learning rate should be returnedga
- the action from which the learning rate should be returnedpublic double peekAtLearningRate(int featureId)
LearningRate
peekAtLearningRate
in interface LearningRate
featureId
- the state feature for which the learning rate should be returnedpublic double pollLearningRate(int agentTime, int featureId)
LearningRate
pollLearningRate
in interface LearningRate
agentTime
- the time index of the agent when polling.featureId
- the state feature for which the learning rate should be returnedpublic void resetDecay()
LearningRate
resetDecay
in interface LearningRate
protected double learningRate(int time)
protected SoftTimeInverseDecayLR.StateWiseTimeIndex getStateWiseTimeIndex(State s)
s
- the state to get a learning rate time index forprotected SoftTimeInverseDecayLR.StateWiseTimeIndex getFeatureWiseTimeIndex(int featureId)
featureId
- the state feature id to get a learning rate time index for