public class LinearStateDifferentiableRF extends DifferentiableRF
DifferentiableRF.
The features of the reward function are produced by a StateToFeatureVectorGenerator.
By default, the reward function is defined as: R(s, a, s') = w * f(s'), where w is the weight vector (the parameters)
of this object, * is the dot product operator, and f(s') is the feature vector for state s'. Alternatively, the reward function
may be defined R(s, a, s') = w * f(s), (that is, using the feature vector for the previous state) by using the
LinearStateDifferentiableRF(burlap.behavior.singleagent.vfa.StateToFeatureVectorGenerator, int, boolean) constructor
or the setFeaturesAreForNextState(boolean)} method
and setting the featuresAreForNextState boolean to false.| Modifier and Type | Field and Description |
|---|---|
protected boolean |
featuresAreForNextState
Whether features are based on the next state or previous state.
|
protected StateToFeatureVectorGenerator |
fvGen
The state feature vector generator.
|
dim, parameters| Constructor and Description |
|---|
LinearStateDifferentiableRF(StateToFeatureVectorGenerator fvGen,
int dim)
Initializes.
|
LinearStateDifferentiableRF(StateToFeatureVectorGenerator fvGen,
int dim,
boolean featuresAreForNextState)
Initializes.
|
| Modifier and Type | Method and Description |
|---|---|
protected DifferentiableRF |
copyHelper()
A helper method for making a copy of this reward function.
|
double[] |
getGradient(State s,
GroundedAction ga,
State sp)
Returns the gradient of the reward function for the given state transition.
|
double |
reward(State s,
GroundedAction a,
State sprime)
Returns the reward received when action a is executed in state s and the agent transitions to state sprime.
|
void |
setFeaturesAreForNextState(boolean featuresAreForNextState)
Sets whether features for the reward function are generated from the next state or previous state.
|
copy, getParameterDimension, getParameters, randomizeParameters, setParameter, setParameters, toStringprotected boolean featuresAreForNextState
protected StateToFeatureVectorGenerator fvGen
public LinearStateDifferentiableRF(StateToFeatureVectorGenerator fvGen, int dim)
fvGen - the state feature vector generatordim - the dimensionality of the state features that will be producedpublic LinearStateDifferentiableRF(StateToFeatureVectorGenerator fvGen, int dim, boolean featuresAreForNextState)
fvGen - the state feature vector generatordim - the dimensionality of the state features that will be producedfeaturesAreForNextState - If true, then the features will be generated from the next state in the (s, a, s') transition. If false, then the previous state.public void setFeaturesAreForNextState(boolean featuresAreForNextState)
featuresAreForNextState - If true, then the features will be generated from the next state in the (s, a, s') transition. If false, then the previous state.protected DifferentiableRF copyHelper()
DifferentiableRFDifferentiableRF.copy() method.copyHelper in class DifferentiableRFpublic double[] getGradient(State s, GroundedAction ga, State sp)
DifferentiableRFgetGradient in class DifferentiableRFs - the source statega - the action taken in the source statesp - the resulting state from the actionpublic double reward(State s, GroundedAction a, State sprime)
RewardFunctions - the state in which the action was executeda - the action executedsprime - the state to which the agent transitioned