public class LinearStateDifferentiableRF extends java.lang.Object implements DifferentiableRF
DifferentiableRF
.
The features of the reward function are produced by a DenseStateFeatures
.
By default, the reward function is defined as: R(s, a, s') = w * f(s'), where w is the weight vector (the parameters)
of this object, * is the dot product operator, and f(s') is the feature vector for state s'. Alternatively, the reward function
may be defined R(s, a, s') = w * f(s), (that is, using the feature vector for the previous state) by using the
LinearStateDifferentiableRF(DenseStateFeatures, int, boolean)
constructor
or the setFeaturesAreForNextState(boolean)
} method
and setting the featuresAreForNextState boolean to false.ParametricFunction.ParametricStateActionFunction, ParametricFunction.ParametricStateFunction
Modifier and Type | Field and Description |
---|---|
protected int |
dim
The dimension of this reward function
|
protected boolean |
featuresAreForNextState
Whether features are based on the next state or previous state.
|
protected DenseStateFeatures |
fvGen
The state feature vector generator.
|
protected double[] |
parameters
The parameters of this reward function
|
Constructor and Description |
---|
LinearStateDifferentiableRF(DenseStateFeatures fvGen,
int dim)
Initializes.
|
LinearStateDifferentiableRF(DenseStateFeatures fvGen,
int dim,
boolean featuresAreForNextState)
Initializes.
|
Modifier and Type | Method and Description |
---|---|
ParametricFunction |
copy()
Returns a copy of this
ParametricFunction . |
double |
getParameter(int i)
Returns the value of the ith parameter value
|
FunctionGradient |
gradient(State s,
Action a,
State sprime) |
int |
numParameters()
Returns the number of parameters defining this function.
|
void |
resetParameters()
Resets the parameters of this function to default values.
|
double |
reward(State s,
Action a,
State sprime)
Returns the reward received when action a is executed in state s and the agent transitions to state sprime.
|
void |
setFeaturesAreForNextState(boolean featuresAreForNextState)
Sets whether features for the reward function are generated from the next state or previous state.
|
void |
setParameter(int i,
double p)
Sets the value of the ith parameter to given value
|
java.lang.String |
toString() |
protected boolean featuresAreForNextState
protected DenseStateFeatures fvGen
protected double[] parameters
protected int dim
public LinearStateDifferentiableRF(DenseStateFeatures fvGen, int dim)
fvGen
- the state feature vector generatordim
- the dimensionality of the state features that will be producedpublic LinearStateDifferentiableRF(DenseStateFeatures fvGen, int dim, boolean featuresAreForNextState)
fvGen
- the state feature vector generatordim
- the dimensionality of the state features that will be producedfeaturesAreForNextState
- If true, then the features will be generated from the next state in the (s, a, s') transition. If false, then the previous state.public void setFeaturesAreForNextState(boolean featuresAreForNextState)
featuresAreForNextState
- If true, then the features will be generated from the next state in the (s, a, s') transition. If false, then the previous state.public FunctionGradient gradient(State s, Action a, State sprime)
gradient
in interface DifferentiableRF
public int numParameters()
ParametricFunction
numParameters
in interface ParametricFunction
public double getParameter(int i)
ParametricFunction
getParameter
in interface ParametricFunction
i
- the parameter indexpublic void setParameter(int i, double p)
ParametricFunction
setParameter
in interface ParametricFunction
i
- the index of the parameter to setp
- the parameter value to which it should be setpublic void resetParameters()
ParametricFunction
resetParameters
in interface ParametricFunction
public ParametricFunction copy()
ParametricFunction
ParametricFunction
.copy
in interface ParametricFunction
ParametricFunction
.public double reward(State s, Action a, State sprime)
RewardFunction
reward
in interface RewardFunction
s
- the state in which the action was executeda
- the action executedsprime
- the state to which the agent transitionedpublic java.lang.String toString()
toString
in class java.lang.Object