public class LinearDiffRFVInit extends java.lang.Object implements DifferentiableVInit, DifferentiableRF
DifferentiableRF
and
a DifferentiableVInit
when the reward function and value function initialization are linear functions over some set of features.
The total parameter dimensionality will be the sum of the reward function feature dimension
and value function initialization feature dimension.
This class is useful when learning both a reward function and the shaping values at the leaf nodes of a finite horizon valueFunction.
ParametricFunction.ParametricStateActionFunction, ParametricFunction.ParametricStateFunction
Modifier and Type | Field and Description |
---|---|
protected int |
dim |
protected double[] |
parameters |
protected int |
rfDim
The dimensionality of the reward function parameters
|
protected boolean |
rfFeaturesAreForNextState
Whether features are based on the next state or previous state.
|
protected DenseStateFeatures |
rfFvGen
The state feature vector generator.
|
protected int |
vinitDim
The dimensionality of the value function initialization parameters
|
protected DenseStateFeatures |
vinitFvGen
The state feature vector generator.
|
Constructor and Description |
---|
LinearDiffRFVInit(DenseStateFeatures rfFvGen,
DenseStateFeatures vinitFvGen,
int rfDim,
int vinitDim)
Initializes a linear reward function for a given feature vector of a given dimension and linear
value function initialization for a given feature vector and set of dimensions.
|
LinearDiffRFVInit(DenseStateFeatures rfFvGen,
DenseStateFeatures vinitFvGen,
int rfDim,
int vinitDim,
boolean rfFeaturesAreForNextState)
Initializes a linear reward function for a given feature vector of a given dimension and linear
value function initialization for a given feature vector and set of dimensions.
|
Modifier and Type | Method and Description |
---|---|
ParametricFunction |
copy()
Returns a copy of this
ParametricFunction . |
double |
getParameter(int i)
Returns the value of the ith parameter value
|
int |
getRfDim() |
DenseStateFeatures |
getRfFvGen() |
int |
getVinitDim() |
DenseStateFeatures |
getVinitFvGen() |
FunctionGradient |
gradient(State s,
Action a,
State sp) |
boolean |
isRfFeaturesAreForNextState()
Returns whether the reward function state features are evaluated on the next state of the transition
(s' of R(s,a,s')) or the previous state of the transition (s of R(s,a,s'))
|
int |
numParameters()
Returns the number of parameters defining this function.
|
void |
resetParameters()
Resets the parameters of this function to default values.
|
double |
reward(State s,
Action a,
State sprime)
Returns the reward received when action a is executed in state s and the agent transitions to state sprime.
|
void |
setParameter(int i,
double p)
Sets the value of the ith parameter to given value
|
void |
setRfDim(int rfDim) |
void |
setRfFeaturesAreForNextState(boolean rfFeaturesAreForNextState) |
void |
setRfFvGen(DenseStateFeatures rfFvGen) |
void |
setVinitDim(int vinitDim) |
void |
setVinitFvGen(DenseStateFeatures vinitFvGen) |
double |
value(State s)
Returns the value function evaluation of the given state.
|
FunctionGradient |
valueGradient(State s)
Returns the gradient of this value function
|
protected boolean rfFeaturesAreForNextState
protected DenseStateFeatures rfFvGen
protected DenseStateFeatures vinitFvGen
protected int rfDim
protected int vinitDim
protected double[] parameters
protected int dim
public LinearDiffRFVInit(DenseStateFeatures rfFvGen, DenseStateFeatures vinitFvGen, int rfDim, int vinitDim)
rfFvGen
- the reward function feature vector generatorvinitFvGen
- the value function initialization feature vector generatorrfDim
- the reward function feature/parameter dimensionalityvinitDim
- the value function initialization feature/parameter dimensionalitypublic LinearDiffRFVInit(DenseStateFeatures rfFvGen, DenseStateFeatures vinitFvGen, int rfDim, int vinitDim, boolean rfFeaturesAreForNextState)
rfFvGen
- the reward function feature vector generatorvinitFvGen
- the value function initialization feature vector generatorrfDim
- the reward function feature/parameter dimensionalityvinitDim
- the value function initialization feature/parameter dimensionalityrfFeaturesAreForNextState
- if true, the the rf features are evaluated on the next state of the transition; if false then on the previous state of the transition.public boolean isRfFeaturesAreForNextState()
public void setRfFeaturesAreForNextState(boolean rfFeaturesAreForNextState)
public DenseStateFeatures getRfFvGen()
public void setRfFvGen(DenseStateFeatures rfFvGen)
public DenseStateFeatures getVinitFvGen()
public void setVinitFvGen(DenseStateFeatures vinitFvGen)
public int getRfDim()
public void setRfDim(int rfDim)
public int getVinitDim()
public void setVinitDim(int vinitDim)
public FunctionGradient gradient(State s, Action a, State sp)
gradient
in interface DifferentiableRF
public double reward(State s, Action a, State sprime)
RewardFunction
reward
in interface RewardFunction
s
- the state in which the action was executeda
- the action executedsprime
- the state to which the agent transitionedpublic FunctionGradient valueGradient(State s)
DifferentiableValueFunction
valueGradient
in interface DifferentiableValueFunction
s
- the state on which the function is to be evaluatedpublic double value(State s)
ValueFunction
value
in interface ValueFunction
s
- the state to evaluate.public int numParameters()
ParametricFunction
numParameters
in interface ParametricFunction
public double getParameter(int i)
ParametricFunction
getParameter
in interface ParametricFunction
i
- the parameter indexpublic void setParameter(int i, double p)
ParametricFunction
setParameter
in interface ParametricFunction
i
- the index of the parameter to setp
- the parameter value to which it should be setpublic void resetParameters()
ParametricFunction
resetParameters
in interface ParametricFunction
public ParametricFunction copy()
ParametricFunction
ParametricFunction
.copy
in interface ParametricFunction
ParametricFunction
.