public class LinearDiffRFVInit extends DifferentiableRF implements DifferentiableVInit
DifferentiableRF
and
a DifferentiableVInit
when the reward function and value function initialization are linear functions over some set of features.
The total parameter dimensionality will be the sum of the reward function feature dimension
and value function initialization feature dimension.
DifferentiableVInit.ParamedDiffVInit
ValueFunctionInitialization.ConstantValueFunctionInitialization
Modifier and Type | Field and Description |
---|---|
protected int |
rfDim
The dimensionality of the reward function parameters
|
protected boolean |
rfFeaturesAreForNextState
Whether features are based on the next state or previous state.
|
protected StateToFeatureVectorGenerator |
rfFvGen
The state feature vector generator.
|
protected int |
vinitDim
The dimensionality of the value function initialization parameters
|
protected StateToFeatureVectorGenerator |
vinitFvGen
The state feature vector generator.
|
dim, parameters
Constructor and Description |
---|
LinearDiffRFVInit(StateToFeatureVectorGenerator rfFvGen,
StateToFeatureVectorGenerator vinitFvGen,
int rfDim,
int vinitDim)
Initializes a linear reward function for a given feature vector of a given dimension and linear
value function initialization for a given feature vector and set of dimensions.
|
LinearDiffRFVInit(StateToFeatureVectorGenerator rfFvGen,
StateToFeatureVectorGenerator vinitFvGen,
int rfDim,
int vinitDim,
boolean rfFeaturesAreForNextState)
Initializes a linear reward function for a given feature vector of a given dimension and linear
value function initialization for a given feature vector and set of dimensions.
|
Modifier and Type | Method and Description |
---|---|
protected DifferentiableRF |
copyHelper()
A helper method for making a copy of this reward function.
|
double[] |
getGradient(State s,
GroundedAction ga,
State sp)
Returns the gradient of the reward function for the given state transition.
|
double[] |
getQGradient(State s,
AbstractGroundedAction ga)
Returns the Q-value function gradient.
|
int |
getRfDim() |
StateToFeatureVectorGenerator |
getRfFvGen() |
double[] |
getVGradient(State s)
Returns the value function gradient.
|
int |
getVinitDim() |
StateToFeatureVectorGenerator |
getVinitFvGen() |
boolean |
isRfFeaturesAreForNextState()
Returns whether the reward function state features are evaluated on the next state of the transition
(s' of R(s,a,s')) or the previous state of the transition (s of R(s,a,s'))
|
double |
qValue(State s,
AbstractGroundedAction a)
Returns the initialization value of the Q-value function for a given state and action pair.
|
double |
reward(State s,
GroundedAction a,
State sprime)
Returns the reward received when action a is executed in state s and the agent transitions to state sprime.
|
void |
setRfDim(int rfDim) |
void |
setRfFeaturesAreForNextState(boolean rfFeaturesAreForNextState) |
void |
setRfFvGen(StateToFeatureVectorGenerator rfFvGen) |
void |
setVinitDim(int vinitDim) |
void |
setVinitFvGen(StateToFeatureVectorGenerator vinitFvGen) |
double |
value(State s)
Returns the value function evaluation of the given state.
|
copy, getParameterDimension, getParameters, randomizeParameters, setParameter, setParameters, toString
protected boolean rfFeaturesAreForNextState
protected StateToFeatureVectorGenerator rfFvGen
protected StateToFeatureVectorGenerator vinitFvGen
protected int rfDim
protected int vinitDim
public LinearDiffRFVInit(StateToFeatureVectorGenerator rfFvGen, StateToFeatureVectorGenerator vinitFvGen, int rfDim, int vinitDim)
rfFvGen
- the reward function feature vector generatorvinitFvGen
- the value function initialization feature vector generatorrfDim
- the reward function feature/parameter dimensionalityvinitDim
- the value function initialization feature/parameter dimensionalitypublic LinearDiffRFVInit(StateToFeatureVectorGenerator rfFvGen, StateToFeatureVectorGenerator vinitFvGen, int rfDim, int vinitDim, boolean rfFeaturesAreForNextState)
rfFvGen
- the reward function feature vector generatorvinitFvGen
- the value function initialization feature vector generatorrfDim
- the reward function feature/parameter dimensionalityvinitDim
- the value function initialization feature/parameter dimensionalityrfFeaturesAreForNextState
- if true, the the rf features are evaluated on the next state of the transition; if false then on the previous state of the transition.public boolean isRfFeaturesAreForNextState()
public void setRfFeaturesAreForNextState(boolean rfFeaturesAreForNextState)
public StateToFeatureVectorGenerator getRfFvGen()
public void setRfFvGen(StateToFeatureVectorGenerator rfFvGen)
public StateToFeatureVectorGenerator getVinitFvGen()
public void setVinitFvGen(StateToFeatureVectorGenerator vinitFvGen)
public int getRfDim()
public void setRfDim(int rfDim)
public int getVinitDim()
public void setVinitDim(int vinitDim)
public double[] getGradient(State s, GroundedAction ga, State sp)
DifferentiableRF
getGradient
in class DifferentiableRF
s
- the source statega
- the action taken in the source statesp
- the resulting state from the actionprotected DifferentiableRF copyHelper()
DifferentiableRF
DifferentiableRF.copy()
method.copyHelper
in class DifferentiableRF
public double reward(State s, GroundedAction a, State sprime)
RewardFunction
reward
in interface RewardFunction
s
- the state in which the action was executeda
- the action executedsprime
- the state to which the agent transitionedpublic double[] getVGradient(State s)
DifferentiableVInit
getVGradient
in interface DifferentiableVInit
s
- the state on which the value function is to be evaluatedpublic double[] getQGradient(State s, AbstractGroundedAction ga)
DifferentiableVInit
getQGradient
in interface DifferentiableVInit
s
- the state on which the Q-value is to be evaluated.ga
- the action on which the Q-value is to be evaluated.public double value(State s)
ValueFunction
value
in interface ValueFunction
s
- the state to evaluate.public double qValue(State s, AbstractGroundedAction a)
ValueFunctionInitialization
qValue
in interface ValueFunctionInitialization
s
- the state for which to get the initial value of the Q-value function.a
- the action for which to get the initial value of the Q-value function.