public class DiffVFRF extends java.lang.Object implements DifferentiableRF
MLIRL when
the reward function is known, but the value function initialization for leaf nodes is to be learned.
This class takes as input the true reward function and a DifferentiableVInit
object to form the DifferentiableRF object
that MLIRL will use.ParametricFunction.ParametricStateActionFunction, ParametricFunction.ParametricStateFunction| Modifier and Type | Field and Description |
|---|---|
protected DifferentiableVInit |
diffVInit |
protected int |
dim |
protected RewardFunction |
objectiveRF |
| Constructor and Description |
|---|
DiffVFRF(RewardFunction objectiveRF,
DifferentiableVInit diffVinit) |
| Modifier and Type | Method and Description |
|---|---|
ParametricFunction |
copy()
Returns a copy of this
ParametricFunction. |
double |
getParameter(int i)
Returns the value of the ith parameter value
|
FunctionGradient |
gradient(State s,
GroundedAction a,
State sprime) |
int |
numParameters()
Returns the number of parameters defining this function.
|
void |
resetParameters()
Resets the parameters of this function to default values.
|
double |
reward(State s,
GroundedAction a,
State sprime)
Returns the reward received when action a is executed in state s and the agent transitions to state sprime.
|
void |
setParameter(int i,
double p)
Sets the value of the ith parameter to given value
|
protected RewardFunction objectiveRF
protected DifferentiableVInit diffVInit
protected int dim
public DiffVFRF(RewardFunction objectiveRF, DifferentiableVInit diffVinit)
public FunctionGradient gradient(State s, GroundedAction a, State sprime)
gradient in interface DifferentiableRFpublic int numParameters()
ParametricFunctionnumParameters in interface ParametricFunctionpublic double getParameter(int i)
ParametricFunctiongetParameter in interface ParametricFunctioni - the parameter indexpublic void setParameter(int i,
double p)
ParametricFunctionsetParameter in interface ParametricFunctioni - the index of the parameter to setp - the parameter value to which it should be setpublic void resetParameters()
ParametricFunctionresetParameters in interface ParametricFunctionpublic ParametricFunction copy()
ParametricFunctionParametricFunction.copy in interface ParametricFunctionParametricFunction.public double reward(State s, GroundedAction a, State sprime)
RewardFunctionreward in interface RewardFunctions - the state in which the action was executeda - the action executedsprime - the state to which the agent transitioned