public class DiffVFRF extends DifferentiableRF
MLIRL
when
the reward function is known, but the value function initialization for leaf nodes is to be learned.
This class takes as input the true reward function and a DifferentiableVInit
object to form the DifferentiableRF
object
that MLIRL
will use.Modifier and Type | Field and Description |
---|---|
protected DifferentiableVInit.ParamedDiffVInit |
diffVInit |
protected RewardFunction |
objectiveRF |
dim, parameters
Constructor and Description |
---|
DiffVFRF(RewardFunction objectiveRF,
DifferentiableVInit.ParamedDiffVInit diffVinit) |
Modifier and Type | Method and Description |
---|---|
protected DifferentiableRF |
copyHelper()
A helper method for making a copy of this reward function.
|
double[] |
getGradient(State s,
GroundedAction ga,
State sp)
Returns the gradient of the reward function for the given state transition.
|
double |
reward(State s,
GroundedAction a,
State sprime)
Returns the reward received when action a is executed in state s and the agent transitions to state sprime.
|
void |
setParameters(double[] parameters) |
copy, getParameterDimension, getParameters, randomizeParameters, setParameter, toString
protected RewardFunction objectiveRF
protected DifferentiableVInit.ParamedDiffVInit diffVInit
public DiffVFRF(RewardFunction objectiveRF, DifferentiableVInit.ParamedDiffVInit diffVinit)
public double[] getGradient(State s, GroundedAction ga, State sp)
DifferentiableRF
getGradient
in class DifferentiableRF
s
- the source statega
- the action taken in the source statesp
- the resulting state from the actionprotected DifferentiableRF copyHelper()
DifferentiableRF
DifferentiableRF.copy()
method.copyHelper
in class DifferentiableRF
public double reward(State s, GroundedAction a, State sprime)
RewardFunction
s
- the state in which the action was executeda
- the action executedsprime
- the state to which the agent transitionedpublic void setParameters(double[] parameters)
setParameters
in class DifferentiableRF