LinearDiffRFVInit

java.lang.Object
- burlap.behavior.singleagent.learnbydemo.mlirl.support.DifferentiableRF
- - burlap.behavior.singleagent.learnbydemo.mlirl.differentiableplanners.diffvinit.LinearDiffRFVInit

All Implemented Interfaces:

DifferentiableVInit, ValueFunction, ValueFunctionInitialization, RewardFunction
```
public class LinearDiffRFVInit
extends DifferentiableRF
implements DifferentiableVInit
```
A class for creating a DifferentiableRF and a DifferentiableVInit when the reward function and value function initialization are linear functions over some set of features. The total parameter dimensionality will be the sum of the reward function feature dimension and value function initialization feature dimension.
This class is useful when learning both a reward function and the shaping values at the leaf nodes of a finite horizon planner.

Author:

James MacGlashan.

Nested Class Summary
- Nested classes/interfaces inherited from interface burlap.behavior.singleagent.learnbydemo.mlirl.differentiableplanners.diffvinit.DifferentiableVInit
  DifferentiableVInit.ParamedDiffVInit
- Nested classes/interfaces inherited from interface burlap.behavior.singleagent.ValueFunctionInitialization
  ValueFunctionInitialization.ConstantValueFunctionInitialization

Field Summary

Fields
Modifier and Type	Field and Description
`protected int`	`rfDim` The dimensionality of the reward function parameters
`protected boolean`	`rfFeaturesAreForNextState` Whether features are based on the next state or previous state.
`protected StateToFeatureVectorGenerator`	`rfFvGen` The state feature vector generator.
`protected int`	`vinitDim` The dimensionality of the value function initialization parameters
`protected StateToFeatureVectorGenerator`	`vinitFvGen` The state feature vector generator.

Fields inherited from class burlap.behavior.singleagent.learnbydemo.mlirl.support.DifferentiableRF
dim, parameters

Constructor Summary

Constructors
Constructor and Description
`LinearDiffRFVInit(StateToFeatureVectorGenerator rfFvGen, StateToFeatureVectorGenerator vinitFvGen, int rfDim, int vinitDim)` Initializes a linear reward function for a given feature vector of a given dimension and linear value function initialization for a given feature vector and set of dimensions.
`LinearDiffRFVInit(StateToFeatureVectorGenerator rfFvGen, StateToFeatureVectorGenerator vinitFvGen, int rfDim, int vinitDim, boolean rfFeaturesAreForNextState)` Initializes a linear reward function for a given feature vector of a given dimension and linear value function initialization for a given feature vector and set of dimensions.

Method Summary

Methods
Modifier and Type	Method and Description
`protected DifferentiableRF`	`copyHelper()` A helper method for making a copy of this reward function.
`double[]`	`getGradient(State s, GroundedAction ga, State sp)` Returns the gradient of the reward function for the given state transition.
`double[]`	`getQGradient(State s, AbstractGroundedAction ga)` Returns the Q-value function gradient.
`int`	`getRfDim()`
`StateToFeatureVectorGenerator`	`getRfFvGen()`
`double[]`	`getVGradient(State s)` Returns the value function gradient.
`int`	`getVinitDim()`
`StateToFeatureVectorGenerator`	`getVinitFvGen()`
`boolean`	`isRfFeaturesAreForNextState()` Returns whether the reward function state features are evaluated on the next state of the transition (s' of R(s,a,s')) or the previous state of the transition (s of R(s,a,s'))
`double`	`qValue(State s, AbstractGroundedAction a)` Returns the initialization value of the Q-value function for a given state and action pair.
`double`	`reward(State s, GroundedAction a, State sprime)` Returns the reward received when action a is executed in state s and the agent transitions to state sprime.
`void`	`setRfDim(int rfDim)`
`void`	`setRfFeaturesAreForNextState(boolean rfFeaturesAreForNextState)`
`void`	`setRfFvGen(StateToFeatureVectorGenerator rfFvGen)`
`void`	`setVinitDim(int vinitDim)`
`void`	`setVinitFvGen(StateToFeatureVectorGenerator vinitFvGen)`
`double`	`value(State s)` Returns the value function evaluation of the given state.

Methods inherited from class burlap.behavior.singleagent.learnbydemo.mlirl.support.DifferentiableRF
copy, getParameterDimension, getParameters, randomizeParameters, setParameter, setParameters, toString

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait

- Field Detail
  - rfFeaturesAreForNextState
```
protected boolean rfFeaturesAreForNextState
```
    Whether features are based on the next state or previous state. Default is for the next state (true).
  - rfFvGen
```
protected StateToFeatureVectorGenerator rfFvGen
```
    The state feature vector generator.
  - vinitFvGen
```
protected StateToFeatureVectorGenerator vinitFvGen
```
    The state feature vector generator.
  - rfDim
```
protected int rfDim
```
    The dimensionality of the reward function parameters
  - vinitDim
```
protected int vinitDim
```
    The dimensionality of the value function initialization parameters
- Constructor Detail
  - LinearDiffRFVInit
```
public LinearDiffRFVInit(StateToFeatureVectorGenerator rfFvGen,
                 StateToFeatureVectorGenerator vinitFvGen,
                 int rfDim,
                 int vinitDim)
```
    Initializes a linear reward function for a given feature vector of a given dimension and linear value function initialization for a given feature vector and set of dimensions.
    
    Parameters:
    rfFvGen - the reward function feature vector generator
    vinitFvGen - the value function initialization feature vector generator
    rfDim - the reward function feature/parameter dimensionality
    vinitDim - the value function initialization feature/parameter dimensionality
  - LinearDiffRFVInit
```
public LinearDiffRFVInit(StateToFeatureVectorGenerator rfFvGen,
                 StateToFeatureVectorGenerator vinitFvGen,
                 int rfDim,
                 int vinitDim,
                 boolean rfFeaturesAreForNextState)
```
    Initializes a linear reward function for a given feature vector of a given dimension and linear value function initialization for a given feature vector and set of dimensions.
    
    Parameters:
    rfFvGen - the reward function feature vector generator
    vinitFvGen - the value function initialization feature vector generator
    rfDim - the reward function feature/parameter dimensionality
    vinitDim - the value function initialization feature/parameter dimensionality
    rfFeaturesAreForNextState - if true, the the rf features are evaluated on the next state of the transition; if false then on the previous state of the transition.
- Method Detail
  - isRfFeaturesAreForNextState
```
public boolean isRfFeaturesAreForNextState()
```
    Returns whether the reward function state features are evaluated on the next state of the transition (s' of R(s,a,s')) or the previous state of the transition (s of R(s,a,s'))
    
    Returns:
    True if features are evaluated on the next state; false if they are evaluated on the previous state.
  - setRfFeaturesAreForNextState
```
public void setRfFeaturesAreForNextState(boolean rfFeaturesAreForNextState)
```
  - getRfFvGen
```
public StateToFeatureVectorGenerator getRfFvGen()
```
  - setRfFvGen
```
public void setRfFvGen(StateToFeatureVectorGenerator rfFvGen)
```
  - getVinitFvGen
```
public StateToFeatureVectorGenerator getVinitFvGen()
```
  - setVinitFvGen
```
public void setVinitFvGen(StateToFeatureVectorGenerator vinitFvGen)
```
  - getRfDim
```
public int getRfDim()
```
  - setRfDim
```
public void setRfDim(int rfDim)
```
  - getVinitDim
```
public int getVinitDim()
```
  - setVinitDim
```
public void setVinitDim(int vinitDim)
```
  - getGradient
```
public double[] getGradient(State s,
                   GroundedAction ga,
                   State sp)
```
    Description copied from class: DifferentiableRF
    
    Returns the gradient of the reward function for the given state transition.
    
    Specified by:
    
    getGradient in class DifferentiableRF
    
    Parameters:
    s - the source state
    ga - the action taken in the source state
    sp - the resulting state from the action
    
    Returns:
    the gradient of the reward function for the given transition.
  - copyHelper
```
protected DifferentiableRF copyHelper()
```
    Description copied from class: DifferentiableRF
    
    A helper method for making a copy of this reward function. THe parameters and dimensionality do not have to be copied, because they will be copied in the public DifferentiableRF.copy() method.
    
    Specified by:
    
    copyHelper in class DifferentiableRF
    
    Returns:
    a copy of this reward function.
  - reward
```
public double reward(State s,
            GroundedAction a,
            State sprime)
```
    Description copied from interface: RewardFunction
    
    Returns the reward received when action a is executed in state s and the agent transitions to state sprime.
    
    Specified by:
    
    reward in interface RewardFunction
    
    Parameters:
    s - the state in which the action was executed
    a - the action executed
    sprime - the state to which the agent transitioned
    
    Returns:
    the reward received when action a is executed in state s and the agent transitions to state sprime.
  - getVGradient
```
public double[] getVGradient(State s)
```
    Description copied from interface: DifferentiableVInit
    
    Returns the value function gradient.
    
    Specified by:
    
    getVGradient in interface DifferentiableVInit
    
    Parameters:
    s - the state on which the value function is to be evaluated
    
    Returns:
    the value function gradient.
  - getQGradient
```
public double[] getQGradient(State s,
                    AbstractGroundedAction ga)
```
    Description copied from interface: DifferentiableVInit
    
    Returns the Q-value function gradient.
    
    Specified by:
    
    getQGradient in interface DifferentiableVInit
    
    Parameters:
    s - the state on which the Q-value is to be evaluated.
    ga - the action on which the Q-value is to be evaluated.
    
    Returns:
    the Q-value function gradient
  - value
```
public double value(State s)
```
    Description copied from interface: ValueFunction
    
    Returns the value function evaluation of the given state. If the value is not stored, then the default value specified by the ValueFunctionInitialization object of this class is returned.
    
    Specified by:
    
    value in interface ValueFunction
    
    Parameters:
    s - the state to evaluate.
    
    Returns:
    the value function evaluation of the given state.
  - qValue
```
public double qValue(State s,
            AbstractGroundedAction a)
```
    Description copied from interface: ValueFunctionInitialization
    
    Returns the initialization value of the Q-value function for a given state and action pair.
    
    Specified by:
    
    qValue in interface ValueFunctionInitialization
    
    Parameters:
    s - the state for which to get the initial value of the Q-value function.
    a - the action for which to get the initial value of the Q-value function.
    
    Returns:
    the initialization value of the Q-value function for a given state and action pair.

Class LinearDiffRFVInit

Nested Class Summary

Nested classes/interfaces inherited from interface burlap.behavior.singleagent.learnbydemo.mlirl.differentiableplanners.diffvinit.DifferentiableVInit

Nested classes/interfaces inherited from interface burlap.behavior.singleagent.ValueFunctionInitialization

Field Summary

Fields inherited from class burlap.behavior.singleagent.learnbydemo.mlirl.support.DifferentiableRF

Constructor Summary

Method Summary

Methods inherited from class burlap.behavior.singleagent.learnbydemo.mlirl.support.DifferentiableRF

Methods inherited from class java.lang.Object

Field Detail

rfFeaturesAreForNextState

rfFvGen

vinitFvGen

rfDim

vinitDim

Constructor Detail

LinearDiffRFVInit

LinearDiffRFVInit

Method Detail

isRfFeaturesAreForNextState

setRfFeaturesAreForNextState

getRfFvGen

setRfFvGen

getVinitFvGen

setVinitFvGen

getRfDim

setRfDim

getVinitDim

setVinitDim

getGradient

copyHelper

reward

getVGradient

getQGradient

value

qValue