LinearStateActionDifferentiableRF

java.lang.Object
- burlap.behavior.singleagent.learnbydemo.mlirl.support.DifferentiableRF
- - burlap.behavior.singleagent.learnbydemo.mlirl.commonrfs.LinearStateActionDifferentiableRF

All Implemented Interfaces:

RewardFunction
```
public class LinearStateActionDifferentiableRF
extends DifferentiableRF
```
A class for defining a state-action linear DifferentiableRF. The class takes as input a StateToFeatureVectorGenerator and the set of possible grounded actions that can be applied in the world. The dimensionality of this reward function is equal to |A|*|f|, where A is the set of possible grounded actions, and |f| is the state feature vector dimensionality.
The reward function is defined as R(s, a, s') = w(a) * f(s), where w(a) is the set of weights (the parameters) of this reward functions associated with action a, * is the dot product operator, and f(s) is the feature vector for state s.
Note that when the gradient is a vector of size |A||f|, since the feature vector is replicated for each action, and the gradient for all entries associated with an action other than the one taken in the (s, a, s') query will have a gradient value of zero.
The set of possible grounded actions must be defined either in the LinearStateActionDifferentiableRF(burlap.behavior.singleagent.vfa.StateToFeatureVectorGenerator, int, burlap.oomdp.singleagent.GroundedAction...) constructor, or added iteratively with the addAction(burlap.oomdp.singleagent.GroundedAction) method.

Author:

James MacGlashan.

Field Summary

Fields
Modifier and Type	Field and Description
`protected java.util.Map<GroundedAction,java.lang.Integer>`	`actionMap` An ordering of grounded actions
`protected StateToFeatureVectorGenerator`	`fvGen` The state feature vector generator to use
`protected int`	`numStateFeatures` The number of state features

Fields inherited from class burlap.behavior.singleagent.learnbydemo.mlirl.support.DifferentiableRF
dim, parameters

Constructor Summary

Constructors
Constructor and Description
`LinearStateActionDifferentiableRF(StateToFeatureVectorGenerator stateFeatures, int numStateFeatures, GroundedAction... allPossibleActions)` Initializes.

Method Summary

Methods
Modifier and Type	Method and Description
`void`	`addAction(GroundedAction ga)` Adds a possible grounded action.
`protected DifferentiableRF`	`copyHelper()` A helper method for making a copy of this reward function.
`protected void`	`copyInto(double[] source, double[] target, int index)` The copies the values of source into the target, starting in target index position index.
`double[]`	`getGradient(State s, GroundedAction ga, State sp)` Returns the gradient of the reward function for the given state transition.
`double`	`reward(State s, GroundedAction a, State sprime)` Returns the reward received when action a is executed in state s and the agent transitions to state sprime.

Methods inherited from class burlap.behavior.singleagent.learnbydemo.mlirl.support.DifferentiableRF
copy, getParameterDimension, getParameters, randomizeParameters, setParameter, setParameters, toString

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait

- Field Detail
  - actionMap
```
protected java.util.Map<GroundedAction,java.lang.Integer> actionMap
```
    An ordering of grounded actions
  - fvGen
```
protected StateToFeatureVectorGenerator fvGen
```
    The state feature vector generator to use
  - numStateFeatures
```
protected int numStateFeatures
```
    The number of state features
- Constructor Detail
  - LinearStateActionDifferentiableRF
```
public LinearStateActionDifferentiableRF(StateToFeatureVectorGenerator stateFeatures,
                                 int numStateFeatures,
                                 GroundedAction... allPossibleActions)
```
    Initializes. If not all possible grounded actions are provided, they can be defined/added later with the addAction(burlap.oomdp.singleagent.GroundedAction) method.
    
    Parameters:
    stateFeatures - the state feature vector generator
    numStateFeatures - the dimensionality of the state feature vector
    allPossibleActions - the set of possible grounded actions.
- Method Detail
  - addAction
```
public void addAction(GroundedAction ga)
```
    Adds a possible grounded action. This addition increases the dimensionality of this reward function by |f| where |f| is the dimensionality of the state feature vector.
    
    Parameters:
    ga - the possible grounded action to add to this reward function's definition.
  - copyHelper
```
protected DifferentiableRF copyHelper()
```
    Description copied from class: DifferentiableRF
    
    A helper method for making a copy of this reward function. THe parameters and dimensionality do not have to be copied, because they will be copied in the public DifferentiableRF.copy() method.
    
    Specified by:
    
    copyHelper in class DifferentiableRF
    
    Returns:
    a copy of this reward function.
  - reward
```
public double reward(State s,
            GroundedAction a,
            State sprime)
```
    Description copied from interface: RewardFunction
    
    Returns the reward received when action a is executed in state s and the agent transitions to state sprime.
    
    Parameters:
    s - the state in which the action was executed
    a - the action executed
    sprime - the state to which the agent transitioned
    
    Returns:
    the reward received when action a is executed in state s and the agent transitions to state sprime.
  - getGradient
```
public double[] getGradient(State s,
                   GroundedAction ga,
                   State sp)
```
    Description copied from class: DifferentiableRF
    
    Returns the gradient of the reward function for the given state transition.
    
    Specified by:
    
    getGradient in class DifferentiableRF
    
    Parameters:
    s - the source state
    ga - the action taken in the source state
    sp - the resulting state from the action
    
    Returns:
    the gradient of the reward function for the given transition.
  - copyInto
```
protected void copyInto(double[] source,
            double[] target,
            int index)
```
    The copies the values of source into the target, starting in target index position index. For example, target[index] = source[0]; target[index+1] = source[1]; etc.
    
    Parameters:
    source - the source values
    target - the target array to receive the source values
    index - the starting index in the target array into which the source values will be copied.

Class LinearStateActionDifferentiableRF

Field Summary

Fields inherited from class burlap.behavior.singleagent.learnbydemo.mlirl.support.DifferentiableRF

Constructor Summary

Method Summary

Methods inherited from class burlap.behavior.singleagent.learnbydemo.mlirl.support.DifferentiableRF

Methods inherited from class java.lang.Object

Field Detail

actionMap

fvGen

numStateFeatures

Constructor Detail

LinearStateActionDifferentiableRF

Method Detail

addAction

copyHelper

reward

getGradient

copyInto