RewardValueProjection

java.lang.Object
- burlap.behavior.singleagent.learnfromdemo.RewardValueProjection

All Implemented Interfaces:

QFunction, QProvider, ValueFunction
```
public class RewardValueProjection
extends java.lang.Object
implements QProvider
```
This class is a QProvider/ValueFunction wrapper to provide the immediate reward signals for a source RewardFunction. It is useful for analyzing learned reward function through IRL, for example, for passing a learned reward function to a ValueFunctionVisualizerGUI to visualize what was learned. This class returns values based one of four possible reward projection types (RewardValueProjection.RewardProjectionType):
SOURCESTATE: when the reward function only depends on the source state
DESTINATIONSTATE: when the reward function only depends on the destination state (the state to which the agent transitions)
STATEACTION: when the reward function only depends on the state-action pair
ONESTEP: when the reward function depends on a transition of some sort (e.g., from a source state to a target state)
The default assumption is DESTINATIONSTATE.
When the value(State) of a state is queried, it returns the value of the RewardFunction using the most minimal information. For example, if the projection type is DESTINATIONSTATE, then the value returned is rf.reward(null, null, s), where rf is the input RewardFunction and s is the input State to the value(State) method. If it's SOURCESTATE, then it returns rf.reward(s, null, null). If it is STATEACTION or ONESTEP, then the Domain will need to have been input with the RewardValueProjection(RewardFunction, RewardProjectionType, SADomain) constructor so that the actions can be enumerated (and in the case of ONESTEP, the transitions enumerated) and the max reward taken. Similarly, the qValue(State, Action) and qValues(State) methods may need the Domain provided to properly answer the query.

Author:

James MacGlashan.

Nested Class Summary

Nested Classes
Modifier and Type Class and Description

static class RewardValueProjection.CustomRewardNoTermModel

static class RewardValueProjection.RewardProjectionType
- Nested classes/interfaces inherited from interface burlap.behavior.valuefunction.QProvider
  QProvider.Helper

Nested Classes
Modifier and Type	Class and Description
`static class`	`RewardValueProjection.CustomRewardNoTermModel`
`static class`	`RewardValueProjection.RewardProjectionType`

Field Summary

Fields
Modifier and Type	Field and Description
`protected SADomain`	`domain`
`protected SparseSampling`	`oneStepBellmanPlanner`
`protected RewardValueProjection.RewardProjectionType`	`projectionType`
`protected RewardFunction`	`rf`

Constructor Summary

Constructors
Constructor and Description
`RewardValueProjection(RewardFunction rf)` Initializes for the given `RewardFunction` assuming that it only depends on the destination state.
`RewardValueProjection(RewardFunction rf, RewardValueProjection.RewardProjectionType projectionType)` Initializes.
`RewardValueProjection(RewardFunction rf, RewardValueProjection.RewardProjectionType projectionType, SADomain domain)` Initializes.

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`double`	`qValue(State s, Action a)` Returns the `QValue` for the given state-action pair.
`java.util.List<QValue>`	`qValues(State s)` Returns a `List` of `QValue` objects for ever permissible action for the given input state.
`double`	`value(State s)` Returns the value function evaluation of the given state.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - rf
```
protected RewardFunction rf
```
  - projectionType
```
protected RewardValueProjection.RewardProjectionType projectionType
```
  - oneStepBellmanPlanner
```
protected SparseSampling oneStepBellmanPlanner
```
  - domain
```
protected SADomain domain
```
- Constructor Detail
  - RewardValueProjection
```
public RewardValueProjection(RewardFunction rf)
```
    Initializes for the given RewardFunction assuming that it only depends on the destination state.
    
    Parameters:
    
    rf - the input RewardFunction to project for one step.
  - RewardValueProjection
```
public RewardValueProjection(RewardFunction rf,
                             RewardValueProjection.RewardProjectionType projectionType)
```
    Initializes. Note that if projectionType is ONESTEP a runtime exception will be thrown because projecting a one step value requires the Domain to enumerate the actions and transition dynamics. Use the RewardValueProjection(RewardFunction, RewardProjectionType, SADomain) constructor instead.
    
    Parameters:
    
    rf - the input RewardFunction to project for one step.
    
    projectionType - the type of reward projection to use.
  - RewardValueProjection
```
public RewardValueProjection(RewardFunction rf,
                             RewardValueProjection.RewardProjectionType projectionType,
                             SADomain domain)
```
    Initializes.
    
    Parameters:
    
    rf - the input RewardFunction to project for one step.
    
    projectionType - the type of reward projection to use.
    
    domain - the Domain in which the RewardFunction is evaluated.
- Method Detail
  - qValues
```
public java.util.List<QValue> qValues(State s)
```
    Description copied from interface: QProvider
    
    Returns a List of QValue objects for ever permissible action for the given input state.
    
    Specified by:
    
    qValues in interface QProvider
    
    Parameters:
    
    s - the state for which Q-values are to be returned.
    
    Returns:
    
    a List of QValue objects for ever permissible action for the given input state.
  - qValue
```
public double qValue(State s,
                     Action a)
```
    Description copied from interface: QFunction
    
    Returns the QValue for the given state-action pair.
    
    Specified by:
    
    qValue in interface QFunction
    
    Parameters:
    
    s - the input state
    
    a - the input action
    
    Returns:
    
    the QValue for the given state-action pair.
  - value
```
public double value(State s)
```
    Description copied from interface: ValueFunction
    
    Returns the value function evaluation of the given state. If the value is not stored, then the default value specified by the ValueFunctionInitialization object of this class is returned.
    
    Specified by:
    
    value in interface ValueFunction
    
    Parameters:
    
    s - the state to evaluate.
    
    Returns:
    
    the value function evaluation of the given state.

Class RewardValueProjection

Nested Class Summary

Nested classes/interfaces inherited from interface burlap.behavior.valuefunction.QProvider

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

rf

projectionType

oneStepBellmanPlanner

domain

Constructor Detail

RewardValueProjection

RewardValueProjection

RewardValueProjection

Method Detail

qValues

qValue

value