PotentialShapedRF

Prev Class
Next Class

All Classes

Summary:
Nested |
Field |
Constr |
Method

Detail:
Field |
Constr |
Method

java.lang.Object
- burlap.behavior.singleagent.shaping.ShapedRewardFunction
- - burlap.behavior.singleagent.shaping.potential.PotentialShapedRF

All Implemented Interfaces:

RewardFunction
```
public class PotentialShapedRF
extends ShapedRewardFunction
```
This class is used to implement Potential-based reward shaping [1] which is guaranteed to preserve the optimal policy. This class requires a PotentialFunction and the discount being used by the MDP. The additive reward is defined as: d * p(s') - p(s) where d is this discount factor, s' is the most recent state, s is the previous state, and p(s) is the potential of state s. 1. Ng, Andrew Y., Daishi Harada, and Stuart Russell. "Policy invariance under reward transformations: Theory and application to reward shaping." ICML. 1999.

Author:

James MacGlashan

Field Summary

Fields
Modifier and Type	Field and Description
`protected double`	`discount` The discount factor the MDP (required for this to shaping to preserve policy optimality)
`protected PotentialFunction`	`potentialFunction` The potential function that can be used to return the potential reward from input states.

Fields inherited from class burlap.behavior.singleagent.shaping.ShapedRewardFunction
baseRF

Constructor Summary

Constructors
Constructor and Description
`PotentialShapedRF(RewardFunction baseRF, PotentialFunction potentialFunction, double discount)` Initializes the shaping with the objective reward function, the potential function, and the discount of the MDP.

Method Summary

Methods
Modifier and Type	Method and Description
`double`	`additiveReward(State s, GroundedAction a, State sprime)` Returns the reward value to add to the base objective reward function.

Methods inherited from class burlap.behavior.singleagent.shaping.ShapedRewardFunction
reward

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - potentialFunction
```
protected PotentialFunction potentialFunction
```
    The potential function that can be used to return the potential reward from input states.
  - discount
```
protected double discount
```
    The discount factor the MDP (required for this to shaping to preserve policy optimality)
- Constructor Detail
  - PotentialShapedRF
```
public PotentialShapedRF(RewardFunction baseRF,
                 PotentialFunction potentialFunction,
                 double discount)
```
    Initializes the shaping with the objective reward function, the potential function, and the discount of the MDP.
    
    Parameters:
    baseRF - the objective task reward function.
    potentialFunction - the potential function to use.
    discount - the discount factor of the MDP.
- Method Detail
  - additiveReward
```
public double additiveReward(State s,
                    GroundedAction a,
                    State sprime)
```
    Description copied from class: ShapedRewardFunction
    
    Returns the reward value to add to the base objective reward function.
    
    Specified by:
    
    additiveReward in class ShapedRewardFunction
    
    Parameters:
    s - the previous state
    a - the action taken the previous state
    sprime - the successor state
    
    Returns:
    the reward value to add to the base objective reward function.

Prev Class
Next Class

All Classes

Summary:
Nested |
Field |
Constr |
Method

Detail:
Field |
Constr |
Method