PotentialShapedRF

Skip navigation links

Prev Class
Next Class

All Classes

Summary:
Nested |
Field |
Constr |
Method

Detail:
Field |
Constr |
Method

java.lang.Object
- burlap.behavior.singleagent.shaping.ShapedRewardFunction
- - burlap.behavior.singleagent.shaping.potential.PotentialShapedRF

All Implemented Interfaces:

RewardFunction
```
public class PotentialShapedRF
extends ShapedRewardFunction
```
This class is used to implement Potential-based reward shaping [1] which is guaranteed to preserve the optimal policy. This class requires a PotentialFunction and the discount being used by the MDP. The additive reward is defined as: d * p(s') - p(s) where d is this discount factor, s' is the most recent state, s is the previous state, and p(s) is the potential of state s. 1. Ng, Andrew Y., Daishi Harada, and Stuart Russell. "Policy invariance under reward transformations: Theory and application to reward shaping." ICML. 1999.

Author:

James MacGlashan

Field Summary

Fields
Modifier and Type	Field and Description
`protected double`	`discount` The discount factor the MDP (required for this to shaping to preserve policy optimality)
`protected PotentialFunction`	`potentialFunction` The potential function that can be used to return the potential reward from input states.

Fields inherited from class burlap.behavior.singleagent.shaping.ShapedRewardFunction
baseRF

Constructor Summary

Constructors
Constructor and Description
`PotentialShapedRF(RewardFunction baseRF, PotentialFunction potentialFunction, double discount)` Initializes the shaping with the objective reward function, the potential function, and the discount of the MDP.

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`double`	`additiveReward(State s, Action a, State sprime)` Returns the reward value to add to the base objective reward function.

Methods inherited from class burlap.behavior.singleagent.shaping.ShapedRewardFunction
reward

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - potentialFunction
```
protected PotentialFunction potentialFunction
```
    The potential function that can be used to return the potential reward from input states.
  - discount
```
protected double discount
```
    The discount factor the MDP (required for this to shaping to preserve policy optimality)
- Constructor Detail
  - PotentialShapedRF
```
public PotentialShapedRF(RewardFunction baseRF,
                         PotentialFunction potentialFunction,
                         double discount)
```
    Initializes the shaping with the objective reward function, the potential function, and the discount of the MDP.
    
    Parameters:
    
    baseRF - the objective task reward function.
    
    potentialFunction - the potential function to use.
    
    discount - the discount factor of the MDP.
- Method Detail
  - additiveReward
```
public double additiveReward(State s,
                             Action a,
                             State sprime)
```
    Description copied from class: ShapedRewardFunction
    
    Returns the reward value to add to the base objective reward function.
    
    Specified by:
    
    additiveReward in class ShapedRewardFunction
    
    Parameters:
    
    s - the previous state
    
    a - the action taken the previous state
    
    sprime - the successor state
    
    Returns:
    
    the reward value to add to the base objective reward function.

Skip navigation links

Prev Class
Next Class

All Classes

Summary:
Nested |
Field |
Constr |
Method

Detail:
Field |
Constr |
Method