ApprenticeshipLearning

java.lang.Object
- burlap.behavior.singleagent.learnfromdemo.apprenticeship.ApprenticeshipLearning

```
public class ApprenticeshipLearning
extends java.lang.Object
```
This algorithm will take expert trajectors and return a policy that models them. It is an implementation of the algorithm described by Abbel and Ng [1]. Both the projection method and quadractic programming version are available. 1. Abbeel, Peter and Ng, Andrew. "Apprenticeship Learning via Inverse Reinforcement Learning"

Author:

Stephen Brawner and Mark Ho; modified by James MacGlashan

Nested Class Summary

Nested Classes
Modifier and Type Class and Description

static class ApprenticeshipLearning.StationaryRandomDistributionPolicy
This class extends Policy.

Nested Classes
Modifier and Type	Class and Description
`static class`	`ApprenticeshipLearning.StationaryRandomDistributionPolicy` This class extends Policy.

Field Summary

Fields
Modifier and Type Field and Description

static int DEBUG_CODE_RF_WEIGHTS

static int DEBUG_CODE_SCORE

Fields
Modifier and Type	Field and Description
`static int`	`DEBUG_CODE_RF_WEIGHTS`
`static int`	`DEBUG_CODE_SCORE`

Method Summary

All Methods Static Methods Concrete Methods
Modifier and Type	Method and Description
`static double[]`	`estimateFeatureExpectation(Episode episode, DenseStateFeatures featureFunctions, java.lang.Double gamma)` Calculates the Feature Expectations given one demonstration, a feature mapping and a discount factor gamma
`static double[]`	`estimateFeatureExpectation(java.util.List<Episode> episodes, DenseStateFeatures featureFunctions, java.lang.Double gamma)` Calculates the Feature Expectations given a list of demonstrations, a feature mapping and a discount factor gamma
`static RewardFunction`	`generateRewardFunction(DenseStateFeatures featureFunctions, burlap.behavior.singleagent.learnfromdemo.apprenticeship.ApprenticeshipLearning.FeatureWeights featureWeights)` Generates an anonymous instance of a reward function derived from a FeatureMapping and associated feature weights Computes (w^(i))T phi from step 4 in section 3
`static State`	`getInitialState(java.util.List<Episode> episodes)` Returns the initial state of a randomly chosen episode analysis
`static Policy`	`getLearnedPolicy(ApprenticeshipLearningRequest request)` Computes a policy that models the expert trajectories included in the request object.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - DEBUG_CODE_SCORE
```
public static final int DEBUG_CODE_SCORE
```
    See Also:
    
    Constant Field Values
  - DEBUG_CODE_RF_WEIGHTS
```
public static final int DEBUG_CODE_RF_WEIGHTS
```
    See Also:
    
    Constant Field Values
- Method Detail
  - estimateFeatureExpectation
```
public static double[] estimateFeatureExpectation(Episode episode,
                                                  DenseStateFeatures featureFunctions,
                                                  java.lang.Double gamma)
```
    Calculates the Feature Expectations given one demonstration, a feature mapping and a discount factor gamma
    
    Parameters:
    
    episode - An EpisodeAnalysis object that contains a sequence of state-action pairs
    
    featureFunctions - Feature Mapping which maps states to features
    
    gamma - Discount factor gamma
    
    Returns:
    
    The Feature Expectations generated (double array that matches the length of the featureMapping)
  - estimateFeatureExpectation
```
public static double[] estimateFeatureExpectation(java.util.List<Episode> episodes,
                                                  DenseStateFeatures featureFunctions,
                                                  java.lang.Double gamma)
```
    Calculates the Feature Expectations given a list of demonstrations, a feature mapping and a discount factor gamma
    
    Parameters:
    
    episodes - List of expert demonstrations as EpisodeAnalysis objects
    
    featureFunctions - Feature Mapping which maps states to features
    
    gamma - Discount factor for future expected reward
    
    Returns:
    
    The Feature Expectations generated (double array that matches the length of the featureMapping)
  - generateRewardFunction
```
public static RewardFunction generateRewardFunction(DenseStateFeatures featureFunctions,
                                                    burlap.behavior.singleagent.learnfromdemo.apprenticeship.ApprenticeshipLearning.FeatureWeights featureWeights)
```
    Generates an anonymous instance of a reward function derived from a FeatureMapping and associated feature weights Computes (w^(i))T phi from step 4 in section 3
    
    Parameters:
    
    featureFunctions - The feature mapping of states to features
    
    featureWeights - The weights given to each feature
    
    Returns:
    
    An anonymous instance of RewardFunction
  - getInitialState
```
public static State getInitialState(java.util.List<Episode> episodes)
```
    Returns the initial state of a randomly chosen episode analysis
    
    Parameters:
    
    episodes - the expert demonstrations
    
    Returns:
    
    a random episode's initial state
  - getLearnedPolicy
```
public static Policy getLearnedPolicy(ApprenticeshipLearningRequest request)
```
    Computes a policy that models the expert trajectories included in the request object.
    
    Parameters:
    
    request - the IRL problem description
    
    Returns:
    
    the computed Policy

Class ApprenticeshipLearning

Nested Class Summary

Field Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

DEBUG_CODE_SCORE

DEBUG_CODE_RF_WEIGHTS

Method Detail

estimateFeatureExpectation

estimateFeatureExpectation

generateRewardFunction

getInitialState

getLearnedPolicy