ApprenticeshipLearning

java.lang.Object
- burlap.behavior.singleagent.learnbydemo.apprenticeship.ApprenticeshipLearning

```
public class ApprenticeshipLearning
extends java.lang.Object
```
This algorithm will take expert trajectors and return a policy that models them. It is an implementation of the algorithm described by Abbel and Ng [1]. Both the projection method and quadractic programming version are available. 1. Abbeel, Peter and Ng, Andrew. "Apprenticeship Learning via Inverse Reinforcement Learning"

Author:

Stephen Brawner and Mark Ho; modified by James MacGlashan

Nested Class Summary

Nested Classes
Modifier and Type	Class and Description
`static class`	`ApprenticeshipLearning.RandomPolicy` This class extends Policy, and all it does is create a randomly generated distribution of actions over all possible states.

Field Summary

Fields
Modifier and Type Field and Description

static int debugCodeRFWeights

static int debugCodeScore

static int FEATURE_EXPECTATION_SAMPLES

Fields
Modifier and Type	Field and Description
`static int`	`debugCodeRFWeights`
`static int`	`debugCodeScore`
`static int`	`FEATURE_EXPECTATION_SAMPLES`

Constructor Summary

Constructors
Constructor and Description

ApprenticeshipLearning()

Constructors
Constructor and Description
`ApprenticeshipLearning()`

Method Summary

Methods
Modifier and Type	Method and Description
`static double[]`	`estimateFeatureExpectation(EpisodeAnalysis episodeAnalysis, StateToFeatureVectorGenerator featureFunctions, java.lang.Double gamma)` Calculates the Feature Expectations given one demonstration, a feature mapping and a discount factor gamma
`static double[]`	`estimateFeatureExpectation(java.util.List<EpisodeAnalysis> episodes, StateToFeatureVectorGenerator featureFunctions, java.lang.Double gamma)` Calculates the Feature Expectations given a list of demonstrations, a feature mapping and a discount factor gamma
`static RewardFunction`	`generateRewardFunction(StateToFeatureVectorGenerator featureFunctions, burlap.behavior.singleagent.learnbydemo.apprenticeship.ApprenticeshipLearning.FeatureWeights featureWeights)` Generates an anonymous instance of a reward function derived from a FeatureMapping and associated feature weights Computes (w^(i))T phi from step 4 in section 3
`static State`	`getInitialState(java.util.List<EpisodeAnalysis> episodes)` Returns the initial state of a randomly chosen episode analysis
`static Policy`	`getLearnedPolicy(ApprenticeshipLearningRequest request)` Computes a policy that models the expert trajectorys included in the request object.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - FEATURE_EXPECTATION_SAMPLES
```
public static final int FEATURE_EXPECTATION_SAMPLES
```
    See Also:
    Constant Field Values
  - debugCodeScore
```
public static final int debugCodeScore
```
    See Also:
    Constant Field Values
  - debugCodeRFWeights
```
public static final int debugCodeRFWeights
```
    See Also:
    Constant Field Values
- Constructor Detail
  - ApprenticeshipLearning
```
public ApprenticeshipLearning()
```
- Method Detail
  - estimateFeatureExpectation
```
public static double[] estimateFeatureExpectation(EpisodeAnalysis episodeAnalysis,
                                  StateToFeatureVectorGenerator featureFunctions,
                                  java.lang.Double gamma)
```
    Calculates the Feature Expectations given one demonstration, a feature mapping and a discount factor gamma
    
    Parameters:
    episodeAnalysis - An EpisodeAnalysis object that contains a sequence of state-action pairs
    featureFunctions - Feature Mapping which maps states to features
    gamma - Discount factor gamma
    
    Returns:
    The Feature Expectations generated (double array that matches the length of the featureMapping)
  - estimateFeatureExpectation
```
public static double[] estimateFeatureExpectation(java.util.List<EpisodeAnalysis> episodes,
                                  StateToFeatureVectorGenerator featureFunctions,
                                  java.lang.Double gamma)
```
    Calculates the Feature Expectations given a list of demonstrations, a feature mapping and a discount factor gamma
    
    Parameters:
    episodes - List of expert demonstrations as EpisodeAnalysis objects
    featureFunctions - Feature Mapping which maps states to features
    gamma - Discount factor for future expected reward
    
    Returns:
    The Feature Expectations generated (double array that matches the length of the featureMapping)
  - generateRewardFunction
```
public static RewardFunction generateRewardFunction(StateToFeatureVectorGenerator featureFunctions,
                                    burlap.behavior.singleagent.learnbydemo.apprenticeship.ApprenticeshipLearning.FeatureWeights featureWeights)
```
    Generates an anonymous instance of a reward function derived from a FeatureMapping and associated feature weights Computes (w^(i))T phi from step 4 in section 3
    
    Parameters:
    featureFunctions - The feature mapping of states to features
    featureWeights - The weights given to each feature
    
    Returns:
    An anonymous instance of RewardFunction
  - getInitialState
```
public static State getInitialState(java.util.List<EpisodeAnalysis> episodes)
```
    Returns the initial state of a randomly chosen episode analysis
    
    Parameters:
    episodes -
    
    Returns:
    a random episode's initial state
  - getLearnedPolicy
```
public static Policy getLearnedPolicy(ApprenticeshipLearningRequest request)
```
    Computes a policy that models the expert trajectorys included in the request object.
    
    Parameters:
    request -
    
    Returns:
    the computed Policy

Class ApprenticeshipLearning

Nested Class Summary

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

FEATURE_EXPECTATION_SAMPLES

debugCodeScore

debugCodeRFWeights

Constructor Detail

ApprenticeshipLearning

Method Detail

estimateFeatureExpectation

estimateFeatureExpectation

generateRewardFunction

getInitialState

getLearnedPolicy