MLIRLRequest

java.lang.Object
- burlap.behavior.singleagent.learnbydemo.IRLRequest
- - burlap.behavior.singleagent.learnbydemo.mlirl.MLIRLRequest

Direct Known Subclasses:

MultipleIntentionsMLIRLRequest
```
public class MLIRLRequest
extends IRLRequest
```
A request object for Maximum-Likelihood Inverse Reinforcement Learning (MLIRL). This request adds a set of optionally specified weights on the expert trajectories, the DifferentiableRF to use, and the Boltzmann beta parameter used for Differentiable planning. The larger the beta value, the more deterministic the expert trajectories are assumed to be.
If no expert trajectory weights are provided, then they will all be assumed to have a weight of 1. Calls to the getEpisodeWeights() method when weights have not been specified will result in a new double array being created and returned with the value 1.0 everywhere, so changes to the returned array will not change the weights actually used. Instead, modify the weights using the setEpisodeWeights(double[]) method.

Author:

James MacGlashan.

Field Summary

Fields
Modifier and Type	Field and Description
`protected double`	`boltzmannBeta` The parameter used in the boltzmann policy that affects how noisy the expert is assumed to be.
`protected double[]`	`episodeWeights` The weight assigned to each episode.
`protected DifferentiableRF`	`rf` The differentiable reward function model that will be estimated by MLRIL.

Fields inherited from class burlap.behavior.singleagent.learnbydemo.IRLRequest
domain, expertEpisodes, gamma, planner

Constructor Summary

Constructors
Constructor and Description
`MLIRLRequest(Domain domain, java.util.List<EpisodeAnalysis> expertEpisodes, DifferentiableRF rf, StateHashFactory hashingFactory)` Initializes without any expert trajectory weights (which will be assumed to have a value 1) and requests a default `QGradientPlanner` instance to be created using the `StateHashFactory` provided.
`MLIRLRequest(Domain domain, OOMDPPlanner planner, java.util.List<EpisodeAnalysis> expertEpisodes, DifferentiableRF rf)` Initializes the request without any expert trajectory weights (which will be assumed to have a value 1).

Method Summary

Methods
Modifier and Type	Method and Description
`double`	`getBoltzmannBeta()`
`double[]`	`getEpisodeWeights()` Returns expert episodes weights.
`DifferentiableRF`	`getRf()`
`boolean`	`isValid()` Returns true if this request object has valid data members set; false otherwise.
`void`	`setBoltzmannBeta(double boltzmannBeta)`
`void`	`setEpisodeWeights(double[] episodeWeights)`
`void`	`setPlanner(OOMDPPlanner p)`
`void`	`setRf(DifferentiableRF rf)`

Methods inherited from class burlap.behavior.singleagent.learnbydemo.IRLRequest
getDomain, getExpertEpisodes, getGamma, getPlanner, setDomain, setExpertEpisodes, setGamma

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - episodeWeights
```
protected double[] episodeWeights
```
    The weight assigned to each episode. If null, all episodes will be assumed to have equal weight
  - boltzmannBeta
```
protected double boltzmannBeta
```
    The parameter used in the boltzmann policy that affects how noisy the expert is assumed to be. The smaller the parameter, the more noisy the expert behavior is assumed; the larger the more correct.
  - rf
```
protected DifferentiableRF rf
```
    The differentiable reward function model that will be estimated by MLRIL.
- Constructor Detail
  - MLIRLRequest
```
public MLIRLRequest(Domain domain,
            OOMDPPlanner planner,
            java.util.List<EpisodeAnalysis> expertEpisodes,
            DifferentiableRF rf)
```
    Initializes the request without any expert trajectory weights (which will be assumed to have a value 1). If the provided planner is not null and does not implement the QGradientPlanner interface, an exception will be thrown.
    
    Parameters:
    domain - the domain in which trajectories are provided.
    planner - a planner that implements the QGradientPlanner interface.
    expertEpisodes - the expert episodes/trajectories to use for training.
    rf - the DifferentiableRF model to use.
  - MLIRLRequest
```
public MLIRLRequest(Domain domain,
            java.util.List<EpisodeAnalysis> expertEpisodes,
            DifferentiableRF rf,
            StateHashFactory hashingFactory)
```
    Initializes without any expert trajectory weights (which will be assumed to have a value 1) and requests a default QGradientPlanner instance to be created using the StateHashFactory provided. The QGradientPlanner instance will be a DifferentiableVI that plans either until the maximum change is the value function is no greater than 0.01 or until 500 iterations have been performed. A default gamma (discount) value of 0.99 will be used for the planner and no terminal states will be used.
    
    Parameters:
    domain - the domain in which trajectories are provided.
    expertEpisodes - the expert episodes/trajectories to use for training.
    rf - the DifferentiableRF model to use.
    hashingFactory - the state hashing factory to use for the created planner.
- Method Detail
  - isValid
```
public boolean isValid()
```
    Description copied from class: IRLRequest
    
    Returns true if this request object has valid data members set; false otherwise.
    
    Overrides:
    
    isValid in class IRLRequest
    
    Returns:
    true if this request object has valid data members set; false otherwise.
  - setPlanner
```
public void setPlanner(OOMDPPlanner p)
```
    Overrides:
    
    setPlanner in class IRLRequest
  - getEpisodeWeights
```
public double[] getEpisodeWeights()
```
    Returns expert episodes weights. If no specific weights have been set, a new double array the same length as the number of expert episodes with a constant value of 1 will be created and returned.
    
    Returns:
    expert episodes weights
  - getBoltzmannBeta
```
public double getBoltzmannBeta()
```
  - getRf
```
public DifferentiableRF getRf()
```
  - setEpisodeWeights
```
public void setEpisodeWeights(double[] episodeWeights)
```
  - setBoltzmannBeta
```
public void setBoltzmannBeta(double boltzmannBeta)
```
  - setRf
```
public void setRf(DifferentiableRF rf)
```

Class MLIRLRequest

Field Summary

Fields inherited from class burlap.behavior.singleagent.learnbydemo.IRLRequest

Constructor Summary

Method Summary

Methods inherited from class burlap.behavior.singleagent.learnbydemo.IRLRequest

Methods inherited from class java.lang.Object

Field Detail

episodeWeights

boltzmannBeta

rf

Constructor Detail

MLIRLRequest

MLIRLRequest

Method Detail

isValid

setPlanner

getEpisodeWeights

getBoltzmannBeta

getRf

setEpisodeWeights

setBoltzmannBeta

setRf