| Class | Description |
|---|---|
| LSPI |
This class implements the optimized version of last squares policy iteration [1] (runs in quadratic time of the number of state features).
|
| SARSCollector |
This object is used to collected
SARSData (state-action-reard-state tuples) that can then be used by algorithms like LSPI for learning. |
| SARSCollector.UniformRandomSARSCollector |
Collects SARS data from source states generated by a
StateGenerator by choosing actions uniformly at random. |
| SARSData |
Class that provides a wrapper for a List holding a bunch of state-action-reward-state (
SARSData.SARS) tuples. |
| SARSData.SARS |
State-action-reward-state tuple.
|