burlap.behavior.singleagent.learning.lspi

Class Summary
Class	Description
LSPI	This class implements the optimized version of last squares policy iteration [1] (runs in quadratic time of the number of state features).
SARSCollector	This object is used to collected `SARSData` (state-action-reard-state tuples) that can then be used by algorithms like LSPI for learning.
SARSCollector.UniformRandomSARSCollector	Collects SARS data from source states generated by a `StateGenerator` by choosing actions uniformly at random.
SARSData	Class that provides a wrapper for a List holding a bunch of state-action-reward-state (`SARSData.SARS`) tuples.
SARSData.SARS	State-action-reward-state tuple.

Package burlap.behavior.singleagent.learning.lspi