SGQWActionHistory

java.lang.Object
- burlap.oomdp.stochasticgames.Agent
- - burlap.behavior.stochasticgame.agents.naiveq.SGNaiveQLAgent
  - - burlap.behavior.stochasticgame.agents.naiveq.history.SGQWActionHistory

All Implemented Interfaces:

QComputablePlanner
```
public class SGQWActionHistory
extends SGNaiveQLAgent
```
A Tabular Q-learning [1] algorithm for stochastic games formalisms that augments states with the actions each agent took in n previous time steps. If the constructor is not passed the maximum number of players and an ActionIdMap to use, then when the first game starts, it will be initialized to an ParameterNaiveActionIdMap and the number of players will be set to the number of players in the world which this agent has joined. If the world contains parameterized actions, this may be a problem and you should use the SGQWActionHistory(SGDomain, double, double, StateHashFactory, int, int, ActionIdMap) constructor to resolve action parameterization instead.
1. Watkins, Christopher JCH, and Peter Dayan. "Q-learning." Machine learning 8.3-4 (1992): 279-292.

Author:

James MacGlashan

Nested Class Summary
- Nested classes/interfaces inherited from interface burlap.behavior.singleagent.planning.QComputablePlanner
  QComputablePlanner.QComputablePlannerHelper

Field Summary

Fields
Modifier and Type	Field and Description
`protected ActionIdMap`	`actionMap` a map from actions to int values which can be used to fill in an action history attribute value
`static java.lang.String`	`ATTHAID` A constant for the name of the attribute used to define which action an agent took
`static java.lang.String`	`ATTHNUM` A constant for the name of the history time index attribute.
`static java.lang.String`	`ATTHPN` A constant for the name of the attribute used to define which agent in the world this history object represents
`protected ObjectClass`	`classHistory` The object class that will be used to represent a history component.
`static java.lang.String`	`CLASSHISTORY` A constant for the name of the history object class.
`protected java.util.LinkedList<JointAction>`	`history` the joint action history
`protected int`	`historySize` The size of action history to store.

Fields inherited from class burlap.behavior.stochasticgame.agents.naiveq.SGNaiveQLAgent
discount, hashFactory, learningRate, policy, qInit, qMap, stateRepresentations, storedMapAbstraction, totalNumberOfSteps

Fields inherited from class burlap.oomdp.stochasticgames.Agent
agentType, domain, internalRewardFunction, world, worldAgentName

Constructor Summary

Constructors
Constructor and Description
`SGQWActionHistory(SGDomain d, double discount, double learningRate, StateHashFactory hashFactory, int historySize)` Initializes the learning algorithm using 0.1 epsilon greedy learning strategy/policy
`SGQWActionHistory(SGDomain d, double discount, double learningRate, StateHashFactory hashFactory, int historySize, int maxPlayers, ActionIdMap actionMap)` Initializes the learning algorithm using 0.1 epsilon greedy learning strategy/policy

Method Summary

Methods
Modifier and Type	Method and Description
`void`	`gameStarting()` This method is called by the world when a new game is starting.
`protected State`	`getHistoryAugmentedState(State s)` Takes an input state and returns an augmented state with the history of actions each agent previously took.
`protected ObjectInstance`	`getHistoryLessObjectInstanceForAgent(java.lang.String aname, int h)` Returns a history object instance for a given agent in which the action that was taken is unset because the episode has not last h steps.
`protected ObjectInstance`	`getHistoryObjectInstanceForAgent(GroundedSingleAction gsa, int h)` Returns a history object instance for the corresponding action and how far back in history it occurred
`protected void`	`initializeActionMapAndAugmentedDomain()` Initializes the action map to be an instance of `ParameterNaiveActionIdMap` and then initializes the history augmented domain using the max players as the number of players in the world which this agent has joined.
`protected void`	`initializeHistoryAugmentedDomain(int maxPlayers)` Initializes the history augmented domain/state representation the agent will use
`void`	`observeOutcome(State s, JointAction jointAction, java.util.Map<java.lang.String,java.lang.Double> jointReward, State sprime, boolean isTerminal)` This method is called by the world when every agent in the world has taken their action.
`protected StateHashTuple`	`stateHash(State s)` First abstracts state s, and then returns the `StateHashTuple` object for the abstracted state.

Methods inherited from class burlap.behavior.stochasticgame.agents.naiveq.SGNaiveQLAgent
gameTerminated, getAction, getMaxQValue, getQ, getQs, setLearningRate, setQValueInitializer, setStoredMapAbstraction, setStrategy, translateAction

Methods inherited from class burlap.oomdp.stochasticgames.Agent
getAgentName, getAgentType, getInternalRewardFunction, init, joinWorld, setInternalRewardFunction

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - history
```
protected java.util.LinkedList<JointAction> history
```
    the joint action history
  - historySize
```
protected int historySize
```
    The size of action history to store.
  - actionMap
```
protected ActionIdMap actionMap
```
    a map from actions to int values which can be used to fill in an action history attribute value
  - classHistory
```
protected ObjectClass classHistory
```
    The object class that will be used to represent a history component. A history component consists a player identifier, the action that player took, and how long ago that action was taken. A object instance of this class will be created for each player in the world and for each of the n time steps that this learning algorithm is told to keep.
  - ATTHNUM
```
public static final java.lang.String ATTHNUM
```
    A constant for the name of the history time index attribute. For instance, a history object representing the action of an agent in the previous time step will have a value of 1 for this attribute
    
    See Also:
    Constant Field Values
  - ATTHPN
```
public static final java.lang.String ATTHPN
```
    A constant for the name of the attribute used to define which agent in the world this history object represents
    
    See Also:
    Constant Field Values
  - ATTHAID
```
public static final java.lang.String ATTHAID
```
    A constant for the name of the attribute used to define which action an agent took
    
    See Also:
    Constant Field Values
  - CLASSHISTORY
```
public static java.lang.String CLASSHISTORY
```
    A constant for the name of the history object class.
- Constructor Detail
  - SGQWActionHistory
```
public SGQWActionHistory(SGDomain d,
                 double discount,
                 double learningRate,
                 StateHashFactory hashFactory,
                 int historySize,
                 int maxPlayers,
                 ActionIdMap actionMap)
```
    Initializes the learning algorithm using 0.1 epsilon greedy learning strategy/policy
    
    Parameters:
    d - the domain in which the agent will act
    discount - the discount factor
    learningRate - the learning rate
    hashFactory - the state hashing factory to use
    historySize - the number of previous steps to remember and with which to augment the state space
    maxPlayers - the maximum number of players that will be in the game
    actionMap - a mapping from actions to integer identifiers for them
  - SGQWActionHistory
```
public SGQWActionHistory(SGDomain d,
                 double discount,
                 double learningRate,
                 StateHashFactory hashFactory,
                 int historySize)
```
    Initializes the learning algorithm using 0.1 epsilon greedy learning strategy/policy
    
    Parameters:
    d - the domain in which the agent will act
    discount - the discount factor
    learningRate - the learning rate
    hashFactory - the state hashing factory to use
    historySize - the number of previous steps to remember and with which to augment the state space
- Method Detail
  - initializeHistoryAugmentedDomain
```
protected void initializeHistoryAugmentedDomain(int maxPlayers)
```
    Initializes the history augmented domain/state representation the agent will use
    
    Parameters:
    maxPlayers - the maximum number of players in the game
  - gameStarting
```
public void gameStarting()
```
    Description copied from class: Agent
    
    This method is called by the world when a new game is starting.
    
    Overrides:
    
    gameStarting in class SGNaiveQLAgent
  - initializeActionMapAndAugmentedDomain
```
protected void initializeActionMapAndAugmentedDomain()
```
    Initializes the action map to be an instance of ParameterNaiveActionIdMap and then initializes the history augmented domain using the max players as the number of players in the world which this agent has joined.
  - observeOutcome
```
public void observeOutcome(State s,
                  JointAction jointAction,
                  java.util.Map<java.lang.String,java.lang.Double> jointReward,
                  State sprime,
                  boolean isTerminal)
```
    Description copied from class: Agent
    
    This method is called by the world when every agent in the world has taken their action. It conveys the result of the joint action.
    
    Overrides:
    
    observeOutcome in class SGNaiveQLAgent
    
    Parameters:
    s - the state in which the last action of each agent was taken
    jointAction - the joint action of all agents in the world
    jointReward - the joint reward of all agents in the world
    sprime - the next state to which the agent transitioned
    isTerminal - whether the new state is a terminal state
  - getHistoryAugmentedState
```
protected State getHistoryAugmentedState(State s)
```
    Takes an input state and returns an augmented state with the history of actions each agent previously took.
    
    Parameters:
    s - the input state to augment
    
    Returns:
    an augmented state with the history of actions each agent previously took.
  - getHistoryObjectInstanceForAgent
```
protected ObjectInstance getHistoryObjectInstanceForAgent(GroundedSingleAction gsa,
                                              int h)
```
    Returns a history object instance for the corresponding action and how far back in history it occurred
    
    Parameters:
    gsa - the action that was taken (which includes which agent took it)
    h - how far back in history the action was taken.
    
    Returns:
    a history object instance for the corresponding action and how far back in history it occurred
  - getHistoryLessObjectInstanceForAgent
```
protected ObjectInstance getHistoryLessObjectInstanceForAgent(java.lang.String aname,
                                                  int h)
```
    Returns a history object instance for a given agent in which the action that was taken is unset because the episode has not last h steps.
    
    Parameters:
    aname - the name of agent for which the history object should be returned
    h - how many step backs this object instance represents
    
    Returns:
    a history object instance
  - stateHash
```
protected StateHashTuple stateHash(State s)
```
    Description copied from class: SGNaiveQLAgent
    
    First abstracts state s, and then returns the StateHashTuple object for the abstracted state.
    
    Overrides:
    
    stateHash in class SGNaiveQLAgent
    
    Parameters:
    s - the state for which the state hash should be returned.
    
    Returns:
    the hashed state.

Class SGQWActionHistory

Nested Class Summary

Nested classes/interfaces inherited from interface burlap.behavior.singleagent.planning.QComputablePlanner

Field Summary

Fields inherited from class burlap.behavior.stochasticgame.agents.naiveq.SGNaiveQLAgent

Fields inherited from class burlap.oomdp.stochasticgames.Agent

Constructor Summary

Method Summary

Methods inherited from class burlap.behavior.stochasticgame.agents.naiveq.SGNaiveQLAgent

Methods inherited from class burlap.oomdp.stochasticgames.Agent

Methods inherited from class java.lang.Object

Field Detail

history

historySize

actionMap

classHistory

ATTHNUM

ATTHPN

ATTHAID

CLASSHISTORY

Constructor Detail

SGQWActionHistory

SGQWActionHistory

Method Detail

initializeHistoryAugmentedDomain

gameStarting

initializeActionMapAndAugmentedDomain

observeOutcome

getHistoryAugmentedState

getHistoryObjectInstanceForAgent

getHistoryLessObjectInstanceForAgent

stateHash