EquilibriumPlayingSGAgent

java.lang.Object
- burlap.oomdp.stochasticgames.SGAgent
- - burlap.behavior.stochasticgames.agents.twoplayer.singlestage.equilibriumplayer.EquilibriumPlayingSGAgent

```
public class EquilibriumPlayingSGAgent
extends SGAgent
```
This agent plays an equilibrium solution for two player games based on the immediate joint rewards received for the given state, as if it is a single stage game. By default, the solution concept used will be MaxMax - assuming the other agent will choose actions that maximize your reward. Different solution concepts can be used by providing a different BimatrixEquilibriumSolver object in the constructor.

Author:

James MacGlashan

Nested Class Summary

Nested Classes
Modifier and Type Class and Description

protected class EquilibriumPlayingSGAgent.BimatrixTuple
A Bimatrix tuple.

Nested Classes
Modifier and Type	Class and Description
`protected class`	`EquilibriumPlayingSGAgent.BimatrixTuple` A Bimatrix tuple.

Field Summary

Fields
Modifier and Type	Field and Description
`protected java.util.Random`	`rand` Random generator for selecting actions according to the solved solution
`protected BimatrixEquilibriumSolver`	`solver` The solution concept to be solved for the immediate rewards.

Fields inherited from class burlap.oomdp.stochasticgames.SGAgent
agentType, domain, internalRewardFunction, world, worldAgentName

Constructor Summary

Constructors
Constructor and Description
`EquilibriumPlayingSGAgent()` Initializes with the `MaxMax` solution concept.
`EquilibriumPlayingSGAgent(BimatrixEquilibriumSolver solver)` Initializes with strategies formed usign the solution concept generated by the given solver.

Method Summary

Methods
Modifier and Type	Method and Description
`protected EquilibriumPlayingSGAgent.BimatrixTuple`	`constructBimatrix(State s, java.util.List<GroundedSGAgentAction> myActions)` Constructs a bimatrix game from the possible joint rewards of the given state.
`void`	`gameStarting()` This method is called by the world when a new game is starting.
`void`	`gameTerminated()` This method is called by the world when a game has ended.
`GroundedSGAgentAction`	`getAction(State s)` This method is called by the world when it needs the agent to choose an action
`protected SGAgent`	`getOpponent()` Returns the `SGAgent` object in the world for the opponent.
`void`	`observeOutcome(State s, JointAction jointAction, java.util.Map<java.lang.String,java.lang.Double> jointReward, State sprime, boolean isTerminal)` This method is called by the world when every agent in the world has taken their action.
`protected int`	`sampleStrategy(double[] strategy)` Samples an action from a strategy, where a strategy is defined as probability distribution over actions.

Methods inherited from class burlap.oomdp.stochasticgames.SGAgent
getAgentName, getAgentType, getInternalRewardFunction, init, joinWorld, setInternalRewardFunction

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - solver
```
protected BimatrixEquilibriumSolver solver
```
    The solution concept to be solved for the immediate rewards.
  - rand
```
protected java.util.Random rand
```
    Random generator for selecting actions according to the solved solution
- Constructor Detail
  - EquilibriumPlayingSGAgent
```
public EquilibriumPlayingSGAgent()
```
    Initializes with the MaxMax solution concept.
  - EquilibriumPlayingSGAgent
```
public EquilibriumPlayingSGAgent(BimatrixEquilibriumSolver solver)
```
    Initializes with strategies formed usign the solution concept generated by the given solver.
    
    Parameters:
    solver - the solver to use for a given solution concept.
- Method Detail
  - gameStarting
```
public void gameStarting()
```
    Description copied from class: SGAgent
    
    This method is called by the world when a new game is starting.
    
    Specified by:
    
    gameStarting in class SGAgent
  - getAction
```
public GroundedSGAgentAction getAction(State s)
```
    Description copied from class: SGAgent
    
    This method is called by the world when it needs the agent to choose an action
    
    Specified by:
    
    getAction in class SGAgent
    
    Parameters:
    s - the current state of the world
    
    Returns:
    the action this agent wishes to take
  - observeOutcome
```
public void observeOutcome(State s,
                  JointAction jointAction,
                  java.util.Map<java.lang.String,java.lang.Double> jointReward,
                  State sprime,
                  boolean isTerminal)
```
    Description copied from class: SGAgent
    
    This method is called by the world when every agent in the world has taken their action. It conveys the result of the joint action.
    
    Specified by:
    
    observeOutcome in class SGAgent
    
    Parameters:
    s - the state in which the last action of each agent was taken
    jointAction - the joint action of all agents in the world
    jointReward - the joint reward of all agents in the world
    sprime - the next state to which the agent transitioned
    isTerminal - whether the new state is a terminal state
  - gameTerminated
```
public void gameTerminated()
```
    Description copied from class: SGAgent
    
    This method is called by the world when a game has ended.
    
    Specified by:
    
    gameTerminated in class SGAgent
  - constructBimatrix
```
protected EquilibriumPlayingSGAgent.BimatrixTuple constructBimatrix(State s,
                                                        java.util.List<GroundedSGAgentAction> myActions)
```
    Constructs a bimatrix game from the possible joint rewards of the given state. The other agent and their action set is determined by retreiving the corresponding agent object from the world. Similarly for the joint action model. If this agent has an internal reward function, they use that; otherwise the world reward function is used.
    
    Parameters:
    s - the state from which the joint rewards are based
    myActions - the set of GroundedSGAgentActions the agent can taken in s.
    
    Returns:
    a EquilibriumPlayingSGAgent.BimatrixTuple for the joint reward function.
  - sampleStrategy
```
protected int sampleStrategy(double[] strategy)
```
    Samples an action from a strategy, where a strategy is defined as probability distribution over actions.
    
    Parameters:
    strategy - a double array where strategy[i] is the probability of action i being selected
    
    Returns:
    a sampled action
  - getOpponent
```
protected SGAgent getOpponent()
```
    Returns the SGAgent object in the world for the opponent.
    
    Returns:
    the SGAgent object in the world for the opponent.

Class EquilibriumPlayingSGAgent

Nested Class Summary

Field Summary

Fields inherited from class burlap.oomdp.stochasticgames.SGAgent

Constructor Summary

Method Summary

Methods inherited from class burlap.oomdp.stochasticgames.SGAgent

Methods inherited from class java.lang.Object

Field Detail

solver

rand

Constructor Detail

EquilibriumPlayingSGAgent

EquilibriumPlayingSGAgent

Method Detail

gameStarting

getAction

observeOutcome

gameTerminated

constructBimatrix

sampleStrategy

getOpponent