EquilibriumPlayingSGAgent

java.lang.Object
- burlap.mdp.stochasticgames.agent.SGAgentBase
- - burlap.behavior.stochasticgames.agents.twoplayer.singlestage.equilibriumplayer.EquilibriumPlayingSGAgent

All Implemented Interfaces:

SGAgent
```
public class EquilibriumPlayingSGAgent
extends SGAgentBase
```
This agent plays an equilibrium solution for two player games based on the immediate joint rewards received for the given state, as if it is a single stage game. By default, the solution concept used will be MaxMax - assuming the other agent will choose actions that maximize your reward. Different solution concepts can be used by providing a different BimatrixEquilibriumSolver object in the constructor.

Author:

James MacGlashan

Nested Class Summary

Nested Classes
Modifier and Type Class and Description

protected class EquilibriumPlayingSGAgent.BimatrixTuple
A Bimatrix tuple.

Nested Classes
Modifier and Type	Class and Description
`protected class`	`EquilibriumPlayingSGAgent.BimatrixTuple` A Bimatrix tuple.

Field Summary

Fields
Modifier and Type	Field and Description
`protected int`	`agentNum`
`protected java.util.Random`	`rand` Random generator for selecting actions according to the solved solution
`protected BimatrixEquilibriumSolver`	`solver` The solution concept to be solved for the immediate rewards.

Fields inherited from class burlap.mdp.stochasticgames.agent.SGAgentBase
agentType, domain, internalRewardFunction, world, worldAgentName

Constructor Summary

Constructors
Constructor and Description
`EquilibriumPlayingSGAgent()` Initializes with the `MaxMax` solution concept.
`EquilibriumPlayingSGAgent(BimatrixEquilibriumSolver solver)` Initializes with strategies formed usign the solution concept generated by the given solver.

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`Action`	`action(State s)` This method is called by the world when it needs the agent to choose an action
`protected EquilibriumPlayingSGAgent.BimatrixTuple`	`constructBimatrix(State s, java.util.List<Action> myActions)` Constructs a bimatrix game from the possible joint rewards of the given state.
`void`	`gameStarting(World w, int agentNum)` This method is called by the world when a new game is starting.
`void`	`gameTerminated()` This method is called by the world when a game has ended.
`protected SGAgent`	`getOpponent()` Returns the `SGAgent` object in the world for the opponent.
`void`	`observeOutcome(State s, JointAction jointAction, double[] jointReward, State sprime, boolean isTerminal)` This method is called by the world when every agent in the world has taken their action.
`protected int`	`opponentNum()`
`protected int`	`sampleStrategy(double[] strategy)` Samples an action from a strategy, where a strategy is defined as probability distribution over actions.

Methods inherited from class burlap.mdp.stochasticgames.agent.SGAgentBase
agentName, agentType, getInternalRewardFunction, init, init, setAgentDetails, setInternalRewardFunction

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - solver
```
protected BimatrixEquilibriumSolver solver
```
    The solution concept to be solved for the immediate rewards.
  - rand
```
protected java.util.Random rand
```
    Random generator for selecting actions according to the solved solution
  - agentNum
```
protected int agentNum
```
- Constructor Detail
  - EquilibriumPlayingSGAgent
```
public EquilibriumPlayingSGAgent()
```
    Initializes with the MaxMax solution concept.
  - EquilibriumPlayingSGAgent
```
public EquilibriumPlayingSGAgent(BimatrixEquilibriumSolver solver)
```
    Initializes with strategies formed usign the solution concept generated by the given solver.
    
    Parameters:
    
    solver - the solver to use for a given solution concept.
- Method Detail
  - gameStarting
```
public void gameStarting(World w,
                         int agentNum)
```
    Description copied from interface: SGAgent
    
    This method is called by the world when a new game is starting.
    
    Parameters:
    
    w - the world in which the game is starting
    
    agentNum - the agent number of the agent in the world
  - action
```
public Action action(State s)
```
    Description copied from interface: SGAgent
    
    This method is called by the world when it needs the agent to choose an action
    
    Parameters:
    
    s - the current state of the world
    
    Returns:
    
    the action this agent wishes to take
  - observeOutcome
```
public void observeOutcome(State s,
                           JointAction jointAction,
                           double[] jointReward,
                           State sprime,
                           boolean isTerminal)
```
    Description copied from interface: SGAgent
    
    This method is called by the world when every agent in the world has taken their action. It conveys the result of the joint action.
    
    Parameters:
    
    s - the state in which the last action of each agent was taken
    
    jointAction - the joint action of all agents in the world
    
    jointReward - the joint reward of all agents in the world
    
    sprime - the next state to which the agent transitioned
    
    isTerminal - whether the new state is a terminal state
  - gameTerminated
```
public void gameTerminated()
```
    Description copied from interface: SGAgent
    
    This method is called by the world when a game has ended.
  - constructBimatrix
```
protected EquilibriumPlayingSGAgent.BimatrixTuple constructBimatrix(State s,
                                                                    java.util.List<Action> myActions)
```
    Constructs a bimatrix game from the possible joint rewards of the given state. The other agent and their action set is determined by retreiving the corresponding agent object from the world. Similarly for the joint action model. If this agent has an internal reward function, they use that; otherwise the world reward function is used.
    
    Parameters:
    
    s - the state from which the joint rewards are based
    
    myActions - the set of Actions the agent can taken in s.
    
    Returns:
    
    a EquilibriumPlayingSGAgent.BimatrixTuple for the joint reward function.
  - sampleStrategy
```
protected int sampleStrategy(double[] strategy)
```
    Samples an action from a strategy, where a strategy is defined as probability distribution over actions.
    
    Parameters:
    
    strategy - a double array where strategy[i] is the probability of action i being selected
    
    Returns:
    
    a sampled action
  - getOpponent
```
protected SGAgent getOpponent()
```
    Returns the SGAgent object in the world for the opponent.
    
    Returns:
    
    the SGAgent object in the world for the opponent.
  - opponentNum
```
protected int opponentNum()
```

Class EquilibriumPlayingSGAgent

Nested Class Summary

Field Summary

Fields inherited from class burlap.mdp.stochasticgames.agent.SGAgentBase

Constructor Summary

Method Summary

Methods inherited from class burlap.mdp.stochasticgames.agent.SGAgentBase

Methods inherited from class java.lang.Object

Field Detail

solver

rand

agentNum

Constructor Detail

EquilibriumPlayingSGAgent

EquilibriumPlayingSGAgent

Method Detail

gameStarting

action

observeOutcome

gameTerminated

constructBimatrix

sampleStrategy

getOpponent

opponentNum