MultiAgentDPPlanningAgent

java.lang.Object
- burlap.mdp.stochasticgames.agent.SGAgentBase
- - burlap.behavior.stochasticgames.agents.madp.MultiAgentDPPlanningAgent

All Implemented Interfaces:

SGAgent
```
public class MultiAgentDPPlanningAgent
extends SGAgentBase
```
A agent that using a MADynamicProgramming planning algorithm to compute the value of each state and then follow a policy derived from a joint policy that is derived from that estimated value function. This is achieved by at each step by the MADynamicProgramming.planFromState(State) being first called and then following the policy. Ideally, the planning object should only perform planning for a state if it has not already planned for it. The joint policy underlining the policy the agent follows must be an instance of MAQSourcePolicy. Furthermore, when the policy is set, the underlining joint policy will automatically be set to use this agent's planning object as the value function source and the set of agents will automatically be set to the involved in this agent's world. The PolicyFromJointPolicy will also be told that this agent is its target.

Author:

James MacGlashan

Field Summary

Fields
Modifier and Type	Field and Description
`protected int`	`agentNum`
`protected MADynamicProgramming`	`planner` The valueFunction this agent will use to estiamte the value function and thereby determine its policy.
`protected PolicyFromJointPolicy`	`policy` The policy dervied from a joint policy derived from the valueFunction's value function estimate that this agent will follow.
`protected boolean`	`setAgentDefinitions` Whether the agent definitions for this valueFunction have been set yet.

Fields inherited from class burlap.mdp.stochasticgames.agent.SGAgentBase
agentType, domain, internalRewardFunction, world, worldAgentName

Constructor Summary

Constructors
Constructor and Description
`MultiAgentDPPlanningAgent(SGDomain domain, MADynamicProgramming planner, PolicyFromJointPolicy policy, java.lang.String agentName, SGAgentType agentType)` Initializes.

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`Action`	`action(State s)` This method is called by the world when it needs the agent to choose an action
`void`	`gameStarting(World w, int agentNum)` This method is called by the world when a new game is starting.
`void`	`gameTerminated()` This method is called by the world when a game has ended.
`void`	`observeOutcome(State s, JointAction jointAction, double[] jointReward, State sprime, boolean isTerminal)` This method is called by the world when every agent in the world has taken their action.
`void`	`setPolicy(PolicyFromJointPolicy policy)` Sets the policy derived from this agents valueFunction to follow.

Methods inherited from class burlap.mdp.stochasticgames.agent.SGAgentBase
agentName, agentType, getInternalRewardFunction, init, init, setAgentDetails, setInternalRewardFunction

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - planner
```
protected MADynamicProgramming planner
```
    The valueFunction this agent will use to estiamte the value function and thereby determine its policy.
  - policy
```
protected PolicyFromJointPolicy policy
```
    The policy dervied from a joint policy derived from the valueFunction's value function estimate that this agent will follow.
  - setAgentDefinitions
```
protected boolean setAgentDefinitions
```
    Whether the agent definitions for this valueFunction have been set yet.
  - agentNum
```
protected int agentNum
```
- Constructor Detail
  - MultiAgentDPPlanningAgent
```
public MultiAgentDPPlanningAgent(SGDomain domain,
                                 MADynamicProgramming planner,
                                 PolicyFromJointPolicy policy,
                                 java.lang.String agentName,
                                 SGAgentType agentType)
```
    Initializes. The underlining joint policy of the policy must be an instance of MAQSourcePolicy or a runtime exception will be thrown. The joint policy will automatically be set to use the provided valueFunction as the value function source.
    
    Parameters:
    
    domain - the domain in which the agent will act
    
    planner - the valueFunction the agent should use for determining its policy
    
    policy - the policy that will use the planners value function as a source.
    
    agentName - the name of the agent
    
    agentType - the SGAgentType for the agent defining its action space
- Method Detail
  - setPolicy
```
public void setPolicy(PolicyFromJointPolicy policy)
```
    Sets the policy derived from this agents valueFunction to follow. he underlining joint policy of the policy must be an instance of MAQSourcePolicy or a runtime exception will be thrown. The joint policy will automatically be set to use the provided valueFunction as the value function source.
    
    Parameters:
    
    policy - the policy that will use the planners value function as a source.
  - gameStarting
```
public void gameStarting(World w,
                         int agentNum)
```
    Description copied from interface: SGAgent
    
    This method is called by the world when a new game is starting.
    
    Parameters:
    
    w - the world in which the game is starting
    
    agentNum - the agent number of the agent in the world
  - action
```
public Action action(State s)
```
    Description copied from interface: SGAgent
    
    This method is called by the world when it needs the agent to choose an action
    
    Parameters:
    
    s - the current state of the world
    
    Returns:
    
    the action this agent wishes to take
  - observeOutcome
```
public void observeOutcome(State s,
                           JointAction jointAction,
                           double[] jointReward,
                           State sprime,
                           boolean isTerminal)
```
    Description copied from interface: SGAgent
    
    This method is called by the world when every agent in the world has taken their action. It conveys the result of the joint action.
    
    Parameters:
    
    s - the state in which the last action of each agent was taken
    
    jointAction - the joint action of all agents in the world
    
    jointReward - the joint reward of all agents in the world
    
    sprime - the next state to which the agent transitioned
    
    isTerminal - whether the new state is a terminal state
  - gameTerminated
```
public void gameTerminated()
```
    Description copied from interface: SGAgent
    
    This method is called by the world when a game has ended.

Class MultiAgentDPPlanningAgent

Field Summary

Fields inherited from class burlap.mdp.stochasticgames.agent.SGAgentBase

Constructor Summary

Method Summary

Methods inherited from class burlap.mdp.stochasticgames.agent.SGAgentBase

Methods inherited from class java.lang.Object

Field Detail

planner

policy

setAgentDefinitions

agentNum

Constructor Detail

MultiAgentDPPlanningAgent

Method Detail

setPolicy

gameStarting

action

observeOutcome

gameTerminated