public class MultiAgentDPPlanningAgent extends SGAgentBase
MADynamicProgramming
planning algorithm to compute the value of each state and then follow
a policy derived from a joint policy that is derived from that estimated value function. This is achieved by at each step by the MADynamicProgramming.planFromState(State)
being first
called and then following the policy. Ideally, the planning object should only perform planning for a state if it has not already planned for it. The joint policy
underlining the policy the agent follows must be an instance of MAQSourcePolicy
. Furthermore, when the policy is set, the underlining joint policy
will automatically be set to use this agent's planning object as the value function source and the set of agents will automatically be set to the involved in this agent's
world. The PolicyFromJointPolicy
will also be told that this agent is its target.Modifier and Type | Field and Description |
---|---|
protected int |
agentNum |
protected MADynamicProgramming |
planner
The valueFunction this agent will use to estiamte the value function and thereby determine its policy.
|
protected PolicyFromJointPolicy |
policy
The policy dervied from a joint policy derived from the valueFunction's value function estimate that this agent will follow.
|
protected boolean |
setAgentDefinitions
Whether the agent definitions for this valueFunction have been set yet.
|
agentType, domain, internalRewardFunction, world, worldAgentName
Constructor and Description |
---|
MultiAgentDPPlanningAgent(SGDomain domain,
MADynamicProgramming planner,
PolicyFromJointPolicy policy,
java.lang.String agentName,
SGAgentType agentType)
Initializes.
|
Modifier and Type | Method and Description |
---|---|
Action |
action(State s)
This method is called by the world when it needs the agent to choose an action
|
void |
gameStarting(World w,
int agentNum)
This method is called by the world when a new game is starting.
|
void |
gameTerminated()
This method is called by the world when a game has ended.
|
void |
observeOutcome(State s,
JointAction jointAction,
double[] jointReward,
State sprime,
boolean isTerminal)
This method is called by the world when every agent in the world has taken their action.
|
void |
setPolicy(PolicyFromJointPolicy policy)
Sets the policy derived from this agents valueFunction to follow.
|
agentName, agentType, getInternalRewardFunction, init, init, setAgentDetails, setInternalRewardFunction
protected MADynamicProgramming planner
protected PolicyFromJointPolicy policy
protected boolean setAgentDefinitions
protected int agentNum
public MultiAgentDPPlanningAgent(SGDomain domain, MADynamicProgramming planner, PolicyFromJointPolicy policy, java.lang.String agentName, SGAgentType agentType)
MAQSourcePolicy
or a runtime exception will be thrown.
The joint policy will automatically be set to use the provided valueFunction as the value function source.domain
- the domain in which the agent will actplanner
- the valueFunction the agent should use for determining its policypolicy
- the policy that will use the planners value function as a source.agentName
- the name of the agentagentType
- the SGAgentType
for the agent defining its action spacepublic void setPolicy(PolicyFromJointPolicy policy)
MAQSourcePolicy
or a runtime exception will be thrown.
The joint policy will automatically be set to use the provided valueFunction as the value function source.policy
- the policy that will use the planners value function as a source.public void gameStarting(World w, int agentNum)
SGAgent
w
- the world in which the game is startingagentNum
- the agent number of the agent in the worldpublic Action action(State s)
SGAgent
s
- the current state of the worldpublic void observeOutcome(State s, JointAction jointAction, double[] jointReward, State sprime, boolean isTerminal)
SGAgent
s
- the state in which the last action of each agent was takenjointAction
- the joint action of all agents in the worldjointReward
- the joint reward of all agents in the worldsprime
- the next state to which the agent transitionedisTerminal
- whether the new state is a terminal statepublic void gameTerminated()
SGAgent