burlap.behavior.stochasticgames.madynamicprogramming.policies

Class Summary
Class	Description
ECorrelatedQJointPolicy	A joint policy that computes the correlated equilibrium using the Q-values of the agents as input and then either follows that policy or returns a random action with probability epsilon.
EGreedyJointPolicy	An epsilon greedy joint policy, in which the joint action with the highest Q-value for a given target agent is returned a 1-epsilon fraction of the time, and a random joint action an epsilon fraction of the time.
EGreedyMaxWellfare	An epsilon greedy joint policy, in which the joint aciton with the highest aggregate Q-values for each agent is returned a 1-epsilon fraction of the time and a random joint action an epsilon fraction of the time.
EMinMaxPolicy	Class for following a minmax joint policy.

Package burlap.behavior.stochasticgames.madynamicprogramming.policies