Class | Description |
---|---|
ECorrelatedQJointPolicy |
A joint policy that computes the correlated equilibrium using the Q-values of the agents as input and then either
follows that policy or returns a random action with probability epsilon.
|
EGreedyJointPolicy |
An epsilon greedy joint policy, in which the joint action with the highest Q-value for a given target agent is returned a 1-epsilon fraction
of the time, and a random joint action an epsilon fraction of the time.
|
EGreedyMaxWellfare |
An epsilon greedy joint policy, in which the joint aciton with the highest aggregate Q-values for each agent is returned a 1-epsilon fraction of the time and a random
joint action an epsilon fraction of the time.
|
EMinMaxPolicy |
Class for following a minmax joint policy.
|