Multiplayer Interface

CommonRLInterface provides a basic interface for multiplayer games.

Sequential games

Sequential games should implement the optional function players to return a range of player ids, and player to indicate which player's turn it is. There is no requirement that players play in the order returned by the players function. Only the action for the current player should be supplied to act!, but rewards for all players should be returned. observe returns the observation for only the current player.

Simultaneous Games/Multi-agent (PO)MDPs

Environments in which all players take actions at once should implement the all_act! and all_observe optional functions which take a collection of actions for all players and return observations for each player, respectively.

Indicating reward properties

The UtilityStyle trait can be used to indicate that the rewards will meet properties, for example that rewards for all players are identical or that the game is zero-sum.

CommonRLInterface.playersFunction
players(env::AbstractEnv)

Return an ordered iterable collection of integer indices for all players, starting with one.

This function is a static property of the environment; the value it returns should not change based on the state.

Example

players(::MyEnv) = 1:2
source
CommonRLInterface.all_act!Function
all_act!(env::AbstractEnv, actions::AbstractVector)

Take actions for all players and advance AbstractEnv env forward, and return rewards for all players.

Environments that support simultaneous actions by all players should implement this in addition to or instead of act!.

source
CommonRLInterface.all_observeFunction
all_observe(env::AbstractEnv)

Return observations from the environment for all players.

Environments that support simultaneous actions by all players should implement this in addition to or instead of observe.

source
CommonRLInterface.UtilityStyleType
UtilityStyle(env)

Trait that allows an environment to declare certain properties about the relative utility for the players.

Possible returns are:

  • ZeroSum()
  • ConstantSum()
  • GeneralSum()
  • IdenticalUtility()

See the docstrings for each for more details.

source