class ple.PLE(game, fps=30, frame_skip=1, num_steps=1, reward_values={}, force_fps=True, display_screen=False, add_noop_action=True, NOOP=K_F15, state_preprocessor=None, rng=24)

Main wrapper that interacts with games. Provides a similar interface to Arcade Learning Environment.



The game the PLE environment manipulates and maintains.

fps: int (default: 30)

The desired frames per second we want to run our game at. Typical settings are 30 and 60 fps.

frame_skip: int (default: 1)

The number of times we skip getting observations while repeat an action.

num_steps: int (default: 1)

The number of times we repeat an action.

reward_values: dict

This contains the rewards we wish to set give our agent based on different actions in game. The current defaults are as follows:

rewards = {
    "positive": 1.0,
    "negative": -1.0,
    "tick": 0.0,
    "loss": -5.0,
    "win": 5.0

Tick is given to the agent at each game step. You can selectively adjust the rewards by passing a dictonary with the key you want to change. Eg. If we want to adjust the negative reward and the tick reward we would pass in the following:

rewards = {
    "negative": -2.0,
    "tick": -0.01

Keep in mind that the tick is applied at each frame. If the game is running at 60fps the agent will get a reward of 60*tick.

force_fps: bool (default: True)

If False PLE delays between game.step() calls to ensure the fps is specified. If not PLE passes an elapsed time delta to ensure the game steps by an amount of time consistent with the specified fps. This is usally set to True as it allows the game to run as fast as possible which speeds up training.

display_screen: bool (default: False)

If we draw updates to the screen. Disabling this speeds up interation speed. This can be toggled to True during testing phases so you can observe the agents progress.

add_noop_action: bool (default: True)

This inserts the NOOP action specified as a valid move the agent can make.

NOOP: pygame.constants (default: K_F15)

The key we want our agent to send that represents a NOOP. This is currently set to F15.

state_preprocessor: python function (default: None)

Python function which takes a dict representing game state and returns a numpy array.

rng: numpy.random.RandomState, int, array_like or None. (default: 24)

Number generator which is used by PLE and the games.


Perform an action on the game. We lockstep frames with actions. If act is not called the game will not run.


action : int

The index of the action we wish to perform. The index usually corresponds to the index item returned by getActionSet().



Returns the reward that the agent has accumlated while performing the action.


Returns True if the game has reached a terminal state and False otherwise.

This state is game dependent.


Gets the actions the game supports. Optionally inserts the NOOP action if PLE has add_noop_action set to True.


list of pygame.constants

The agent can simply select the index of the action to perform.


Gets the current number of frames the agent has seen since PLE was initialized.


Gets a non-visual state representation of the game.

This can include items such as player position, velocity, ball location and velocity etc.


dict or None

It returns a dict of game information. This greatly depends on the game in question and must be referenced against each game. If no state is available or supported None will be returned back.


Gets the games non-visual state dimensions.


tuple of int or None

Returns a tuple of the state vectors shape or None if the game does not support it.


Gets the games screen dimensions.


tuple of int

Returns a tuple of the following format (screen_width, screen_height).


Gets the current game screen in Grayscale format. Converts from RGB using relative lumiance.


numpy uint8 array

Returns a numpy array with the shape (width, height).


Gets the current game screen in RGB format.


numpy uint8 array

Returns a numpy array with the shape (width, height, 3).


Initializes the pygame environement, setup the display, and game clock.

This method should be explicitly called.


Gets the number of lives the agent has left. Not all games have the concept of lives.


Performs a reset of the games to a clean initial state.


Saves the current screen to png file.


filename : string

The path with filename to where we want the image saved.


Gets the score the agent currently has in game.