Pong simulates 2D table tennis. The agent controls an in-game paddle which is used to hit the ball back to the other side.

The agent controls the left paddle while the CPU controls the right paddle.

Valid Actions

Up and down control the direction of the paddle. The paddle has a little velocity added to it to allow smooth movements.

Terminal states (game_over)

The game is over if either the agent or CPU reach the number of points set by MAX_SCORE.


The agent receives a positive reward, of +1, for each successful ball placed behind the opponents paddle, while it loses a point, -1, if the ball goes behind its paddle.

class ple.games.pong.Pong(width=64, height=48, MAX_SCORE=11)[source]

Loosely based on code from marti1125’s pong game.


width : int

Screen width.

height : int

Screen height, recommended to be same dimension as width.

MAX_SCORE : int (default: 11)

The max number of points the agent or cpu need to score to cause a terminal state.


Gets a non-visual state representation of the game.



  • player y position.
  • players velocity.
  • cpu y position.
  • ball x position.
  • ball y position.
  • ball x velocity.
  • ball y velocity.

See code for structure.