Self learning Pac-Man player, using reinforcement learning

This project was performed togehter with Maco Morik.

The goal of the project was to build a Pac-Man player that learns to play by exploring. The player was implemented using approximate Q-learning. We used a neural network (7 inputs, 1 hidden layer of 50 neurons, and 1 output) to approximate the Q function. The inputs were an action (e.g. top/left/bottom/right) along side 6 features extracted from the game state. The used features are:

Distance to closed food pill
Remaining food pills
Distance to the closest non-scared ghost
Distance to the closest scared ghost
time left of the closes scared ghost
Ghost presence in a neighboring location

Below is showing the game score (averaged over 30 game episodes) during training. After 400 games, the player seems to have converged to some playing strategy (policy).

avg_game_score

When tested in 50 games, the player achieved a game win rate of 97% against random ghosts, and 53% against intelligent ghosts that chases Pack-man (player with Q-function approximated with a linear regression model achieved a game win rate of 92% and 53% for the two scenarios). The neural network has improvement over the linear model, but not a significant one. This most likely due to the fact that the game is state is approximated with some features that do not capture all the aspects of the game state (i.e. there is no feature indicating direction of ghosts movement). Deep neural networks automatically extract “good” feature representations from the raw inputs, and hence using them as approximator for the Q function results in a more powerful agents (given that they were trained correct).

Mohamed Haseeb

Software and machine learning engineer, interested in applying machine learning techniques to build innovative solutions.

Self learning Pac-Man player, using reinforcement learning

References

Comments

Self learning Pac-Man player, using reinforcement learning

References

Related Posts

Wisture: Touch-less Hand Gesture Classification in Unmodified Smartphones Using Wi-Fi Signals 01 Nov 2018

Deep Q-learning presentation at Ericsson 28 Aug 2016

Python implementation of the Learning Time-series Shapelets method (LTS) 12 Jul 2016

Comments