Skip to content

A statistical analysis of chess games on lichess.com

Notifications You must be signed in to change notification settings

FraserParlane/chess-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Chess moves—a statistical analysis

Which are the most frequently used tiles on a chess board? Which tile is black most likely to move a pawn onto? What tile are both players most likely to lose a piece on? This repo provides an analysis of >1 million chess moves to answer these questions and more.

heat-plot

Data source

Lichess is a free and open-source chess server with an API that provides access to all games played on the site. I use the Berserk Python wrapper to download the top X games from the top Y players on Lichess, resulting in Z total chess moves.

To use

  1. Create a Lichess account and create a token. Save this token as lichess.token within the repository.
  2. process.py fetches and cleans the data from Lichess. Modify the number of top players and the number of games per player at the bottom of the script. The processing script creates a dataframe saved as plays.feather where each row is one chess move. The data has the following datatypes:
column name data type description
white bool Is this a move by white.
piece str 'K', 'Q', 'B', 'N', 'R', or 'P'
posx int x position [0, 8]
posy int y position [0, 8]
kill bool Did this move remove a piece
check bool Did this move result in check
mate bool Did this move result in mate
  1. The plot.py script has some helpful plotting scripts for creating heat plots to show the frequency that each position on the board is moved to.

Examples

heat-plot heat-plot

About

A statistical analysis of chess games on lichess.com

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages