Skip to content

shrishtiroy/Quarter-Sense

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

NBA Play-by-Play Summarization with T5-Gemma

Overview

This repository builds a custom dataset of NBA play-by-play (pbp) logs and finetunes T5-Gemma small model to generate concise but engaging quarter summaries of an NBA game. The pipeline covers scraping, preprocessing, and supervised fine-tuning.

Project structure

  • NBA_pbp_scraper.ipynb
    Scrapes official NBA play-by-play logs and stores actions (shots, fouls, turnovers, rebounds, etc.) in structured form.
  • pbp_preprocessing.ipynb
    Cleans and formats scraped data, splits games by quarter, and pairs quarter pbp text with a ground-truth summary in a kaggle dataset
  • finetuning_nba.ipynb
    Finetunes google/t5-gemma-small using Hugging Face training utilities with experiment tracking via Weights & Biases

Dataset

  • Source: NBA Quarter Play Summaries (custom-built)
  • Granularity: Per quarter
  • Link: https://www.kaggle.com/datasets/shrishtiroy6/nba-quarter-play-summaries
  • Format example:
    • Input: Time TIMBERWOLVES Score Lead Warriors 12:00 Start of Period (9:16 PM) 12:00 Possession: Timberwolves 30-23 +7 11:44 MISS N.Alexander-Walker 25' 3PT Jump Shot 11:41 Q.Post REBOUND 11:34 30-25 +5 J.ButlerIII Driving Layup 11:34 N.Reid S.FOUL (P1, T1) (S.Wright) 11:34 30-26 +4 J.ButlerIII Free Throw 1 of 1
    • Output: The first quarter was tightly contested, with the Warriors leading thanks to strong three-point shooting from Curry and Hield.

Model training

  • Base model: google/t5-gemma-small (60M params)
  • Uses LoRA finetuning and 4-bit quantization to compress model weights
  • Typical settings used:
    • Per-device batch size: 1 (with gradient accumulation)
    • Optimizer: adamw_torch_8bit
    • Mixed precision: bf16
    • Logging: Weights & Biases
    • GPU: Tesla P100
  • Development goal: Overfit small examples to validate training loop, then scale to full dataset and evaluate with ROUGE/BLEU.

About

Finetuned T5 Gemma Small to summarize NBA quarters commentary style given NBA plays.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors