This project is a personal statistical analysis tool for financial assets.
I’m a Computer Engineering student with a strong interest in investing and finance, and I wanted a reliable and transparent way to measure assets statistically, beyond just looking at prices or past returns.
The goal is to:
- understand how different assets behave (risk, skewness, drawdowns, persistence)
- compare assets consistently
- use these statistics to make better portfolio and strategy decisions
This is not a trading bot.
It’s a diagnostic and analysis tool.
Given historical price data for an asset, it computes and visualizes:
- CAGR (annualized growth)
- Mean return (daily & annualized)
- Volatility (annualized)
- Downside volatility (annualized)
- Skewness of returns
- Autocorrelation (lag 1)
- Sharpe ratio
- Return percentiles (tails & median)
- Max and average drawdown
It automatically generates and saves:
- Price chart
- Equity curve
- Drawdown plot
- Return distribution (with skew)
- Rolling downside volatility
All plots are saved as PNG files, grouped per asset and time window.
stats-lab-for-assets/
├── data/ # CSV price datasets
├── images/ # Generated plots
├── notebooks/ # EDA notebooks
├── src/
│ ├── data_process.py # Download and preprocess data
│ ├── metrics.py # Statistical analysis (Asset class)
│ ├── visualization.py # Plots
│ └── main.py # Main
├── environment.yml # Conda environment
└── README.md # Documentation
You only need Docker and Docker Compose installed.
docker compose build
Historical price data is downloaded using Yahoo Finance.
To fetch and store data for an asset, run:
python3 src/data_process.py GLD
This will:
- download data
- clean it
- save a csv file into the data/ folder
! You can change the data dates on the function get_data() !
This project has two main parts:
metrics.py→ computes all statistics for an assetvisualization.py→ generates and saves plots as PNG files
Usage (needs the csv data in the data folder):
docker compose run stats-lab GLD GLD_2000-01-01_to_2020-01-01 (or whatever the years of the csv)
Important to note:
- csv file name without the ending .csv
- plots are saved in folder images
- it prints a summary in the terminal
With S&P500 data from 2000 to 2025: