This project is a complete ELT (Extract, Load, Transform) pipeline that fetches stock data from Yahoo Finance, stores it in an S3 bucket, and then loads it into an Redshift Serverless table. Finally, AWS Quicksight is used to create BI Dashboards on this data. The pipeline uses AWS Lambda, AWS Glue, and AWS Redshift Serverless.
- An AWS Lambda function calls the yfinance API to load data for the past 30 days of specific stocks.
- The data is combined into one CSV file and uploaded to an S3 bucket.
- File:
src/lambda-functions/load_data_s3.py
- The Glue job loads the stock data from the S3 bucket into the Amazon Redshift table.
- File:
src/glue/s3_to_redshift_job.py
-
The raw stock data is stored in a Redshift table.
-
Several views are defined on top of this base table for specific use-cases.
-
Intraday Percentage Difference View: Calculates the average intraday percentage change for each stock, measuring how much a stock moves within a single trading session.
-
Daily Volatility View: Calculates the daily price range percentage for each stock, measuring intraday volatility by computing how much the price fluctuates within a single day.
-
Daily Gainers and Losers View: Identifies whether a stock is a gainer, loser, or neutral for the day by comparing today's closing price with the previous day's closing price
-
Table/View DDLs:
-
src/redshift-sql/tables/stock_data_raw.sql
-
src/redshift-sql/views/stock_intraday_diff.sql
-
src/redshift-sql/views/stock_daily_volatility.sql
-
src/redshift-sql/views/stock_daily_gainers_losers.sql
- The data is visualized in a dashboard on AWS QuickSight.
- Deploy the Lambda function to fetch and upload stock data to S3.
- Configure the Glue job to load data from S3 to Redshift.
- Create the necessary Redshift tables and views.
- Visualize the data using AWS QuickSight.
- Trigger the Lambda function to fetch and upload stock data CSV to S3.
- Run the Glue job to load data from S3 to Redshift.
- Configure Redshift table as Data Source in AWS Quicksight via VPC.