This directory contains the Python FastAPI backend application for the GenEd project. It handles data processing, provides API endpoints, and interacts with the database.
- Python (3.8+ recommended - Check
requirements.txt) - pip
- virtualenv (recommended)
- Navigate to the backend directory:
cd path/to/GenEd-CMUQ/backend - Create and activate a virtual environment:
python -m venv venv source venv/bin/activate # On Windows use `venv\Scripts\activate`
- Install dependencies:
pip install -r requirements.txt
Below is the Entity Relationship Diagram (ERD) for the database:
- Database File: The application uses a SQLite database. By default, it expects the database file to be at
backend/database/gened_db.sqlite. This path can be overridden by setting theDATABASE_URLenvironment variable. - Models: Database table structures are defined using SQLAlchemy ORM in
backend/database/models.py.
Data for courses, enrollment, and degree audits is extracted from source files (JSON and Excel) using specialized extractor classes located in backend/scripts/:
course_extractor.py: Processes course information JSON files.enrollment_extractor.py: Processes enrollment data from Excel files.audit_extractor.py: Processes degree audit requirement JSON files.
These scripts parse the raw data and transform it into a structure suitable for database insertion.
- The logic for extracting data from degree audit JSON files (
backend/scripts/audit_extractor.py) is based on the work done in the cs-cmuq/courses-data-analysis repository. - Specifically, it adapts the logic found in the
audits.ipynbnotebook: https://github.com/cs-cmuq/courses-data-analysis/blob/master/audits.ipynb
Once extracted, the data needs to be loaded into the database. There are two primary methods for this:
- Maintainers can access a dedicated data upload interface via a specific URL path. This path is configured in the frontend environment using the
REACT_APP_UPLOAD_PATHvariable. - On this page, maintainers can upload the source data files (Course ZIPs, Audit ZIPs, Enrollment Excel, Department CSV).
- The frontend then sends these files to the backend API (specifically the
/upload/init-db/endpoint), which triggers the appropriate extractor scripts (backend/scripts/*_extractor.py) and saves the resulting data to the database.
- Alternatively, data can be loaded directly by executing the database modules from the command line. This is typically done after resetting the database.
- Steps:
- Ensure your virtual environment is activated (
source venv/bin/activate). - Navigate to the project's root directory (
GenEd-CMUQ). - Optionally, reset the database (clears all existing data):
python -m backend.database.reset_db
- Load the data by running the
load_datamodule. This script will likely look for data files in predefined locations (e.g., within thedata/directory):python -m backend.database.load_data
- This executes the main logic within
backend/database/load_data.py, which uses the extractor classes to parse data and populate the database tables.
- Ensure your virtual environment is activated (
Start the FastAPI development server using Uvicorn:
uvicorn backend.app.main:app --reload --host 0.0.0.0 --port 8000--reload: Enables auto-reloading when code changes.--host 0.0.0.0: Makes the server accessible on your local network.--port 8000: Specifies the port number.
The API will be accessible at http://127.0.0.1:8000 (or http://<your-local-ip>:8000).
Once the server is running, interactive API documentation is available:
- Swagger UI:
http://127.0.0.1:8000/api/docs - ReDoc:
http://127.0.0.1:8000/api/redoc
Backend tests are located within the backend/tests/ directory and are run using pytest from the project root directory (CountsFor/).
# Navigate to the project root directory (e.g., CountsFor/)
# Ensure your backend virtual environment is activated if needed
# Run all backend tests
python -m pytest backend/testsFor more detailed information on the test structure and how to run specific tests (e.g., targeting specific files or functions), see the tests/README.md file within this backend directory.
-
The backend follows a layered architecture to ensure clean separation of concerns:
backend/ │ ├── app/ # Contains the FastAPI application setup, routers, and schemas. │ ├── routers/ # API route definitions (endpoints). │ ├── schemas.py # Pydantic models for data validation and serialization. │ └── main.py # FastAPI app entry point and middleware configuration. │ ├── database/ # Handles database connection, models, and initialization. │ ├── models.py # SQLAlchemy ORM models defining database tables. │ └── db.py # Database connection setup and session management. │ ├── repository/ # Data Access Layer: Interacts directly with the database. │ └── ... # Example: courses.py for course-related queries. │ ├── services/ # Business Logic Layer: Processes data between API and repository. │ └── ... # Example: courses.py for course-related business logic. │ ├── scripts/ # Utility scripts for data extraction, population, etc. │ └── ... │ ├── tests/ # Contains automated tests for the backend. │ ├── database/ │ ├── routers/ │ ├── services/ │ └── README.md # Detailed guide for running backend tests. │ ├── docs/ # Documentation files, including images. │ └── images/ # Contains images like the ERD. │ ├── requirements.txt # Python package dependencies. └── README.md # This file.
The backend is designed with three layers :
- API Layer (
routers/)- Exposes REST API endpoints using FastAPI .
- Calls the service layer for business logic.
- Ensures validation using Pydantic schemas .
- Service Layer (
services/)- Implements business logic (e.g., structuring responses, processing data).
- Calls the repository layer for data access.
- Ensures consistency and formatting before returning responses.
- Repository Layer (
repository/)- Directly interacts with the database using SQLAlchemy .
- Contains raw queries and fetches data without processing it .
- Called by the service layer to retrieve structured data.
