Skip to content

FECDataConnect is a project aimed at extracting data from the Federal Election Commission (FEC) and integrating it into a MariaDB database. This ETL pipeline ensures that the data remains fresh and accessible for further analysis.

License

Notifications You must be signed in to change notification settings

AaronNHorvitz/FECDataConnect

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 

Repository files navigation

FECDataConnect

An ETL pipeline for extracting data from the Federal Election Commission (FEC) and integrating it into a MariaDB database.

Overview

FECDataConnect is designed to fetch, transform, and load data from the FEC into a structured database. This ensures easy accessibility and analysis of election-related data.

Table of Contents

Features

  • Data Extraction: Automated scraping of FEC data.
  • Transformation: Pre-processing and cleaning of raw FEC data to ensure database readiness.
  • Loading: Streamlined insertion of the transformed data into a MariaDB database.

Installation & Setup

  1. Clone the Repository:
    git clone https://github.com/your_username/FECDataConnect.git
  2. Install Required Libraries:
    pip install -r requirements.txt
  3. Database Configuration: [Instructions for setting up your MariaDB database, configuring user privileges, etc.]

Usage

Run the main script to start the ETL process:

python main.py

Contributing

Contributions are welcome!

  1. Fork the repository.
  2. Create your feature branch ('git checkout -b feature/AmazingFeature').
  3. Commit your changes ('git commit -m 'Add some AmazingFeature').
  4. Push the branch ('git push origin feature/AmazingFeature').
  5. Open a pull request.

For major changes, please open an issue first to discuss what you'd like to change.

License

This project is licensed under the MIT License. For more details, see the LICENSE file in the repository. Contact

Project File Structure

FECDataConnect/
│
├── data/
│   ├── raw/                 # For storing raw scraped data
│   ├── processed/           # For data that's been cleaned/transformed
│   └── archive/             # For archival purposes (optional)
│
├── src/
│   ├── etl/
│   │   ├── extract.py       # Code to extract data
│   │   ├── transform.py     # Code to transform data
│   │   └── load.py          # Code to load data into MariaDB
│   │
│   ├── utils/               # Helper scripts, utilities, etc.
│   └── config.py            # Configuration variables/settings
│
├── logs/                    # Directory for logs (if you're logging events/errors)
│
├── tests/                   # For unit tests
│
├── .gitignore               # Specifies intentionally untracked files to ignore
├── LICENSE
├── README.md
└── requirements.txt         # Lists all project dependencies

About

FECDataConnect is a project aimed at extracting data from the Federal Election Commission (FEC) and integrating it into a MariaDB database. This ETL pipeline ensures that the data remains fresh and accessible for further analysis.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published