API Tools

A CLI to scrape some really useful UTD data, parse it, and upload it to the Nebula API database for community use.

Part of Project Nebula.

Design

- The `grade-data` directory contains .csv files of UTD grade data.

Files are named by year and semester, with a suffix of S, U, or F denoting Spring, Summer, and Fall semesters, respectively.
This means that, for example, 22F.csv corresponds to the 2022 Fall semester, whereas 18U.csv corresponds with the 2018 Summer semester.
This grade data is collected independently from the scrapers, and is used during the parsing process.

- The `scrapers` directory contains the scrapers for various UTD data sources. This is where the data pipeline begins.

The scrapers are concerned solely with data collection, not necessarily validation or processing of said data. Those responsibilities are left to the parsing stage.

- The `parser` directory contains the files and methods that parse the scraped data. This is the 'middle man' of the data pipeline.

The parsing stage is responsible for 'making sense' of the scraped data; this consists of reading, validating, and merging/intermixing of various data sources.
The input data is considered immutable by the parsing stage. This means the parsers should never modify the data being fed into them.

- The `uploader` directory contains the uploader that sends the parsed data to the Nebula API MongoDB database. This is the final stage of the data pipeline.

The uploader(s) are concerned solely with pushing parsed data to the database. Data, at this point, is assumed to be valid and ready for use.

Contributing

Prerequisites

Golang 1.23 (or higher)

Development

Documentation for the project will be created soon, but for more information please visit our Discord.

To build the project, simply clone the repository and then either:

Run make in the root (top level) directory (for systems with make installed, i.e. most Linux distros, MacOS)
Run build.bat on Windows systems (unless you want to deal with getting make to work on Windows :P)

The build process will output an executable file named api-tools; this executable is the CLI and can be ran in your terminal!

Deployment

[TBD]

Questions or Feedback

If you have any questions about this project, reach out to the Project Nebula maintainers at [email protected] or open an issue or discussion on this repository.

Name		Name	Last commit message	Last commit date
Latest commit History 64 Commits
.github/workflows		.github/workflows
grade-data		grade-data
parser		parser
scrapers		scrapers
uploader		uploader
utils		utils
.env.template		.env.template
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
build.bat		build.bat
go.mod		go.mod
go.sum		go.sum
main.go		main.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

API Tools

Design

- The `grade-data` directory contains .csv files of UTD grade data.

- The `scrapers` directory contains the scrapers for various UTD data sources. This is where the data pipeline begins.

- The `parser` directory contains the files and methods that parse the scraped data. This is the 'middle man' of the data pipeline.

- The `uploader` directory contains the uploader that sends the parsed data to the Nebula API MongoDB database. This is the final stage of the data pipeline.

Contributing

Prerequisites

Development

Deployment

Questions or Feedback

About

Releases

Packages

Languages

License

sreenidhi-gs/api-tools

Folders and files

Latest commit

History

Repository files navigation

API Tools

Design

- The grade-data directory contains .csv files of UTD grade data.

- The scrapers directory contains the scrapers for various UTD data sources. This is where the data pipeline begins.

- The parser directory contains the files and methods that parse the scraped data. This is the 'middle man' of the data pipeline.

- The uploader directory contains the uploader that sends the parsed data to the Nebula API MongoDB database. This is the final stage of the data pipeline.

Contributing

Prerequisites

Development

Deployment

Questions or Feedback

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

- The `grade-data` directory contains .csv files of UTD grade data.

- The `scrapers` directory contains the scrapers for various UTD data sources. This is where the data pipeline begins.

- The `parser` directory contains the files and methods that parse the scraped data. This is the 'middle man' of the data pipeline.

- The `uploader` directory contains the uploader that sends the parsed data to the Nebula API MongoDB database. This is the final stage of the data pipeline.

Packages