Welcome to the official repository for the paper KARRIEREWEGE: A Large-Scale Career Path Prediction Dataset. This repository contains datasets, models, and code for career path prediction research.
KARRIEREWEGE is a large-scale dataset designed to support career path prediction tasks. It provides rich information on career trajectories, enabling research in career forecasting, job market analysis, and related fields.
The datasets and models are hosted on Hugging Face: Karrierewege Collection.
To run the code, ensure you have the necessary dependencies installed. You can set up the environment using:
pip install -r requirements.txt
To reproduce the results of the linear transformation approach for all datasets, run:
bash src/pipeline.sh
This will process the datasets and generate output in the output/
folder.
If you prefer to use the precomputed matrices from the linear transformation (stored in the output
folder), run:
python src/test.py --test_config "test_config_of_choice.json"
Replace test_config_of_choice.json
with the appropriate configuration file for the dataset you want to test.
For questions or collaborations, feel free to reach out or open an issue in this repository.