Analiza ofert pracy

Avaliable README.dm language versions: PL, EN

PL

Analiza ofert pracy

Caution

Zaimportuj własne wnioski!

Spis treści

Wstęp
Instrukcja obsługi
Plany rozwoju
Informacje dodatkowe

Wstęp

Projekt służy do analizy i wizualizacji danych o ofertach pracy w branży IT. W obecnej wersji:

Obsługiwane są tylko oferty z portalu JustJoinIT
Dane mogą być pobierane zarówno z plików JSON (dołączonych lub pobranych z Kaggle), jak i za pomocą przygotowanej listy linków do ogłoszeń

Important

Pliki JSON dotyczą przedziału od 2021-10 do 2023-09
Dane dostępne są w repozytorium Kaggle
Należy je umieścić w lokalizacji data/raw/downloaded_offers

Instrukcja obsługi

Pobierz repozytorium:
git clone https://github.com/TwojeKonto/analyzing_job_offers.git
Przejdź do lokalizacji projektu:
cd analyzing_job_offers
Zainstaluj wymagane biblioteki:
pip install -r requirements.txt
Wybierz źródło danych:
- Pobierz zestaw danych z Kaggle i umieść go w folderze data/raw/downloaded_offers. Następnie uruchom skrypt etl.py, który pobierze i przetworzy pliki JSON do bazy danych
- Stwórz plik links.txt w katalogu data i wklej linki do ofert z JustJoinIT. Kolejno uruchom main.py w katalogu app, który pobierze i przetworzy oferty do bazy danych
Uruchom Jupyter Notebook:
Przejdź do katalogu notebooks i przy użyciu terminala i uruchom Jupyter Notebook za pomocą komendy jupyter notebook. Możesz także odpalić ręcznie plik jupyter_notebook_launcher.bat. Uruchomi się domyślna przeglądarka z interfejsem Jupyter – tam znajdziesz m.in. plik job_analysis.ipynb, który można otworzyć w celu przeglądu i wizualizacji danych. Trzeba pamiętać, że trzeba go uruchomić od samego początku aż do końca (wszystkie "cell'e")

Tip

Jeżeli Twój plik z linkami zawiera nie tylko oferty, użyj funkcji sort_raw_offers_file() w klasie Utilities w lokalizacji app/utilities.py, aby odfiltrować tylko oferty pracy

Tip

Istnieje także plik Dockerfile do odpalenia aplikacji w kontenerze Docker za pomocą komendy docker build -t analyzing_job_offers . do stworzenia obrazu

Plany rozwoju

Obsługa wielu stron z ofertami pracy i źródeł danych (Pracuj.pl, LinkedIn, No Fluff Jobs)
Zaprojektowanie struktur, optymalizacja oraz wdrożenie innych baz danych (PostgreSQL, MongoDB, Apache Cassandra)
Wdrożenie agenta AI w celu:
- rozmowy z użytkownikiem w celu wyciągnięcia wniosków z przeanalizowanych ofert pracy
- łatwiejsze filtrowanie kluczowych wniosków z ofert
Rozszerzenie wizualizacji o nowe funkcje i interaktywne wykresy
Stworzenie GUI do obsługi aplikacji
Rozbudowana analiza trendów, m.in. z wykorzystaniem machine learning (przewidywanie najgorętszych technologii)

Informacje dodatkowe

Projekt jest rozwijany w ramach nauki i zdobywania nowych umiejętności, podczas którego musiałem zebrać wiedzę na temat:

programowania w języku Python
Web Scrapingu
ETL (Extract, Transform, Load)
baz danych (SQLite)
analizy oraz wizualizacji danych
testów
konteneryzacji

Przetestowano na systemie Windows 11

EN

Job analysis

Caution

Import your own conclusions!

Introduction

The project is used for analysis and visualization of job offer data in the IT industry. In the current version:

Only listings from JustJoinIT are supported
Data can be retrieved both from JSON files (attached or downloaded from Kaggle) and by using a prepared list of links to ads

Important

The JSON files are for the period from 2021-10 to 2023-09.
The data is available in the repository Kaggle
They should be placed in the location data/raw/downloaded_offers.

User Manual

Download the repository:
git clone https://github.com/TwojeKonto/analyzing_job_offers.git
Go to project location:
cd analyzing_job_offers
Install the required libraries:
pip install -r requirements.txt
Select a data source:
- Download the dataset from Kaggle and place them in the folder data/raw/downloaded_offers. Then run the script etl.py, which will download and parse the JSON files into the database
- Create a file links.txt in folder data and paste links to your listings from JustJoinIT. Next, run main.py in the app directory, which will download and process the listings to the database
Launch Jupyter Notebook:
Navigate to the notebooks directory and using a terminal and launch Jupyter Notebook using the jupyter notebook command. You can also fire up the jupyter_notebook_launcher.bat file manually. It will launch the default browser with Jupyter interface - there you will find, among other things, the job_analysis.ipynb file, which you can open to review and visualize the data. Keep in mind that you need to run it from the very beginning all the way to the end (all “cells”)

Tip

If your link file contains more than just listings, use the sort_raw_offers_file() function in the Utilities class in the app/utilities.py location to filter out only job listings

Tip

Also exists file Dockerfileto launch application in container Docker using command docker build -t analyzing_job_offers . to create image

Development plans

Supporting multiple job sites and data sources (Pracuj.pl, LinkedIn, No Fluff Jobs)
Design of structures, optimization and implementation of other databases (PostgreSQL, MongoDB, Apache Cassandra)
Implementation of AI agent to:
- talk to users to draw conclusions from analyzed job offers
- easier filtering of key findings from offers
Enhance visualization with new features and interactive charts
Creation of a GUI to operate the application
Expanded trend analysis, including using machine learning (predicting the hottest technologies)

Additional information

The project is being developed as a part of learning and acquiring new skills, during which I had to gather knowledge about:

Python programming
Web Scraping
ETL (Extract, Transform, Load)
databases (SQLite)
analysis and data visualization
tests
containerization

Tested on Windows 11

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
app		app
assets/images		assets/images
data		data
notebooks		notebooks
scripts		scripts
tests		tests
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Avaliable README.dm language versions: PL, EN

PL

Analiza ofert pracy

Spis treści

Wstęp

Instrukcja obsługi

Plany rozwoju

Informacje dodatkowe

EN

Job analysis

Table of contents

Introduction

User Manual

Development plans

Additional information

About

Releases

Packages

Languages

License

czubi1928/analyzing_job_offers

Folders and files

Latest commit

History

Repository files navigation

Avaliable README.dm language versions: PL, EN

PL

Analiza ofert pracy

Spis treści

Wstęp

Instrukcja obsługi

Plany rozwoju

Informacje dodatkowe

EN

Job analysis

Table of contents

Introduction

User Manual

Development plans

Additional information

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages