-
Notifications
You must be signed in to change notification settings - Fork 3
Open
Description
π Data Science Homework 1 β Web Scraping
π§ Instructions
Dear all,
Please follow the instructions below to complete Homework 1:
1. Git Branch and Folder Structure
-
Each student must create a new branch named:
[UPID]_hw1_2025_1
Example:
123456_hw1_2025_1
-
Using this branch, create a folder named exactly as your branch in the folder hw1:
123456_hw1_2025_1/
-
Inside your folder, include the following:
requirements.txt
- Your scraping code (your jupyter notebook). The format name should be 123456_hw1_2025_1.ipynb
- Your resulting CSV file
-
Save everything under the main
homework1/
directory in the repo.
2. Task Description
Scrape all Data Science job offers from the Bumeran platform that match the following filters(using code not by hand!):
3. Suggested Scraping Strategy (Two Stages)
β Stage 1: Extract Job Posting Links
- Scrape all the job listing URLs based on the filters above.
- Navigate across all pages if necessary.
β Stage 2: Scrape Job Details
- For each job URL collected in Stage 1, extract the following:
- Job Title
- Description (up to the "Benefits" section)
- District
- Work Mode (e.g., on-site, remote, hybrid)
4. Output
- Your final output must be a CSV file with the following columns:
Job Title | Description | District | Work Mode
5. πΉ Short Explanation Video
-
Create a 3-minute video explaining your work.
-
Your video should include:
- A short explanation of your environment setup.
- A walk-through of your code and any specific functions/classes you used.
- A sample run showing the output.
-
Upload your video link to the next Google sheet.
Deadline - April 2 23:59 p.m. NO EXTENSION!
Let us know if you have any questions. Good luck!
Metadata
Metadata
Assignees
Labels
No labels