Skip to content

chester0104/ZillowWebscraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

ZillowWebscraper

This is a webscraper for Zillow to scrape useful information on their site, for example, finding best agent via their rating, amount of reviews, and data about houses they sold including the percent change in listing to sold.

Zillow Web Scraper

A professional PyQt5-based desktop application for extracting real estate data from Zillow with advanced anti-bot detection bypass capabilities.

Features

  • Town Data Scraper: Extract sold property data from entire towns and neighborhoods
  • Agent Data Scraper: Analyze real estate agent performance and sold property history
  • Modern GUI: Dark/Light mode toggle with intuitive navigation
  • Anti-Bot Bypass: Advanced techniques to bypass Zillow's bot detection
  • Data Export: Automatic CSV export with comprehensive property details

Installation

# Clone the repository
git clone https://github.com/chester0104/ZillowWebscraper.git
cd ZillowWebscraper

# Install required packages
pip install -r requirements.txt

Requirements

  • Python 3.7+
  • PyQt5
  • Selenium
  • undetected-chromedriver
  • pandas
  • BeautifulSoup4

Usage

GUI Application

python main.py

The application will launch with a welcome screen offering two scraping options:

  1. Town Scraper: Enter a Zillow town URL (e.g., https://www.zillow.com/los-angeles-ca/sold/)
  2. Agent Scraper: Enter a Zillow agent profile URL (e.g., https://www.zillow.com/profile/agent-name/)

Command Line Usage

Town Scraper:

python town_scraper.py

Agent Scraper:

python agent_scraper.py

Data Extracted

Town Scraper

  • Property address
  • House type (Single Family, Condo, etc.)
  • Sold price
  • Listing price
  • Price change percentage
  • Property URL

Agent Scraper

  • Agent name, rating, and review count
  • Sold property addresses
  • Sale dates
  • Listing and closing prices
  • Price per square foot
  • Average metrics across all sales

Features

  • Stealth Mode: Uses undetected-chromedriver and custom anti-detection measures
  • Human-like Behavior: Simulates natural scrolling and mouse movements
  • Persistent Sessions: Maintains browser profiles for better success rates
  • CAPTCHA Handling: Manual verification support with automatic detection
  • Partial Data Recovery: Saves progress if scraping is interrupted (Ctrl+C)

Project Structure

ZillowWebscraper/
├── src/
│   ├── main.py                 # Main application entry point
│   ├── town_scraper.py         # Town scraping logic
│   ├── agent_scraper.py        # Agent scraping logic
│   ├── anti_bot_bypass.py      # Advanced anti-detection module
│   └── gui_interface.py        # PyQt5 GUI components
├── images/
|   ├── zillow_icon.png         # Application Logo
│   └── zillow_logo.png         # Application Icon
└── README.md

Important Notes

  • Rate Limiting: Use responsibly to avoid overwhelming Zillow's servers
  • Browser Session: The browser remains open after scraping for manual inspection
  • Manual Verification: Some scraping sessions may require manual CAPTCHA completion

Troubleshooting

  • Chrome Driver Issues: The application auto-detects your Chrome version
  • Verification Required: If prompted, complete the CAPTCHA in the browser window
  • No Data Found: Verify the URL format matches Zillow's structure

Users are responsible for ensuring their use complies with applicable laws and website terms of service.

About

This is a webscraper for Zillow to scrape useful information on their site, for example, finding best agent via their rating, amount of reviews, and data about houses they sold including the percent change in listing to sold.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages