This repository contains a web scraper designed to extract the average prices of domestic and imported fruits from the Jiaxing market. The scraper targets the website Jiaxing Fruit Market to retrieve and process relevant pricing data.
- Scrapes both domestic and imported fruit data
- Automatic pagination handling
- Built-in request retry mechanism
- Data saved in CSV format
- Supports custom date range scraping
-
Setup environment
# Clone repository git clone https://github.com/your_username/jiaxing_market_fruit_info_scraper.git cd jiaxing_market_fruit_info_scraper # Install dependencies pip install -r requirements.txt
-
Run the scraper
# Default: scrapes data from 2018-01-01 to 2018-01-05 python main.py
The scraper generates two CSV files:
domestic_fruit_info.csv
- Contains domestic fruit dataimported_fruit_info.csv
- Contains imported fruit data
Both files have the same columns (if available):
- id
- stageId
- structId
- price
- totalSalesVolume
- totalTurnover
- category
- kind
- placeOfOrigin
- city
- specification
- date
If any errors occur during execution, please check the log.log
file for detailed error messages.
- Python 3.10
- pandas==2.2.1
- DataRecorder==3.4.12
- DrissionPage==4.1.0.12
- loguru==0.7.3
- tqdm==4.66.2
Contributions are welcome! Please follow these steps:
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature
) - Commit your changes (
git commit -m 'Add some AmazingFeature'
) - Push to the branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.