Scrapes Gate.io P2P buy listings with Puppeteer and saves merchant data to JSON/CSV. Ships with auto-scroll, robust logging, environment-based config, and an optional filtered view around a target merchant.
- Headless scraping with Puppeteer (uses bundled Chromium by default)
- Configurable target URL (USDT-KES by default)
- Auto-scroll to load dynamic content
- One-off or continuous runs (configurable interval)
- Outputs in JSON and CSV
- Logs for activity, warnings, and errors
- Optional filtered output for the merchant adjacent to a target (default: "coinftw")
- Node.js 18+ and npm
- Internet access
- Optional: Google Chrome/Chromium (if you prefer a system browser instead of bundled Chromium)
npm install- Copy
.env.exampleto.envand adjust as needed:
TARGET_URL=https://www.gate.io/p2p/buy/USDT-KES
SCRAPE_INTERVAL_MS=60000
TARGET_MERCHANT=coinftw
# CHROME_EXECUTABLE=/usr/bin/google-chrome-stable- URL: change
TARGET_URLto the P2P page you want. - Browser binary: set
CHROME_EXECUTABLEto use a system Chrome/Chromium; otherwise the bundled Chromium is used.
export CHROME_EXECUTABLE=/usr/bin/google-chrome-stableNote: The filtered output looks for merchant name
coinftwby default (seescraper/filterMerchant.js).
Run a single scrape:
npm run scrape:onceRun continuously (default interval from .env):
npm startBuild the image:
docker build -t gateio-p2p-scraper:latest .Run with your .env and persist outputs to the host:
docker run --rm \
--env-file .env \
-v "$(pwd)/data:/app/data" \
-v "$(pwd)/logs:/app/logs" \
gateio-p2p-scraper:latestOr use docker-compose:
docker compose up --buildNotes:
- Inside Docker, leave
CHROME_EXECUTABLEunset so Puppeteer uses its bundled Chromium. - Data will appear in
./dataand logs in./logson your host.
data/gateio_p2p_merchants.jsondata/gateio_p2p_merchants.csvdata/filtered_merchants.json(adjacent to the target merchant)data/filtered_merchants.csv(adjacent to the target merchant)
logs/activity.log— high-level stepslogs/errors.log— errors and stack traceslogs/warnings.log— non-fatal warnings
- Browser not found at executablePath
- Clear
CHROME_EXECUTABLEor point it to a valid path; by default the bundled Chromium is used.
- Clear
- Timeouts waiting for selectors or no data extracted
- The site may have changed. Update selectors in
scraper/extract.jsand thewaitForSelectorinscraper/scraper.js.
- The site may have changed. Update selectors in
- Permission errors writing files
- Ensure the process can write to
data/andlogs/folders.
- Ensure the process can write to
- Lint:
npm run lint- Format:
npm run format- Tests (none added yet):
npm testsrc/index.js— CLI/entrypoint for the enhanced app lifecycleapp.js— main application class orchestrating servicesconfig/— environment configs and validation schemas
scraper/— scraping pipeline, selectors, database, metrics, logging, shutdown, etc.scripts/healthcheck.js— fast readiness/self-check used locally and in Docker HEALTHCHECKdata/,logs/— runtime outputs (mounted in Docker)Dockerfile,.dockerignore,docker-compose.yml— containerization.editorconfig,.prettierrc,.nvmrc— editor/formatting/runtime defaults
- Extraction selectors are centralized in
scraper/extract.jsand include fallbacks; if the site changes, update them there. - Config and intervals are environment-driven via
.envto avoid code changes. - CSV writing escapes values and ensures directories exist.
- Graceful shutdown on SIGINT/SIGTERM to stop intervals cleanly.
ISC