Add Glassdoor platform adapter for job scraping #39

Copilot · 2025-11-25T23:28:28Z

Original prompt

Objective

Develop Glassdoor platform adapter for job scraping, similar to the LinkedIn adapter, to expand job source coverage and provide users with more opportunities.

Related Issue

Closes #22

Requirements

Mirror LinkedIn adapter's class structure (extend BaseJobScraper)
Scrape jobs with title, company, location, compensation, etc.
Export to unified job pipeline JSON format
Add platform selection logic in orchestrator
Unit tests, usage docs

Technical Specifications

Directory Structure

/services/platform-adapters/
├── glassdoor/
│   ├── src/
│   │   ├── index.ts                  # Main export
│   │   ├── GlassdoorJobScraper.ts    # Core scraper class
│   │   ├── types.ts                  # Glassdoor-specific types
│   │   └── utils/
│   │       ├── urlBuilder.ts         # Build Glassdoor search URLs
│   │       ├── parser.ts             # Parse Glassdoor HTML
│   │       └── ratingParser.ts       # Parse company ratings
│   ├── tests/
│   │   ├── scraper.test.ts           # Unit tests
│   │   ├── parser.test.ts            # Parser tests
│   │   └── fixtures/                 # Mock HTML
│   │       ├── search-results.html
│   │       └── job-detail.html
│   ├── package.json
│   ├── tsconfig.json
│   ├── Dockerfile
│   └── README.md
├── base/
│   └── BaseJobScraper.ts             # Shared base class
└── types/
    └── job.ts                        # Unified Job interface

Glassdoor-Specific Job Interface

interface GlassdoorJob extends Job {
  // Glassdoor-specific fields
  companyRating?: number;           // 1-5 stars
  salaryEstimate?: {
    min: number;
    max: number;
    currency: string;
  };
  reviewCount?: number;
  easyApply: boolean;
  companySize?: string;
  industry?: string;
  headquarters?: string;
}

// Glassdoor search filters
interface GlassdoorSearchFilters extends SearchFilters {
  companyRating?: number;           // Minimum rating (1-5)
  companySize?: 'small' | 'medium' | 'large' | 'enterprise';
  industry?: string;
  easyApplyOnly?: boolean;
}

GlassdoorJobScraper Class

import { BaseJobScraper } from '../base/BaseJobScraper';
import { Browser, Page } from 'playwright';

export class GlassdoorJobScraper extends BaseJobScraper {
  readonly platform = 'glassdoor';
  readonly baseUrl = 'https://www.glassdoor.com';
  
  constructor(config: ScraperConfig) {
    super(config);
  }
  
  // Build Glassdoor search URL from filters
  protected buildSearchUrl(filters: GlassdoorSearchFilters): string {
    const params = new URLSearchParams();
    params.set('keyword', filters.keywords);
    if (filters.location) params.set('locT', 'C');
    if (filters.location) params.set('locKeyword', filters.location);
    if (filters.salary?.min) params.set('minSalary', filters.salary.min.toString());
    if (filters.datePosted) params.set('fromAge', this.mapDatePosted(filters.datePosted));
    
    return `${this.baseUrl}/Job/jobs.htm?${params.toString()}`;
  }
  
  // Parse job cards from search results
  protected async parseJobList(page: Page): Promise<GlassdoorJob[]> {
    const jobCards = await page.$$('[data-test="jobListing"]');
    const jobs: GlassdoorJob[] = [];
    
    for (const card of jobCards) {
      const job = await this.parseJobCard(card);
      jobs.push(job);
    }
    
    return jobs;
  }
  
  // Parse individual job card
  private async parseJobCard(card: ElementHandle): Promise<GlassdoorJob> {
    return {
      id: await card.getAttribute('data-id') || '',
      title: await card.$eval('[data-test="job-title"]', el => el.textContent?.trim() || ''),
      company: await card.$eval('[data-test="employer-name"]', el => el.textContent?.trim() || ''),
      location: await card.$eval('[data-test="emp-location"]', el => el.textContent?.trim() || ''),
      salary: await this.parseSalary(card),
      companyRating: await this.parseRating(card),
      url: await this.parseJobUrl(card),
      postingDate: await this.parseDate(card),
      tags: [],
      source: 'glassdoor',
      scrapedAt: new Date().toISOString(),
      easyApply: await this.hasEasyApply(card)
    };
  }
  
  // Get detailed job info
  async getJobDetails(jobUrl: string): Promise<GlassdoorJob> {
    const page = await this.browser.newPage();
    await page.goto(jobUrl);
    
    return {
      ...await this.parseBasicInfo(page),
      description: await page.$eval('[data-test="job-description"]', el => el.innerHTML),
      companySize: await this.parseCompanySize(page),
      industry: await this.parseIndustry(page),
      headquarters: await this.parseHeadquarters(page)
    };
  }
  
  // Handle pagination
  protected async hasNextPage(page: Page): Promise<boolean> {
    return await page.isVisible('[data-test="pagination-next"]');
  }
  
  protected async goToNextPage(page: Page): Promise<void> {
    await page.click('[data-test="pagination-next"]');
    await page.waitForLoadState('networkidle');
  }
}

URL Builder Utility


</details>

*This pull request was created as a result of the following prompt from Copilot chat.*
> ## Objective
> Develop Glassdoor platform adapter for job scraping, similar to the LinkedIn adapter, to expand job source coverage and provide users with more opportunities.
> 
> ## Related Issue
> Closes #22
> 
> ## Requirements
> - Mirror LinkedIn adapter's class structure (extend BaseJobScraper)
> - Scrape jobs with title, company, location, compensation, etc.
> - Export to unified job pipeline JSON format
> - Add platform selection logic in orchestrator
> - Unit tests, usage docs
> 
> ## Technical Specifications
> 
> ### Directory Structure
> ```
> /services/platform-adapters/
> ├── glassdoor/
> │   ├── src/
> │   │   ├── index.ts                  # Main export
> │   │   ├── GlassdoorJobScraper.ts    # Core scraper class
> │   │   ├── types.ts                  # Glassdoor-specific types
> │   │   └── utils/
> │   │       ├── urlBuilder.ts         # Build Glassdoor search URLs
> │   │       ├── parser.ts             # Parse Glassdoor HTML
> │   │       └── ratingParser.ts       # Parse company ratings
> │   ├── tests/
> │   │   ├── scraper.test.ts           # Unit tests
> │   │   ├── parser.test.ts            # Parser tests
> │   │   └── fixtures/                 # Mock HTML
> │   │       ├── search-results.html
> │   │       └── job-detail.html
> │   ├── package.json
> │   ├── tsconfig.json
> │   ├── Dockerfile
> │   └── README.md
> ├── base/
> │   └── BaseJobScraper.ts             # Shared base class
> └── types/
>     └── job.ts                        # Unified Job interface
> ```
> 
> ### Glassdoor-Specific Job Interface
> ```typescript
> interface GlassdoorJob extends Job {
>   // Glassdoor-specific fields
>   companyRating?: number;           // 1-5 stars
>   salaryEstimate?: {
>     min: number;
>     max: number;
>     currency: string;
>   };
>   reviewCount?: number;
>   easyApply: boolean;
>   companySize?: string;
>   industry?: string;
>   headquarters?: string;
> }
> 
> // Glassdoor search filters
> interface GlassdoorSearchFilters extends SearchFilters {
>   companyRating?: number;           // Minimum rating (1-5)
>   companySize?: 'small' | 'medium' | 'large' | 'enterprise';
>   industry?: string;
>   easyApplyOnly?: boolean;
> }
> ```
> 
> ### GlassdoorJobScraper Class
> ```typescript
> import { BaseJobScraper } from '../base/BaseJobScraper';
> import { Browser, Page } from 'playwright';
> 
> export class GlassdoorJobScraper extends BaseJobScraper {
>   readonly platform = 'glassdoor';
>   readonly baseUrl = 'https://www.glassdoor.com';
>   
>   constructor(config: ScraperConfig) {
>     super(config);
>   }
>   
>   // Build Glassdoor search URL from filters
>   protected buildSearchUrl(filters: GlassdoorSearchFilters): string {
>     const params = new URLSearchParams();
>     params.set('keyword', filters.keywords);
>     if (filters.location) params.set('locT', 'C');
>     if (filters.location) params.set('locKeyword', filters.location);
>     if (filters.salary?.min) params.set('minSalary', filters.salary.min.toString());
>     if (filters.datePosted) params.set('fromAge', this.mapDatePosted(filters.datePosted));
>     
>     return `${this.baseUrl}/Job/jobs.htm?${params.toString()}`;
>   }
>   
>   // Parse job cards from search results
>   protected async parseJobList(page: Page): Promise<GlassdoorJob[]> {
>     const jobCards = await page.$$('[data-test="jobListing"]');
>     const jobs: GlassdoorJob[] = [];
>     
>     for (const card of jobCards) {
>       const job = await this.parseJobCard(card);
>       jobs.push(job);
>     }
>     
>     return jobs;
>   }
>   
>   // Parse individual job card
>   private async parseJobCard(card: ElementHandle): Promise<GlassdoorJob> {
>     return {
>       id: await card.getAttribute('data-id') || '',
>       title: await card.$eval('[data-test="job-title"]', el => el.textContent?.trim() || ''),
>       company: await card.$eval('[data-test="employer-name"]', el => el.textContent?.trim() || ''),
>       location: await card.$eval('[data-test="emp-location"]', el => el.textContent?.trim() || ''),
>       salary: await this.parseSalary(card),
>       companyRating: await this.parseRating(card),
>       url: await this.parseJobUrl(card),
>       postingDate: await this.parseDate(card),
>       tags: [],
>       source: 'glassdoor',
>       scrapedAt: new Date().toISOString(),
>       easyApply: await this.hasEasyApply(card)
>     };
>   }
>   
>   // Get detailed job info
>   async getJobDetails(jobUrl: string): Promise<GlassdoorJob> {
>     const page = await this.browser.newPage();
>     await page.goto(jobUrl);
>     
>     return {
>       ...await this.parseBasicInfo(page),
>       description: await page.$eval('[data-test="job-description"]', el => el.innerHTML),
>       companySize: await this.parseCompanySize(page),
>       industry: await this.parseIndustry(page),
>       headquarters: await this.parseHeadquarters(page)
>     };
>   }
>   
>   // Handle pagination
>   protected async hasNextPage(page: Page): Promise<boolean> {
>     return await page.isVisible('[data-test="pagination-next"]');
>   }
>   
>   protected async goToNextPage(page: Page): Promise<void> {
>     await page.click('[data-test="pagination-next"]');
>     await page.waitForLoadState('networkidle');
>   }
> }
> ```
> 
> ### URL Builder Utility
> ```typescript
> // utils/urlBuilder.ts
> export function buildGlassdoorSearchUrl(filters: GlassdoorSearchFilters): string {
>   const baseUrl = 'https://www.glassdoor.com/Job/jobs.htm';
>   const params: Record<string, string> = {};
>   
>   if (filters.keywords) {
>     params.sc = `keywords=' + encodeURIComponent(filters.keywords);
>   }
>   
>   if (filters.location) {
>     params.locT = 'C';
>     params.locKeyword = filters.location;
>   }
>   
>   if (filters.remote) {
>     params.remoteWorkType = '1';
>   }
>   
>   if (filters.salary?.min) {
>     params.minSalary = filters.salary.min.toString();
>   }
>   
>   if (filters.datePosted) {
>     const dateMap = { '24h': '1', 'week': '7', 'month': '30' };
>     params.fromAge = dateMap[filters.datePosted];
>   }
>   
>   if (filters.easyApplyOnly) {
>     params.applicationType = '1';
>   }
>   
>   const queryString = Object.entries(params)
>     .map(([k, v]) => `${k}=${encodeURIComponent(v)}`)
>     .join('&');
>     
>   return `${baseUrl}?${queryString}`;
> }
> ```
> 
> ### Usage Example
> ```typescript
> import { GlassdoorJobScraper } from './GlassdoorJobScraper';
> 
> const scraper = new GlassdoorJobScraper({
>   headless: true,
>   throttleMs: 3000,
>   maxResults: 50
> });
> 
> const jobs = await scraper.search({
>   keywords: 'software engineer',
>   location: 'New York, NY',
>   remote: true,
>   companyRating: 4,      // 4+ stars only
>   easyApplyOnly: true,
>   datePosted: 'week'
> });
> 
> console.log(`Found ${jobs.length} jobs on Glassdoor`);
> 
> // Get detailed info for top job
> const details = await scraper.getJobDetails(jobs[0].url);
> console.log(`Company size: ${details.companySize}`);
> console.log(`Industry: ${details.industry}`);
> 
> await scraper.exportToJson(jobs, './output/glassdoor-jobs.json');
> ```
> 
> ### Platform Orchestrator Integration
> ```typescript
> // In /services/agent-orchestrator/src/JobSearchOrchestrator.ts
> import { LinkedInJobScraper } from '../platform-adapters/linkedin';
> import { GlassdoorJobScraper } from '../platform-adapters/glassdoor';
> import { Job, SearchFilters } from '../platform-adapters/types/job';
> 
> class JobSearchOrchestrator {
>   private scrapers: Map<string, BaseJobScraper> = new Map();
>   
>   constructor() {
>     this.scrapers.set('linkedin', new LinkedInJobScraper({ headless: true }));
>     this.scrapers.set('glassdoor', new GlassdoorJobScraper({ headless: true }));
>   }
>   
>   async searchAllPlatforms(filters: SearchFilters): Promise<Job[]> {
>     const results = await Promise.all(
>       Array.from(this.scrapers.values()).map(s => s.search(filters))
>     );
>     
>     // Deduplicate by job title + company
>     return this.deduplicateJobs(results.flat());
>   }
>   
>   async searchPlatform(platform: string, filters: SearchFilters): Promise<Job[]> {
>     const scraper = this.scrapers.get(platform);
>     if (!scraper) throw new Error(`Unknown platform: ${platform}`);
>     return scraper.search(filters);
>   }
> }
> ```
> 
> ## Acceptance Criteria
> - [ ] Extends BaseJobScraper from LinkedIn adapter
> - [ ] Scrapes 30+ jobs per search query
> - [ ] Parses company ratings and salary estimates
> - [ ] Handles Glassdoor's anti-bot measures (throttling, user-agent)
> - [ ] Easy Apply filter support
> - [ ] Pagination support for large result sets
> - [ ] Unit tests with mock HTML fixtures (>80% coverage)
> - [ ] Integration with JobSearchOrchestrator
> - [ ] README with setup and usage documentation
> - [ ] Dockerfile for containerized deployment

<!-- START COPILOT CODING AGENT TIPS -->
---

✨ Let Copilot coding agent [set things up for you](https://github.com/groupthinking/AJOB4AGENT/issues/new?title=✨+Set+up+Copilot+instructions&body=Configure%20instructions%20for%20this%20repository%20as%20documented%20in%20%5BBest%20practices%20for%20Copilot%20coding%20agent%20in%20your%20repository%5D%28https://gh.io/copilot-coding-agent-tips%29%2E%0A%0A%3COnboard%20this%20repo%3E&assignees=copilot) — coding agent works faster and does higher quality work when set up for your repo.

jazzberry-ai · 2025-11-25T23:28:32Z

This repository is associated with groupthinking whose free trial has ended. Subscribe at jazzberry.ai.
_{If this is an error contact us at [email protected].}

coderabbitai · 2025-11-25T23:28:34Z

Important

Review skipped

Bot user detected.

To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Co-authored-by: groupthinking <[email protected]>

chatgpt-codex-connector

💡 Codex Review

AJOB4AGENT/services/agent-orchestrator/src/adapters/platform-manager.ts

Lines 80 to 83 in 038528a

    
           // Phase 1: JobSpy platforms (4 core platforms) 
        
           const jobSpyPlatforms = params.platforms.filter(p =>  
        
             ['indeed', 'linkedin', 'glassdoor', 'ziprecruiter'].includes(p) 
        
           );

Glassdoor direct adapter never used in multi-platform search

In searchAllPlatforms the Glassdoor platform is always grouped with the JobSpy MCP platforms and sent to jobSpyAdapter (lines 80‑83), and the method has no branch for the new 'glassdoor-direct' option. As a result, multi-platform searches ignore USE_GLASSDOOR_DIRECT=true and drop any 'glassdoor-direct' entry from params.platforms, returning either JobSpy data or no Glassdoor results at all even though a dedicated GlassdoorAdapter was added elsewhere. Consider routing Glassdoor through the dedicated adapter when requested so the new scraper can actually run during parallel searches.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copilot

Pull request overview

This PR implements a comprehensive Glassdoor platform adapter for job scraping, extending the AJOB4AGENT system with a new job source. The implementation follows the established LinkedIn adapter pattern with a Playwright-based scraper, utility functions, comprehensive tests, and integration with the agent orchestrator.

Key Changes

New Glassdoor scraper with Playwright-based automation, pagination support, and rate limiting with jitter
Comprehensive utility functions for URL building, HTML parsing, and rating/review count parsing
88+ unit tests with HTML fixtures achieving 80%+ coverage targets
Agent orchestrator integration with dual routing (dedicated adapter + JobSpy fallback)

Reviewed changes

Copilot reviewed 23 out of 25 changed files in this pull request and generated 14 comments.

Show a summary per file

File	Description
`services/platform-adapters/types/job.ts`	Shared type definitions for unified Job interface and search filters
`services/platform-adapters/glassdoor/src/types.ts`	Glassdoor-specific types extending base Job interface with rating, salary, and Easy Apply fields
`services/platform-adapters/glassdoor/src/GlassdoorJobScraper.ts`	Core scraper class with modal dismissal, throttling, and pagination logic
`services/platform-adapters/glassdoor/src/utils/*.ts`	URL builder, HTML parser, and rating parser utilities
`services/platform-adapters/glassdoor/src/base/*`	Base scraper class and duplicated type definitions
`services/platform-adapters/glassdoor/tests/*.test.ts`	Comprehensive unit tests for scraper, parsers, and utilities
`services/platform-adapters/glassdoor/tests/fixtures/*.html`	Mock HTML fixtures for testing
`services/platform-adapters/glassdoor/package.json`	Package configuration with dependencies and test scripts
`services/platform-adapters/glassdoor/Dockerfile`	Multi-stage Docker build with Playwright support
`services/platform-adapters/glassdoor/README.md`	Comprehensive documentation with API reference and examples
`services/agent-orchestrator/src/adapters/glassdoor-adapter.ts`	HTTP-based adapter for agent orchestrator (simplified regex parsing)
`services/agent-orchestrator/src/adapters/platform-manager.ts`	Integration with dual routing for glassdoor/glassdoor-direct platforms
`.gitignore`	Excludes TypeScript build artifacts from platform-adapters

services/platform-adapters/glassdoor/src/utils/urlBuilder.ts

Copilot · 2025-11-26T20:44:58Z

services/platform-adapters/glassdoor/src/GlassdoorJobScraper.ts

+  async search(filters: GlassdoorSearchFilters): Promise<GlassdoorJob[]> {
+    await this.initialize();
+
+    const page = await this.newPage();
+    const allJobs: GlassdoorJob[] = [];
+
+    try {
+      const searchUrl = this.buildSearchUrl(filters);
+      console.log(`🔍 Searching Glassdoor: ${searchUrl}`);
+
+      await this.throttle();
+      await page.goto(searchUrl, { 
+        waitUntil: 'networkidle',
+        timeout: this.config.timeout 
+      });
+
+      // Handle potential modal or overlay
+      await this.dismissModals(page);
+
+      let pageNum = 1;
+      const maxResults = this.config.maxResults || 50;
+
+      while (allJobs.length < maxResults) {
+        console.log(`📄 Processing page ${pageNum}...`);
+
+        // Handle modals that might appear mid-session
+        await this.dismissModals(page);
+
+        const jobs = await this.parseJobList(page);
+        allJobs.push(...jobs);
+
+        console.log(`✅ Total jobs collected: ${allJobs.length}`);
+
+        if (allJobs.length >= maxResults) {
+          break;
+        }
+
+        const hasMore = await this.hasNextPage(page);
+        if (!hasMore) {
+          console.log('📍 No more pages available');
+          break;
+        }
+
+        await this.throttle();
+        await this.goToNextPage(page);
+        pageNum++;
+      }
+
+      return allJobs.slice(0, maxResults);
+
+    } finally {
+      await page.close();
+    }
+  }


The search() method is fully overridden in the subclass and doesn't call super.search(), which duplicates pagination logic from the base class. This violates the DRY principle and makes maintenance harder since changes to search logic would need to be duplicated. Consider either calling super.search() and only customizing the modal dismissal, or refactoring the base class to allow hooks for platform-specific behavior.

Copilot · 2025-11-26T20:44:58Z

services/platform-adapters/glassdoor/src/GlassdoorJobScraper.ts

+  private async parseBasicInfoFromDetailPage(page: Page): Promise<GlassdoorJob> {
+    const title = await page.$eval('[data-test="job-title"], .JobDetails_jobTitle__bFQf_, h1', 
+      el => el.textContent?.trim() || ''
+    ).catch(() => 'Unknown Title');
+
+    const company = await page.$eval('[data-test="employer-name"], .EmployerProfile_employerName__Xemli', 
+      el => el.textContent?.trim() || ''
+    ).catch(() => 'Unknown Company');
+
+    const location = await page.$eval('[data-test="emp-location"], .JobDetails_location__4j4Qv', 
+      el => el.textContent?.trim() || ''
+    ).catch(() => '');
+
+    const salary = await page.$eval('[data-test="detailSalary"], .JobDetails_salaryEstimate__cPQyl', 
+      el => el.textContent?.trim() || ''
+    ).catch(() => undefined);
+
+    const ratingText = await page.$eval('[data-test="rating"], .EmployerProfile_ratingValue__2BBWA', 
+      el => el.textContent?.trim() || ''
+    ).catch(() => '');
+
+    const reviewCountText = await page.$eval('[data-test="review-count"]', 
+      el => el.textContent?.trim() || ''
+    ).catch(() => '');
+
+    const easyApply = await page.$('[data-test="easy-apply"], .JobDetails_easyApply__YZw6j')
+      .then(el => el !== null)
+      .catch(() => false);
+
+    // Extract job ID from URL
+    const url = page.url();
+    const idMatch = url.match(/jl=(\d+)/) || url.match(/-JV_(\d+)/);
+    const id = idMatch ? idMatch[1] : `gd-${Date.now()}`;
+
+    return {
+      id,
+      title,
+      company,
+      location,
+      salary: salary || undefined,
+      companyRating: ratingText ? parseRating(ratingText) : undefined,
+      reviewCount: reviewCountText ? parseReviewCount(reviewCountText) : undefined,
+      easyApply,
+      salaryEstimate: salary ? parseSalaryEstimate(salary) : undefined,
+      url,
+      tags: [],
+      source: 'glassdoor',
+      scrapedAt: new Date().toISOString()
+    };
+  }


The parseBasicInfoFromDetailPage method returns a GlassdoorJob object with potentially empty/default values when elements are not found (lines 157-203). However, the method name suggests it only parses "basic info", not a complete job object. This could lead to confusion about what data is guaranteed to be present. Consider renaming to parseJobFromDetailPage or returning a partial type to clarify expectations.

Copilot · 2025-11-26T20:44:59Z

services/agent-orchestrator/src/adapters/glassdoor-adapter.ts

+      const errorMessage = error instanceof Error ? error.message : 'Unknown error';
+      console.error('❌ Glassdoor search failed:', errorMessage);


Mixing error types in the catch block - using Error type check but also catching all errors with unknown. The pattern on line 97 error instanceof Error ? error.message : 'Unknown error' is correct, but this could be simplified using optional chaining: (error as Error)?.message || 'Unknown error'. More importantly, errors should be logged with full stack traces for debugging, not just the message.

Suggested change

const errorMessage = error instanceof Error ? error.message : 'Unknown error';

console.error('❌ Glassdoor search failed:', errorMessage);

const errorMessage = (error as Error)?.message || 'Unknown error';

console.error('❌ Glassdoor search failed:', errorMessage, error);

Copilot · 2025-11-26T20:44:59Z

services/agent-orchestrator/src/adapters/glassdoor-adapter.ts

+  private async rateLimitInterceptor(config: any) {
+    const now = Date.now();
+    const timeSinceLastRequest = now - this.lastRequestTime;
+
+    if (timeSinceLastRequest < this.minRequestInterval) {
+      const delay = this.minRequestInterval - timeSinceLastRequest;
+      console.log(`⏳ Glassdoor rate limiting: waiting ${delay}ms`);
+      await new Promise(resolve => setTimeout(resolve, delay));
+    }
+
+    if (this.requestCount >= this.maxRequestsPerWindow) {
+      const windowReset = this.lastRequestTime + this.rateLimitWindow;
+      if (now < windowReset) {
+        const delay = windowReset - now;
+        console.log(`⚠️  Glassdoor rate limit reached: waiting ${Math.ceil(delay / 1000)}s`);
+        await new Promise(resolve => setTimeout(resolve, delay));
+        this.requestCount = 0;
+      }
+    }
+
+    this.lastRequestTime = Date.now();
+    this.requestCount++;
+
+    return config;


The rate limiting logic has a potential race condition. The requestCount is checked and incremented in the interceptor (lines 59-70), but if multiple requests are made concurrently, they could all pass the check before any increments the counter. Consider using an atomic operation or mutex to ensure thread-safety, or document that this adapter should not be used concurrently.

@copilot @claude Provide a fix

Fixed in commit 2dca7cc. Implemented a mutex pattern using Promise chaining for thread-safe rate limiting:

Added acquireRateLimitLock() method that ensures only one request processes rate limiting at a time

Uses Promise chaining to serialize access - each request waits for the previous to complete before checking/updating counters

Added windowStartTime for proper rate limit window tracking

Updated getRateLimitStatus() to use the correct window timing

The adapter is now safe for concurrent access.

Copilot · 2025-11-26T20:45:00Z

services/platform-adapters/glassdoor/src/GlassdoorJobScraper.ts

+      console.log('⚠️ No job listings found on page');
+      return [];
+    }
+
+    // Get page HTML and parse job cards
+    const html = await page.content();
+    const rawJobs = parseJobListHtml(html);
+
+    // Transform raw job cards to GlassdoorJob format
+    const jobs: GlassdoorJob[] = rawJobs.map(raw => {
+      const job = transformRawJobCard(raw);
+
+      // Parse salary estimate if available
+      if (raw.salary) {
+        job.salaryEstimate = parseSalaryEstimate(raw.salary);
+      }
+
+      return job;
+    });
+
+    console.log(`📋 Parsed ${jobs.length} jobs from current page`);
+    return jobs;
+  }
+
+  /**
+   * Check if there's a next page available
+   */
+  protected async hasNextPage(page: Page): Promise<boolean> {
+    try {
+      const html = await page.content();
+      return hasNextPageButton(html);
+    } catch {
+      return false;
+    }
+  }
+
+  /**
+   * Navigate to the next page
+   */
+  protected async goToNextPage(page: Page): Promise<void> {
+    const nextButton = await page.$('[data-test="pagination-next"], .nextButton, [aria-label="Next"]');
+
+    if (nextButton) {
+      await nextButton.click();
+      await page.waitForLoadState('networkidle');
+    }
+  }
+
+  /**
+   * Get detailed job information from a job detail page
+   */
+  async getJobDetails(jobUrl: string): Promise<GlassdoorJob> {
+    await this.initialize();
+
+    const page = await this.newPage();
+
+    try {
+      await this.throttle();
+      await page.goto(jobUrl, {
+        waitUntil: 'networkidle',
+        timeout: this.config.timeout
+      });
+
+      // Wait for job description to load
+      await page.waitForSelector('[data-test="job-description"], .JobDetails_jobDescription__uW_fK, #JobDescriptionContainer', {
+        timeout: 10000
+      }).catch(() => {
+        console.log('⚠️ Job description selector not found, continuing...');
+      });
+
+      // Parse the page content
+      const html = await page.content();
+      const detail = parseJobDetailHtml(html);
+
+      // Get basic info from the page
+      const basicInfo = await this.parseBasicInfoFromDetailPage(page);
+
+      return {
+        ...basicInfo,
+        descriptionHtml: detail.description,
+        description: this.stripHtml(detail.description),
+        companySize: detail.companySize,
+        industry: detail.industry,
+        headquarters: detail.headquarters,
+        benefits: detail.benefits,
+        skills: detail.skills,
+        employmentType: detail.employmentType,
+        url: jobUrl,
+        source: 'glassdoor',
+        scrapedAt: new Date().toISOString()
+      };
+
+    } finally {
+      await page.close();
+    }
+  }
+
+  /**
+   * Parse basic job info from the detail page
+   */
+  private async parseBasicInfoFromDetailPage(page: Page): Promise<GlassdoorJob> {
+    const title = await page.$eval('[data-test="job-title"], .JobDetails_jobTitle__bFQf_, h1', 
+      el => el.textContent?.trim() || ''
+    ).catch(() => 'Unknown Title');
+
+    const company = await page.$eval('[data-test="employer-name"], .EmployerProfile_employerName__Xemli', 
+      el => el.textContent?.trim() || ''
+    ).catch(() => 'Unknown Company');
+
+    const location = await page.$eval('[data-test="emp-location"], .JobDetails_location__4j4Qv', 
+      el => el.textContent?.trim() || ''
+    ).catch(() => '');
+
+    const salary = await page.$eval('[data-test="detailSalary"], .JobDetails_salaryEstimate__cPQyl', 
+      el => el.textContent?.trim() || ''
+    ).catch(() => undefined);
+
+    const ratingText = await page.$eval('[data-test="rating"], .EmployerProfile_ratingValue__2BBWA', 
+      el => el.textContent?.trim() || ''
+    ).catch(() => '');
+
+    const reviewCountText = await page.$eval('[data-test="review-count"]', 
+      el => el.textContent?.trim() || ''
+    ).catch(() => '');
+
+    const easyApply = await page.$('[data-test="easy-apply"], .JobDetails_easyApply__YZw6j')
+      .then(el => el !== null)
+      .catch(() => false);
+
+    // Extract job ID from URL
+    const url = page.url();
+    const idMatch = url.match(/jl=(\d+)/) || url.match(/-JV_(\d+)/);
+    const id = idMatch ? idMatch[1] : `gd-${Date.now()}`;
+
+    return {
+      id,
+      title,
+      company,
+      location,
+      salary: salary || undefined,
+      companyRating: ratingText ? parseRating(ratingText) : undefined,
+      reviewCount: reviewCountText ? parseReviewCount(reviewCountText) : undefined,
+      easyApply,
+      salaryEstimate: salary ? parseSalaryEstimate(salary) : undefined,
+      url,
+      tags: [],
+      source: 'glassdoor',
+      scrapedAt: new Date().toISOString()
+    };
+  }
+
+  /**
+   * Strip HTML tags from a string
+   */
+  private stripHtml(html: string): string {
+    return html
+      .replace(/<[^>]*>/g, ' ')
+      .replace(/\s+/g, ' ')
+      .trim();
+  }
+
+  /**
+   * Handle Glassdoor's anti-bot measures
+   * Override the throttle method with more aggressive rate limiting
+   */
+  protected async throttle(): Promise<void> {
+    const now = Date.now();
+    const elapsed = now - this.lastRequestTime;
+
+    // Use a longer throttle time for Glassdoor (default 3 seconds)
+    const throttleMs = this.config.throttleMs || 3000;
+
+    // Add some randomness to avoid detection patterns
+    const jitter = Math.floor(Math.random() * 1000);
+    const totalDelay = throttleMs + jitter;
+
+    if (elapsed < totalDelay) {
+      const delay = totalDelay - elapsed;
+      console.log(`⏳ Throttling: waiting ${delay}ms`);


The console.log statements should be replaced with proper logging using a logger instance. Using console.log directly in production code makes it difficult to control log levels, format output consistently, or redirect logs to different destinations. Consider using a logging library or creating a logger interface.

Copilot · 2025-11-26T20:45:01Z

services/platform-adapters/glassdoor/src/GlassdoorJobScraper.ts

+      await new Promise(resolve => setTimeout(resolve, delay));
+    }
+
+    this.lastRequestTime = Date.now();


The lastRequestTime property is declared as protected in the base class, but the throttle method in GlassdoorJobScraper (line 223) attempts to access it. While this is technically correct for protected members, the overridden throttle() method never calls super.throttle(), which means the base class's lastRequestTime won't be updated. This could cause issues if other methods in the base class rely on lastRequestTime being current.

Suggested change

this.lastRequestTime = Date.now();

this.lastRequestTime = Date.now();

await super.throttle();

services/agent-orchestrator/src/adapters/glassdoor-adapter.ts

services/agent-orchestrator/src/adapters/platform-manager.ts

services/platform-adapters/base/BaseJobScraper.ts

Co-authored-by: Copilot <[email protected]>

vercel · 2025-11-26T20:45:21Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Review	Updated (UTC)
ajob-4-agent	Ready	Preview, Comment	Dec 20, 2025 2:11am

Co-authored-by: Copilot <[email protected]>

groupthinking · 2025-11-26T20:49:10Z

@copilot apply changes based on the comments in this thread

…efactor search() to use hooks, add thread-safety docs Co-authored-by: groupthinking <[email protected]>

services/platform-adapters/glassdoor/src/GlassdoorJobScraper.ts

services/agent-orchestrator/src/adapters/platform-manager.ts

Copilot · 2025-11-26T20:54:43Z

@copilot apply changes based on the comments in this thread

Applied all code review changes in commit d54b047:

Updated User-Agent to Chrome/124, made configurable via SCRAPER_USER_AGENT env var
Removed duplicate playwright from dependencies (kept only in peerDependencies)
Renamed parseBasicInfoFromDetailPage → parseJobFromDetailPage with clarified docs
Refactored throttle() to properly call super.throttle()
Refactored search() to use new beforeParseJobList() hook instead of duplicating base class logic
Added thread-safety documentation to GlassdoorAdapter
Improved error logging to include full stack traces

All 88 tests still passing.

…ifying shared config Co-authored-by: groupthinking <[email protected]>

Co-authored-by: Copilot <[email protected]>

…tead of fragile regex patterns Co-authored-by: groupthinking <[email protected]>

…e concurrent access Co-authored-by: groupthinking <[email protected]>

Co-authored-by: vercel[bot] <35613825+vercel[bot]@users.noreply.github.com>

vercel · 2025-11-27T03:56:14Z

services/agent-orchestrator/src/adapters/glassdoor-adapter.ts

+    const searchParams: Record<string, string> = {};
+
+    if (params.searchTerm) {
+      searchParams.sc = 'keyword=' + encodeURIComponent(params.searchTerm);


The search parameter construction double-encodes the search term, causing malformed query parameters that Glassdoor won't understand.

View Details

📝 Patch Details

diff --git a/services/agent-orchestrator/src/adapters/glassdoor-adapter.ts b/services/agent-orchestrator/src/adapters/glassdoor-adapter.ts index 45e1593..5e70728 100644 --- a/services/agent-orchestrator/src/adapters/glassdoor-adapter.ts +++ b/services/agent-orchestrator/src/adapters/glassdoor-adapter.ts @@ -166,7 +166,7 @@ export class GlassdoorAdapter { const searchParams: Record<string, string> = {}; if (params.searchTerm) { - searchParams.sc = 'keyword=' + encodeURIComponent(params.searchTerm); + searchParams.sc = 'keyword=' + params.searchTerm; } if (params.location) { diff --git a/services/platform-adapters/glassdoor/src/utils/urlBuilder.ts b/services/platform-adapters/glassdoor/src/utils/urlBuilder.ts index 83a9553..6776cb7 100644 --- a/services/platform-adapters/glassdoor/src/utils/urlBuilder.ts +++ b/services/platform-adapters/glassdoor/src/utils/urlBuilder.ts @@ -55,7 +55,7 @@ export function buildGlassdoorSearchUrl(filters: GlassdoorSearchFilters): string // Keywords/search term if (filters.keywords) { - params.sc = 'keyword=' + encodeURIComponent(filters.keywords); + params.sc = 'keyword=' + filters.keywords; } // Location handling

Analysis

Double-encoding of search parameters causes malformed Glassdoor API requests

What fails: Search parameter construction in GlassdoorAdapter.buildSearchParams() and buildGlassdoorSearchUrl() double-encodes the search term, resulting in malformed query parameters that Glassdoor cannot parse.

How to reproduce:

// In services/platform-adapters/glassdoor/src/utils/urlBuilder.ts const filters = { keywords: 'software engineer' }; const url = buildGlassdoorSearchUrl(filters); // Result: ?sc=keyword%3Dsoftware%2520engineer

What happens: When axios/URL API receives a pre-encoded parameter value like sc='keyword=software%20engineer', it automatically URL-encodes all parameter values again, resulting in:

Original space: 20 → encoded to %20 → re-encoded to %2520

Original = sign: → encoded to %3D

Final URL: ?sc=keyword%3Dsoftware%2520engineer

Glassdoor receives: sc parameter value = "keyword=software%20engineer" (with literal %20 instead of space)

Expected behavior: axios should handle all URL encoding automatically. Parameters passed to axios should NOT be pre-encoded. Let axios/URL API do the single encoding pass:

With fix: ?sc=keyword%3Dsoftware+engineer

Glassdoor receives: sc parameter value = "keyword=software engineer" (properly decoded)

Fix applied: Removed encodeURIComponent() calls in both:

services/agent-orchestrator/src/adapters/glassdoor-adapter.ts line 169

services/platform-adapters/glassdoor/src/utils/urlBuilder.ts line 58

Changed from: searchParams.sc = 'keyword=' + encodeURIComponent(params.searchTerm); Changed to: searchParams.sc = 'keyword=' + params.searchTerm;

This allows axios to perform a single, correct URL encoding rather than double-encoding the parameter values.

Verification: All existing tests pass, URL now correctly generates sc=keyword%3Dsoftware+engineer instead of sc=keyword%3Dsoftware%2520engineer

vercel · 2025-11-27T03:56:14Z

services/platform-adapters/glassdoor/src/utils/urlBuilder.ts

+    params.sc = 'keyword=' + encodeURIComponent(filters.keywords);
+  }
+
+  // Location handling
+  if (filters.location) {
+    params.locT = 'C'; // City type
+    params.locKeyword = filters.location;
+  }
+
+  // Remote work filter
+  if (filters.remote) {
+    params.remoteWorkType = '1';
+  }
+
+  // Salary filter
+  if (filters.salary?.min) {
+    params.minSalary = filters.salary.min.toString();
+  }
+  if (filters.salary?.max) {
+    params.maxSalary = filters.salary.max.toString();
+  }
+
+  // Date posted filter
+  const fromAge = mapDatePosted(filters.datePosted);
+  if (fromAge) {
+    params.fromAge = fromAge;
+  }
+
+  // Easy Apply filter
+  if (filters.easyApplyOnly) {
+    params.applicationType = '1';
+  }
+
+  // Company rating filter (minimum stars)
+  if (
+    filters.companyRating !== undefined &&
+    filters.companyRating !== null &&
+    filters.companyRating >= 1 &&
+    filters.companyRating <= 5
+  ) {
+    params.minRating = filters.companyRating.toString();
+  }
+
+  // Company size filter
+  const employeeCount = mapCompanySize(filters.companySize);
+  if (employeeCount) {
+    params.employerSizes = employeeCount;
+  }
+
+  // Job type filter
+  const jobType = mapJobType(filters.jobType);
+  if (jobType) {
+    params.jobType = jobType;
+  }
+
+  // Industry filter
+  if (filters.industry) {
+    params.industry = filters.industry;
+  }
+
+  // Company name filter
+  if (filters.companyName) {
+    params.employer = filters.companyName;
+  }
+
+  // Build query string
+  const queryString = Object.entries(params)
+    .map(([key, value]) => `${key}=${encodeURIComponent(value)}`)
+    .join('&');


The URL builder double-encodes the search term by first encoding it, then encoding it again when building the final query string.

View Details

📝 Patch Details

diff --git a/services/agent-orchestrator/src/adapters/glassdoor-adapter.ts b/services/agent-orchestrator/src/adapters/glassdoor-adapter.ts index 45e1593..5e70728 100644 --- a/services/agent-orchestrator/src/adapters/glassdoor-adapter.ts +++ b/services/agent-orchestrator/src/adapters/glassdoor-adapter.ts @@ -166,7 +166,7 @@ export class GlassdoorAdapter { const searchParams: Record<string, string> = {}; if (params.searchTerm) { - searchParams.sc = 'keyword=' + encodeURIComponent(params.searchTerm); + searchParams.sc = 'keyword=' + params.searchTerm; } if (params.location) { diff --git a/services/platform-adapters/glassdoor/src/utils/urlBuilder.ts b/services/platform-adapters/glassdoor/src/utils/urlBuilder.ts index 83a9553..6776cb7 100644 --- a/services/platform-adapters/glassdoor/src/utils/urlBuilder.ts +++ b/services/platform-adapters/glassdoor/src/utils/urlBuilder.ts @@ -55,7 +55,7 @@ export function buildGlassdoorSearchUrl(filters: GlassdoorSearchFilters): string // Keywords/search term if (filters.keywords) { - params.sc = 'keyword=' + encodeURIComponent(filters.keywords); + params.sc = 'keyword=' + filters.keywords; } // Location handling

Analysis

Double-encoding of search terms in Glassdoor URL builder

What fails: The URL builder pre-encodes the search term keyword using encodeURIComponent(), then encodes it again when building the query string. This results in %20 (space character) being double-encoded as %2520, which malforms the search parameters sent to Glassdoor.

How to reproduce:

import { buildGlassdoorSearchUrl } from './services/platform-adapters/glassdoor/src/utils/urlBuilder'; const filters = { keywords: 'python developer' }; const url = buildGlassdoorSearchUrl(filters); console.log(url); // Actual: https://www.glassdoor.com/Job/jobs.htm?sc=keyword%3Dpython%2520developer // Expected: https://www.glassdoor.com/Job/jobs.htm?sc=keyword%3Dpython%20developer

What happens vs. expected behavior:

Current (broken): sc=keyword%3Dpython%2520developer - the space becomes %2520 (double-encoded)

When decoded once: keyword=python%20developer (still has encoded space)

Expected: sc=keyword%3Dpython%20developer - space should be %20 only (single-encoded)

Root cause: At line 58 of services/platform-adapters/glassdoor/src/utils/urlBuilder.ts, the code pre-encodes the search term:

params.sc = 'keyword=' + encodeURIComponent(filters.keywords);

Then at lines 124-126, it encodes all parameter values again:

const queryString = Object.entries(params) .map(([key, value]) => `

This causes double-encoding of the entire keyword= parameter value. Since axios automatically applies encodeURIComponent to param values, the same issue exists in services/agent-orchestrator/src/adapters/glassdoor-adapter.ts at line 169.

Fix: Remove the pre-encoding at the parameter construction stage and let the query string builder handle all encoding:

In services/platform-adapters/glassdoor/src/utils/urlBuilder.ts (line 58):

params.sc = 'keyword=' + filters.keywords; // Remove encodeURIComponent

In services/agent-orchestrator/src/adapters/glassdoor-adapter.ts (line 169):

searchParams.sc = 'keyword=' + params.searchTerm; // Remove encodeURIComponent

Both locations use the same pattern and require the same fix.

J

groupthinking · 2025-12-20T02:05:46Z

@copilot Fix this repo

Copilot · 2025-12-20T02:07:21Z

@copilot Fix this repo

I see build errors in both services. The main issues are:

Glassdoor adapter:

Missing dependencies: playwright, cheerio, @types/node
TypeScript configuration needs adjustment (lib settings for DOM/ES2019+)

Agent orchestrator:

Missing type definitions: @types/node, @types/jest
Missing dependencies: axios, cheerio
Similar TypeScript lib configuration issues

Would you like me to:

Fix all missing dependencies and type definitions?
Update TypeScript configurations to resolve lib/target issues?
Address any specific functionality issues?

Or should I focus on a specific area?

…nd playwright dev dependency Co-authored-by: groupthinking <[email protected]>

vercel · 2025-12-20T02:19:08Z

services/platform-adapters/base/BaseJobScraper.ts

  /**
-   * Export jobs to JSON file
+   * Search for jobs with the given filters
   */
-  async exportToJson(jobs: Job[], filePath: string): Promise<void> {
-    const fs = await import('fs/promises');
-    const path = await import('path');
+  async search(filters: F): Promise<T[]> {
+    await this.initialize();
+
+    const page = await this.newPage();
+    const allJobs: T[] = [];
+
+    try {
+      const searchUrl = this.buildSearchUrl(filters);
+      await this.throttle();
+      await page.goto(searchUrl, { 
+        waitUntil: 'networkidle',
+        timeout: this.config.timeout 
+      });

-    // Ensure directory exists
-    const dir = path.dirname(filePath);
-    await fs.mkdir(dir, { recursive: true });
+      let pageNum = 1;
+      const maxResults = this.config.maxResults || 50;

-    // Write jobs to file
+      while (allJobs.length < maxResults) {
+        const jobs = await this.parseJobList(page);
+        allJobs.push(...jobs);
+
+        if (allJobs.length >= maxResults) {
+          break;
+        }
+
+        const hasMore = await this.hasNextPage(page);
+        if (!hasMore) {
+          break;
+        }
+
+        await this.throttle();
+        await this.goToNextPage(page);
+        pageNum++;
+      }
+
+      return allJobs.slice(0, maxResults);
+
+    } finally {
+      await page.close();
+    }
+  }


The shared BaseJobScraper class is missing the beforeParseJobList() hook and its integration into the search() method, making it architecturally incomplete compared to the improved version in the Glassdoor module.

View Details

📝 Patch Details

diff --git a/services/platform-adapters/base/BaseJobScraper.ts b/services/platform-adapters/base/BaseJobScraper.ts index 4f7c267..9c4ddb8 100644 --- a/services/platform-adapters/base/BaseJobScraper.ts +++ b/services/platform-adapters/base/BaseJobScraper.ts @@ -90,6 +90,14 @@ export abstract class BaseJobScraper<T extends Job = Job, F extends SearchFilter this.lastRequestTime = Date.now(); } + /** + * Hook called before parsing job list on each page. + * Subclasses can override to add platform-specific behavior (e.g., dismissing modals). + */ + protected async beforeParseJobList(_page: Page): Promise<void> { + // Default implementation does nothing + } + /** * Search for jobs with the given filters */ @@ -107,10 +115,16 @@ export abstract class BaseJobScraper<T extends Job = Job, F extends SearchFilter timeout: this.config.timeout }); + // Allow subclasses to perform platform-specific setup + await this.beforeParseJobList(page); + let pageNum = 1; const maxResults = this.config.maxResults || 50; while (allJobs.length < maxResults) { + // Allow subclasses to handle modals or other interruptions before parsing + await this.beforeParseJobList(page); + const jobs = await this.parseJobList(page); allJobs.push(...jobs);

Analysis

Missing beforeParseJobList() hook in root BaseJobScraper class

What fails: The root BaseJobScraper class in services/platform-adapters/base/BaseJobScraper.ts is missing the beforeParseJobList() extension hook and its integration into the search() method, causing architectural inconsistency with the improved Glassdoor adapter implementation.

How to reproduce: Compare the two files:

Root base: services/platform-adapters/base/BaseJobScraper.ts

Glassdoor base: services/platform-adapters/glassdoor/src/base/BaseJobScraper.ts

The Glassdoor version includes (lines 93-136 in complete form):

A protected beforeParseJobList(_page: Page): Promise<void> hook method with default no-op implementation

A call to await this.beforeParseJobList(page) after the initial page navigation

A call to await this.beforeParseJobList(page) before each iteration of the job parsing loop

The root base class was missing all three elements.

Result before fix: Subclasses using the root base class cannot override beforeParseJobList() to handle platform-specific setup (like dismissing modals) without reimplementing the entire search() method.

Expected behavior: Both base classes should be architecturally identical, providing the same extension points. The beforeParseJobList() hook allows subclasses to add platform-specific behavior (demonstrated in GlassdoorJobScraper.beforeParseJobList() which dismisses modals) without needing to override the entire search() method.

Fix applied: Added the missing hook method and integrated it into the search() method at two points:

After initial page load (allows platform-specific setup before first parse)

Before each parsing loop iteration (allows cleanup between pages, e.g., dismissing modals)

This provides a consistent, extensible base class for all platform adapters and matches the architectural pattern established in the Glassdoor implementation.

Initial plan

3227f2d

Copilot AI assigned Copilot and groupthinking Nov 25, 2025

Copilot started work on behalf of groupthinking November 25, 2025 23:28 View session

Copilot AI and others added 3 commits November 25, 2025 23:44

Add Glassdoor platform adapter with scraper, parsers, and tests

e9c8aee

Co-authored-by: groupthinking <[email protected]>

Add Glassdoor adapter integration to agent-orchestrator

78a633a

Co-authored-by: groupthinking <[email protected]>

Fix hardcoded currency in single-value salary parsing

038528a

Co-authored-by: groupthinking <[email protected]>

Copilot AI changed the title ~~[WIP] Add Glassdoor platform adapter for job scraping~~ Add Glassdoor platform adapter for job scraping Nov 25, 2025

Copilot AI requested a review from groupthinking November 25, 2025 23:52

Copilot finished work on behalf of groupthinking November 25, 2025 23:52

groupthinking approved these changes Nov 26, 2025

View reviewed changes

groupthinking marked this pull request as ready for review November 26, 2025 20:39

Copilot AI review requested due to automatic review settings November 26, 2025 20:39

Copilot started reviewing on behalf of groupthinking November 26, 2025 20:40 View session

Copilot finished reviewing on behalf of groupthinking November 26, 2025 20:42

chatgpt-codex-connector bot reviewed Nov 26, 2025

View reviewed changes

Copilot AI reviewed Nov 26, 2025

View reviewed changes

Update services/agent-orchestrator/src/adapters/platform-manager.ts

ab647c0

Co-authored-by: Copilot <[email protected]>

Update services/platform-adapters/base/BaseJobScraper.ts

449cc62

Co-authored-by: Copilot <[email protected]>

vercel bot deployed to Preview November 26, 2025 20:46 View deployment

Update services/platform-adapters/glassdoor/src/utils/urlBuilder.ts

7defd34

Co-authored-by: Copilot <[email protected]>

vercel bot deployed to Preview November 26, 2025 20:49 View deployment

Copilot started work on behalf of groupthinking November 26, 2025 20:49 View session

Address code review feedback: update User-Agent, fix method naming, r…

d54b047

…efactor search() to use hooks, add thread-safety docs Co-authored-by: groupthinking <[email protected]>

vercel bot reviewed Nov 26, 2025

View reviewed changes

services/platform-adapters/glassdoor/src/GlassdoorJobScraper.ts Outdated Show resolved Hide resolved

services/agent-orchestrator/src/adapters/platform-manager.ts Show resolved Hide resolved

vercel bot deployed to Preview November 26, 2025 20:55 View deployment

Fix race condition in throttle by using local variable instead of mod…

63c50b1

…ifying shared config Co-authored-by: groupthinking <[email protected]>

vercel bot deployed to Preview November 26, 2025 20:58 View deployment

Copilot AI requested a review from groupthinking November 26, 2025 20:58

Copilot finished work on behalf of groupthinking November 26, 2025 20:58

groupthinking and others added 2 commits November 26, 2025 21:28

Update types.ts

8e5f248

Co-authored-by: Copilot <[email protected]>

Update BaseJobScraper.ts

c6d2e31

Co-authored-by: Copilot <[email protected]>

vercel bot deployed to Preview November 27, 2025 03:30 View deployment

Copilot started work on behalf of groupthinking November 27, 2025 03:32 View session

Refactor glassdoor-adapter to use cheerio for robust HTML parsing ins…

530e95c

…tead of fragile regex patterns Co-authored-by: groupthinking <[email protected]>

vercel bot deployed to Preview November 27, 2025 03:41 View deployment

Copilot finished work on behalf of groupthinking November 27, 2025 03:42

Copilot started work on behalf of groupthinking November 27, 2025 03:42 View session

Copilot AI and others added 2 commits November 27, 2025 03:47

Fix race condition in rate limiting with mutex pattern for thread-saf…

2dca7cc

…e concurrent access Co-authored-by: groupthinking <[email protected]>

Update GlassdoorJobScraper.ts

06e1b71

Co-authored-by: vercel[bot] <35613825+vercel[bot]@users.noreply.github.com>

vercel bot deployed to Preview November 27, 2025 03:48 View deployment

Copilot finished work on behalf of groupthinking November 27, 2025 03:48

groupthinking previously approved these changes Nov 27, 2025

View reviewed changes

vercel bot reviewed Nov 27, 2025

View reviewed changes

Copilot started work on behalf of groupthinking December 20, 2025 02:06 View session

Fix TypeScript config and dependencies: add DOM lib, ES2020 target, a…

66649ef

…nd playwright dev dependency Co-authored-by: groupthinking <[email protected]>

Copilot AI requested a review from groupthinking December 20, 2025 02:11

Copilot finished work on behalf of groupthinking December 20, 2025 02:11

vercel bot deployed to Preview December 20, 2025 02:11 View deployment

vercel bot reviewed Dec 20, 2025

View reviewed changes

groupthinking approved these changes Dec 22, 2025

View reviewed changes

	// Phase 1: JobSpy platforms (4 core platforms)
	const jobSpyPlatforms = params.platforms.filter(p =>
	['indeed', 'linkedin', 'glassdoor', 'ziprecruiter'].includes(p)
	);

		const errorMessage = error instanceof Error ? error.message : 'Unknown error';
		console.error('❌ Glassdoor search failed:', errorMessage);

	this.lastRequestTime = Date.now();
	this.lastRequestTime = Date.now();
	await super.throttle();

Add Glassdoor platform adapter for job scraping #39

Are you sure you want to change the base?

Add Glassdoor platform adapter for job scraping #39

Uh oh!

Conversation

Copilot AI commented Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Objective

Related Issue

Requirements

Technical Specifications

Directory Structure

Glassdoor-Specific Job Interface

GlassdoorJobScraper Class

URL Builder Utility

Uh oh!

jazzberry-ai bot commented Nov 25, 2025

Uh oh!

coderabbitai bot commented Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Other AI code review bot(s) detected

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Key Changes

Reviewed changes

Uh oh!

Uh oh!

Copilot AI Nov 26, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 26, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 26, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 26, 2025

Choose a reason for hiding this comment

Uh oh!

groupthinking Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 26, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 26, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

vercel bot commented Nov 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

groupthinking commented Nov 26, 2025

Uh oh!

Uh oh!

Uh oh!

Copilot AI commented Nov 26, 2025

Uh oh!

vercel bot Nov 27, 2025

Choose a reason for hiding this comment

Analysis

Double-encoding of search parameters causes malformed Glassdoor API requests

Uh oh!

vercel bot Nov 27, 2025

Choose a reason for hiding this comment

Analysis

Double-encoding of search terms in Glassdoor URL builder

Uh oh!

Copilot AI commented Nov 25, 2025 •

edited

Loading

coderabbitai bot commented Nov 25, 2025 •

edited

Loading

vercel bot commented Nov 26, 2025 •

edited

Loading