Skip to content

fix(reward): add missing detail page patterns for 4 plugins#17

Merged
angosr merged 1 commit intoAffineFoundation:mainfrom
bittoby:fix/add-missing-detail-page-reward-patterns
Apr 10, 2026
Merged

fix(reward): add missing detail page patterns for 4 plugins#17
angosr merged 1 commit intoAffineFoundation:mainfrom
bittoby:fix/add-missing-detail-page-reward-patterns

Conversation

@bittoby
Copy link
Copy Markdown
Contributor

@bittoby bittoby commented Mar 27, 2026

Problem

DETAIL_PAGE_PATTERNS in core/reward.py only covers the 4 original plugins (CoinGecko, Stooq, Taostats, Weather). The 4 newer plugins — ArXiv (#9), OpenLibrary (#2), Open-Meteo (#5), and HackerNews — have no detail page patterns at all.

This means:

  • is_detail_page() always returns False for these plugins
  • The DETAIL_PAGE_VISIT reward signal (+0.03) never fires
  • _extract_asset_from_url() returns None, so no asset confirmation tracking occurs
  • _normalize_url() strips query params that identify resources on HackerNews (?id=) and Open-Meteo (?latitude=&longitude=), causing false "repeated URL" penalties

Impact: Agents browsing detail pages on 4 out of 8 active plugins receive no exploration reward for drilling into content. This silently degrades RL training quality for half the plugin portfolio.

Summary

  • Added detail page regex patterns to DETAIL_PAGE_PATTERNS for all 4 missing plugins
  • Added asset ID extraction to _extract_asset_from_url() for the same plugins
  • Added query param preservation in _normalize_url() for HackerNews and Open-Meteo

What changed

Plugin Detail page pattern Asset ID example
ArXiv /abs/2603.16870 2603.16870
OpenLibrary /works/OL103123W/Dune ol103123w
Open-Meteo /en/docs?latitude=35.68&longitude=139.65 35.68,139.65
HackerNews /item?id=12345 12345

Each pattern was verified against the actual plugin URL parsing logic (arxiv.py, openlibrary.py, openmeteo.py, hackernews.py) and live website URL structures.

Test plan

  • 30 existing reward tests pass, zero regressions
  • Manual verification: is_detail_page() returns True for detail URLs, False for list/search/homepage URLs across all 4 plugins
  • _extract_asset_from_url() correctly extracts asset IDs from all new URL formats
  • _normalize_url() preserves query params that identify resources (HN id, Open-Meteo latitude/longitude)

@bittoby
Copy link
Copy Markdown
Contributor Author

bittoby commented Mar 31, 2026

@angosr please review this PR. thanks

@bittoby
Copy link
Copy Markdown
Contributor Author

bittoby commented Apr 7, 2026

Hi, @angosr any updates for me, please. It took long time since I submitted this PR.

Copy link
Copy Markdown
Contributor

@angosr angosr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: PR #17 — APPROVE

Significance Gate: PASS

Fixes a real bug: 4 of 8 active plugins had no detail page patterns, meaning agents received zero exploration reward for visiting detail pages on ArXiv, OpenLibrary, Open-Meteo, and HackerNews. This silently degrades RL training quality.

Independent verification (not relying on PR's own test claims)

1. Regex patterns — 10/10 correct (tested independently)

URL Expected Result
arxiv.org/abs/2603.16870 detail=True
arxiv.org/abs/2604.00005 (5-digit) detail=True
arxiv.org/abs/2603.16870v2 (version suffix) detail=True
arxiv.org/pdf/2603.16870 (PDF, not abs) detail=False
arxiv.org/list/cs.AI/new (listing) detail=False
openlibrary.org/works/OL103123W/Dune detail=True
openlibrary.org/search?q=dune detail=False
open-meteo.com/en/docs?latitude=-33.87&longitude=151.21 (negative coords) detail=True
open-meteo.com/en/docs (no params) detail=False
news.ycombinator.com/item?id=12345&goto=news (extra params) detail=True

2. URL normalization — preserves essential params, strips noise

  • HN ?id=12345&p=2?id=12345
  • OpenMeteo ?latitude=35.68&longitude=139.65&hourly=temp?latitude=35.68&longitude=139.65
  • OpenMeteo missing longitude → falls through to default (strips all params) ✅

3. Asset extraction — correct for all edge cases

  • ArXiv v2 suffix → extracts 2603.16870 (strips version) ✅
  • OpenMeteo negative coords → extracts -33.87,151.21

4. Existing tests: 30/30 passed, zero regressions

Minor note (non-blocking)

No new automated tests added for the 4 new plugins' patterns. The existing 30 tests only cover the original 4 plugins. Recommend adding parametrized tests in a follow-up, but the patterns are simple regex additions following the established pattern and verified above.

@bittoby
Copy link
Copy Markdown
Contributor Author

bittoby commented Apr 9, 2026

@angosr Thanks for approving. would you merge this PR?

@angosr angosr merged commit 18ce60d into AffineFoundation:main Apr 10, 2026
@bittoby bittoby deleted the fix/add-missing-detail-page-reward-patterns branch April 10, 2026 03:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants