Skip to content

feature: implement crawler detail page#24

Merged
gignac-cha merged 4 commits into
mainfrom
feature/crawler-detail-page
Apr 20, 2026
Merged

feature: implement crawler detail page#24
gignac-cha merged 4 commits into
mainfrom
feature/crawler-detail-page

Conversation

@gignac-cha
Copy link
Copy Markdown
Owner

@gignac-cha gignac-cha commented Apr 20, 2026

Summary

  • /crawlers/:id route + CrawlerDetailPage for viewing and editing (name, url_pattern, code, input/output schemas)
  • useGetCrawler + useUpdateCrawler hooks with skipToken-guarded detail query
  • UpdateCrawlerInput as web/data discriminated union matching worker contract
  • CrawlersPage cards keyboard-activatable; delete button stopPropagation + aria-label
  • CodeEditorPanel opt-in showDefaultTemplate (off on detail page)
  • Inline JSON schema validation on blur + DOM-warning-free SchemaHelper
  • 26 new tests (119 total passing)

Linear Issue

Closes TES-27

Changes

Area Description
Application.tsx Register /crawlers/:id ProtectedRoute
use-crawler-manager.ts useGetCrawler (skipToken), useUpdateCrawler (web/data union), exact invalidation
CrawlerDetailPage.tsx New — form with dirty-tracking, Save/Revert, inline schema validation, local field validation
CrawlersPage.tsx Cards clickable (role/tabIndex/onKeyDown), delete event.stopPropagation
CodeEditorPanel.tsx showDefaultTemplate?: boolean opt-in prop (default true; off on detail page)
Tests 17 hook + 9 page tests: GET/PUT auth headers, type=data, revert, save error, invalid JSON

Review Fixes Applied

Pre-PR internal review (10 subagent passes) addressed:

# Fix
1 Form overwrite guard — first-load only seed, post-save explicit resync
2 UpdateCrawlerInput discriminated union; empty name/code/url_pattern client validation
3 skipToken + dedicated disabled queryKey (prevents empty-string collision)
4 CodeEditorPanel.showDefaultTemplate opt-in prevents Monaco stale-template display
5 Inline JSON schema validation on blur + aria-invalid + red border
6 Missing test coverage (auth headers, type=data, revert, PUT error)
7 does not fetch when id is undefined settle delay to prevent race false-positive
A SchemaHelper shouldForwardProp filter (DOM warning eliminated)
B Revert/PUT-fail tests strengthened with explicit assertions
C useUpdateCrawler invalidation uses exact: true (no wasted detail refetch)
D input_schema textarea onBlur gated for web (readOnly type)

Test plan

  • pnpm --filter @audio-underview/web test — 119 tests pass
  • Navigate /crawlers, click a card — lands on /crawlers/:id
  • Edit name/url_pattern/code/schemas, Save — server persists, toast confirms
  • Invalid JSON in schema — inline error, Save disabled, no PUT fired
  • Clear name/code/url_pattern → Save disabled
  • Revert after edit → pristine restored, Save and Revert disabled
  • Keyboard: Tab to card → Enter/Space navigates

🤖 Generated with Claude Code

Summary by CodeRabbit

릴리스 노트

  • 새로운 기능

    • 크롤러 상세 페이지 추가: 개별 크롤러 조회, 편집, 저장/되돌리기 가능
    • 단일 크롤러 조회/업데이트용 API 연동 훅 추가
  • 개선사항

    • 코드 에디터에 기본 템플릿 표시 옵션 추가
    • 카드 네비게이션 및 키보드 접근성 향상(포커스/엔터/스페이스 지원)
    • 삭제 버튼 클릭이 카드 탐색을 방해하지 않도록 동작 개선
  • 테스트

    • 상세 페이지·훅·런너 관련 테스트 추가/수정 및 검증 강화

gignac-cha and others added 2 commits April 20, 2026 00:37
- AuthenticationContext/use-authentication tests: toBeNull → toBeUndefined for user state
- use-crawler-code-runner tests: toBeNull → toBeUndefined for result/error
- crawler-code-runner-function: rename normalize tests since field is now omitted when undefined
- extensions.ts: use object destructuring for fixture parameter (vitest v4 requirement)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds view and edit for individual crawlers, following the schedulers detail
pattern but adapted to the crawler API's full-replacement PUT semantics.

- /crawlers/:id route and CrawlerDetailPage with name, url_pattern, code, and schemas
- useGetCrawler and useUpdateCrawler hooks with skipToken-guarded detail query
- UpdateCrawlerInput as web/data discriminated union matching worker contract
- CrawlersPage cards become keyboard-activatable with delete stopPropagation
- CodeEditorPanel gains opt-in showDefaultTemplate (off on detail page)
- Inline JSON schema validation on blur and DOM-warning-free SchemaHelper
- 26 new tests: hook auth headers, type=data rendering, revert, save error toast

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 20, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: dad413a9-7bbc-4b92-abcb-2deb49a6cf8c

📥 Commits

Reviewing files that changed from the base of the PR and between 649869e and 9dc5064.

📒 Files selected for processing (3)
  • applications/web/sources/hooks/use-crawler-manager.test.tsx
  • applications/web/sources/pages/CrawlerDetailPage.test.tsx
  • applications/web/sources/pages/CrawlerDetailPage.tsx

Walkthrough

새로운 크롤러 상세 기능이 추가되었습니다: 보호된 경로 GET /crawlers/:id 및 페이지 컴포넌트 CrawlerDetailPage, 개별 크롤러 조회/업데이트 훅(useGetCrawler, useUpdateCrawler), 관련 요청 헬퍼와 상세-레벨 캐시 키, UI/테스트 변경들이 포함됩니다.

Changes

Cohort / File(s) Summary
라우팅 및 상세 페이지
applications/web/sources/Application.tsx, applications/web/sources/pages/CrawlerDetailPage.tsx, applications/web/sources/pages/CrawlerDetailPage.test.tsx
GET /crawlers/:id 보호 라우트 추가 및 CrawlerDetailPage 도입. 상세 페이지는 id로 조회, 폼 상태 관리(더티/프리스틴), JSON 스키마 검증, 저장/되돌리기, 로딩/오류 흐름을 구현하고 광범위한 브라우저 테스트가 추가됨.
크롤러 훅 및 요청 헬퍼
applications/web/sources/hooks/use-crawler-manager.ts, applications/web/sources/hooks/use-crawler-manager.test.tsx
useGetCrawler(id)(조건부 단일 조회) 및 useUpdateCrawler()(PUT + 캐시 무효화/업데이트) 추가. getCrawlerRequest/updateCrawlerRequest, 상세 쿼리 키(crawlerDetailKey, CRAWLER_DETAIL_DISABLED_KEY) 및 업데이트 입력 타입 도입. 관련 단위 테스트 추가/확장.
컴포넌트 변화
applications/web/sources/components/crawlers/CodeEditorPanel.tsx, applications/web/sources/pages/CrawlersPage.tsx
CodeEditorPanel에 선택적 showDefaultTemplate prop 추가로 빈 입력 처리 변경. CrawlerCard를 버튼 역할로 전환(포커스, 키보드 내비게이션, 클릭 네비게이션) 및 삭제 버튼 클릭 전파 방지, 접근성 속성 추가.
테스트 어설션 일관화
applications/web/sources/contexts/AuthenticationContext.test.tsx, applications/web/sources/hooks/use-authentication.test.tsx, applications/web/sources/hooks/use-crawler-code-runner.test.tsx, functions/crawler-code-runner-function/tests/index.test.ts
테스트에서 "없음" 값 기대치를 nullundefined로 변경. 코드러너 함수 테스트는 사용자 코드가 undefined 반환 시 result 필드를 누락하도록 기대치 조정.
테스트 인프라/픽스처
applications/web/sources/tests/extensions.ts
Vitest v4 호환을 위해 worker 픽스처 콜백 매개변수를 빈 구조분해({})로 변경하고 관련 ESLint 예외 추가.

Sequence Diagram

sequenceDiagram
    participant User as User
    participant Page as CrawlerDetailPage
    participant Hook as useGetCrawler / useUpdateCrawler
    participant Server as API Server
    participant Cache as Query Cache

    User->>Page: Navigate to /crawlers/:id
    activate Page
    Page->>Hook: useGetCrawler(id)
    activate Hook
    Hook->>Server: GET /crawlers/:id
    Server-->>Hook: CrawlerRow
    Hook->>Cache: Populate detail cache
    Hook-->>Page: crawler data
    deactivate Hook
    Page->>User: Render form with data
    deactivate Page

    User->>Page: Edit fields
    Page->>Page: Track dirty & validate JSON
    Page-->>User: Show inline validation

    User->>Page: Click Save
    Page->>Hook: useUpdateCrawler -> updateCrawler(payload)
    activate Hook
    Hook->>Server: PUT /crawlers/:id
    Server-->>Hook: Updated CrawlerRow
    Hook->>Cache: Invalidate list, update detail cache
    Hook-->>Page: updated crawler
    deactivate Hook
    Page->>User: Update form, show success toast
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

🐰 새 크롤러를 찾아왔네,
경로를 따라 id를 불러오고,
스키마도 까다롭게 검사하네.
저장하면 캐시는 춤추고,
되돌리면 폼은 다시 맑아지네. ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed PR 제목이 변경 집합의 주요 내용을 명확하게 요약합니다. 새로운 크롤러 상세 페이지 구현이 주요 변경 사항이며, 제목은 이를 정확하게 반영합니다.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/crawler-detail-page

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

- CrawlerDetailPage: replace useEffect form seeding with render-phase state
  adjustment (react-hooks/set-state-in-effect rule)
- extensions.ts: disable no-empty-pattern for vitest v4 fixture (runtime
  requires object destructure pattern that the rule forbids)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@gignac-cha
Copy link
Copy Markdown
Owner Author

@coderabbitai review

Review Context

이 PR은 /crawlers/:id 라우트와 CrawlerDetailPage를 추가합니다. Crawler는 full-replacement PUT semantics (worker validateCrawlerBody + handleUpdateCrawler)를 요구하므로, scheduler의 per-field PATCH 패턴과 달리 dirty-tracking form + Save/Revert UX로 구현했습니다. useGetCrawlerskipToken + dedicated disabled queryKey로 id undefined 상태를 안전하게 처리합니다.

주의 깊게 검토가 필요한 영역

  1. Form state vs react-query cache 동기화formpristine을 derived가 아닌 별도 state로 유지. seededForCrawlerID 가드로 crawler id당 1회만 seed (render-phase state adjustment 패턴). 이후 외부 refetch는 무시하여 유저 edits 보호. 단점: 다른 탭에서 같은 crawler 수정 시 UI가 silent 하게 stale 유지 (out of scope로 판단 — multi-tab conflict는 별도 설계 필요).
  2. Discriminated union (UpdateCrawlerInput) — web은 url_pattern required, data는 input_schema required. handleSavecrawler.type 기준으로 payload 분기. 향후 type-mutation 기능 생기면 crawler.type 대신 form-owned type 필요.
  3. Invalidation exact: trueuseUpdateCrawler.onSuccess에서 prefix match로 detail까지 invalidate 되던 낭비를 제거. setQueryData로 작성한 값이 서버 response와 동일하므로 추가 refetch 불필요.
  4. Inline JSON schema validation — onBlur에서 tryParseSchema로 object 타입만 허용 (primitives/array reject). type === 'web' input_schema는 readOnly이므로 onBlur 비활성화. SchemaHelper shouldForwardProp 필터로 DOM warning 방지.
  5. CodeEditorPanel showDefaultTemplate prop — CrawlerNewPage는 기본 true로 새 crawler 템플릿 제공, DetailPage는 false로 편집 중 code 비우면 빈 상태 유지 (DEFAULT_CODE 혼란 방지).

Known deferred (out of scope for TES-27)

  • Multi-tab 동시 편집 conflict banner — updated_at divergence 추적 필요, 제품 결정 대기
  • Client-side length caps (name 255, url_pattern 2048, code 1MB) / regex safety 사전 검증 — 서버가 명확한 400 에러로 reject
  • URL-path correctness test via capturedURL assertion — 현재 MSW handler match로 충분한 signal
  • Card-as-div → <Link> + nested <button> 마이그레이션 — Crawlers/Schedulers 양쪽에 영향, cross-feature refactor로 별도 PR

Internal review summary

10 subagent pass 완료 후 반영: form overwrite guard (render-phase adjustment), discriminated union, skipToken, inline validation, prop-forward filter, exact invalidation, readOnly onBlur gate, test 강화 (119 tests pass, lint clean).

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 20, 2026

@gignac-cha 상세한 리뷰 컨텍스트 감사합니다! PR을 지금 검토하겠습니다.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
applications/web/sources/pages/CrawlersPage.tsx (1)

386-412: ⚠️ Potential issue | 🟡 Minor

상호작용 요소 중첩(접근성) — 추후 Link 리팩토링 시 해결 권장

role="button"이 지정된 CrawlerCard 내부에 네이티브 <button>(DeleteButton)이 중첩되어 있습니다. WAI-ARIA 관점에서 대화형 요소를 다른 대화형 요소 안에 중첩하는 것은 권장되지 않으며, 일부 스크린 리더에서 내부 버튼 포커스/발표가 어색해질 수 있습니다. PR 설명상 "Card → Link 리팩토링"은 본 PR 범위 밖으로 지연된 항목이므로 이번에는 참고만 남깁니다.

추가로 한 가지 확인:

  • onKeyDown에서 event.target !== event.currentTarget 가드로 삭제 버튼 위에서의 Enter/Space가 카드 네비게이션을 트리거하지 않도록 한 점은 적절합니다. 다만 Space 키의 경우 네이티브 <button>의 기본 동작이 이미 click을 발생시키므로, 현재 구현은 의도대로 동작합니다.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@applications/web/sources/pages/CrawlersPage.tsx` around lines 386 - 412, The
CrawlerCard (role="button") contains an interactive DeleteButton which can
trigger the card's navigation; to avoid nested-interaction issues without a full
Link refactor, update the DeleteButton handlers to stop the card from receiving
events: inside the DeleteButton onClick wrapper (the call to handleDelete) call
event.stopPropagation() before invoking handleDelete, and add an onKeyDown on
DeleteButton that calls event.stopPropagation() (and prevents default for Space
if needed) so keyboard events on the delete button don't bubble to the
CrawlerCard onKeyDown; reference CrawlerCard, onKeyDown, DeleteButton, and
handleDelete to locate and change the handlers.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@applications/web/sources/hooks/use-crawler-manager.test.tsx`:
- Around line 249-268: The test relies on a fixed 50ms setTimeout to wait for
microtasks; replace that with a deterministic assertion using Vitest's waitFor
(or assert immediately) so the test isn't flaky. Specifically, remove the `await
new Promise((resolve) => setTimeout(resolve, 50));` and either immediately
assert `expect(result.current.isLoading).toBe(false)` and
`expect(requestCount).toBe(0)` after `renderHook(() => useGetCrawler(undefined),
{ wrapper: createWrapper() })`, or use `await vi.waitFor(() =>
expect(result.current.isLoading).toBe(false))` and then assert
`expect(requestCount).toBe(0)`; keep references to `useGetCrawler`,
`result.current`, `requestCount`, `worker.use` and `MANAGER_URL` to locate the
test.

In `@applications/web/sources/hooks/use-crawler-manager.ts`:
- Around line 25-43: The issue is that validateCrawlerBody unconditionally
normalizes input_schema to { body: 'string' } causing handleUpdateCrawler to
overwrite existing schemas for web crawlers; fix by making input_schema optional
on UpdateCrawlerWebInput (add input_schema?: Record<string, unknown>), change
validateCrawlerBody to only apply the default normalization for creation (POST)
or when input_schema is explicitly absent on POST, and update
handleUpdateCrawler to only include input_schema in the DB update if the request
body actually contains input_schema; reference UpdateCrawlerWebInput,
UpdateCrawlerInput, validateCrawlerBody, and handleUpdateCrawler when making
these changes.

In `@applications/web/sources/pages/CrawlerDetailPage.test.tsx`:
- Around line 242-266: Add a recovery path to the test 'shows inline error and
disables save on invalid JSON schema' by, after filling the invalid JSON and
asserting the error and disabled save, replacing the textarea
(page.getByLabelText('Output schema')) with a valid JSON string (e.g. '{}'),
assert the inline error message is gone and the save button
(page.getByRole('button', { name: /Save/ })) becomes enabled, click save, then
assert putCount increments (ensuring the worker.put handler is invoked) to
verify the PUT is performed; update assertions around saveButton and putCount
accordingly.

In `@applications/web/sources/pages/CrawlerDetailPage.tsx`:
- Around line 486-489: The spinner is currently visual-only; update the
CrawlerDetailPage loading render (the isLoading branch that returns
LoadingContainer and Spinner) to expose loading to screen readers by adding an
accessible status element: keep Spinner visually rendered (optionally
aria-hidden="true") and include a text node with role="status" and
aria-live="polite" (or aria-busy on a wrapper) containing a short message like
"Loading..." that is visually hidden but announced by assistive tech; modify the
LoadingContainer/Spinner rendering in CrawlerDetailPage.tsx to include this
accessible status element so screen readers receive the loading state.
- Around line 459-462: The response handling currently unconditionally
overwrites the user's in-progress edits by calling setForm(next) /
setPristine(next) after updateCrawler(payload); instead, only apply the
server-derived state if the current form still matches the submission payload
(or if a matching request/version token is present). Modify the submit flow
around updateCrawler/deriveFormState to compare the live form state (the form
variable/state used when calling updateCrawler) against payload (or use a
requestId/version on payload and response) and only call setForm(next) and
setPristine(next) when they correspond to the same submission; otherwise ignore
the response-derived overwrite to preserve the user's ongoing edits.
- Around line 550-579: When a schema field is edited the existing validation
error should be cleared or revalidated immediately so Save isn't stuck disabled;
update the onChange handlers for SchemaArea (both input_schema and
output_schema) to also clear or re-run validation for the respective
schemaErrors key (e.g. call setSchemaErrors({...schemaErrors, input_schema:
undefined}) or invoke validateSchemaField('input_schema', newValue) and update
schemaErrors accordingly) in addition to updating form via setForm, so hasError
and aria-invalid reflect the live state and the Save button becomes enabled as
soon as the JSON is corrected (before blur).
- Around line 434-455: The payload for web crawlers currently sends the raw
form.url_pattern which can include leading/trailing whitespace even though
validation used form.url_pattern.trim(); update the submit logic in
CrawlerDetailPage (where payload is built for crawler.type === 'web') to send a
trimmed URL pattern (e.g., compute const trimmedUrlPattern =
form.url_pattern.trim() and use that in the payload's url_pattern) so the stored
regex has no accidental whitespace.

---

Outside diff comments:
In `@applications/web/sources/pages/CrawlersPage.tsx`:
- Around line 386-412: The CrawlerCard (role="button") contains an interactive
DeleteButton which can trigger the card's navigation; to avoid
nested-interaction issues without a full Link refactor, update the DeleteButton
handlers to stop the card from receiving events: inside the DeleteButton onClick
wrapper (the call to handleDelete) call event.stopPropagation() before invoking
handleDelete, and add an onKeyDown on DeleteButton that calls
event.stopPropagation() (and prevents default for Space if needed) so keyboard
events on the delete button don't bubble to the CrawlerCard onKeyDown; reference
CrawlerCard, onKeyDown, DeleteButton, and handleDelete to locate and change the
handlers.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 64e1c618-2e54-48a1-9a48-a6a0ae91362b

📥 Commits

Reviewing files that changed from the base of the PR and between 1701ace and 649869e.

📒 Files selected for processing (12)
  • applications/web/sources/Application.tsx
  • applications/web/sources/components/crawlers/CodeEditorPanel.tsx
  • applications/web/sources/contexts/AuthenticationContext.test.tsx
  • applications/web/sources/hooks/use-authentication.test.tsx
  • applications/web/sources/hooks/use-crawler-code-runner.test.tsx
  • applications/web/sources/hooks/use-crawler-manager.test.tsx
  • applications/web/sources/hooks/use-crawler-manager.ts
  • applications/web/sources/pages/CrawlerDetailPage.test.tsx
  • applications/web/sources/pages/CrawlerDetailPage.tsx
  • applications/web/sources/pages/CrawlersPage.tsx
  • applications/web/sources/tests/extensions.ts
  • functions/crawler-code-runner-function/tests/index.test.ts

Comment thread applications/web/sources/hooks/use-crawler-manager.test.tsx
Comment on lines +25 to +43
interface UpdateCrawlerWebInput {
id: string;
type: 'web';
name: string;
url_pattern: string;
code: string;
output_schema?: Record<string, unknown>;
}

interface UpdateCrawlerDataInput {
id: string;
type: 'data';
name: string;
code: string;
input_schema: Record<string, unknown>;
output_schema?: Record<string, unknown>;
}

type UpdateCrawlerInput = UpdateCrawlerWebInput | UpdateCrawlerDataInput;
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# input_schema 필수 여부 및 PUT 바디 검증 로직 확인
rg -nP -C5 '\bvalidateCrawlerBody\b'
rg -nP -C3 '\binput_schema\b' --type=ts -g '!**/node_modules/**' -g 'workers/**'
ast-grep --pattern $'function updateCrawler($$$) { $$$ }'

Repository: gignac-cha/audio-underview

Length of output: 30698


🏁 Script executed:

# Check usages of UpdateCrawlerInput and related code
cd applications/web && rg -nP 'UpdateCrawler(Web|Data)?Input' --type=ts

# Find the actual PUT request construction
rg -nP -C8 'handleUpdateCrawler|UpdateCrawlerInput' applications/web/sources --type=ts --type=tsx

# Check what the server returns for web crawlers
rg -nP -C3 'type.*web.*input_schema' workers/crawler-manager-worker --type=ts

Repository: gignac-cha/audio-underview

Length of output: 688


🏁 Script executed:

# Find the actual implementation of updateCrawlerRequest and how it constructs the request
rg -nP -A20 'async function updateCrawlerRequest' applications/web/sources/hooks/use-crawler-manager.ts

# Check CrawlerDetailPage to see how it constructs the update payload
rg -nP -B5 -A10 'updateCrawlerRequest|UpdateCrawlerInput' applications/web/sources --type=ts

# Look at the actual PUT request body construction
rg -nP -B3 -A5 'method.*PUT' applications/web/sources --type=ts

Repository: gignac-cha/audio-underview

Length of output: 8681


🏁 Script executed:

# Check how the client constructs the update payload - where is it called from?
rg -nP -B10 'updateCrawler\(' applications/web/sources --type=ts | grep -A10 'CrawlerDetailPage\|crawler\.'

# Find CrawlerDetailPage to see how it constructs the update
fd -type f -name '*CrawlerDetailPage*' applications/web/sources

# Check the actual update logic - what fields are sent
rg -nP -B5 -A15 'type.*web.*name.*url_pattern' applications/web/sources --type=ts

Repository: gignac-cha/audio-underview

Length of output: 2027


🏁 Script executed:

# Examine the exact validation and normalization logic in the worker
cat -n workers/crawler-manager-worker/sources/index.ts | sed -n '65,145p'

# Check if input_schema normalization happens in both create and update paths
rg -nP -B5 -A10 'body.input_schema.*body.*string' workers/crawler-manager-worker/sources/index.ts

# Verify if validateCrawlerBody is truly used for both POST and PUT
rg -nP 'validateCrawlerBody' workers/crawler-manager-worker/sources/index.ts | head -20

Repository: gignac-cha/audio-underview

Length of output: 4282


🏁 Script executed:

# Check what handleUpdateCrawler does after validation - does it use normalized body?
cat -n workers/crawler-manager-worker/sources/index.ts | sed -n '245,280p'

# Check if there's any logic that preserves existing input_schema on PUT
rg -nP -B2 -A8 'handleUpdateCrawler' workers/crawler-manager-worker/sources/index.ts | head -50

Repository: gignac-cha/audio-underview

Length of output: 2217


웹 크롤러 PUT 업데이트 시 input_schema 자동 정규화로 인한 데이터 손실

UpdateCrawlerWebInputinput_schema 필드가 없는 것은 의도적 설계이지만, 서버의 검증 로직이 문제를 야기합니다. validateCrawlerBody 함수는 웹 크롤러에 대해 input_schema를 자동으로 { body: 'string' }으로 정규화합니다(index.ts:141-143). 이 정규화된 값이 handleUpdateCrawler에서 직접 데이터베이스 업데이트에 사용되므로(index.ts:267), 클라이언트가 input_schema를 전송하지 않은 PUT 요청 시에도 기존 값을 덮어쓰게 됩니다.

결과: 웹 크롤러를 업데이트할 때마다 input_schema가 항상 기본값으로 리셋되어, 동적으로 설정된 스키마 정보가 손실됩니다.

해결방안:

  • validateCrawlerBody를 POST(생성)와 PUT(업데이트)에서 다르게 처리하거나
  • PUT 요청 시 명시적으로 제공된 필드만 정규화하거나
  • UpdateCrawlerWebInputinput_schema를 선택적 필드로 추가하여 명시적으로 관리
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@applications/web/sources/hooks/use-crawler-manager.ts` around lines 25 - 43,
The issue is that validateCrawlerBody unconditionally normalizes input_schema to
{ body: 'string' } causing handleUpdateCrawler to overwrite existing schemas for
web crawlers; fix by making input_schema optional on UpdateCrawlerWebInput (add
input_schema?: Record<string, unknown>), change validateCrawlerBody to only
apply the default normalization for creation (POST) or when input_schema is
explicitly absent on POST, and update handleUpdateCrawler to only include
input_schema in the DB update if the request body actually contains
input_schema; reference UpdateCrawlerWebInput, UpdateCrawlerInput,
validateCrawlerBody, and handleUpdateCrawler when making these changes.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

리뷰 감사합니다. 제안 검토 결과 이 주장은 부정확하다고 판단되어 수용하지 않습니다.

Worker 동작 근거 (workers/crawler-manager-worker/sources/index.ts:141-143):

if (crawlerType === 'web') {
  body.input_schema = { body: 'string' };
}

validateCrawlerBody는 POST/PUT 공통 경로이며 web 타입일 때 무조건 body.input_schema{ body: 'string' }으로 덮어씁니다. 이후 handleUpdateCrawler(line 262-269)가 normalized body로 DB를 업데이트합니다.

즉:

  • 기존 web crawler의 DB input_schema도 이미 { body: 'string' } (create 시점에 normalize됨)
  • 클라이언트가 input_schema를 보내든 안 보내든 서버가 덮어씀 → data loss 없음

현재 UpdateCrawlerWebInputinput_schema 필드를 의도적으로 제외하는 이유:

  • 클라이언트가 무시될 payload를 보내지 않도록 타입 수준에서 강제
  • Worker의 normalize 행동을 타입으로 문서화

input_schema?를 옵셔널로 추가하면 오히려 "web에도 보낼 수 있다"는 잘못된 신호를 주므로 현재 엄격한 타입이 더 안전합니다.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gignac-cha, 상세한 설명 감사합니다. 지적이 정확합니다.

validateCrawlerBody가 POST/PUT 공통 경로에서 웹 타입에 대해 무조건 input_schema{ body: 'string' }으로 덮어쓰기 때문에, 클라이언트가 해당 필드를 전송하지 않아도 데이터 손실이 발생하지 않는다는 점을 잘못 분석했습니다. UpdateCrawlerWebInput에서 input_schema를 의도적으로 제외하여 서버의 정규화 동작을 타입 수준에서 문서화하는 현재 설계가 더 안전한 접근 방식임을 인정합니다. 리뷰 코멘트를 철회합니다.


✏️ Learnings added
Learnt from: gignac-cha
Repo: gignac-cha/audio-underview PR: 24
File: applications/web/sources/hooks/use-crawler-manager.ts:25-43
Timestamp: 2026-04-20T11:50:26.268Z
Learning: In `applications/web/sources/hooks/use-crawler-manager.ts`, `UpdateCrawlerWebInput` intentionally excludes the `input_schema` field. The worker's `validateCrawlerBody` function unconditionally overwrites `input_schema` to `{ body: 'string' }` for web-type crawlers on both POST and PUT paths, so web crawler DB records always have this fixed value. Excluding `input_schema` from the type prevents clients from sending an ignored payload and documents the server-side normalization behavior at the type level.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: gignac-cha
Repo: gignac-cha/audio-underview PR: 13
File: applications/web/sources/hooks/use-crawler-code-runner.ts:31-31
Timestamp: 2026-02-13T20:48:39.602Z
Learning: In TypeScript/JavaScript files across the repository, follow the acronym casing rule: if an acronym (ID, URL, JSON, UUID, etc.) is used as a standalone parameter or variable name, use lowercase (e.g., url, id, json). If the acronym is part of a compound name with a prefix, use uppercase for the acronym part (e.g., baseURL, userID, parseJSON). This pattern is documented in CLAUDE.md and should be applied consistently across TS files.

Comment thread applications/web/sources/pages/CrawlerDetailPage.test.tsx
Comment thread applications/web/sources/pages/CrawlerDetailPage.tsx Outdated
Comment thread applications/web/sources/pages/CrawlerDetailPage.tsx
Comment thread applications/web/sources/pages/CrawlerDetailPage.tsx
Comment thread applications/web/sources/pages/CrawlerDetailPage.tsx
- #1: replace setTimeout(50ms) with vi.waitFor for deterministic settle
- #3: add invalid→valid JSON recovery path test (fill valid JSON → Save
  enabled → PUT fires)
- #4: trim url_pattern before payload so leading/trailing whitespace is not
  persisted as part of the regex
- #5: preserve in-progress edits via submittedForm ref comparison — if the
  user typed during the PUT round-trip, setForm keeps their draft instead of
  overwriting with the server response
- #6: LoadingContainer gets role=status + aria-live=polite + aria-label;
  Spinner marked aria-hidden=true
- #7: handleSchemaChange clears the inline schema error immediately when a
  previously invalid field becomes valid JSON (no longer waits for blur)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@gignac-cha
Copy link
Copy Markdown
Owner Author

gignac-cha commented Apr 20, 2026

@coderabbitai review

fixup commit 9dc5064에서 이전 리뷰 findings 반영 완료:

  • Finding 1 setTimeout(50ms)vi.waitFor
  • Finding 3 invalid→valid JSON recovery test 추가
  • Finding 4 url_pattern trim한 값으로 payload 전송
  • Finding 5 submittedForm 참조 비교로 PUT 응답이 사용자 in-progress edits 덮어쓰지 않도록 수정
  • Finding 6 LoadingContainer role=status + aria-live, Spinner aria-hidden
  • Finding 7 handleSchemaChange — valid JSON으로 고치면 blur 없이 inline error 즉시 해제

Finding 2 (input_schema optional 제안)은 worker normalize 동작 (validateCrawlerBody line 141-143) 근거로 reject — 인라인 스레드에 상세 답변 남김.

119 tests pass, lint clean.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 20, 2026

@gignac-cha 꼼꼼한 fixup 작업 감사합니다! 변경 사항을 확인했습니다:

fixup commit에 대한 전체 리뷰를 진행하겠습니다.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

1 similar comment
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 20, 2026

@gignac-cha 꼼꼼한 fixup 작업 감사합니다! 변경 사항을 확인했습니다:

fixup commit에 대한 전체 리뷰를 진행하겠습니다.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@gignac-cha gignac-cha merged commit 77ed94d into main Apr 20, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant