mrviduus · mrviduus · Jun 24, 2026 · Jun 24, 2026
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -2,6 +2,10 @@
 
 ## [Unreleased]
 
+### Learning Tutor Agent — web UI (Smart session) (AI-Agent-2) — web (2026-06-24)
+
+The frontend for the Tutor agent: a **"Smart session"** on the Vocabulary page that surfaces the tutor's *reasoning*, not just a card stack. `POST /me/tutor/session` → a **plan view** showing the overall `rationale`, each item's `word` + exercise-type badge + difficulty + per-item `why`, and a closing `readingNudge` (the thesis) — visible, intentional reasoning is the point. Then a **study phase** reusing the existing `FlashCard` (word → flip → Got it / Missed it, `responseTimeMs` measured), and a **HITL feedback loop**: `POST /me/tutor/session/{id}/feedback` re-plans the remainder ("your tutor adjusted your plan") until an empty plan ends it → summary (studied / accuracy / nudge). New `TutorSessionPage` at `/:lang/vocabulary/tutor`, `useTutorSession` state machine (`planning→plan→study→…→summary` + empty/error/signIn), `TutorPlanView`, `tutor.ts` client. **Backend DTO enrichment (anti-join):** to render cards the UI used to re-fetch the user's vocab and join by id — which silently dropped planned cards for users with >100 words (server caps `getWords` at 100 + a non-existent `'recent'` sort). Fixed at the source: `TutorEndpoints` now **loads each plan item's card from the DB scoped to the caller** (`Id IN ids AND UserId == userId` — a second anti-hallucination/isolation re-check) and enriches `TutorPlanItemDto` with `translation`/`definition`/`sentence`/`bookTitle`/`hint`/`distractors`, so the client renders straight from the plan with **no join, nothing dropped**. Also a **re-plan turn cap** (`MaxTurns=6` server-side + an 8-round client backstop) so a persistently-missed card can't loop forever. UI hardening from the adversarial pass: `AbortController`/mounted-guard (no setState-after-unmount), feedback-failure retry re-submits the **same** session (doesn't nuke progress), and untrusted LLM strings (`why`/`rationale`/`nudge`) are line-clamped with an unknown-`exerciseType` fallback (no raw i18n-key leak). `tsc` clean; **584 web + 35 backend Tutor + AiEvals** green; `vite build` green; browser-checked (entry→plan→study→re-plan→summary + empty/unknown-type/long-text/unmount, **0 console errors**). **Deferred**: mobile Tutor UI, SSE plan streaming, generated MC exercises beyond the existing card, admin replay link. Completes AI-Agent-2 (backend shipped earlier).
+
 ### Learning Tutor Agent — plans what to study next over real SRS state (AI-Agent-2) — backend (2026-06-24)
 
 The third and largest agent: a **Tutor** that reasons over the learner's actual vocabulary state and **plans what to study next**, rather than running a fixed review queue. `TutorAgent` runs on the existing `AgentLoop` runtime and calls four thin `ITool`s — `get_due_vocabulary` (due/near-due SRS cards), `get_weak_vocabulary` (lowest-accuracy / earliest-stage words), `get_reading_context` (what they're actually reading — keeps practice tied to reading, the product thesis), and `get_example_sentence` (a real in-context sentence: the learner's saved sentence, else a **spoiler-gated, owner-isolated RAG** pull from their own book) — then emits an **ordered study plan** (`{wordId, word, stage, exerciseType, difficulty, why}` + an overall `rationale` + a `readingNudge`), exercise type/difficulty **recalibrated from the real SRS stage** (recognition→recall→context-cloze). **Server-held `tutor_session`** (new entity/table, jsonb `PlanJson`, status, turn count) persists the plan between turns; **HITL**: `POST /me/tutor/session` starts/resumes and `POST /me/tutor/session/{id}/feedback` re-plans on the learner's results — re-fetching state (so SRS updates are seen), deterministically **dropping cards just answered correctly**, ignoring feedback for ids not in the prior plan, and preserving the session length. **Two hard guarantees, QA-verified**: (1) **anti-hallucination** — every scheduled `wordId` must come from a `get_due`/`get_weak` tool result (harvested ok-only from the transcript), word+stage **re-projected** from the real row, invented ids dropped, empty transcript → empty plan (the model can't fabricate or rename a card); (2) **cross-user isolation** — the example-sentence tool resolves the card with `Id == wordId && UserId == userId` and the RAG path filters on `user_id AND user_book_id`, so no other user's `user_chapter_chunk` content is reachable. All inbound book text (example sentences from user uploads, reading titles) is run through `ExternalTextSanitizer` + length-capped before entering the prompt (prompt-injection boundary). Telemetry: each turn persists an `agent_run` (agent=`tutor`, `tool_calls_count`); route `tutor.agent → gpt-4.1-mini`. **Eval**: `TutorEvalRunner` (deterministic structural rubric over synthetic learner states — due-coverage, weak-targeting, difficulty-appropriateness, no-hallucination, thesis-alignment; a golden where weak ∉ due makes weak-targeting discriminating), admin-runnable `POST /admin/ai-quality/tutor/eval`. EF migration `AddTutorSession` (reversible). `dotnet build` green, `dotnet format` clean; 968 unit + 72 AiEvals tests green. **Deferred**: SSE streaming, the tutor UI surface (frontend/mobile slice), generated free-text exercises beyond MC reuse, longitudinal pedagogical-efficacy A/B (offline evals validate planner mechanics, not learning outcomes). Completes the 3-agent roadmap (`docs/04-dev/agents-roadmap.md`); Agent 1 (Enrichment) + Agent 3 (Librarian) already shipped.

diff --git a/apps/web/src/App.tsx b/apps/web/src/App.tsx
@@ -34,6 +34,7 @@ const UserBookDetailPage = lazy(() => import('./pages/UserBookDetailPage').then(
 const StatsPage = lazy(() => import('./pages/StatsPage').then(m => ({ default: m.StatsPage })))
 const VocabularyPage = lazy(() => import('./pages/VocabularyPage').then(m => ({ default: m.VocabularyPage })))
 const VocabularyReviewPage = lazy(() => import('./pages/VocabularyReviewPage').then(m => ({ default: m.VocabularyReviewPage })))
+const TutorSessionPage = lazy(() => import('./pages/TutorSessionPage').then(m => ({ default: m.TutorSessionPage })))
 const HighlightsPage = lazy(() => import('./pages/HighlightsPage').then(m => ({ default: m.HighlightsPage })))
 const HighlightReviewPage = lazy(() => import('./pages/HighlightReviewPage').then(m => ({ default: m.HighlightReviewPage })))
 import { Header } from './components/Header'
@@ -104,6 +105,7 @@ function LanguageRoutes() {
           <Route path="/stats" element={<StatsPage />} />
           <Route path="/vocabulary" element={<VocabularyPage />} />
           <Route path="/vocabulary/review" element={<VocabularyReviewPage />} />
+          <Route path="/vocabulary/tutor" element={<TutorSessionPage />} />
           <Route path="/highlights" element={<HighlightsPage />} />
           <Route path="/highlights/review" element={<HighlightReviewPage />} />
           <Route path="/library/my/:id" element={<UserBookDetailPage />} />
@@ -157,6 +159,7 @@ function AppRoutes() {
       <Route path="/stats" element={<LegacyRedirect />} />
       <Route path="/vocabulary" element={<LegacyRedirect />} />
       <Route path="/vocabulary/review" element={<LegacyRedirect />} />
+      <Route path="/vocabulary/tutor" element={<LegacyRedirect />} />
       <Route path="/highlights" element={<LegacyRedirect />} />
       <Route path="/highlights/review" element={<LegacyRedirect />} />
       <Route path="/:lang/*" element={<LanguageRoutes />} />

diff --git a/apps/web/src/api/__tests__/tutor.test.ts b/apps/web/src/api/__tests__/tutor.test.ts
@@ -0,0 +1,73 @@
+import { describe, it, expect, vi, beforeEach, afterEach } from 'vitest'
+import { startTutorSession, sendTutorFeedback } from '../tutor'
+
+function mockOk(body: unknown) {
+  return vi.fn().mockResolvedValue({
+    ok: true,
+    status: 200,
+    text: async () => JSON.stringify(body),
+  })
+}
+
+const SAMPLE = {
+  sessionId: 's1',
+  plan: [{ wordId: 'w1', word: 'foo', stage: 1, exerciseType: 'recognition', difficulty: 'easy', why: 'because' }],
+  rationale: 'plan',
+  readingNudge: 'read more',
+  runId: 'r1',
+}
+
+describe('tutor api', () => {
+  beforeEach(() => vi.restoreAllMocks())
+  afterEach(() => vi.unstubAllGlobals())
+
+  it('startTutorSession POSTs maxItems and returns parsed response', async () => {
+    const fetchMock = mockOk(SAMPLE)
+    vi.stubGlobal('fetch', fetchMock)
+
+    const res = await startTutorSession(7)
+
+    expect(res.sessionId).toBe('s1')
+    expect(res.plan).toHaveLength(1)
+    const [url, opts] = fetchMock.mock.calls[0]
+    expect(String(url)).toContain('/me/tutor/session')
+    expect(opts.method).toBe('POST')
+    expect(opts.credentials).toBe('include')
+    expect(JSON.parse(opts.body)).toEqual({ maxItems: 7 })
+  })
+
+  it('startTutorSession sends empty body when maxItems omitted', async () => {
+    const fetchMock = mockOk(SAMPLE)
+    vi.stubGlobal('fetch', fetchMock)
+
+    await startTutorSession()
+
+    const [, opts] = fetchMock.mock.calls[0]
+    expect(JSON.parse(opts.body)).toEqual({})
+  })
+
+  it('sendTutorFeedback POSTs results to the session feedback URL', async () => {
+    const fetchMock = mockOk({ ...SAMPLE, plan: [] })
+    vi.stubGlobal('fetch', fetchMock)
+
+    const results = [{ wordId: 'w1', correct: true, responseTimeMs: 1234 }]
+    const res = await sendTutorFeedback('s1', results)
+
+    expect(res.plan).toHaveLength(0)
+    const [url, opts] = fetchMock.mock.calls[0]
+    expect(String(url)).toContain('/me/tutor/session/s1/feedback')
+    expect(opts.method).toBe('POST')
+    expect(JSON.parse(opts.body)).toEqual({ results })
+  })
+
+  it('rejects on a non-ok response', async () => {
+    const fetchMock = vi.fn().mockResolvedValue({
+      ok: false,
+      status: 503,
+      text: async () => JSON.stringify({ error: 'no tutor' }),
+    })
+    vi.stubGlobal('fetch', fetchMock)
+
+    await expect(startTutorSession()).rejects.toThrow('no tutor')
+  })
+})
diff --git a/apps/web/src/api/tutor.ts b/apps/web/src/api/tutor.ts
@@ -0,0 +1,72 @@
+import { authFetch } from './client'
+
+// Learning Tutor agent (AI-Agent-2). The tutor PLANS what to study next over the learner's real SRS +
+// reading state and hands off to the existing vocabulary-review flow. JSON (SSE deferred). The plan is held
+// server-side in a session so the HITL re-plan turn survives across requests.
+
+// --- Types (mirror Contracts/Agents/TutorDtos.cs, camelCase via the API) ---
+
+/**
+ * One planned study item. The backend now ENRICHES each item with the full card payload (translation,
+ * definition, sentence, bookTitle, hint, distractors), so the UI renders cards straight from the plan —
+ * no separate vocab fetch + join. References a REAL vocab card by `wordId`, with per-item `why` reasoning.
+ */
+export interface TutorPlanItem {
+  wordId: string
+  word: string
+  stage: number
+  exerciseType: string // recognition | recall | context
+  difficulty: string // label string
+  why: string // per-item reasoning
+  translation?: string | null
+  definition?: string | null
+  sentence?: string | null
+  bookTitle?: string | null
+  hint?: string | null
+  distractors: string[] // [] when none, never null
+}
+
+/** The tutor's response: the persisted session, the ordered plan, and the surfaced reasoning. */
+export interface TutorSessionResponse {
+  sessionId: string
+  plan: TutorPlanItem[]
+  rationale: string // overall session reasoning
+  readingNudge: string // ties back to reading (the thesis)
+  runId: string
+}
+
+/** One learner result fed back to the tutor for re-planning. */
+export interface TutorFeedbackResult {
+  wordId: string
+  correct: boolean
+  responseTimeMs: number
+}
+
+// --- API Functions ---
+
+/** Plan a new tutor session over the learner's current state. `maxItems` is optional (server-capped). */
+export async function startTutorSession(maxItems?: number, signal?: AbortSignal): Promise<TutorSessionResponse> {
+  return authFetch<TutorSessionResponse>('/me/tutor/session', {
+    method: 'POST',
+    headers: { 'Content-Type': 'application/json' },
+    body: JSON.stringify(maxItems != null ? { maxItems } : {}),
+    signal,
+  })
+}
+
+/**
+ * Submit the learner's results for the current session and get the re-planned remainder. An empty `plan` in
+ * the response means the session is complete.
+ */
+export async function sendTutorFeedback(
+  sessionId: string,
+  results: TutorFeedbackResult[],
+  signal?: AbortSignal,
+): Promise<TutorSessionResponse> {
+  return authFetch<TutorSessionResponse>(`/me/tutor/session/${sessionId}/feedback`, {
+    method: 'POST',
+    headers: { 'Content-Type': 'application/json' },
+    body: JSON.stringify({ results }),
+    signal,
+  })
+}
diff --git a/apps/web/src/components/vocabulary/TutorPlanView.tsx b/apps/web/src/components/vocabulary/TutorPlanView.tsx
@@ -0,0 +1,55 @@
+import type { TutorPlanItem } from '../../api/tutor'
+import { exerciseLabel, exerciseBadgeClass } from './tutorLabels'
+
+interface Props {
+  rationale: string
+  plan: TutorPlanItem[]
+  readingNudge: string
+  adjusted: boolean
+  t: (key: string) => string
+  onStart: () => void
+}
+
+// The showcase: surfaces the tutor's reasoning as a deliberate "here's your plan and why" view — the visible
+// reasoning is the point, not debug text.
+export function TutorPlanView({ rationale, plan, readingNudge, adjusted, t, onStart }: Props) {
+  return (
+    <div className="tutor-plan">
+      <div className="tutor-plan__rationale">
+        <span className="tutor-plan__rationale-label">
+          {adjusted ? t('tutor.plan.adjustedLabel') : t('tutor.plan.rationaleLabel')}
+        </span>
+        <p className="tutor-plan__rationale-text tutor-clamp tutor-clamp--4">{rationale}</p>
+      </div>
+
+      <ol className="tutor-plan__list">
+        {plan.map((item, i) => (
+          <li key={item.wordId} className="tutor-plan__item">
+            <span className="tutor-plan__item-index">{i + 1}</span>
+            <div className="tutor-plan__item-body">
+              <div className="tutor-plan__item-head">
+                <span className="tutor-plan__item-word">{item.word}</span>
+                <span className={exerciseBadgeClass(item.exerciseType)}>
+                  {exerciseLabel(item.exerciseType, t)}
+                </span>
+                <span className="tutor-plan__item-difficulty tutor-clamp tutor-clamp--1">{item.difficulty}</span>
+              </div>
+              <p className="tutor-plan__item-why tutor-clamp tutor-clamp--3">{item.why}</p>
+            </div>
+          </li>
+        ))}
+      </ol>
+
+      {readingNudge && (
+        <div className="tutor-plan__nudge">
+          <span className="tutor-plan__nudge-icon" aria-hidden>📖</span>
+          <p className="tutor-plan__nudge-text tutor-clamp tutor-clamp--3">{readingNudge}</p>
+        </div>
+      )}
+
+      <button className="tutor-plan__start" onClick={onStart}>
+        {t('tutor.plan.start')}
+      </button>
+    </div>
+  )
+}
diff --git a/apps/web/src/components/vocabulary/tutorLabels.ts b/apps/web/src/components/vocabulary/tutorLabels.ts
@@ -0,0 +1,21 @@
+// Display helpers that tolerate untrusted LLM strings. `exerciseType` and `difficulty` come straight from
+// the model — guard them so an unexpected value doesn't render a raw i18n key (`tutor.exercise.foo`) or
+// blow up the layout.
+
+const KNOWN_EXERCISE_TYPES = new Set(['recognition', 'recall', 'context'])
+
+/**
+ * Label for an exercise type. For a known type we use the i18n key; for anything unexpected from the model
+ * we fall back to the raw value (or a generic label) rather than leaking `tutor.exercise.<garbage>`.
+ */
+export function exerciseLabel(exerciseType: string, t: (key: string) => string): string {
+  if (KNOWN_EXERCISE_TYPES.has(exerciseType)) return t(`tutor.exercise.${exerciseType}`)
+  const raw = exerciseType?.trim()
+  return raw ? raw : t('tutor.exercise.generic')
+}
+
+/** Known types map to a styled badge variant; unknown types get a neutral default. */
+export function exerciseBadgeClass(exerciseType: string): string {
+  const variant = KNOWN_EXERCISE_TYPES.has(exerciseType) ? exerciseType : 'default'
+  return `tutor-badge tutor-badge--${variant}`
+}