Skip to content

[pull] master from GaijinEntertainment:master#980

Merged
pull[bot] merged 5 commits into
forksnd:masterfrom
GaijinEntertainment:master
May 11, 2026
Merged

[pull] master from GaijinEntertainment:master#980
pull[bot] merged 5 commits into
forksnd:masterfrom
GaijinEntertainment:master

Conversation

@pull
Copy link
Copy Markdown

@pull pull Bot commented May 11, 2026

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

borisbat and others added 5 commits May 10, 2026 11:22
…P tool

Closes the false-positive blind spot in query_log. Today a row with
match_count > 0 looks like a hit even if all returned cards just shared
BM25 tokens with the question. The new `useful` column captures the
caller's "none of these addressed my question" judgment. Default-positive
bias means we never write useful=true; only the negative signal carries
information.

## Schema (utils/mouse/index.das)

- New v4 migration adds `useful` (Option<int>) to query_log via add_column.
- v2 migration converted from `create_table(type<QueryLog>)` to raw SQL
  pinned to the v2-era schema. `create_table` regenerates DDL from the
  current struct, so as soon as v4 added a column the v2 fresh-DB path
  would create it too — and v4 would duplicate. Pinning v2's DDL
  decouples migration history from struct shape.
- `log_query` now returns int64 (the inserted query_id).
- New `mark_no_match(db, id)` helper.
- New `LogMode` enum (All / Misses / Bad / Review). `recent_queries` now
  takes a mode parameter — small breaking change to the bool signature
  (3 callsites updated: 2 tests + cmd_log).

## Per-ask path (hot path)

- `mouse__ask` response now begins with `query_id: N`. Same in CLI.
- New MCP tool `mouse__bad(queryId)` and CLI `mouse bad <id>` mark a hit
  as no-match (UPDATE useful = 0). Idempotent. Distinct error exit codes
  (1 = usage, 2 = no such id) so shell callers can branch.
- `tool_ask` ends with a hint reminding the agent: if NONE of the cards
  address the question, mouse__bad before moving on.

## Wrap-up path (safety net)

- New MCP tool `mouse__log` exposes the existing `recent_queries`
  symmetrically to the CLI. New `--bad` and `--review` flags on `mouse
  log` alongside `--misses`. `--review` shows match_count > 0 AND
  useful IS NULL — the queue for retrospective rating.
- `skills/task_wrap_up.md` Section 1 extended with the `--review` step:
  scan unrated hits, mark unhelpful ones via `mouse__bad`, mouse__add
  the real answer if this session has it.

## Skill / OVERVIEW updates

- `skills/mouse.md`: new trigger-table row + Asking-section bullet
  for the per-ask mark-no-match rule. Frames why we don't mark positive.
- `utils/mouse/OVERVIEW.md`: Operations table gains `bad` + `log`
  entries; query_log schema documented.

## Tests (utils/mouse/tests/test_index.das)

40/40 pass (was 34/34). Six new tests:
- log_query returns increasing ids
- mark_no_match sets useful=0 + starts NULL
- mark_no_match on nonexistent id returns 0
- mark_no_match idempotent (re-mark is no-op)
- recent_queries filters Bad
- recent_queries Review excludes match_count=0 rows

## Opportunistic lint cleanups (lint surfaced from generic instantiation)

- `daslib/linq.das:847` `take`: ternary → min(total, length(arr)) [PERF015]
- `modules/dasSQLITE/daslib/sqlite_linq.das:38` `_first_opt`:
  `length(arr) > 0` → `!empty(arr)` [PERF017]

Note: sqlite_linq.das has 41 other pre-existing lint warnings unrelated
to this PR (PERF013/PERF017/STYLE015/STYLE016) scattered through 4800+
lines. Deliberately left for a separate focused cleanup PR.

Verified end-to-end via CLI smoke (ask → log → bad → log --bad/--review)
and JSON-RPC tools/list (returns mouse__ask, mouse__add, mouse__get,
mouse__rebuild, mouse__bad, mouse__log).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three review comments accepted (one with split verdict):

1. cmd_log empty-state was misleading under filter modes — saying
   "(no queries logged yet)" while a populated log just had no
   rows matching --misses/--bad/--review. Now mode-aware: keep
   the original message only for LogMode.All; print "(no rows
   matching the selected filter)" otherwise.

2. tool_log displayed the raw caller-provided `mode` string
   verbatim — so `mode:""` (omitted) rendered as "mode=" and
   unknown values rendered as "mode=foo" while silently
   returning All. Display now derived from the parsed enum
   via log_mode_display(), so the surfaced mode always matches
   what we filtered by. Did NOT add error-on-unknown defensive
   guard — the JSON schema enum is the right place to enforce
   the contract; runtime validation duplicates it.

3. skills/mouse.md trigger-table row showed `mouse__bad(query_id)`
   while the MCP arg is `queryId` (camelCase, matching `rawQuery`
   convention). Reader copying the doc would hit a parameter-name
   error. One-word fix.

40/40 mouse tests still pass; lint clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…int)

Round-2 Copilot review caught a real bug: my get_int_arg / get_int64_arg
checked `v.value is _number` only, but daslib/json parses JSON integer
literals into `_longint` (int64) — not _number (double). So MCP callers
sending `queryId: 1` (JSON integer) got "missing or invalid 'queryId'"
while `queryId: 1.0` (JSON float) worked. Same latent bug on
tool_ask's `k` and tool_log's `limit`.

The canonical fix turned out to be deletion. daslib/json_boost.das:85-140
already defines operator?? overloads for every numeric type, bool, and
string — each backed by null_coalescing (lines 59-69) which handles BOTH
_number AND _longint. The skill at skills/json.md:105 shows the pattern:
`let first_score = js?["user"]?["scores"]?[0] ?? 0`.

So instead of fixing my four helpers (get_string_arg, get_int_arg,
get_bool_arg, get_int64_arg), I dropped them entirely (-40 lines) and
replaced 18 callsites with direct `args?[key] ?? default` form. The
explicit type annotation on each let-binding (`let k : int = ...`) pins
the ?? overload selection.

Repro before: queryId:1 → "missing or invalid 'queryId'"
Repro after:  queryId:1 → "marked id=1 as no-match. ..."

Also addressed in this commit:

- resolve_log_mode comment said "First-set-wins" but the code is
  fixed-priority misses>bad>review regardless of CLI flag order.
  Comment updated to match reality.

- tests/json/safe.das: filled the small gap in ?? operator coverage.
  Existing tests cover `?? int` (line 14) and `?? string`/`?? bool`,
  but `?? int64` directly off a JSON integer literal (the path
  tool_bad uses) wasn't exercised. One-line addition asserting
  `js?.a ?? -1l == 1l` on a `{ "a": 1 }` payload. 4/4 tests still pass.

40/40 mouse tests still pass; lint clean; format clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot caught that `mouse__bad <id>` reads as a CLI-style positional
invocation, but mouse__bad is an MCP tool taking the named argument
`queryId`. Reworded to "call mouse__bad with the row's id (CLI:
mouse bad <id>)" — descriptive rather than syntax-literal, so the
example can't be copy-pasted into an invalid invocation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
feat(mouse): bad_mouse — mark no-match asks; useful flag + log MCP tool
@pull pull Bot locked and limited conversation to collaborators May 11, 2026
@pull pull Bot added the ⤵️ pull label May 11, 2026
@pull pull Bot merged commit 899425c into forksnd:master May 11, 2026
@pull pull Bot had a problem deploying to github-pages May 11, 2026 02:58 Error
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant