Symptom
cargo test --features ffi-backend --test kpm_regression test_full_pipeline_pose fails on the current feat/freak-visual-database branch with:
thread 'test_full_pipeline_pose' panicked at crates\core\tests\kpm_regression.rs:164:13:
full_pipeline: pose[0][2] differs by 6.134186e-2
(actual=6.406289e-2, expected=2.721035e-3, tol=1.0e-2)
Reproduces with both --features ffi-backend alone and --features dual-mode (which transitively enables ffi-backend). All other kpm_regression tests (4 of 5) pass.
Reproduction
git checkout feat/freak-visual-database # at 4fea4e2 (M9 #152 merge)
cargo test -p webarkitlib-rs --features ffi-backend --test kpm_regression test_full_pipeline_pose
Expected: PASS. Actual: FAILED at the same pose[0][2] element, same diff value, deterministic.
Not caused by M9-2 (#141)
Discovered while validating M9-2 (RustFreakMatcher + DualFreakMatcher). Verified pre-existing by stashing the M9-2 work and re-running on clean feat/freak-visual-database@4fea4e2 — the failure persists with identical diff value (6.134186e-2). M9-2 doesn't modify the C++ pipeline state this test exercises.
CI gap (verified, not speculated)
Reading .github/workflows/ci.yml:
| CI step |
Command |
Runs kpm_regression with C++ backend? |
build-and-test |
cargo test --workspace |
❌ Default features only — no ffi-backend |
build-and-test |
cargo test --workspace --features simd |
❌ Same — no ffi-backend |
kpm-build |
cargo test -p webarkitlib-rs --lib --features dual-mode |
❌ --lib only; skips tests/*.rs files |
native-example |
cargo run --example simple --features log-helpers |
❌ Not a test |
benchmarks |
(benchmarks) |
❌ Not a test |
No CI step runs cargo test --features ffi-backend --test kpm_regression (or the dual-mode equivalent). That's why this slipped through every M9 PR.
Suggested fix
Part 1 — close the CI gap
Add a step to .github/workflows/ci.yml:
- name: Run kpm_regression integration tests with ffi-backend
run: cargo test -p webarkitlib-rs --features ffi-backend --test kpm_regression
That alone would have caught this regression at the PR that introduced it.
Part 2 — restore green
The test compares the full 3×4 pose element-wise against a hard-coded baseline. The hardest-failing element is pose[0][2] (the [0][2] entry of the rotation matrix) with a 6.1e-2 deviation. Two paths:
- (A) Regenerate the baseline against the current
feat/freak-visual-database C++ pipeline state and update the expected matrix in kpm_regression.rs. The C++ pipeline IS the production baseline going into M9-3 — the test should track its actual output. Effort: ~30 minutes (capture actual via --nocapture, paste into test, document in module docstring). Risk: low.
- (B) Find and fix the actual regression. Bisect over
feat/freak-visual-database's M9 merges (b17db5d → 23e4e46 → e32cd93 → 4fea4e2) to identify which PR shifted the pose. Almost certainly one of the BHC/auto-adjust changes; could be a latent bug or just BHC tree-topology variance bleeding into the pose. Effort: a few hours of bisect + investigation.
I'd lean (A) first (gets CI green so subsequent work isn't blocked), then (B) as a follow-up if anyone wants to chase the root cause. The corner-reprojection metric used by M9 #152 / #153 already absorbs the BHC variance for homography parity; we may decide the pose test should adopt a similar tolerance-aware metric (Frobenius rotation distance + Euclidean translation) — but that's option (B) territory.
Acceptance
Refs
Discovered during the validation step of #141 (M9-2). Filed separately because it's a pre-existing issue with its own scope.
Supersedes #154 (closed, re-filed here with tighter scope and verified CI-gap evidence).
Symptom
cargo test --features ffi-backend --test kpm_regression test_full_pipeline_posefails on the currentfeat/freak-visual-databasebranch with:Reproduces with both
--features ffi-backendalone and--features dual-mode(which transitively enables ffi-backend). All otherkpm_regressiontests (4 of 5) pass.Reproduction
Expected: PASS. Actual: FAILED at the same
pose[0][2]element, same diff value, deterministic.Not caused by M9-2 (#141)
Discovered while validating M9-2 (RustFreakMatcher + DualFreakMatcher). Verified pre-existing by stashing the M9-2 work and re-running on clean
feat/freak-visual-database@4fea4e2— the failure persists with identical diff value (6.134186e-2). M9-2 doesn't modify the C++ pipeline state this test exercises.CI gap (verified, not speculated)
Reading
.github/workflows/ci.yml:kpm_regressionwith C++ backend?build-and-testcargo test --workspaceffi-backendbuild-and-testcargo test --workspace --features simdffi-backendkpm-buildcargo test -p webarkitlib-rs --lib --features dual-mode--libonly; skipstests/*.rsfilesnative-examplecargo run --example simple --features log-helpersbenchmarksNo CI step runs
cargo test --features ffi-backend --test kpm_regression(or the dual-mode equivalent). That's why this slipped through every M9 PR.Suggested fix
Part 1 — close the CI gap
Add a step to
.github/workflows/ci.yml:That alone would have caught this regression at the PR that introduced it.
Part 2 — restore green
The test compares the full 3×4 pose element-wise against a hard-coded baseline. The hardest-failing element is
pose[0][2](the [0][2] entry of the rotation matrix) with a6.1e-2deviation. Two paths:feat/freak-visual-databaseC++ pipeline state and update the expected matrix inkpm_regression.rs. The C++ pipeline IS the production baseline going into M9-3 — the test should track its actual output. Effort: ~30 minutes (capture actual via--nocapture, paste into test, document in module docstring). Risk: low.feat/freak-visual-database's M9 merges (b17db5d → 23e4e46 → e32cd93 → 4fea4e2) to identify which PR shifted the pose. Almost certainly one of the BHC/auto-adjust changes; could be a latent bug or just BHC tree-topology variance bleeding into the pose. Effort: a few hours of bisect + investigation.I'd lean (A) first (gets CI green so subsequent work isn't blocked), then (B) as a follow-up if anyone wants to chase the root cause. The corner-reprojection metric used by M9 #152 / #153 already absorbs the BHC variance for homography parity; we may decide the pose test should adopt a similar tolerance-aware metric (Frobenius rotation distance + Euclidean translation) — but that's option (B) territory.
Acceptance
cargo test -p webarkitlib-rs --features ffi-backend --test kpm_regressionpasses onfeat/freak-visual-database.feat/freak-visual-databaseandmain.Refs
Discovered during the validation step of #141 (M9-2). Filed separately because it's a pre-existing issue with its own scope.
Supersedes #154 (closed, re-filed here with tighter scope and verified CI-gap evidence).