What changed
The public 2-m16 bundle now points at the March 19 winning deployment candidate:
- Session:
combined_20260319_081200_pruned_rest_events_0_1_2 - Run:
20260319_075520
This update replaces the March 18 public bundle with the current winning-model snapshot used by the deployment config and replay tooling.
Architecture changes now reflected on-site
- The public model still uses active-finger decoding, so committed
OPEN/CLOSEpredictions always carry a real finger. - A new finger-applicability head now models whether a finger label is meaningful on each window, instead of pretending the active-finger head should solve REST detection by itself.
- The deployment threshold is now tuned and published as
threshold_applicability = 0.4. - The public site now surfaces applicability false-positive / false-negative rates and the deployment pair invariant, not just offline action and finger accuracy.
Updated headline metrics
- Test action accuracy: 89.79%
- Test finger accuracy on non-REST windows: 87.01%
- Primary holdout joint accuracy: 84.66%
- Primary holdout joint accuracy on non-REST windows: 82.55%
- Primary holdout REST TPR / precision: 98.37% / 80.11%
- Action ECE / finger ECE on non-REST: 2.32% / 2.73%
- Holdout applicability FP on true REST: 18.57%
- Holdout applicability FN on true non-REST: 2.26%
Most importantly, the published deployment invariants are now clean:
- Committed non-REST + NONE rate: 0.00%
- Committed REST + active-finger rate: 0.00%
- Sent non-REST + NONE rate: 0.00%
- Sent REST + active-finger rate: 0.00%
Updated replay ladder
- Cleaned deployment replay: 86.64% committed joint accuracy, 93.32% would-send precision, 0.12% false REST actuation
- Legacy combined replay: 82.98% committed joint accuracy, 89.62% would-send precision, 1.71% false REST actuation
- March 17 realism replay: 71.96% committed joint accuracy, 62.50% would-send precision, 0.09% false REST actuation
The realism replay remains conservative because applicability recall is still weak on that session, but the zero-tolerance pair invariant holds across all published replay bundles.
Site-wide changes
- The featured run card, results overview, and run detail page now reference the March 19 winning deployment candidate.
- The public figure set was refreshed with the new confusion matrices, calibration figures, and report HTML.
- Applicability diagnostics are now shown directly in the results UI and in the public metrics bundle.
- The older topomap block is no longer part of the featured
2-m16bundle because regenerated topomap assets were not part of this March 19 winning-model snapshot.