What changed
We reran tuning for the 2-M16 session captured on 2026-02-16, producing a new model run on 2026-02-26. The new run uses updated loss weighting and evaluation settings (see below), and the report now exposes additional diagnostics beyond accuracy.
- Training weights:
loss_action_weight=2.0,rest_weight=3.0, uniform finger weights. - Split/eval settings:
split_mode=group_trial,test_size=0.2,seed=42, thresholds0.75for action and finger, no smoothing/hysteresis. - Report now includes expanded metrics (F1 variants, REST TPR/FPR/precision, and overall finger accuracy).
Before vs after (2-M16)
| Metric | Before (2026-02-24) | After (2026-02-26) | Δ |
|---|---|---|---|
| Test action accuracy | 89.71% | 85.88% | -3.82 pp |
| Test finger accuracy on non-REST windows | 90.38% | 87.24% | -3.14 pp |
| Train action accuracy | 91.19% | 88.01% | -3.19 pp |
| Train finger accuracy | 89.41% | 88.30% | -1.11 pp |
| Train avg loss | 0.5020 | 0.8820 | +0.3801 |
| Test windows | 2,040 | 2,026 | -14 |
| Test non-REST windows | 1,871 | 1,857 | -14 |
Expanded diagnostics (new report)
- Action F1 (macro/weighted): 0.867 / 0.860.
- Finger F1 (non-REST macro/weighted): 0.730 / 0.873.
- Finger accuracy (overall): 79.96%.
- Finger F1 (overall macro/weighted): 0.696 / 0.767.
- REST: TPR 80.47%, FPR 0.05%, Precision 99.27%, F1 0.889.
Notes
- The tuning refresh reduced both action accuracy and finger accuracy on non-REST windows relative to the 2026-02-24 run, so this configuration is currently a step back in headline accuracy.
- We will keep iterating on weighting and calibration to regain accuracy while preserving better REST control and stability.