2026-02-27

1-M16 tuning refresh (2026-02-27)

Published a new tuned 1-M16 model and compared it to the 2026-02-21 version, plus expanded diagnostics.

Historical note: archived update posts preserve the figures published at that time. For the current verified run bundles, use the results page.

What changed

We reran tuning for the 1-M16 session captured on 2026-02-16, producing a new model run on 2026-02-27. The new run uses updated loss weighting and evaluation settings (see below), and the report now exposes additional diagnostics beyond accuracy.

  • Training weights: loss_action_weight=2.0, rest_weight=3.0, uniform finger weights.
  • Split/eval settings: split_mode=group_trial, test_size=0.2, seed=42, thresholds 0.75 for action and finger, no smoothing/hysteresis.
  • Report now includes expanded metrics (F1 variants, REST TPR/FPR/precision, and overall finger accuracy).

Before vs after (1-M16)

MetricBefore (2026-02-21)After (2026-02-27)Δ
Test action accuracy77.38%79.02%+1.64 pp
Test finger accuracy on non-REST windows86.27%82.92%-3.34 pp
Train action accuracy83.36%81.07%-2.29 pp
Train finger accuracy85.21%84.92%-0.29 pp
Train avg loss0.74211.1787+0.4366
Test windows2,6482,684+36
Test non-REST windows2,5342,565+31

Key takeaway: The 2026-02-27 tuning run improved action accuracy but reduced finger accuracy on non-REST windows, so the overall tradeoff is mixed and needs further iteration.

Expanded diagnostics (new report)

  • Action F1 (macro/weighted): 0.790 / 0.790.
  • Finger F1 (non-REST macro/weighted): 0.818 / 0.829.
  • Finger accuracy (overall): 79.25%.
  • Finger F1 (overall macro/weighted): 0.669 / 0.773.
  • REST: TPR 65.55%, FPR 0.04%, Precision 98.73%, F1 0.788.

Notes

  • The new weighting configuration improves action accuracy but may be hurting finger separation on active windows.
  • We will continue tuning weights and calibration to balance action/REST separation with finger accuracy on non-REST windows.

Links