Selection logic
Why this model won
The current report separates two model roles: April 3 remains a stronger historical offline checkpoint, while March 19 stays public because the deployment replay is stronger on precision, event hits, latency, and rest-period actuation risk.
Holdout action accuracy
March 19
89.79%
April 3
91.83% · offline winner
April 3 improves fixed-split action-head accuracy, so it remains the stronger offline checkpoint for action-state separation.
Holdout joint accuracy
March 19
84.66%
April 3
86.66% · offline winner
April 3 also improves paired action-plus-finger correctness on the holdout, which is useful for model-development tracking before actuation risk is considered.
Event-level joint accuracy
March 19
87.60%
April 3
93.39% · offline winner
April 3 stays ahead after majority voting windows by event, showing that the offline gain is not just single-window variance.
Holdout rest TPR
March 19
98.37% · safety winner
April 3
94.79%
March 19 preserves more true rest windows at the action head, reducing the chance that idle periods enter downstream actuation gates.
Pseudo-live would-send precision
March 19
95.37% · current path
April 3
80.06% · April 24 audit
The current March 19 bundle is substantially more reliable among gated command windows. The April 3 number is retained as historical rollback context, not a same-actuation-path comparison.
Pseudo-live send coverage
March 19
31.56% · throughput
April 3
36.49% · April 24 audit
This is the share of true movement windows that pass the send gate, so it measures command throughput rather than classification accuracy.
False rest actuation
March 19
0.25% · current path
April 3
6.74% · April 24 audit
The current path produced 6 movement commands over 2,404 true REST windows. That rest-period failure mode remains the main safety constraint for the public robot-hand claim.
Pseudo-live committed joint
March 19
86.42% · current path
April 3
86.04% · April 24 audit
After smoothing, applicability checks, stability, and per-finger hold logic are applied, March 19 keeps the deployment score near its held-out decoding ceiling.
| Metric | March 19 | April 3 | Selection interpretation |
|---|---|---|---|
| Holdout action accuracy | 89.79% | 91.83% · offline winner | April 3 improves fixed-split action-head accuracy, so it remains the stronger offline checkpoint for action-state separation. |
| Holdout joint accuracy | 84.66% | 86.66% · offline winner | April 3 also improves paired action-plus-finger correctness on the holdout, which is useful for model-development tracking before actuation risk is considered. |
| Event-level joint accuracy | 87.60% | 93.39% · offline winner | April 3 stays ahead after majority voting windows by event, showing that the offline gain is not just single-window variance. |
| Holdout rest TPR | 98.37% · safety winner | 94.79% | March 19 preserves more true rest windows at the action head, reducing the chance that idle periods enter downstream actuation gates. |
| Pseudo-live would-send precision | 95.37% · current path | 80.06% · April 24 audit | The current March 19 bundle is substantially more reliable among gated command windows. The April 3 number is retained as historical rollback context, not a same-actuation-path comparison. |
| Pseudo-live send coverage | 31.56% · throughput | 36.49% · April 24 audit | This is the share of true movement windows that pass the send gate, so it measures command throughput rather than classification accuracy. |
| False rest actuation | 0.25% · current path | 6.74% · April 24 audit | The current path produced 6 movement commands over 2,404 true REST windows. That rest-period failure mode remains the main safety constraint for the public robot-hand claim. |
| Pseudo-live committed joint | 86.42% · current path | 86.04% · April 24 audit | After smoothing, applicability checks, stability, and per-finger hold logic are applied, March 19 keeps the deployment score near its held-out decoding ceiling. |