Overview
After the April 28, 2026 2-M16 recording, we found two separate issues:
- The event file for the new session had an invalid first mark that became REST-like data.
- Even after repairing those events, the live-control model still failed to recognize the intended movements from that sitting.
The corrected event sequence is now the intended protocol:
- thumb close
- thumb open
- index close
- index open
- middle close
- middle open
- ring close
- ring open
- pinky close
- pinky open
That repair was necessary, but it did not explain the live-control failure by itself.
Key Finding
The repaired session was replayed through the live-inference path as if it were streaming. On the 291 corrected movement windows, the deployed March 19 model chose REST as the raw top action for every window.
| Raw action chosen by the model | Corrected movement windows |
|---|---|
| REST | 291 of 291 |
| OPEN | 0 of 291 |
| CLOSE | 0 of 291 |
This means the failure happened before robot-hand actuation logic. Cooldowns, stability checks, and actuation thresholds could not recover the movement because the model's action head was already treating the signal as REST.
Same-Sitting Replay Results
The full corrected sitting contains 2,226 replay windows:
| Window source | Count | Share |
|---|---|---|
| REST / non-event time | 1,935 | 86.93% |
| OPEN events | 145 | 6.51% |
| CLOSE events | 146 | 6.56% |
Several model variants were tested. None should replace the current deployment model.
| Replay candidate | Training / role | Movement recall | Movement precision | False REST actuation | Reviewer takeaway |
|---|---|---|---|---|---|
| March 19 deployment | Current public baseline | 0.00% | 0.00% | 0.155% | Safe, but missed every corrected movement event. |
| April 4 archived | Earlier conservative reference | 0.00% | 0.00% | 0.052% | Same failure pattern as the deployed model. |
| March + April adaptation | Old corpus plus April 28 session | 0.00% | n/a | 0.000% | Offline metrics improved, but live-style movement recall did not. |
| April-only aggressive | April 28 session, lower REST weight | 17.18% | 15.72% | 12.92% | Movement appeared, but false actuation was far too high. |
| April-only conservative | April 28 session, moderate REST weight | 11.34% | 22.45% | 5.37% | Still unsafe and still missed most movement. |
The important result is not simply that one model underperformed. Conservative models stayed at REST, while session-only tuning produced too much false actuation. That combination points to a data-quality and robustness problem rather than a simple threshold adjustment.
Signal Quality Evidence
The latest live input distribution was compared to the offline reference distribution used by the deployed model. The report now flags the run as shifted_low_amplitude.
| Muse channel | Live RMS ratio vs reference | Interpretation |
|---|---|---|
| TP9 | 0.460 | Much quieter than expected |
| AF7 | 0.894 | Slightly quieter |
| AF8 | 1.016 | In range |
| TP10 | 0.718 | Quieter than expected |
Two channels were materially quieter than expected, with TP9 especially low. This is consistent with the model preferring REST during intended movement. The likely causes are headset contact, placement, or session-to-session amplitude shift.
We also checked whether the channels were secretly ordered incorrectly. Testing all channel permutations did not recover usable movement recall, so the evidence does not support channel order as the main cause.
Engineering Fixes
Several safeguards were added during this investigation:
- Step 1 now rejects invalid open/close marks when no finger is selected, instead of allowing them to become REST-like events.
- The UI launch path now gives clearer warnings around keyboard capture and actuation configuration.
- Live distribution analysis now catches channel-local low-amplitude problems, not only broad signal collapse.
- Live preflight output now includes per-channel RMS ratios for faster diagnosis.
- Pseudo-live replay and archived-model evaluation now handle model-local temperature files more reliably.
- Dataset merging now tolerates expected scalar metadata differences.
These changes make future failures easier to detect and prevent, but they do not make the April 28 low-count session sufficient for a new deployment model.
Decision
No newly trained April 28 model is being promoted.
The current March 19 deployment model remains the conservative baseline. This investigation adds a stricter next requirement: before live robot-hand control, the system needs a short same-sitting calibration and preflight check that confirms both signal quality and movement recall.
The next model should only be accepted if it passes both conditions:
- It sends meaningful non-REST actuation during same-sitting pseudo-live replay.
- It keeps false REST actuation low during rest-by-exclusion replay.
This is a useful negative result. It separates an event-recording bug from a deeper live-signal robustness problem, and it gives a concrete path for the next training session.