Featured run

Subject 2-M16 — Deployment Run 2026-03-19

Current source: combined_20260319_081200_pruned_rest_events_0_1_2 · 20260319_075520

Action accuracy: 89.79%

Finger accuracy on non-rest windows: 85.96%

Event-level joint accuracy: 87.60%

How metrics are labeled

Saved test split

89.79% / 85.96%

Action and finger accuracy

Action accuracy is reported on all test windows; finger accuracy is reported on non-rest windows.

2,301 test windows; 1,994 non-rest

Primary holdout

84.66%

Joint accuracy with rest and applicability diagnostics

Adds rest and applicability diagnostics beyond the saved test split.

Deployment pair invariant passed; 2.26% applicability FN on true non-rest

Event-level holdout

92.56% / 87.60%

Action and joint accuracy by event

Majority vote over all held-out windows belonging to the same session event.

121 total events; 118 non-rest events

Chronological replay

84.30%

Core full-session replay

Replay across the two core movement sessions on a longer chronological trace.

95.99% Rest TPR; 18.07% applicability FP on true rest

Pseudo-live replay

86.42%

Current per-finger decision path

Replay through the current decision path, including precision, event hit rate, and false rest actuation.

0.25% false REST actuation; 95.37% would-send precision; 31.56% throughput coverage

Harder replay session

71.96%

March 17 realism session

A harder pseudo-live check that exposes where applicability recall is still weak.

52.98% applicability FN on true non-rest

Selection context

The March 19 checkpoint now ships with per-finger actuation defaults.

The trained checkpoint is still 20260319_075520. The public deployment bundle has been updated around it: same-finger holds prevent command chatter, other fingers can still actuate immediately, and the public metrics now report throughput separately from accuracy.

95.37%

Would-send precision

Among non-rest windows that passed the current per-finger actuation gate, 95.37% carried the correct movement command.

0.25%

False rest actuation

Only 6 of 2,404 true REST replay windows produced a movement command under the current defaults.

91.11%

Event hit rate

The replay hit 543 of 596 movement events at least once, which is the better responsiveness summary than window-level send coverage.

2,595 configs

Postprocess ablation

The March 16, 2026 website update documents a 2,595-config ablation over thresholds, smoothing, hysteresis, adjacency, and finger-mode settings.

How the featured run was chosen

  • The public bundle still uses the March 19 checkpoint because it combines strong held-out decoding with the safest validated deployment behavior.
  • The May 2026 replay uses the corrected per-finger hold semantics: a finger holds its last command briefly, but other fingers are not globally blocked.
  • The old 10.57% would-send recall was a global-gate throughput number. It should not be read as a 10.57% accuracy ceiling.
  • A future replacement should beat the current bundle on precision, event hit rate, latency, and false REST actuation, not only on offline holdout accuracy.

Training snapshot

Architecture

CNNLSTMFingerActionNet

The March 19 checkpoint combines action decoding, active-finger decoding, and a dedicated finger-applicability head.

Optimization

60 epochs · batch 64 · lr 0.001 · seed 43

These values come from the restored March 19 deployment metrics bundle.

Split policy

group_trial · test_size 0.2 · calibration_size 0.1

The holdout bundle stays tied to a fixed split while calibration is separated from the main train/test partition.

Input + preprocessing

64 x 4 windows · center_detrend

Per-window centering and detrending are frozen into the reference run's preprocessing and normalizer config.

Interactive latent space

Curated PCA views

Each point is one EEG window projected from the learned latent representation into a three-component PCA view. PCA is shown here because it preserves a linear comparison across the full dataset, the training split, and the held-out test split.

The train/test pair stays colored by true finger so the geometry is directly comparable across splits. Correctness and deployment-gating views are available on the visualization page.

Open visualization page

PCA · Full dataset

Full-dataset PCA colored by true finger

Each point is one EEG window, projected into three principal components and colored by the labeled finger.

Separated regions suggest structured learned organization, while overlap marks similar or harder windows.

Train/test comparison at a glance

Full dataset: 12,447 windows. Train split: 10,146 windows. Held-out test split: 2,301 windows. Scan the pair below for whether the held-out geometry still resembles the training structure under the same color coding.

Split comparison

Same projection family, same label colors, different split membership.

PCA · Train split

Train-split PCA colored by true finger

This restricts the PCA view to the 10,146 training windows while keeping the same true-finger coloring used in the full-dataset view.

Keeping train and test on the same coloring makes split-to-split geometry easier to compare directly.

PCA · Test split

Test-split PCA colored by true finger

This shows the 2,301 held-out test windows only, using the same true-finger coloring so the class structure can be compared against the training split.

If the held-out view preserves similar neighborhoods rather than collapsing, the representation is carrying structure beyond the fitting set.

Figures

These figures show error structure and confidence behavior for the featured run.

Action confusion matrix for 2-m16

Action confusion matrix

REST, OPEN, and CLOSE confusion for the featured bundle.

Finger confusion matrix for 2-m16

Finger confusion matrix (non-rest)

Finger-level confusion after rest windows are removed from the task.

Calibration figure for 2-m16

Evaluation summary

Confusion matrices and reliability diagrams for the featured bundle.

Uncertainty scatter for 2-m16

Confidence and uncertainty scatter

Confidence spread for the featured bundle.

Topomaps and signal evidence

These figures add signal-distribution context for the decoder metrics above and help explain where the model separates movement states.

Action alpha rest-delta topomap

Action alpha rest-delta topomap

Rest-relative alpha maps for REST, OPEN, and CLOSE in the March 19 winning session. OPEN and CLOSE both show the dominant TP10 decrease and smaller TP9 increase that characterize the current 2-M16 action story.

Finger alpha rest-delta topomap with NONE reference

Finger alpha rest-delta topomap with NONE reference

Finger-level rest-delta maps, including NONE as the explicit rest reference. The strongest variation remains concentrated on TP10 and TP9, which helps explain why lateral Muse 2 channels carry most of the finger-separation load.

Published runs

The table keeps the earlier 1-M16 bundle visible as a methods baseline while separating it from the current 2-M16 deployment-facing stack.

1-m16-500

March 5, 2026

Historical baseline

Action accuracy

83.94%

Finger accuracy

80.61%

Test windows

2,652

Further Reading

Next steps

  • The current public evidence spans saved test splits, holdout audits, event-level summaries, and offline replay; the next expansion is synchronized live-session validation with prediction logs, hardware traces, and video review.
  • The public corpus currently includes two subject bundles. Broader subject coverage and repeated same-subject sessions are the path toward stronger robustness claims.
  • Pseudo-live replay already exercises the saved Step 7 decision path offline. We have also seen promising preliminary live-inference results under the same preflight, logging, and actuation-gating rules; the next milestone is continuous closed-loop validation.