Selection logic
Why this model won
- The March 19 checkpoint replaced the March 18 deployment candidate after the cleaned training corpus, explicit finger-applicability head, and refreshed replay bundle all aligned better than the previous public snapshot.
- Selection favored the combination of strong holdout metrics, stronger replay behavior on the cleaned deployment corpus, and zero committed or sent invalid action-finger pairs across the published holdout and replay bundles.
- The model was chosen because it behaved coherently across saved split metrics, chronological replay, and pseudo-live replay, not because it won on one leaderboard number.
- The harder March 17 realism replay is still conservative on applicability recall, but it remains part of the public selection story because it shows where the deployment stack is still weak.