Commit 5c05aff
Fix MAB-vs-VW bin regression: branch override aggressiveness on reward shape
Full-scale validation of three documented benchmarks against the demo
image revealed that MAB-vs-VW had regressed from documented bin A
(2.67× lower regret than VW, mean ratio 0.374) to bin B (0.70× lower
regret, mean ratio 1.438) at 10 seeds × 2000 rounds × 9 cells.
Outbreak pandemic also drifted: 1.20 mean deaths vs documented 0.5.
Root cause:
The Thompson/UCB1 algorithm-choice override in helpers.rs nudged the
chosen option's graph weight to `max + 1e-3`, which after
renormalisation barely shifted the selection distribution. Legacy
weighted-bucket dynamics dominated, and those have asymmetric updates
on binary rewards: `delta = clipped * learning_rate` so `reward=0`
gives `delta=0` (no decrement for failed arms). Result: Thompson's
Beta posterior correctly identified the best arm, but the actual
selection kept exploring inferior arms at 25-30% probability long
after the posterior was sharp.
Fix:
Branch the override on reward shape using `warmup_state.current_algorithm()`
as discriminator (Thompson ⇔ Binary characterization per the
`pick_algorithm` mapping):
- Binary: hard greedy commit on the algorithm's argmax, with
min_exploration as uniform floor. Textbook Thompson Sampling.
- Continuous: keep the legacy soft nudge so weighted-bucket dynamics
smooth around UCB's optimistic argmax. Asymmetric cost of premature
commitment in continuous domains (outbreak: greedy → 3.8× more
deaths) makes hard greedy wrong there.
Validation at full documented scale:
- Vaccine: 4.36× ratio (docs 4.4×) — unchanged ✓
- Outbreak: 2/4 pass, 0.40 deaths (docs 0.5), $25.4B (docs $26.3B) ✓
- MAB: bin A restored (was bin B), ratio 1.19-1.24 across two reruns
MAB headline number (2.67× lower regret) still does not reproduce —
holds at 0.81× / 0.84× across two runs. Filed in known-issues.md with
investigation targets. Bin classification (A) matches docs.
Sibling touches:
- scripts/smoke-test.sh: fixed lycan path from $ROOT/target to
$ROOT/Lycan/target (broken since the Lycan merge).
- Three example .lyc files (calculator, demo_edge_of_chaos,
demo_takeaway_chaos_replay) re-emitted by lycan compile during
demo runs.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>1 parent a33c9c5 commit 5c05aff
8 files changed
Lines changed: 160 additions & 8 deletions
File tree
- Lycan
- examples
- src/server
- docs
- examples/lycan-internals
- scripts
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
7 | 78 | | |
8 | 79 | | |
9 | 80 | | |
| |||
Binary file not shown.
Binary file not shown.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
349 | 349 | | |
350 | 350 | | |
351 | 351 | | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
| 355 | + | |
| 356 | + | |
| 357 | + | |
| 358 | + | |
| 359 | + | |
| 360 | + | |
| 361 | + | |
352 | 362 | | |
353 | 363 | | |
354 | 364 | | |
355 | 365 | | |
356 | | - | |
| 366 | + | |
357 | 367 | | |
358 | 368 | | |
359 | 369 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
99 | 99 | | |
100 | 100 | | |
101 | 101 | | |
| 102 | + | |
102 | 103 | | |
103 | 104 | | |
104 | 105 | | |
| |||
131 | 132 | | |
132 | 133 | | |
133 | 134 | | |
134 | | - | |
135 | | - | |
136 | | - | |
137 | | - | |
138 | | - | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
139 | 166 | | |
140 | 167 | | |
141 | 168 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
13 | 55 | | |
14 | 56 | | |
15 | 57 | | |
| |||
Binary file not shown.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
5 | | - | |
6 | | - | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
7 | 9 | | |
8 | 10 | | |
9 | 11 | | |
| |||
0 commit comments