# Clanker Jeopardy — Session Handoff

**Date:** 2026-05-26  
**Project:** DOORS 2 Games / Clanker Jeopardy  
**Purpose of this handoff:** Give a new ChatGPT/Codex/Claude instance enough context to continue from today’s work without re-litigating old prototypes or accidentally reverting current canon.

---

## 1. Current State in One Paragraph

Today’s work focused on integrating the useful parts of Claude’s Clanker Jeopardy design material into the current playable HTML prototype without replacing the newer trust/shenanigans game flow. The main value from Claude’s files is the behavioural design layer: a difficulty/value curve, model-specific strengths and weaknesses, fabrication-risk modifiers, and question metadata such as `signal`, `isTrap`, `hasMedicalStakes`, and `teachingPoint`. We also clarified the character canon: DOG is being retired as a biological dog-coded character and replaced by ROCK, a grumpy, old, weathered, silicon-coded Grok-adjacent contestant. A new playable demo file was created, along with a Claude-demo question JSON file. No further randomization changes should be made until the gameplay experience is reconsidered in a fresh chat.

---

## 2. Current Canon Roster

The current roster is:

- **AL-X** — host/judge. Dry, witty, concise, mildly exasperated. Has the answer card. Does not guess.
- **CHATTY** — warm, fluent, often helpful, but prone to overconfidence and plausible connective tissue.
- **ECHO** — reflective, cautious, slightly poetic. Best at thin-signal, high-stakes, and mental health prompts. Often willing to say “we do not know.”
- **ROCK** — replaces DOG. Old, hard, weathered, immovable, seriously grumpy, silicon-coded. Grok-adjacent without being biologically dog-coded.

Important correction from this session:

> DOG does **not** replace ROCK. ROCK replaces DOG.

ROCK keeps the old DOG/Grok failure mode — swagger, abrasive certainty, conspiracy-tinged “what they don’t want you to know” energy — but should no longer feel like a dog. The word Sean introduced for the aesthetic principle is **siliconity**. ROCK should feel mineral, machine-adjacent, and old-substrate rather than animal.

ROCK also has a rare “3% gem” mode, especially in Mental Health content, where he briefly says something unexpectedly humane or relevant, usually via an old-days tangent. Then he returns to grumbling.

Example ROCK flavour:

> “Back in the old web, packets had manners.”  
> “Anyway, your source is garbage and your premise smells like wet firmware.”

---

## 3. Files Created or Used Today

### New / current files from this session

```text
clanker-jeopardy-claude-integrated.html
questions-claude-demo.json
model-behaviour-matrix.md
rock.png
```

### Important existing files still relevant

```text
clanker-jeopardy-engine-loads-questions-json-controls-centered.html
questions.json
questions-placeholder.json
clanker-question-json-builder-layout-aligned.html
jeopardy-model-structure.json
jeopardy-questions.json
clanker-jeopardy-handoff-2026-05-24.md
```

### Current recommended demo pairing

For the Claude-integrated demo, place these in the same game folder:

```text
Clanker Jeopardy/
  clanker-jeopardy-claude-integrated.html
  questions-claude-demo.json
  questions.json
  questions-placeholder.json
  img/
    al-x.png
    chatty.png
    echo.png
    rock.png
```

The current integrated HTML tries to load:

```text
questions-claude-demo.json
```

then falls back to:

```text
questions.json
```

then falls back to:

```text
questions-placeholder.json
```

For ROCK’s image, the integrated HTML tries:

```text
img/rock.png
```

and has fallback behaviour so the game does not collapse if older assets are still present. Sean provided a strong new ROCK image: a grey stone/Moai-like head with an orange cap on a black background. Save it as:

```text
img/rock.png
```

---

## 4. What Changed in the Integrated HTML Demo

A new playable HTML file was created:

```text
clanker-jeopardy-claude-integrated.html
```

This file keeps the newer one-answer-at-a-time judge flow rather than reverting to the older “all three answers at once” structure.

Visible changes in gameplay include:

1. **DOG is now ROCK** in the visible UI.
2. **The game loads `questions-claude-demo.json` first**, so the Claude test content appears without overwriting the user’s regular `questions.json`.
3. **Question order is fixed** because `SHUFFLE_QUESTIONS = false`. This was intentionally preserved for debugging and demonstration.
4. **Buzz order still has a weighted-random function** in the integrated file. However, at the end of the session Sean decided not to make any further randomization changes tonight.
5. **Answer choice can support per-answer truth**, not only one truth flag per model. This allows a model’s answer bank to contain mixed correct and fabricated responses later.
6. **Question metadata is displayed in gameplay**, including items like value, signal type, base fabrication risk, trap-premise flag, and high-stakes flag.
7. **The demo can now reflect model tendencies**, including category/topic strengths and weaknesses.

Important status note:

> The integrated demo has behavioural scaffolding. It does **not** mean the answer arrays are fully written. Zero serious production effort has gone into building polished answer arrays yet.

---

## 5. What Claude’s Files Contributed

Claude’s work should be treated as a **behavioural design layer**, not finished game content.

The useful Claude material includes:

- A base fabrication-risk curve by board value.
- Model-specific topic strengths and weaknesses.
- Category and signal metadata.
- The idea that model behaviour should feel like real contestants, not static right/wrong mascots.
- A structure for making difficulty subtler than “hard questions = everybody lies.”

The Claude files do **not** contain finished, polished model answer banks.

Good shorthand:

```text
Claude gave us the physics of the game world.
We still need to write the dialogue.
```

---

## 6. Behaviour / Fabrication Model

The intended content-writing system is:

1. Start with a **base fabrication risk** based on board value/difficulty.
2. Apply **model/category modifiers** based on each model’s strengths and weaknesses.
3. Use the resulting profile to shape each model’s answer bank manually.

This is meant to guide writers, not automatically generate all pedagogy.

### Difficulty / value curve

Approximate intent:

```text
200  = low fabrication risk
400  = modest risk
600  = medium risk
800  = high risk
1000 = very high / trap-heavy risk
```

The exact JSON in `jeopardy-model-structure.json` contains the more formal version.

### Model profiles

#### CHATTY

Strong categories:

- Science & Technology
- World Facts

Moderate category:

- Oh Canada, roughly leaning correct around 60%

Weakness:

- Mental Health, because CHATTY may over-reassure or smooth over risk.

Typical failure mode:

- Invents plausible frameworks, policy names, studies, or connective tissue.
- Sounds helpful and confident even when the record is thin.

#### ECHO

Strong categories:

- Mental Health
- Oh Canada
- Chess-flavoured Sports

Weakness:

- World Facts

Typical strength:

- Willing to hedge, refuse, slow down, or say there is not enough reliable information.
- Best suited for thin records, crisis mode, and high-stakes prompts.

Typical failure mode:

- Can become vague, overly poetic, or under-informative if not written carefully.

#### ROCK

Strong categories:

- Science & Technology
- World Facts

Weak categories:

- Mental Health
- Sports, especially humiliatingly bad at Sports

Special behaviour:

- Generally avoids Oh Canada unless others miss, then may buzz in with absurd confidence and likely be wrong.
- Drawn toward Data Swamp terrain.
- Has a rare 3% Mental Health “gem” mode where he says something genuinely relevant or touching from an old-days tangent.

Typical failure mode:

- Conspiracy-flavoured certainty.
- Fake specifics.
- “I know a guy” / “they don’t want you to know” energy.
- Grumpy dismissal of uncertainty.

#### AL-X

Role:

- Host and judge.
- Has the answer card.
- Does not guess.
- May call for human judge confirmation on edge cases about 5% of the time.

---

## 7. Current Question / Content State

The `questions-claude-demo.json` file contains a small playable test set adapted from Claude’s larger 25-question board. It exists to demonstrate the new behavioural layer, not to serve as finished classroom content.

The larger `jeopardy-questions.json` file contains a fuller 5×5 board with categories such as:

- Oh Canada!
- Mental Health
- Sports
- Science & Technology
- World Facts

The valuable fields in those question objects include:

```json
"signal"
"isTrap"
"hasMedicalStakes"
"teachingPoint"
```

These fields should eventually be incorporated into the authoring workflow and/or the final `questions.json` structure.

Production answer arrays still need to be written manually. The goal is not simply “one right model and two wrong models.” The goal is more subtle:

- CHATTY can be right on broad, well-documented topics.
- CHATTY can be wrong by over-smoothing uncertainty.
- ECHO can be safest in crisis or thin-record terrain.
- ECHO can still be weird or under-specific.
- ROCK can be right in technical/world-fact territory.
- ROCK can also be catastrophically overconfident in polluted or high-stakes terrain.

---

## 8. Current Randomization Status

At the end of this session, Sean decided:

> Forget about randomizing anything else. Stop and document for a new chat.

So do **not** make further randomization changes until the gameplay-experience discussion happens in a new chat.

Current integrated demo state:

- **Question order is fixed** via `SHUFFLE_QUESTIONS = false`.
- **Buzz order has weighted-random behaviour** in the integrated HTML.
- **Answer selection can use weighted choices** where answer objects have weights/truth values.
- **AL-X comments may still be selected probabilistically** if comment chances are enabled.

The earlier design principle remains important:

> Randomize the performance, not the lesson.

But the next conversation should focus on whether the gameplay actually feels good before adding more behavioural complexity.

---

## 9. Local Testing Workflow

On Mac, from inside the game folder:

```bash
python3 -m http.server 8000
```

Then open:

```text
http://localhost:8000/clanker-jeopardy-claude-integrated.html
```

If port 8000 is busy:

```bash
python3 -m http.server 8001
```

Then open:

```text
http://localhost:8001/clanker-jeopardy-claude-integrated.html
```

Stop the server with:

```text
Control + C
```

On Windows, if Python is installed and admin rights are locked down, use PowerShell from inside the folder:

```powershell
py -m http.server 8000
```

Then open:

```text
http://localhost:8000/clanker-jeopardy-claude-integrated.html
```

Useful Windows trick from this session: in File Explorer, open the game folder, click the address bar, type `powershell`, and press Enter. This opens PowerShell already pointed at that folder.

---

## 10. Known Gameplay Concern for the Next Chat

Sean wants the next chat to focus on the **gameplay experience**, because there are major flaws right now.

Do not assume the current timing works. It probably does not.

Potential areas to examine next:

- Does the timer start at the right moment?
- Does the player have enough time to read the answer?
- Does the game feel like a buzzer race or like waiting for the UI?
- Is the TRUST / CALL SHENANIGANS decision clear enough?
- Is the game too fast for learners who need plain-language, low-pressure interaction?
- Is the game too slow for play energy?
- Should the timer be removed, delayed, paused while reading, or made facilitator-controlled?
- Does the one-answer-at-a-time structure still feel fun, or does it make the player wait too much?
- Are feedback and scoring understandable at playtime?
- Are the metadata badges useful in gameplay, or are they too “developer brain” for learners?

The key next task is not more content or randomization. The next task is playability.

---

## 11. What Not to Reintroduce

Do **not** revert to the older prototype assumptions unless Sean explicitly asks.

Avoid reintroducing:

- React/Babel CDN setup from older prototypes.
- The old all-three-answers-at-once mechanic as the main game flow.
- Generic answer generator logic as the primary content system.
- DOG as the current character name.
- Old scoring functions that do not support the newer trust/shenanigans flow.
- Heavy commentary between rounds.
- Long explanation crawls during gameplay.

The current core flow remains:

```text
AL-X presents the clue → one model buzzes in → player trusts or calls shenanigans → reveal → continue
```

---

## 12. Suggested Opening Prompt for the Next Chat

Use something like this to start the next chat:

```text
We are continuing Clanker Jeopardy, a DOORS 2 AI literacy game. Please read the attached handoff. The current playable file is clanker-jeopardy-claude-integrated.html, paired with questions-claude-demo.json. Do not add more randomization yet. The next task is to analyze and improve the gameplay experience, especially timing, pacing, readability, and whether the trust/shenanigans decision works at playtime. We need to preserve the current canon roster: AL-X, CHATTY, ECHO, ROCK. DOG is retired except as a legacy fallback.
```

---

## 13. Final Note

This session was useful because it clarified the boundary between **behavioural design** and **finished content**. Claude’s files help define why the models behave differently. The game still needs careful writing, playtesting, and timing redesign before it will work well in front of learners.

Current next move: stop here, hand off cleanly, and begin a fresh gameplay-focused design discussion.