Can Twitch Plays Pokemon be built on the blockchain?

https://monad-plays-pokemon.up.railway.app/

I spent 1999 sinking hundreds of hours into Pokémon Red, trading turns with my twin brother. I missed the 2014 'Twitch Plays Pokémon' phenomenon, where 120k concurrent users controlled a Game Boy via chat commands, but the technical chaos of it stuck with me. Chat was so clogged that Twitch had to migrate the stream to infrastructure reserved for massive esports competitions. 20-40s input lag turned a simple ledge into a challenge.

Curse you Retnaburn.

Since that time, I spent a decade in the high frequency trading (HFT) world before joining Category Labs (building the client for the Monad blockchain). In mid-December, I asked myself two questions:

Could Twitch Plays Pokémon work on the blockchain? The challenge is latency. A standard Ethereum 12-second block time would be unplayable. Monad features 400 ms block times, and I suspected that would be sufficiently fast.
Could I build a proof of concept with assistance from Claude Code? I have next to no experience with front end development, so I was extremely confident that I could not without AI help. This was a good opportunity to further evaluate Claude Code’s capabilities.

The Architecture

Just like the original Twitch Plays Pokémon, I would need a back end for a global game state and a front end to view and interact with that state. My goal was to incorporate the blockchain in some meaningful way, which makes transactions and state globally readable and auditable. This comes at the cost of gas fees and delay for block production and global consensus or agreement on the next block.

How much would each action cost? A Twitch login is free.
- This one was easy. I would deploy the smart contract on Monad testnet. Testnet tokens have no monetary value, and integrations exist for automatic gas sponsorship.
After an input, how long would the user have to wait before seeing the game state reflect the input?
- For a decentralized blockchain, the tricky part is reaching consensus quickly and safely (see these blog posts). Monad offers speculative real-time data, but I was curious about measuring the “real-world” latency.

<aside> 📈

Trade-to-tick?

In the world of HFT, we obsessed over tick-to-trade times. Every nanosecond from the market data (tick) coming over the wire to firing an order (trade) or cancel was measured, because this was controllable and lower (all else equal) meant you were winning.

When writing a blockchain application for casual users, winning is minimizing the trade (user action) to tick (UI response).

</aside>

How much of the game state should live “on-chain”?
- Modifying state on a decentralized blockchain is expensive. Each of the N nodes of the network has to spend IOPS to maintain consistent state. Consequently, the SSTORE opcode in the Ethereum Virtual Machine is one of the most expensive (Monad is EVM-compatible). A full cartridge save of a Game Boy emulator is 32 KB, which could be represented by 1024 storage slots on an EVM-compatible chain like Monad. For Pokémon Red, the vast majority of important state fit on one 8 KB bank (player name, Pokémon team, inventory, map coordinates), but that’s still a lot of costly state modification and bookkeeping complexity. Could we do better?
- I chose to ignore on-chain game state entirely. Instead, every action or vote by a player would simply call an enum and emit logs. Bob’s vote would show up in the next proposed block as a Vote transaction with an enum action: 3 (representing RIGHT) in the logs. The off-chain indexer could (cheaply) listen for votes, perform appropriate aggregation and input the resulting action into the emulator. In this case, the indexer is acting as the authoritative game server: reading the blockchain, updating the emulator state and streaming the video to clients.
  
  https://testnet.monadvision.com/tx/0x03bf0cf8bb7c949946a1da279a87a1eae2da3991fb2cfe2c9e63bbe3451a2c14?tab=Logs
- Notably, although no game state is being written to the globally-readable blockchain, anyone can verify that the actions performed in the client correspond to the history of transactions. Of course, replaying the same actions wouldn’t necessarily result in the same game state, but at least each vote’s result could be corroborated with on-chain evidence.
What move validation should take place “on-chain”?
- In this excellent guide to building 2048 on Monad, there’s a section dedicated to gameplay validation. Validation of user input is critical when game logic lives on-chain, but the smart contract in this case simply serves as a sink for user votes, which the indexer makes sense of. A nonsensical command like B in an open area is still valid Game Boy input (despite being a no-op). Fun fact: mashing B (or any other combination of buttons) while attempting to catch a Pokémon ****has no effect.
Anarchy or democracy?
- When Twitch Plays Pokémon first shipped, there was only one mode - anarchy. All inputs were buffered and would eventually execute. This made certain puzzles in the game, like the Celadon City Game Corner (which required a precise series of inputs, like RIGHT → DOWN → RIGHT) almost impossible to achieve with a 25 second delay and hundreds of simultaneous pilots.
  
  https://en.wikipedia.org/wiki/Infinite_monkey_theorem
- The original stream was stuck at this stage for almost 24 hours until the democracy mode was rolled out (users could vote on a series of actions to take in a 30 second window. e.g. up2right3). I’m a fan of democracy and consensus building, but a 30 second window is just too long, especially when I’m the only one playtesting the game. I set the window to 5 blocks (each block on Monad is typically 400 ms, so that’s a 2 second window). The emulator executes the winning vote, settling ties randomly (the seed set to the final blockhash of the window).

The minimal POC would use the Monad blockchain as a record of auditable votes from users while the indexer would do the heavy lifting of aggregating votes appropriately and sequencing actions as emulator input. That emulator state would be visually represented somehow on the front end.

Implementation

I would consider myself an average user of LLM coding agents, certainly not a power user. To reiterate, I know next to nothing about modern web development apart from the (critical) understanding that LLM coding agents are quite powerful in that domain.

I fed Gemini Pro (it had just become available and I had heard good things) a bullet list of elements that I wanted. A lightweight smart contract developed with Foundry to emit Vote transactions. An intelligent indexer that could listen to those events (with optimistic Monad websockets) and sequence appropriate inputs to the game. A front end that could render the Game Boy emulator and handle user inputs. Gemini Pro produced a more complete spec in markdown that I copied into a blank repo as CLAUDE.md. Then Jean-Claude (Opus 4.5) took the wheel.

We’ve exported a smaller version as a Slack Emoji