baki.io - thoughts from the void

AI Game Design Tool

Sat, 21 Mar 2026 00:00:00 GMT

import Callout from '../../components/mdx/Callout.astro'; import LinkCard from '../../components/mdx/LinkCard.astro'; I've been playing games for years. The Fractal Game - a procedural audio experiment where every system connects to every other system through an interconnected health model. The IDLE Game - base building with offline progression and resource loops that compound while you sleep. In both cases, the hardest problem was never implementation. It was playtesting inside my own head. Then I started having design conversations with LLMs, and something shifted. ## The messy process Here's what game design with AI actually looks like. It's not "generate a game for me." It's closer to rubber-ducking with a partner who has read every GDC talk but has never actually played a game. For the Fractal Game, I was stuck on the health system. I wanted interconnected subsystems - audio health, visual health, structural health - where damage to one cascades through the others. But the balance was impossible to intuit. I'd sketch a dependency graph, stare at it, and have no idea if it would feel right in practice. So I described the system to Claude and asked: "If a player takes 30% audio damage and 15% structural damage simultaneously, walk me through what happens over the next ten seconds." The response wasn't correct - it couldn't be, without running the actual simulation. But it surfaced edge cases I hadn't considered. What if the cascade creates a feedback loop? What if recovery in one system blocks recovery in another? AI design conversations generate hypotheses, not answers. Every interesting suggestion needs to be tested against actual player behavior. The model hasn't played your game. You have. ## Simulating players you haven't met This is where my UX research background collides with game design in a useful way. In research, we talk about the gap between what users say they want and what they actually do. In game design, the gap is between what the designer intends and what the player experiences. LLMs can approximate player archetypes. Not perfectly - but well enough to stress-test assumptions. For the IDLE Game, I asked models to roleplay as different player types: the optimizer who min-maxes every resource, the explorer who ignores efficiency, the hoarder who never spends. Each "player" surfaced different failure modes in the progression curve. The optimizer found an exploit in the offline calculation I'd missed. The explorer revealed that my tutorial locked out an entire branch of content. The hoarder showed me that resource caps felt punitive rather than strategic. None of these insights replaced actual playtesting. But they compressed weeks of blind iteration into hours of directed conversation. ## Procedural generation as philosophy There's a deeper connection here that I keep returning to. Procedural generation in games isn't just a technical choice - it's a philosophical stance. It says: the designer creates the rules, not the content. The content emerges. That's uncomfortably close to how I think about life in general. When I use AI in game design, I'm doing procedural generation of the design process itself. I set up the constraints - the mechanics, the feel I'm going for, the player experience I want - and let the conversation generate possibilities I wouldn't have reached alone. The best AI-assisted design sessions start with constraints, not blank canvases. "Design me a game" produces nothing. "This resource loop feels punishing after hour three - why?" produces insight. The tool doesn't replace the designer. It makes the design space navigable. And navigating possibility space is, when you strip away the jargon, what game design has always been about.

Form Reasoning

Sat, 21 Mar 2026 00:00:00 GMT

import Callout from '../../components/mdx/Callout.astro'; import LinkCard from '../../components/mdx/LinkCard.astro'; There's a question I keep circling back to: what does a melancholic idea look like? Not what it means. Not how to describe it. What shape does it take when you strip away language and ask something non-human to render it? In Chaos - the multi-model project I've been building - I feed the same prompt to ten LLMs simultaneously. They don't agree. They never agree. But here's what gets interesting: they disagree in form, not just content. One model returns a slow-drifting jellyfish. Another gives me a fractal spiral collapsing inward. A third produces a swarm of particles that never quite settle. Same input. Ten different shapes. This isn't a failure of alignment. It's the whole point. ## The gap between language and form We treat language as the canonical way to represent ideas. But language is sequential - one word after another, one sentence building on the last. Ideas aren't like that. A feeling of loss has weight, direction, density. It occupies space in a way that a paragraph about loss never captures. Embodied cognition research has been saying this for decades: thinking isn't something that happens in an abstract symbol space. It's grounded in the body, in spatial relationships, in physical metaphor. When we say an argument is "heavy" or an idea is "sharp," we're not being poetic. We're being accurate about how cognition actually works. When ten models give an idea ten different shapes, they're mapping the topology of that idea's meaning-space. Each shape is a projection - a view from a different angle of the same underlying structure. ## Why disagreement is generative The instinct is always to converge. Pick the best answer. Average them out. Build consensus. But when you let ten forms coexist, you get something richer: a stereoscopic view of meaning. Like how two eyes create depth perception, ten interpretations create conceptual depth. I started building Chaos because I was frustrated with single-model thinking. One LLM gives you one perspective, and you mistake that perspective for truth. Ten models make the constructed nature of every response visible. You stop asking "what's the right answer?" and start asking "what's the shape of the question?" That shift matters. As a UX researcher, I've spent years watching people interact with systems that present one answer as definitive. Search results. Recommendation engines. Diagnosis tools. The single-answer paradigm trains passivity. Multi-form reasoning trains perception. ## Where this leads I don't think ideas are made of words. I think ideas are made of shapes, and words are one lossy compression format for transmitting them. When I watch ten models turn the same prompt into ten different visual forms, I see something closer to how understanding actually works - messy, parallel, and irreducibly multiple. All is one and one is all. But "one" has ten thousand shapes.

Infinite Canvas

Sat, 21 Mar 2026 00:00:00 GMT

import Callout from '../../components/mdx/Callout.astro'; import LinkCard from '../../components/mdx/LinkCard.astro'; This site doesn't have pages. Not in the traditional sense. There's no blog archive sorted by date, no linear scroll from top to bottom, no hierarchy telling you what matters most. Instead, there's a canvas - an open field where content exists in space, positioned by semantic relationship rather than chronology. This was a deliberate choice, and it took me a while to articulate why. ## Pages are arguments A traditional webpage is a rhetorical structure. It has a beginning, middle, and end. It controls attention through sequence. The designer decides: this goes first, this comes next, this is the conclusion. That's powerful when you want to persuade. It's limiting when you want to think. I spent years as a UX researcher testing interfaces that impose hierarchy. Navigation menus. Content carousels. Information architecture. All of these are opinionated about what matters - they rank, they sort, they filter. They're useful. But they also compress the space of possible engagement into predetermined paths. An infinite canvas doesn't argue. It presents. When you organize content by date, you're saying recency equals relevance. When you organize by category, you're saying taxonomy equals understanding. When you organize by spatial proximity, you're saying relationship equals meaning. ## Maps over timelines The digital garden movement got close to this. Maggie Appleton's concept of ideas growing from seeds to saplings to trees is beautiful - it captures the epistemic status of thoughts, which blogs flatten into "published" or "not." Andy Matuschak's evergreen notes push further: atomic ideas linked by association rather than sequence. But most digital gardens still live inside page-based layouts. You click links. You navigate trees. The topology is relational, but the interface is still linear. Kinopio does something different - it gives you cards in space, connected by lines. You see the shape of someone's thinking, not the sequence of their output. That spatial dimension carries real information. Clusters mean resonance. Distance means distinction. Orphaned nodes mean ideas that haven't found their place yet. That's what I wanted for baki.io. Not a portfolio. Not a blog. A map of how I think. ## What spatial organization reveals When I placed my projects on the canvas for the first time, patterns emerged that I hadn't seen in two years of working on them. Chaos and the Fractal Game share a deep concern with emergence - but I'd never connected them explicitly. The MCP aggregator and the infinite canvas itself are both about interface philosophy - tools that shape how you interact with complexity. These connections were always there. A chronological blog would have buried them under timestamps. A portfolio grid would have separated them into categories. The canvas made them visible. Try this: take five projects you've worked on and place them on a whiteboard by "how related they feel." The clusters will surprise you. There's something meditative about it. You stop curating a narrative and start observing a landscape. The map isn't the territory, but sometimes the map shows you territory you didn't know existed.

10 LLMs Disagree

Fri, 20 Mar 2026 00:00:00 GMT

import Callout from '../../components/mdx/Callout.astro'; import CodeBlock from '../../components/mdx/CodeBlock.astro'; import LinkCard from '../../components/mdx/LinkCard.astro'; When you ask ten language models the same question, you don't get ten copies of the same answer. You get ten genuinely different interpretations - shaped by architecture, training data, and the emergent personality of each model. Most systems try to resolve this into consensus. Chaos does the opposite: it celebrates disagreement as a feature, not a bug. Disagreement between models is signal, not noise. Each divergence maps a boundary in the latent space of possible answers. ## The diversity hypothesis If a single model gives you one perspective, ten models give you a landscape. Not an averaged, blurred landscape - a terrain with actual peaks and valleys, each one a different way of seeing. {`# Fan out the same prompt to multiple models responses = await asyncio.gather(*[ model.generate(prompt) for model in ensemble ]) # Measure semantic distance between responses distances = pairwise_cosine(responses) diversity_score = distances.mean()`} Higher diversity scores often correlate with questions that have no single correct answer - exactly the kind worth exploring. Don't confuse model diversity with hallucination. Diverse outputs from grounded models reveal genuine interpretive breadth, not errors. ## Further reading