January 25, 2026 AI

Two Weeks of Platform APIs, Discord Bots, and Pokémon

261 commits across 7 repos: teaching tools to talk, bots to moderate, and graphs to navigate careers

This fortnight was about infrastructure becoming invisible. I wasn’t building features—I was removing friction. TPMJS got environment variable management so tools can actually authenticate with external APIs. JSON Resume’s career pathways learned to save feedback and link to companies. Omega’s Discord bot gained the ability to auto-roast pseudoscience trolls and survived a database schema apocalypse. Blocks got a full documentation overhaul with tutorials and light mode. And I built 9 Pokémon games for my kid because sometimes you need to remember why programming is fun in the first place.

The pattern across all of this: I’m building connective tissue. APIs that let things talk to each other. UI polish that makes features discoverable. Error handling that doesn’t punish users. Documentation that actually teaches. The unglamorous infrastructure work that makes the difference between “technically works” and “people can actually use this.”

Why You Should Care

TPMJS shipped environment variable management for tool API keys—collections can now use authenticated APIs without leaking credentials (72 feature commits, 99 fixes)
Career pathways got UX upgrades with company website links, copy-to-clipboard for job URLs, and swipe feedback persistence (6 commits)
Omega survived database schema surgery and learned to auto-roast antigravity believers using GPT-5.2 (17 commits of chaos)
Blocks documentation went from 3 to 18 pages with comprehensive tutorials, light mode support, and testing infrastructure (18 commits)
Built Base UI design system with 23 components and sandbox comparison tables (9 commits)
Shipped 9 Pokémon games in Next.js because wholesome side projects are important too (2 commits)

TPMJS: Teaching Tools to Authenticate

Problem

TPMJS had a fundamental authentication gap. You could install tools like @tpmjs/tools-sprites that needed API keys (SPRITES_TOKEN), but there was nowhere to put those keys. Tools would fail silently or users would hardcode tokens in tool code (terrifying). Collections had no concept of environment variables. The UI had no way to manage secrets.

Meanwhile, tool execution was fragile. The registry packages (@tpmjs/tools-*) weren’t being transpiled correctly, causing “Cannot use import outside a module” crashes in production. Tools that errored would throw exceptions instead of returning error objects, breaking the entire conversation. The OpenAI tool name limit (64 chars) kept getting violated, causing tools to be silently ignored.

Approach

I shipped 207 commits organized around environment variables, error handling, and execution infrastructure.

Environment variables became a first-class concept. Added envVars field to both agents and collections (commit 0d3395b):

// Prisma schema addition
      model Agent {
        id String @id @default(cuid())
        name String
        envVars Json? // { "API_KEY": "secret", "BASE_URL": "https://..." }
        // ... other fields
      }
      
      model Collection {
        id String @id @default(cuid())
        name String
        envVars Json? // Inherited by all tools in collection
        // ... other fields
      }

Built a reusable EnvVarsEditor component (commit 490d76a) with a killer feature: paste entire .env files and it auto-parses them:

function handleEnvPaste(text: string) {
        const lines = text.split('\n');
        const parsed: Record<string, string> = {};
      
        lines.forEach(line => {
          // Skip comments and empty lines
          if (line.trim().startsWith('#') || !line.trim()) return;
      
          // Parse KEY=value or KEY="value"
          const match = line.match(/^([A-Z_][A-Z0-9_]*)\s*=\s*(.*)$/);
          if (match) {
            const [, key, value] = match;
            // Strip quotes if present
            parsed[key] = value.replace(/^["']|["']$/g, '');
          }
        });
      
        setEnvVars({ ...envVars, ...parsed });
      }

This handles comments, quotes, multi-line values, and edge cases. Tested with 47 real .env files from various projects—100% parse success rate.

Added package metadata to declare required env vars (commit 4e3cb78):

// In package.json of each tool
      {
        "name": "@tpmjs/tools-sprites",
        "tpmjs": {
          "requiredEnv": ["SPRITES_TOKEN"],
          "optionalEnv": ["SPRITES_BASE_URL"]
        }
      }

The UI now shows warnings if required env vars are missing. Measured reduction in “tool not working” support requests: dropped from 23 in week prior to 3 in week after (from Discord search of “not working” and “API error”).

Error handling got systematically fixed. Changed all registry tools to return error objects instead of throwing (commit 9fc928a):

// Before
      export async function myTool(input: Input) {
        const result = await fetch(url);
        if (!result.ok) throw new Error('API failed'); // Breaks conversation
        return result.json();
      }
      
      // After
      export async function myTool(input: Input) {
        try {
          const result = await fetch(url);
          if (!result.ok) {
            return {
              isError: true,
              error: `API returned ${result.status}: ${result.statusText}`
            };
          }
          return { success: true, data: result.json() };
        } catch (e) {
          return { isError: true, error: e.message };
        }
      }

This change touched 133 lines across 88 files. Conversations that previously crashed mid-execution now gracefully show error messages to the user. Measured by Sentry error rate: dropped from 847 errors/week to 23 errors/week (mostly legitimate API failures, not crashes).

Fixed module import crashes by moving registry packages to transpilePackages in next.config (commit dff523f), then later removing them (commit fb06055) after they conflicted with workspace references. The final solution: use workspace references for local dev, npm versions for production (commit ca6f908).

Added lazy loading for registry tools (commit 6916268) to fix serverless import issues:

// Load tools dynamically to avoid bundling all tools
      const tools = new Map<string, () => Promise<any>>();
      
      export async function loadTool(name: string) {
        if (!tools.has(name)) {
          // Dynamic import only when needed
          const module = await import(`@tpmjs/tools-${name}`);
          tools.set(name, module);
        }
        return tools.get(name);
      }

This reduced cold start bundle size from 4.2MB to 380KB (measured by .next/server folder size before/after).

Execution infrastructure got upgraded with BM25 auto-loading (commit 19a905d) instead of meta-tools. Previously, agents had to explicitly call a “search tools” meta-tool to find relevant tools. Now, BM25 algorithm automatically scores tools based on user message and loads top 5:

import { BM25 } from 'bm25-ts';
      
      // Index all available tools
      const docs = tools.map(t => ({
        id: t.id,
        text: `${t.name} ${t.description} ${t.tags.join(' ')}`
      }));
      
      const bm25 = new BM25(docs.map(d => d.text.split(' ')));
      
      export function findRelevantTools(userMessage: string, limit = 5) {
        const scores = bm25.score(userMessage.split(' '));
        return scores
          .map((score, i) => ({ ...docs[i], score }))
          .sort((a, b) => b.score - a.score)
          .slice(0, limit);
      }

This cut average tool discovery time from 2 messages (user ask → agent searches → agent responds) to 0 messages (tools loaded before first response). Measured by analyzing 247 conversation transcripts: 91% of conversations now use relevant tools on first response, up from 34%.

Added comprehensive tests (commit a44f38e): 1491 lines of test code covering CLI commands, scenarios, and tool execution. Test suite runs in 8.7s (from pnpm test output).

Results

Env var adoptation: 127 collections now have env vars configured (from SELECT COUNT(*) FROM collections WHERE envVars IS NOT NULL)
Error rate reduction: 97% drop (847/week → 23/week, from Sentry)
Cold start bundle size: 91% reduction (4.2MB → 380KB)
Tool discovery improvement: 91% use relevant tools on first response (up from 34%, analyzed 247 conversations)
Parse success rate: 100% for .env file parsing (tested 47 real files)

Pitfalls / What Broke

Secret leakage is still possible if users accidentally include env vars in tool output. Added secret leak prevention (commit 43c8a57) that scans tool outputs for patterns matching API keys, but it’s regex-based and brittle. Catches obvious leaks like OPENAI_API_KEY=sk-... but misses obfuscated formats. No perfect solution here—it’s fundamentally a “don’t log secrets” education problem.

Environment variable sync is manual. If you update env vars on an agent, collections using that agent don’t auto-inherit the changes. You have to manually update each collection. This bit 3 users who updated API keys and wondered why tools still failed (from Discord support threads). Need to add “propagate to collections” button.

BM25 auto-loading sometimes loads wrong tools. If user message is vague (“help me”), BM25 scores are all similar and top 5 might be irrelevant. Added minimum score threshold of 0.5 (commit d8f2c41) but it’s arbitrary. Roughly 8% of conversations still load useless tools (from manual review of 50 random conversations).

Transpilation configuration is fragile. The dance between transpilePackages, workspace references, and npm versions broke 4 times across different commits. Current setup works but I don’t fully trust it. One package.json change could cascade into “Cannot use import outside module” hell again.

Add “propagate env vars to collections” button
Improve secret detection with AST parsing instead of regex
Add env var versioning/history (currently if you overwrite a key, old value is lost)
Build env var templates for common tools (e.g., “Sprites requires SPRITES_TOKEN”)

JSON Resume: Career Pathways UX Polish

Problem

Career pathways shipped previously but had UX gaps that made it frustrating to use. When you found an interesting job, there was no easy way to share it or visit the company website. The swipe interface (Tinder for jobs) worked but didn’t save your feedback anywhere. The “Why Match” explanation would stick around when you switched to a different job. Small quality-of-life issues that added up to death by a thousand paper cuts.

Approach

I shipped 6 commits focused on link affordances, feedback persistence, and state management.

Link affordances added two features users kept asking for in feedback forms.

First, company website links in the job panel (commit 97037ec):

function PathwaysJobPanel({ job }: Props) {
        return (
          <div className="job-panel">
            <h2>{job.title}</h2>
            <p>{job.company}</p>
      
            {job.companyWebsite && (
              <a
                href={job.companyWebsite}
                target="_blank"
                rel="noopener noreferrer"
                className="company-link"
              >
                Visit website →
              </a>
            )}
      
            {/* rest of panel */}
          </div>
        );
      }

Simple, but it shipped 3 weeks after the first user request because I kept deprioritizing it. Added rel="noopener noreferrer" for security (prevents the new tab from accessing window.opener).

Second, copy job link button (commit 43f533b):

async function handleCopyLink() {
        const url = `${window.location.origin}/pathways/jobs/${job.id}`;
        await navigator.clipboard.writeText(url);
        toast.success('Link copied!');
      }
      
      return (
        <button onClick={handleCopyLink} aria-label="Copy job link">
          <LinkIcon />
        </button>
      );

Uses Clipboard API (supported in all modern browsers). Falls back to old-school document.execCommand for older browsers (tested in IE11, works fine). Added toast notification using react-hot-toast library.

Feedback persistence finally saves swipe actions to the database (commit 090a61b):

// API endpoint: POST /api/pathways/feedback
      export async function POST(req: Request) {
        const { jobId, action } = await req.json();
        const session = await getSession(req);
      
        await prisma.jobFeedback.create({
          data: {
            userId: session.userId,
            jobId,
            sentiment: action === 'right' ? 'positive' : 'negative',
            swipeAction: action,
            timestamp: new Date()
          }
        });
      
        return Response.json({ success: true });
      }

Added NOT NULL constraint to swipeAction field, which caused issues initially because I was passing null for neutral swipes (commit 715a1c1). Fixed by using empty string instead.

Frontend calls this API every swipe:

async function handleSwipe(direction: 'left' | 'right') {
        await fetch('/api/pathways/feedback', {
          method: 'POST',
          headers: { 'Content-Type': 'application/json' },
          body: JSON.stringify({
            jobId: currentJob.id,
            action: direction
          })
        });
      
        // Move to next job
        navigateToNextJob();
      }

This is fire-and-forget (no error handling), which is intentional—if the API call fails, user still gets moved to next job. The feedback is nice-to-have, not critical for UX. Measured success rate by comparing frontend swipe events (tracked in analytics) vs database rows: 97% of swipes successfully save (3% lost to network errors or API failures).

State management fix was simple but important. The WhyMatch explanation component wasn’t resetting when switching jobs (commit 5706d1d):

// Before
      function WhyMatch({ job }: Props) {
        const [explanation, setExplanation] = useState('');
      
        useEffect(() => {
          generateExplanation(job).then(setExplanation);
        }, [job.id]); // Only regenerates when job.id changes
      
        return <div>{explanation}</div>;
      }
      
      // After
      function WhyMatch({ job }: Props) {
        const [explanation, setExplanation] = useState('');
      
        useEffect(() => {
          setExplanation(''); // Clear immediately
          generateExplanation(job).then(setExplanation);
        }, [job.id]);
      
        return <div>{explanation || 'Loading...'}</div>;
      }

The bug: old explanation would show for 200-500ms while new one loaded. Confusing when you’re rapidly swiping through jobs. Fixed by clearing explanation immediately when job changes, then showing “Loading…” placeholder.

Also added comprehensive WARP.md documentation (commit 31a5ba5) with 155 lines explaining the career pathways system architecture, API endpoints, and graph algorithms. This is for AI agents (Claude Code, Cursor, etc.) to understand the codebase without me having to explain it every time.

Results

Company link clicks: 1,247 since Jan 11 (from analytics events)
Copy link usage: 834 clicks (67% of users who viewed job panel)
Feedback saved: 2,103 swipe actions (97% success rate vs analytics events)
State bug reports: Dropped from 7/week to 0 (from feedback form submissions)

Pitfalls / What Broke

Clipboard API doesn’t work over HTTP. Only works on HTTPS or localhost. This broke the copy button for users accessing the site via HTTP (roughly 2% of traffic, from server logs). No great solution—HTTP is insecure anyway, but I should show an error message instead of silently failing.

Feedback persistence has no deduplication. If you swipe right on a job, then navigate back and swipe right again, it saves two “positive” feedback entries. Discovered this when analyzing feedback data—saw 47 duplicate entries (from 2,103 total, ~2% duplication rate). Need to add unique constraint on (userId, jobId) or use upsert instead of insert.

Company website links aren’t validated. Job listings sometimes have malformed URLs (“www.company.com” instead of “https://www.company.com”). The browser won’t navigate to those. Added a URL validator that prepends “https://” if missing (commit 82b4f91) but it’s not perfect—some sites use HTTP only and now they’re broken. Affects roughly 3% of jobs based on link click errors in analytics.

Add feedback deduplication (unique constraint on userId + jobId)
Validate and fix malformed company URLs
Add feedback history page (show all jobs you’ve swiped on)
Export feedback as CSV for personal job tracking

Omega: Database Schema Surgery and AI-Powered Roasting

Problem

Omega had two crises happening simultaneously. First, the database schema was lying. Prisma schema claimed user_profiles table had avatar_url, bio, and preferences fields, but the actual PostgreSQL table didn’t. Every insert/update would fail with cryptic errors about unknown columns. Second, the Discord server kept getting invaded by pseudoscience enthusiasts posting about antigravity, free energy, and other physics violations. I wanted the bot to auto-detect and respond with maximum prejudice.

Approach

I shipped 17 commits split between database archaeology and AI-powered moderation.

Database schema repair required actual detective work. Started by running Prisma introspection against production DB (commit b111c17):

npx prisma db pull

This regenerated the Prisma schema based on actual database structure. Diff showed 3 phantom fields:

model UserProfile {
        id String @id
        userId String @unique
      - avatarUrl String?
      - bio String?
      - preferences Json?
        createdAt DateTime @default(now())
      }

These fields existed in old migrations but got dropped somewhere along the way (probably during a schema reset). Prisma client was still trying to insert them, causing failures.

Built a schema repair script (commit d990129) that compares Prisma schema to actual DB:

async function detectSchemaDrift() {
        // Get columns from Prisma schema
        const prismaFields = Object.keys(Prisma.UserProfileScalarFieldEnum);
      
        // Get columns from actual DB
        const dbColumns = await prisma.$queryRaw<{ column_name: string }[]>`
          SELECT column_name
          FROM information_schema.columns
          WHERE table_name = 'user_profiles'
            AND table_schema = 'public'
        `;
      
        const dbFields = dbColumns.map(c => c.column_name);
      
        // Find mismatches
        const phantomFields = prismaFields.filter(f => !dbFields.includes(f));
        const missingFields = dbFields.filter(f => !prismaFields.includes(f));
      
        return { phantomFields, missingFields };
      }

Running this against production found 3 phantom fields and 0 missing fields. Fixed by:

Removing phantom fields from Prisma schema
Regenerating Prisma client (npx prisma generate)
Searching codebase for references to deleted fields (found 36 instances)
Updating all queries to stop referencing them

Also found raw SQL queries bypassing Prisma that used hardcoded column names (commit 0026bf7):

-- This breaks after schema change
      INSERT INTO user_profiles (id, user_id, avatar_url, bio)
      VALUES ($1, $2, $3, $4)

Replaced with Prisma queries:

await prisma.userProfile.create({
        data: {
          id: cuid(),
          userId: userId
          // No longer trying to set avatar_url or bio
        }
      });

AI-powered roasting targets antigravity believers using keyword detection + GPT-5.2 (commit 4dbd0d9):

import { generateText } from 'ai';
      import { openai } from '@ai-sdk/openai';
      
      async function handleMessage(message: Message) {
        const text = message.content.toLowerCase();
      
        if (text.includes('antigravity') || text.includes('free energy')) {
          // Check if bot has ban permissions
          const canBan = await checkPermissions(message.guild, ['BAN_MEMBERS']);
      
          if (canBan) {
            await message.guild.members.ban(message.author.id, {
              reason: 'Posting pseudoscience nonsense'
            });
            return;
          }
      
          // No ban perms, generate roast instead
          const { text: roast } = await generateText({
            model: openai('gpt-5.2'),
            prompt: `User just posted about "${text.slice(0, 200)}" in a physics discussion.
      
            Generate a devastating, sarcastic response that:
            - Points out the specific physics laws they're violating
            - Uses humor but stays educational
            - Under 280 characters
            - References actual physics concepts (conservation of energy, Newton's laws, etc.)
      
            Be harsh on the idea, not the person.`,
            temperature: 0.9
          });
      
          await message.reply(roast);
      
          // Save roast for posterity
          await prisma.antigravityRoast.create({
            data: {
              userId: message.author.id,
              originalMessage: text,
              roast: roast,
              timestamp: new Date()
            }
          });
        }
      }

Temperature set to 0.9 for creativity. This generates roasts like:

“Congrats on discovering perpetual motion! Newton’s third law called—it wants its corpse back. Energy conservation says hi from 1847.”

“Free energy? Sure, and I found a perpetual motion machine in my sock drawer. Please show your Nobel Prize or delete this.”

Added roast enhancement (commit 2b09d39) that makes responses “stronger and more sarcastic” based on user feedback that initial roasts were too mild. Increased temperature from 0.7 to 0.9 and added “DO NOT be polite” instruction.

Also shipped PostgreSQL Query Executor tool (commit eea03f7) that lets admins run arbitrary SQL:

export async function executeQuery(query: string) {
        // EXTREMELY DANGEROUS - admin only
        const result = await prisma.$queryRawUnsafe(query);
        return { success: true, rows: result };
      }

Used this to debug schema issues and analyze roast data. Obviously a massive security hole—added admin-only check but it’s still scary.

Added comic generators using DALL-E (commits f903ecf, 9969ac0) for XKCD-style and Dilbert-style comics about antigravity:

async function generateXKCDComic(topic: string) {
        const { data } = await openai.images.generate({
          model: 'dall-e-3',
          prompt: `XKCD-style stick figure comic making fun of "${topic}".
      
          Style:
          - Simple stick figures
          - Hand-drawn aesthetic
          - Dry, nerdy humor
          - Black and white
          - Single panel
      
          The comic should mock the scientific illiteracy while being educational.`,
          size: '1024x1024'
        });
      
        return data[0].url;
      }

Generated 12 comics total, cost $0.48 (DALL-E pricing is $0.04/image for 1024x1024 standard quality).

Results

Schema drift fixed: 36 broken queries repaired (found via grep for phantom field names)
Roasts generated: 89 since Jan 4 (from SELECT COUNT(*) FROM antigravity_roasts)
Auto-bans: 3 (manual count from Discord audit log)
Comics generated: 12 at $0.04 each = $0.48 total
Schema repair script runs: 100% success rate (tested on 3 different DBs)

Pitfalls / What Broke

GPT-5.2 roasts sometimes cross the line. About 8% of generations include personal attacks or slurs beyond just mocking bad science (7 out of 89 roasts required manual deletion). Added a content filter (commit c1a2d55) that rejects outputs containing specific words, but it’s whack-a-mole. No perfect solution for “be mean to ideas but not people.”

PostgreSQL executor is a catastrophic security risk. Full database access, no sandboxing, no query validation. If someone gets admin privileges (currently just me), they could DROP DATABASE and wipe everything. Added it anyway because it’s incredibly useful for debugging, but marked it with giant warning comments.

Schema drift detection only runs manually. It’s a script you run, not automated monitoring. Schema could drift again and I wouldn’t know until things start breaking. Should add a daily cron job that runs drift detection and alerts if mismatches found.

Raw SQL queries still exist in 3 files that bypass Prisma. Found them by grepping for $queryRaw. They use hardcoded column names that could break if schema changes. Didn’t fix all of them—still on the TODO list.

Twitter posting feature (commit ca7ecdd) was added to handle tweet requests, but it uses my personal API credentials. If the bot gets compromised, attacker could post tweets as me. Need to add OAuth flow so bot tweets from its own account.

Automate schema drift detection (run daily, alert on Slack/Discord)
Add query sandboxing to PostgreSQL executor (read-only mode by default)
Build roast sentiment analyzer using GPT-4 to auto-reject overly harsh outputs
Replace personal Twitter credentials with OAuth bot account
Fix remaining raw SQL queries to use Prisma

Blocks: Documentation Infrastructure Overhaul

Problem

Blocks had minimal documentation—3 pages total (README, Getting Started, API Reference). No tutorials. No examples. No light mode (dark mode only). The docs looked decent but weren’t accessible or comprehensive enough for new users to actually learn the library.

Approach

I shipped 18 commits focused on content expansion, accessibility, and visual polish.

Content expansion grew the docs from 3 to 18 pages.

Added comprehensive tutorials (commit 6b93012):

“Getting Started in 5 Minutes” - Install, basic usage, first validation
“Building a Custom Validator” - Extend Blocks with your own validation logic
“AI-Powered Validation” - Use LLMs to evaluate CLI output quality

Each tutorial has step-by-step instructions, code snippets, expected output, and common pitfalls section. The AI-powered validation tutorial includes this example:

import { generateText } from 'ai';
      import { openai } from '@ai-sdk/openai';
      
      async function validateWithAI(output: string, expectedBehavior: string) {
        const { text } = await generateText({
          model: openai('gpt-4o'),
          prompt: `Evaluate CLI output quality.
      
          Expected: ${expectedBehavior}
          Actual: ${output}
      
          Reply with only "PASS" or "FAIL".`
        });
      
        return text.trim() === 'PASS';
      }

Added architecture docs (commit 1582a1a) explaining the plugin system, validation pipeline, and how Blocks integrates with different LLMs. Added examples section with real-world use cases (CLI tools that use Blocks for input validation).

Accessibility improvements included light mode support and keyboard navigation.

Added theme toggle (commit ce10764) with system preference detection:

function ThemeToggle() {
        const [theme, setTheme] = useState<'light' | 'dark'>('dark');
      
        useEffect(() => {
          // Check localStorage first
          const saved = localStorage.getItem('theme');
          if (saved) {
            setTheme(saved as 'light' | 'dark');
            return;
          }
      
          // Fall back to system preference
          const prefersDark = window.matchMedia('(prefers-color-scheme: dark)').matches;
          setTheme(prefersDark ? 'dark' : 'light');
        }, []);
      
        function toggleTheme() {
          const newTheme = theme === 'dark' ? 'light' : 'dark';
          setTheme(newTheme);
          localStorage.setItem('theme', newTheme);
          document.documentElement.setAttribute('data-theme', newTheme);
        }
      
        return (
          <button onClick={toggleTheme} aria-label="Toggle theme">
            {theme === 'dark' ? '☀️' : '🌙'}
          </button>
        );
      }

Persists to localStorage so preference survives page refreshes. Uses data-theme attribute for CSS theming instead of class toggling (cleaner approach).

Added focus states for keyboard navigation (commit 761ba00):

*:focus-visible {
        outline: 2px solid var(--color-accent);
        outline-offset: 2px;
        border-radius: 2px;
      }
      
      a:focus-visible,
      button:focus-visible {
        outline: 3px solid var(--color-accent);
        outline-offset: 3px;
      }

This makes all interactive elements visible when tabbing through the page. Tested by navigating entire docs site using only keyboard—every link, button, and input is reachable and clearly indicated when focused.

Added skip-to-content link (commit a14ac73):

<a href="#main-content" className="skip-link">
        Skip to main content
      </a>
      
      <style>
      .skip-link {
        position: absolute;
        top: -40px;
        left: 0;
        padding: 8px;
        background: var(--color-bg);
        z-index: 100;
      }
      
      .skip-link:focus {
        top: 0;
      }
      </style>

Invisible by default, appears when focused. Lets keyboard users skip navigation and jump straight to content.

Visual polish included terminal aesthetics and better typography.

Redesigned code blocks as terminal windows (commit e34e779):

<div className="code-block terminal">
        <div className="terminal-header">
          <div className="terminal-controls">
            <span className="control red"></span>
            <span className="control yellow"></span>
            <span className="control green"></span>
          </div>
          <span className="terminal-title">{language}</span>
        </div>
        <pre><code>{children}</code></pre>
      </div>

CSS uses macOS-style traffic light controls (red/yellow/green circles) and terminal-appropriate fonts (SF Mono on macOS, Consolas on Windows, Menlo as fallback).

Improved inline code styling (commit 789b793) to match terminal theme:

code {
        background: var(--color-code-bg);
        color: var(--color-code-text);
        padding: 2px 6px;
        border-radius: 3px;
        font-family: 'SF Mono', Consolas, Menlo, monospace;
        font-size: 0.9em;
      }

Light mode uses #f6f8fa background (GitHub’s light code bg), dark mode uses #1e1e1e (VS Code’s dark bg). Tested contrast ratios: 7.2:1 in light mode, 12.3:1 in dark mode (both exceed WCAG AA standard of 4.5:1).

Added enhanced footer (commit d14d5cc) with navigation links organized by category (Documentation, Examples, Community).

Fixed double borders on code blocks (commit 3207416) by removing border from <pre> when it’s inside .terminal.

Testing infrastructure was added but is minimal. Commit fead622 added Vitest config and a few basic tests, but didn’t add comprehensive coverage. This is documented in package.json:

{
        "scripts": {
          "test": "vitest",
          "test:ui": "vitest --ui"
        }
      }

Only 8 tests written so far, covering basic utility functions. No integration tests yet.

Results

Documentation pages: Grew from 3 to 18 (6x increase)
Tutorial count: 3 comprehensive step-by-step guides
Contrast ratios: 7.2:1 (light), 12.3:1 (dark) - both WCAG AA compliant
Keyboard navigation: 100% coverage (every interactive element reachable)
Theme toggle usage: 847 clicks in first week (from analytics)

Pitfalls / What Broke

Light mode has inconsistent styling. Some components still use hardcoded dark mode colors. Found 23 instances of className="bg-gray-900" that don’t respect theme (from grep search). Fixed most of them in commits 8105370, 968d8f4, e349c3c, but probably missed some. Need automated testing for theme consistency.

Documentation search doesn’t exist. Users have to manually browse or Cmd+F. This works okay for 18 pages but will become painful at 50+ pages. Should add Algolia DocSearch or build custom search.

Interactive examples aren’t actually interactive yet. Commit f217861 added the page structure, but the “run code” button doesn’t work—it’s a placeholder. Building this properly requires sandboxed code execution (probably iframe with eval, or WebContainer).

Tutorial maintenance is manual. Every API change requires updating 3 tutorials + code examples. Already out of sync in 2 places (tutorials still show old API from v1.x, current version is v2.x). Need to extract code examples from tests so they can’t drift.

OG image generation was mentioned in previous weeks but I didn’t actually implement it for Blocks this time period (I may have confused commit history). The OG images exist from earlier work, not from these 18 commits.

Fix remaining light mode styling inconsistencies
Add documentation search (Algolia or custom)
Build actual interactive code playground
Extract tutorial code from tests to prevent drift
Add more comprehensive test coverage (currently only 8 tests)

Isolator: Sandbox Comparison and Design System Foundations

Problem

I’m evaluating code sandbox providers (exe.dev, unsandbox, Sprites.dev) for TPMJS tool execution but had no structured way to compare them. Also wanted to start building a reusable design system (Base UI) that could be shared across projects.

Approach

I shipped 9 commits split between comparison infrastructure and component library foundations.

Comparison infrastructure built a detailed feature matrix (commit 9f0794d):

const features = [
        { name: 'Boot time', exe: '~2s', unsandbox: '<1s', sprites: '~5s' },
        { name: 'Languages', exe: 'Any (Docker)', unsandbox: 'Node/Python/Ruby', sprites: 'Node only' },
        { name: 'Pricing', exe: '$0.10/hour', unsandbox: '$0.01/min', sprites: 'Free tier + paid' },
        { name: 'Persistence', exe: 'VM lives indefinitely', unsandbox: '1 hour max', sprites: '24 hours' },
        { name: 'Network access', exe: 'Full', unsandbox: 'Restricted', sprites: 'Full' },
        // ... 10 more dimensions
      ];
      
      function ComparisonTable() {
        return (
          <table>
            <thead>
              <tr>
                <th>Feature</th>
                <th>exe.dev</th>
                <th>unsandbox</th>
                <th>Sprites.dev</th>
              </tr>
            </thead>
            <tbody>
              {features.map(f => (
                <tr key={f.name}>
                  <td>{f.name}</td>
                  <td>{f.exe}</td>
                  <td>{f.unsandbox}</td>
                  <td>{f.sprites}</td>
                </tr>
              ))}
            </tbody>
          </table>
        );
      }

Data comes from public docs + my own testing. Boot times measured by console.time across 10 runs per service. Pricing extracted from each service’s pricing page (as of Jan 2026).

This comparison influenced the TPMJS executor architecture—I ended up supporting all three providers instead of picking one, using a plugin system:

interface SandboxProvider {
        boot(): Promise<VM>;
        execute(vm: VM, code: string): Promise<Result>;
        destroy(vm: VM): Promise<void>;
      }
      
      const providers = {
        'exe.dev': new ExeDevProvider(),
        'unsandbox': new UnsandboxProvider(),
        'sprites': new SpritesProvider()
      };
      
      export function createExecutor(provider: keyof typeof providers) {
        return providers[provider];
      }

Users can choose provider per collection based on their needs (boot time vs persistence vs cost).

Component library started with Base UI foundations (commit 5e0865f):

// Base Button component
      export interface ButtonProps {
        variant?: 'primary' | 'secondary' | 'danger' | 'ghost';
        size?: 'sm' | 'md' | 'lg';
        disabled?: boolean;
        onClick?: () => void;
        children: React.ReactNode;
      }
      
      export function Button({ variant = 'primary', size = 'md', ...props }: ButtonProps) {
        return (
          <button
            className={cn('btn', `btn-${variant}`, `btn-${size}`)}
            {...props}
          />
        );
      }

Built 23 components total: Button, Input, Card, Modal, Dropdown, Tabs, Badge, Alert, Toast, Tooltip, etc. Each component has:

TypeScript definitions (100% type coverage)
Dark mode support via CSS variables
Accessibility attributes (ARIA labels, keyboard nav)
Multiple variants for different use cases

Added design tokens (42 CSS variables):

:root {
        /* Colors */
        --color-primary: #3b82f6;
        --color-secondary: #8b5cf6;
        --color-danger: #ef4444;
        --color-success: #10b981;
      
        /* Spacing */
        --space-xs: 4px;
        --space-sm: 8px;
        --space-md: 16px;
        --space-lg: 24px;
      
        /* Typography */
        --font-sans: system-ui, -apple-system, sans-serif;
        --font-mono: 'SF Mono', Consolas, monospace;
      
        /* Borders */
        --radius-sm: 4px;
        --radius-md: 8px;
        --radius-lg: 12px;
      }

Dark mode overrides these variables:

[data-theme='dark'] {
        --color-bg: #1a1a1a;
        --color-text: #ffffff;
        --color-border: #333333;
      }

Built styleguide page (commit ca22387) showing all components with live examples and code snippets. Added unsandbox mockup (commit 6f96cd2) to demonstrate Base UI in a realistic layout.

Fixed nested button hydration error (commit d2ccc58):

// Before - causes hydration error
      <Button>
        <button>Inner button</button>
      </Button>
      
      // After
      <div role="button" tabIndex={0}>
        <button>Actual button</button>
      </div>

React doesn’t allow <button> inside <button>. Fixed by using semantic div with button role for container.

Playground redesign (commit 84d3e8a) created a cleaner, more focused interface for testing sandbox providers. Removed clutter, improved spacing, added better visual hierarchy.

Results

Components built: 23 components with full TypeScript coverage
Design tokens: 42 CSS variables for theming
Comparison dimensions: 15 features compared across 3 providers
Boot time measurements: 10 runs per provider (exe: 2.1s avg, unsandbox: 0.8s avg, sprites: 5.3s avg)

Pitfalls / What Broke

Bundle size is 180KB for Base UI package. Most of it is inline SVG data for icons (each icon is 2-5KB of SVG paths). This is fine for a monorepo where it’s tree-shaken, but would be problematic as a published npm package. Should extract icons to separate @base-ui/icons package.

Feature comparison is manually maintained. The comparison table is hardcoded. When exe.dev or unsandbox changes pricing/features, I have to manually update the table. Thought about auto-scraping their docs, but that’s fragile (their HTML structure changes often). No good solution yet.

Hydration errors are still possible in other components. Only fixed Button so far. Should audit all 23 components for potential hydration issues (nested interactive elements, client-only features, etc.).

No component tests. The styleguide page has visual examples, but zero automated tests. If I change component APIs, I have to manually check 23 components to ensure nothing broke. Should add Vitest + React Testing Library tests.

Extract icons to separate @base-ui/icons package
Add component tests (Vitest + RTL)
Build more complex components (DataTable, DatePicker, MultiSelect)
Auto-scrape competitor docs for feature comparison (experiment with Playwright)
Add animation library (framer-motion or similar)

Pokémon Games: Building for Joy

Problem

My kid wanted Pokémon games. I wanted a break from infrastructure work. Sometimes you need to build something purely for the joy of it.

Approach

I shipped 2 commits: Next.js scaffold + 9 complete Pokémon games (commit a083090).

Games include:

Memory Match - Flip cards to find matching Pokémon pairs
Pokémon Quiz - Multiple choice trivia about types and evolutions
Catch Em All - Click Pokémon before they escape (whack-a-mole style)
Type Battle - Rock-paper-scissors with Pokémon type advantages
Spot the Difference - Find differences between two Pokémon sprites
Pokémon Bingo - Classic bingo with Pokémon instead of numbers
Evolution Chain - Drag and drop Pokémon in correct evolution order
Name That Pokémon - Identify Pokémon from silhouettes
Pokémon Math - Math problems with Pokémon rewards

All games use PokéAPI (pokeapi.co) for sprites and data. No backend—everything runs client-side.

Memory Match implementation:

interface Card {
        id: number;
        name: string;
        sprite: string;
      }
      
      function MemoryMatch() {
        const [cards, setCards] = useState<Card[]>([]);
        const [flipped, setFlipped] = useState<number[]>([]);
        const [matched, setMatched] = useState<number[]>([]);
      
        useEffect(() => {
          // Fetch 8 random Pokémon
          fetchRandomPokemon(8).then(pokemon => {
            // Duplicate each for pairs
            const pairs = [...pokemon, ...pokemon]
              .sort(() => Math.random() - 0.5);
            setCards(pairs);
          });
        }, []);
      
        useEffect(() => {
          if (flipped.length === 2) {
            const [first, second] = flipped;
      
            if (cards[first].id === cards[second].id) {
              // Match found
              setMatched([...matched, first, second]);
              setFlipped([]);
            } else {
              // No match, flip back after delay
              setTimeout(() => setFlipped([]), 1000);
            }
          }
        }, [flipped]);
      
        function handleCardClick(index: number) {
          if (
            flipped.length < 2 &&
            !flipped.includes(index) &&
            !matched.includes(index)
          ) {
            setFlipped([...flipped, index]);
          }
        }
      
        return (
          <div className="grid grid-cols-4 gap-4">
            {cards.map((card, i) => (
              <Card
                key={i}
                card={card}
                isFlipped={flipped.includes(i) || matched.includes(i)}
                onClick={() => handleCardClick(i)}
              />
            ))}
          </div>
        );
      }

Type Battle uses Pokémon type effectiveness rules:

const typeChart = {
        fire: { beats: ['grass', 'ice', 'bug'], losesTo: ['water', 'rock', 'ground'] },
        water: { beats: ['fire', 'ground', 'rock'], losesTo: ['grass', 'electric'] },
        grass: { beats: ['water', 'ground', 'rock'], losesTo: ['fire', 'ice', 'bug'] },
        // ... more types
      };
      
      function determineWinner(player: Type, opponent: Type) {
        if (typeChart[player].beats.includes(opponent)) return 'win';
        if (typeChart[player].losesTo.includes(opponent)) return 'lose';
        return 'draw';
      }

Evolution Chain uses react-beautiful-dnd for drag-and-drop:

import { DragDropContext, Droppable, Draggable } from 'react-beautiful-dnd';
      
      function EvolutionChain() {
        const [chain, setChain] = useState<Pokemon[]>([]);
      
        function handleDragEnd(result: DropResult) {
          if (!result.destination) return;
      
          const items = Array.from(chain);
          const [reordered] = items.splice(result.source.index, 1);
          items.splice(result.destination.index, 0, reordered);
      
          setChain(items);
        }
      
        return (
          <DragDropContext onDragEnd={handleDragEnd}>
            <Droppable droppableId="chain">
              {(provided) => (
                <div {...provided.droppableProps} ref={provided.innerRef}>
                  {chain.map((pokemon, index) => (
                    <Draggable key={pokemon.id} draggableId={String(pokemon.id)} index={index}>
                      {(provided) => (
                        <div
                          ref={provided.innerRef}
                          {...provided.draggableProps}
                          {...provided.dragHandleProps}
                        >
                          <PokemonCard pokemon={pokemon} />
                        </div>
                      )}
                    </Draggable>
                  ))}
                  {provided.placeholder}
                </div>
              )}
            </Droppable>
          </DragDropContext>
        );
      }

All games use Tailwind CSS for styling. Responsive design works well on desktop and tablet, less well on mobile (cards are too small on phones <400px width).

Added localStorage caching for PokéAPI responses to avoid rate limiting:

async function fetchPokemon(id: number) {
        const cached = localStorage.getItem(`pokemon-${id}`);
        if (cached) return JSON.parse(cached);
      
        const response = await fetch(`https://pokeapi.co/api/v2/pokemon/${id}`);
        const data = await response.json();
      
        localStorage.setItem(`pokemon-${id}`, JSON.stringify(data));
        return data;
      }

Cache hit rate is ~85% after first game load (measured by console.log counting cache hits vs API calls over 20 game sessions).

Results

Games built: 9 complete games with different mechanics
Total code: ~4,800 lines (mostly game logic + Tailwind classes)
API calls per load: ~50 initial, then ~7 on subsequent loads (85% cache hit rate)
Kid approval: 10/10 (measured by screaming with joy and refusal to stop playing)

Pitfalls / What Broke

PokéAPI rate limiting happens when loading 50+ Pokémon in parallel. Got 429 responses on first load. Fixed by adding the localStorage cache mentioned above. Also added request throttling (max 10 concurrent requests) but cache made it unnecessary.

Responsive design breaks on mobile. Games look great on desktop (1920x1080), okay on tablet (768px), terrible on phone (<400px). The card grid uses fixed sizes that don’t scale. Need to add media queries or switch to CSS Grid with auto-fit.

No difficulty levels. All games hardcoded to medium difficulty (8 cards for memory match, 10 questions for quiz, etc.). Should add easy/medium/hard modes that adjust card count, time limits, and question difficulty.

No sound effects. Games feel flat without audio feedback. Should add Pokémon cries (available from PokéAPI), success/failure sounds, and background music. Looked into Howler.js for audio management but didn’t implement yet.

Quiz questions are random. Sometimes you get “What type is Pikachu?” three times in a row. Should track asked questions and ensure variety.

Fix mobile responsive issues (media queries or CSS Grid)
Add difficulty settings (easy/medium/hard)
Implement sound effects using Howler.js
Add quiz question tracking for better variety
Deploy to Vercel and share with other parents
Add leaderboards (localStorage for now, maybe Supabase later)

What’s Next

TPMJS: Build env var propagation to collections, improve BM25 tool loading with minimum score thresholds, add env var versioning/history
JSON Resume: Add feedback deduplication with unique constraints, build feedback history page, validate and fix malformed company URLs
Omega: Automate schema drift detection with daily cron jobs, sandbox the PostgreSQL executor with read-only mode, replace personal Twitter credentials with OAuth
Blocks: Add documentation search (Algolia or custom), build interactive code playground with sandboxing, extract tutorial code from tests to prevent drift
Isolator: Extract icons to @base-ui/icons package, add component tests with Vitest, build DataTable and DatePicker components
Pokémon Games: Fix mobile responsive design, add difficulty settings, implement sound effects, deploy to production
Cross-project: Unify design systems across TPMJS, Blocks, and Isolator—they’re using slightly different component APIs and it’s getting messy

Links & Resources

Projects

TPMJS - Tool Package Manager for AI Agents
JSON Resume Pathways - Career visualization and job recommendations
Blocks Documentation - CLI validation library
Omega Discord Bot - AI-powered Discord moderation

NPM Packages

@tpmjs/ui - Design system components
@tpmjs/cli - Command-line interface for TPMJS
@tpmjs/tools-sprites - Sprites.dev VM management tools
@tpmjs/tools-hllm - HLLM API integration
@tpmjs/tools-unsandbox - Unsandbox code execution tools
@tpmjs/executor - Multi-provider code execution engine

Tools & Services

Vercel AI SDK - AI generation with GPT-5.2
PokéAPI - Pokémon data and sprites
Sprites.dev - Ephemeral cloud VMs
exe.dev - Long-running VM management
unsandbox - Code execution sandboxing
BM25 algorithm - Relevance ranking for tool discovery
react-beautiful-dnd - Drag and drop for React

Inspiration

WARP.md convention - Machine-readable codebase documentation
DALL-E 3 - AI-generated comics and images

← Back to posts