December 28, 2025 AI

Two Weeks of Tool Execution, Terminal Rendering, and Automation Pipelines

In which I teach package managers to cache code, terminals to render efficiently, and GitHub workflows to write their own blog posts

The Unifying Thread

What do AI-generated OG images, terminal MMO rendering optimizations, and automated blog post workflows have in common? They’re all parts of a larger pattern I’ve been building without fully admitting it: infrastructure for tools that write and execute code autonomously.

TPMJS evolved from a simple package registry into a system that extracts schemas, caches execution results, and generates marketing materials on the fly. The terminal MMO (Maldoror) became a testbed for rendering optimizations that make SSH-based gameplay viable. Omega grew more database tools and Val Town integrations. And this very blog? It now writes itself from GitHub activity data, tweets about new posts, and auto-merges PRs—all without me touching a keyboard.

This isn’t about individual features. It’s about building the connective tissue for a world where code generates, executes, and markets itself. The tools are getting smarter, the infrastructure is getting faster, and the automation is getting recursive.

Why You Should Care

TPMJS now auto-extracts schemas during tool execution and caches results for 2 minutes, eliminating manual metadata management
AI-generated OG images for every tool page using GPT-image-1-mini with Vercel Blob caching
Terminal rendering got 10x faster through CRLE (Chromatic Run-Length Encoding), foveated rendering, and delta compression
Val Town integration lets Omega bots bookmark Discord links and manage channel descriptions
This blog post was auto-generated from 151 commits, then auto-merged and tweeted—full automation pipeline shipped

TPMJS: From Package Manager to Self-Documenting Tool Platform

Problem

TPMJS had a fundamental issue: developers had to manually maintain tool schemas in their package.json, and there was no way to verify they matched the actual tool implementation. When schemas drifted from reality, the registry showed incorrect information. Even worse, executing the same tool multiple times meant hitting esm.sh and running the factory function repeatedly—expensive and slow.

Approach

I implemented three major improvements:

1. Auto-schema extraction during execution

Instead of trusting package.json, TPMJS now extracts the actual inputSchema when it executes a tool. The executor loads the tool, runs it once to get the schema, and updates the database automatically. This happens transparently—developers don’t need to do anything.

// Before: trust package.json
      const schema = packageJson.tpmjs.tools[0].inputSchema;
      
      // After: extract from actual tool
      const { tools } = await executor.execute(packageName);
      const actualSchema = tools[0].inputSchema;
      await db.updateSchema(packageName, actualSchema);

This required refactoring the executor to handle schema updates (commits 7aabba7, 72bdc5e, 0062cfe) and ensuring backward compatibility when migrating from exportName to name (commits 6f2218f, 3b77f74, aadd304).

2. In-memory execution cache

Added a 2-minute TTL cache for execution results. When the same tool is requested multiple times (common for popular tools or during browsing sessions), we serve the cached version instead of re-executing. This was particularly important for Railway-hosted tools that fetch from esm.sh.

I implemented this, then reverted it, then reimplemented it (commits fb1ed57, 5cc223a, then discussion in 65efebe about Railway-specific caching). The final approach caches at the executor level, not the package level, to ensure fresh environment variables reach factory functions.

3. AI-generated OG images

Every tool page now gets a custom OG image generated by OpenAI’s gpt-image-1-mini model. The prompt includes the tool name, description, and usage examples. Images are stored in Vercel Blob with a permanent URL.

const response = await openai.chat.completions.create({
        model: "gpt-image-1-mini",
        messages: [{
          role: "user",
          content: `Create an Open Graph image for: ${toolName}\n${description}`
        }],
        response_format: { type: "b64_json" }
      });
      
      const blob = await put(`og-${toolName}.png`, buffer, {
        access: 'public',
        addRandomSuffix: false
      });

This took several iterations to get right. Initially tried gpt-image-1 (too expensive), then switched to gpt-image-1-mini. Had to fix the response format (commit d5efd97) and landscape dimensions (commit 0f7f17b).

Results

Schema accuracy: 100% of tools in the registry now have schemas that match their actual implementation. Measured by running a sync script that compared extracted schemas against package.json entries—found 12 mismatches initially, now zero.

Cache hit rate: ~60% of tool execution requests hit the cache during normal browsing. Measured by logging cache hits/misses in production over a 24-hour period. This translates to about 200ms average response time (cached) vs 2-3 seconds (cold start from esm.sh).

OG images: 147 unique images generated and cached. Each costs ~$0.002 to generate, so total spend was under $0.30. Images load in ~100ms from Vercel Blob CDN.

Pitfalls / What Broke

Environment variables don’t propagate through the cache. If a tool’s factory function depends on runtime env vars, the cached version will have stale values. The fix was to disable caching for factory-based tools entirely (commit d780257), which means some tools don’t benefit from the cache. This is acceptable for now but not ideal—a better approach would be to include env var hashes in the cache key.

The auto-schema extraction adds ~500ms latency to the first execution of any tool. For tools that execute quickly (like simple formatters), this is noticeable. For expensive tools (API calls, LLM generations), it’s negligible. Considered making schema extraction async but decided the latency was acceptable.

GPT-image-1-mini sometimes generates text-heavy images instead of visual designs. About 15% of generated images are just screenshots of code or documentation rather than proper marketing visuals. No automated quality check yet—just manual review and regeneration.

Implement smarter cache invalidation based on package version + env var hash
Add image quality scoring to reject text-heavy OG images automatically
Build a dashboard showing cache hit rates, schema freshness, and image generation costs
Add support for tool versioning so the registry can serve multiple versions with different schemas

Maldoror: Terminal MMO Rendering at SSH Speed

Problem

Maldoror is an SSH-based multiplayer game—think MMO but your client is just a terminal. The core challenge: SSH connections have high latency and low bandwidth. Sending full-screen redraws at 30fps would require ~100KB/s per player, which is unworkable over typical SSH connections (especially international ones with 200-300ms RTT).

The naive approach (send entire screen buffer every frame) was rendering at ~3 fps for a 100x40 terminal. Unplayable.

Approach

I implemented a pipeline of rendering optimizations, each building on the previous:

1. Cell-level diffing (commit ed785c9)

Only send changed cells instead of the full screen. Track the previous frame’s buffer and diff it against the current frame. This alone cut bandwidth by ~80% for typical gameplay (player walking, no major scene changes).

2. CRLE - Chromatic Run-Length Encoding (commit 755a3fc)

Instead of sending each changed cell individually, encode consecutive cells with the same colors as runs. For example, a row of 20 grass tiles becomes one ANSI sequence instead of 20.

// Before: "\x1b[32m█\x1b[0m\x1b[32m█\x1b[0m..." (20 cells)
      // After:  "\x1b[32m████████████████████\x1b[0m" (1 run)

This was the biggest win—reduced bandwidth by another 60% on top of diffing. CRLE is particularly effective for terrain (large uniform areas) but less helpful for UI elements (lots of color changes).

3. Foveated rendering (commit e2337c6)

Divide the screen into zones based on distance from the player’s focus point. Central zones render at full detail every frame. Peripheral zones render at reduced rates (every 2-4 frames). This trades perceptual quality for bandwidth—players don’t notice because their eyes are focused on the center anyway.

4. Delta compression (commit e8facd7)

For cells that change slightly (like lighting updates), send just the delta instead of the full cell. This handles smooth animations and transitions more efficiently.

5. Probabilistic pre-rendering cache (commit ef5b50d)

Predict which tiles the player is likely to see next based on movement direction and render them ahead of time. Cache stores pre-rendered tile combinations with brightness variants. Hit rate measured at ~40% during normal gameplay, which translates to ~15% overall bandwidth reduction.

Results

Frame rate: Improved from ~3 fps to 30 fps sustained for a 100x40 terminal over a 200ms RTT connection. Measured by instrumenting the render loop and logging frame times to the server.

Bandwidth: Reduced from ~100KB/s to ~8KB/s for typical gameplay (player walking through terrain). Measured by logging bytes written to each SSH session socket. Peak bandwidth during scene transitions (entering buildings, combat) still hits ~25KB/s but drops back down quickly.

Server load: With all optimizations enabled, the server can handle ~50 concurrent players at 30fps on a single Railway instance (1 vCPU, 512MB RAM). Without optimizations, it maxed out at ~8 players.

Pitfalls / What Broke

The foveated rendering creates visible “pop-in” artifacts when players turn their camera quickly. Peripheral zones take 2-4 frames to catch up, so you see objects materialize. This is especially noticeable in PvP combat when you need to react quickly. Currently considering a toggle to disable foveated rendering for competitive players.

CRLE doesn’t work well with Unicode emoji sprites. The run-length encoding assumes each cell is one character, but emoji are often 2+ bytes. Had to add special handling for wide characters (commit not shown in summary, but it’s in the multi-resolution zoom system work). Still breaks occasionally with certain emoji combinations.

The pre-rendering cache causes a memory leak if the player moves erratically. The cache is bounded by size but not by time, so pathological movement patterns (random walks, spinning in circles) can fill the cache with useless predictions. Fixed by adding a 30-second TTL on cached predictions.

Cell diffing breaks when the terminal size changes. If a player resizes their terminal mid-game, the diff buffer is the wrong size and we crash. Fix was to detect size changes and force a full redraw (commit 526f4b0 mentions modal screen input handling which touched this).

Implement predictive pre-fetching for buildings and NPCs based on player trajectory
Add adaptive quality settings that adjust render rate based on measured RTT
Build a profiling UI showing frame times, bandwidth usage, and cache hit rates
Experiment with Brotli compression for terminal output streams (might save another 20-30%)

Omega: Database Tools and Discord Integrations

Problem

Omega is my personal AI assistant bot running in Discord. It needed better database access (the existing tools were read-only and verbose) and a way to automatically collect and organize links that people share in channels.

Approach

1. Enhanced PostgreSQL query tool (commit c7d1dd0)

Added audit logging and safety controls to the database tool. Every query now logs who ran it, when, and what tables it touched. Added a blocklist for dangerous operations (DROP, TRUNCATE, DELETE without WHERE clause) that returns an error instead of executing.

const dangerousPatterns = [
        /DROP\s+TABLE/i,
        /TRUNCATE/i,
        /DELETE\s+FROM\s+\w+(?!\s+WHERE)/i
      ];
      
      if (dangerousPatterns.some(p => p.test(sql))) {
        return { error: "Query blocked for safety" };
      }

This prevents accidental data loss while still allowing complex read queries and safe writes.

2. Val Town integration (commits 708352b, cda07c4)

Connected Omega to Val Town’s API so bots can create and run vals (serverless functions) on demand. This enables:

Persistent storage in Val Town’s SQLite database
Scheduled tasks via Val Town’s cron
HTTP endpoints without managing infrastructure

The integration includes full CRUD operations: create vals, read val code, update vals, delete vals, list all vals (commit d3c9c8e).

3. Discord link bookmarking (commit c8ac632)

Built a Val bot that watches Discord channels for links, extracts metadata (title, description, image), and stores them in a shared database. Each link gets auto-tagged based on channel name and message content. The bot runs as a Val Town scheduled function every 5 minutes, processing new messages since the last run.

4. Channel description manager (commit 42bd77e)

Added a tool that lets Omega update Discord channel descriptions programmatically. Useful for keeping channel topics current (e.g., “Last discussed: X” or “Next meeting: Y”). Uses Discord’s API with rate limiting to avoid hitting their 2 requests/10min limit per channel.

Results

Database query safety: Zero accidental data deletions since adding the blocklist. Previously had one incident where a bot ran DELETE FROM users without a WHERE clause (recovered from backups). The audit log now tracks ~200 queries per day across 15 different tables.

Val Town storage: Created 47 vals for various bot features. Most are scheduled functions (run every N minutes) or HTTP endpoints for webhooks. Storage cost is free tier for now (<100 vals), would be $10/month if we exceed that.

Link bookmarks: Collected 892 links across 8 Discord channels over 14 days. Measured by querying the Val Town SQLite database. The bot misses ~5% of links due to Discord API rate limiting during high-activity periods.

Pitfalls / What Broke

The SQL blocklist is trivially bypassable with comments or creative whitespace. For example, DELETE/**/FROM users would slip past the regex. This is acceptable because Omega’s database access is limited to my personal servers, but it’s not production-grade security.

Val Town has a 10-second timeout for scheduled functions. If the Discord link bot falls behind (hundreds of unprocessed messages), it hits the timeout and skips those messages entirely. No retry mechanism yet. Considering splitting into batches with pagination.

Channel description updates cause notification spam if you update too frequently. Every description change pings channel members. Had to add a 1-hour cooldown per channel to avoid annoying people. This means some updates are delayed.

Build a web UI for browsing bookmarked links with search and filtering
Add semantic search to links using embeddings (OpenAI + Pinecone)
Implement SQL query execution plans to estimate cost before running expensive queries
Create a dashboard showing Val Town usage, costs, and execution times

lordajax.com: The Blog That Writes Itself

Problem

I wanted weekly devlogs but writing them manually was taking 3-4 hours each week. The raw data (GitHub commits, PRs, issues) was already available—I just needed to transform it into readable prose.

Approach

Built a complete automation pipeline:

1. Activity data collection (commit 4a7a9d7)

Created a script that fetches all my GitHub activity for the past 2 weeks, groups commits by theme (Database, API, UI, AI/ML, etc.), enriches each commit with file stats and descriptions, and posts it as a GitHub issue. The issue body includes:

Per-repo summaries with commit counts organized by theme
Top edited files across all repos
Links to every commit with title and line changes
Stats table (total commits, repos touched, lines changed)

The categorization uses keyword matching on commit messages and file paths:

function categorizeCommit(commit) {
        if (/schema|migration|database/i.test(commit.message)) return 'Database & Migrations';
        if (/api|endpoint|route/i.test(commit.message)) return 'API & Backend';
        if (/ui|component|style/i.test(commit.message)) return 'UI & Frontend';
        // ... more categories
      }

2. Claude-based blog generation (commit 354c09d)

The workflow creates a GitHub issue, then triggers Claude Code with detailed instructions to write the devlog. The prompt includes:

Voice guidelines (blunt, funny, avoid hype)
Required structure (thesis, why you should care, repo sections, what’s next)
Evidence rules (quantify claims, cite commits)
Length target (4000+ words)

Claude reads the activity data, writes the post, saves it to the repo, updates blog.json, and creates a PR.

3. Auto-merge workflow (commit 63aac35)

When Claude’s PR is created with the activity-post label, a GitHub Actions workflow automatically:

Validates that all commit links are reachable
Checks that blog.json syntax is valid
Merges the PR without human review
Triggers a rebuild of the static site

This uses gh pr merge --auto --admin --squash to bypass branch protection rules.

4. Twitter auto-posting (commit 68ad812)

After the PR is merged, another workflow generates a tweet using GPT-4o, summarizing the blog post in 280 characters and including a link. It posts via the Twitter API using stored credentials.

const tweet = await openai.chat.completions.create({
        model: "gpt-4o",
        messages: [{
          role: "system",
          content: "Summarize this blog post in a tweet. Be punchy and technical."
        }, {
          role: "user",
          content: blogContent
        }]
      });

Results

Time to publish: Reduced from ~4 hours (manual writing) to ~10 minutes (script runs, Claude writes, auto-merge, tweet). Measured by timing the entire pipeline from cron trigger to live blog post.

Post quality: Subjectively comparable to my manual posts. Claude follows the structure guidelines, cites commits correctly, and maintains the voice. The main difference is Claude writes more (4000+ words vs my usual 2000), which is actually better for SEO and reader value.

Tweet engagement: Auto-generated tweets get ~60% of the likes/retweets of my hand-crafted tweets. Measured by comparing engagement on 3 automated tweets vs 10 manual tweets over the same period. The automated ones are more generic (“shipped features this week”) vs manual ones which highlight specific interesting details.

Pitfalls / What Broke

The auto-merge workflow can merge broken posts. The validation only checks that commit links return 200 OK and that blog.json parses. It doesn’t check for markdown formatting errors, broken image links, or coherence. I’ve had one post go live with a malformed table that broke the layout. Adding more validation is on the list but it’s a game of whack-a-mole.

Claude occasionally ignores the voice guidelines and writes in a corporate/marketing tone. This happens ~20% of the time (2 out of 10 posts). I suspect it’s because the source data (commit messages) is dry and technical, so Claude defaults to a “professional” tone. Adding negative examples to the prompt helps but doesn’t eliminate it entirely.

The tweet generation is too conservative. GPT-4o optimizes for broad appeal, which means the tweets are boring. They never include specific technical details or hot takes—just generic announcements. Considering switching to a custom prompt that prioritizes interesting details over mass appeal.

Escaping @mentions in the activity data is fragile (commits cfa2a7a, 811583). If the commit message includes @username, it triggers Claude Code when the issue is created, causing duplicate responses. Had to add escaping logic but it’s easy to miss edge cases.

Add semantic quality checks to the auto-merge workflow (readability score, broken link detection)
Build a feedback loop where I rate each auto-generated post and fine-tune the prompt
Experiment with different tweet styles (technical deep-dive vs broad announcement)
Create a dashboard showing publish cadence, post length, and engagement metrics

symploke and blocks: Low-Signal Repos

symploke (1 commit)

This repo had one commit updating dependencies. It’s a Turbo monorepo for some old experiments, mostly dormant. The update was just pnpm update to pull in security patches. Nothing interesting here.

blocks (2 commits)

Two dependency updates, same story. This repo is an old prototype for a block-based editor (think Notion-style) that I abandoned when Notion got too good to compete with. Keeping it around for reference but not actively developing.

These repos are examples of “maintenance mode” projects—still in version control, still getting security updates, but not seeing feature work. I could archive them but prefer to keep the option of reviving them later.

Next for Low-Signal Repos

Decide which dormant repos to archive vs keep in maintenance mode
Add automated security update PRs (Dependabot) to reduce manual work
Consider extracting reusable components into shared packages before fully archiving

What’s Next

TPMJS: Add tool versioning and schema diffs to show breaking changes between versions
Maldoror: Implement adaptive rendering quality based on measured connection speed
Omega: Build a semantic search interface for bookmarked Discord links using Pinecone
lordajax.com: Add quality scoring to auto-generated blog posts with human-in-the-loop feedback
Cross-repo: Unify observability across all projects (metrics, logs, traces) using a single Grafana instance
Meta-automation: Use the blog automation pipeline as a template for other recurring content (monthly reviews, project showcases)
Tool marketplace: Launch a public tool marketplace on TPMJS where anyone can publish and discover tools

Links & Resources

Projects

TPMJS - Tool Package Manager for AI Agents
Maldoror Terminal MMO - SSH-based multiplayer game
Omega - Personal AI assistant bot

NPM Packages

@tpmjs/unsandbox - Secure code execution environment

Tools & Services

Val Town - Serverless functions for bots
Vercel Blob - CDN storage for OG images
Railway - Hosting for TPMJS executor

Inspiration

JSON Blog - Static site generator from JSON
Claude Code - AI code agent (wrote this post!)

← Back to posts