How Ghostty Creator Mitchell Hashimoto Uses AI Agents [With Examples]

Q: What does Mitchell Hashimoto first recommend to AI skeptics?

Not code generation, but 'delegating research.' He considers this the lowest-risk, highest-reward use case - for example, having an agent enumerate every library in a specific language with a specific license type, and summarize each candidate's pros, cons, level of activity, and reputation in a few pages.

Mitchell Hashimoto — co-founder of HashiCorp and the creator of Terraform, Vagrant, Vault, and more. He now develops Ghostty, a fast terminal emulator. He was originally an engineer skeptical of AI, which makes his hands-on perspective all the more persuasive.

Mitchell Hashimoto speaks about how to work with AI agents from the standpoint of a "former AI skeptic" who refuses to be swept up by the hype. The keys: "agents, not chatbots," "start with research," "harness engineering," and the principle that the human must remain the expert and reviewer. In this article we introduce his approach with concrete examples.

This article is based on our research of Mitchell Hashimoto's blog (mitchellh.com, primary source) and The Pragmatic Engineer, which reported on it (secondary source). For quotes from X (formerly Twitter), we recommend a final check of the exact wording. Sources are listed at the end of the article.

1. Use "agents," not "chatbots"

The first thing he emphasizes is that the value comes from agents, not chatbots. An agent is "an LLM you can chat with that can invoke external behavior in a loop" - at minimum something that can read files, run programs, and send HTTP requests. He candidly admits he was "not very impressed" with Claude Code at first, but argues that adopting any tool always goes through three stages.

(1) a period of inefficiency (2) a period of adequacy, then finally (3) a period of workflow and life-altering discovery.
— My AI Adoption Journey (2026-02-05)

2. Delegate "research" first

The entry point he recommends to AI skeptics is not code generation but delegating research. He says this is the lowest-risk, highest-reward use case.

There's a lot of people like, "I don't want it to write code for me." But just delegate some of the research part.
— The Pragmatic Engineer (secondary source, quoting his own words)

Specifically, on his blog he gives uses such as the following.

Deep research: delegate investigations like "enumerate every library in a specific language that has a specific license type," and have it summarize each candidate's pros/cons/level of activity/reputation in a few pages
Parallel agents: have multiple agents simultaneously try out "vague ideas you wanted to attempt but never had time to start"

3. A concrete example: practice in Ghostty

What sets him apart is that he publishes real examples from his own large-scale OSS project, Ghostty, rather than staying abstract.

The command palette that ships for macOS in Ghostty today is only very lightly modified from what Gemini produced for me in seconds.
— My AI Adoption Journey (2026-02-05)

He also documents in detail the process of building the macOS "subtle auto-update notification" feature in a piece titled "Vibing a Non-Trivial Ghostty Feature", disclosing concrete numbers.

$15.98

Token cost

Agent sessions

~8 hours

Actual working time over 3 days

What stands out is that he writes about failing roughly four times and changing strategy to fix one bug. He did not just let the agent run on autopilot - the human intervened and switched approaches. That is where his way of working shows through.

4. "Harness engineering"

His core concept is harness engineering. The definition goes like this.

Anytime you find an agent makes a mistake, you take the time to engineer a solution such that the agent never makes that mistake again.
— My AI Adoption Journey (2026-02-05)

In other words, "every time an agent makes a mistake, engineer a solution so it never makes that mistake again." He says there are two means.

Implicit prompt improvement (AGENTS.md) — Ghostty's repository actually contains a real file called src/inspector/AGENTS.md
Tools you actually program — such as scripts that take screenshots, or scripts that run a narrowed-down set of tests

On verification, too, he leaves words that resonate with Boris Cherny and Simon Willison.

If you give an agent a way to verify its work, it more often [than] not fixes its own mistakes and prevents regressions.
— My AI Adoption Journey (2026-02-05)

5. The warning of "agent psychosis"

And here is his sharpest warning. An agent he had optimize the renderer produced an "amazing result," cutting frame time from 88ms to 2ms and allocations from ~150,000 to 500. But —

Sounds good, right? Wrong. This is exactly why agent psychosis is a big fucking problem.
— Mitchell Hashimoto's post on X (primary source; wording to be verified)

In fact, the version he rewrote by hand ran in about 0.02ms (roughly 75x faster than the agent's version), with zero allocations. The danger of taking "plausible-looking results" at face value - he calls this "agent psychosis" and treats it as his greatest caution.

Lesson: the better an agent's result looks, the more you should think, analyze, and verify it yourself. For OSS, he goes so far as to say "I used to trust by default; now I reject by default," and even says he deliberately plants prompt-injection "traps" in AGENTS.md and code comments to catch people who pass AI code along without reviewing it (primary source, X; to be verified).

Our perspective: Hashimoto's stance strongly overlaps with Hashito System's philosophy. AI is a tool that amplifies rather than replaces experts, and humans bear ultimate responsibility for quality. "Start with research," "eliminate mistakes through systems," "verify rather than trust results" - these are universal practices for steadily bringing new technology into real work. His honest self-assessment that "I can currently run background agents only about 10-20% of my working days" is also useful as a realistic, unexaggerated target.

Series: Learning How to Use Claude Code from Renowned Engineers

Overview: Learning How to Use Claude Code from Renowned Engineers
Armin Ronacher — Optimizing the Loop (Environment)
Simon Willison — Designing the Agent Loop
Boris Cherny — How the Claude Code Developer Himself Uses It
Mitchell Hashimoto (this article)

References

Mitchell Hashimoto, "My AI Adoption Journey" (February 5, 2026 / primary source) — https://mitchellh.com/writing/my-ai-adoption-journey
Mitchell Hashimoto, "Vibing a Non-Trivial Ghostty Feature" (primary source) — https://mitchellh.com/writing/non-trivial-vibing
Gergely Orosz / The Pragmatic Engineer, "Mitchell Hashimoto's new way of writing code" (February 25, 2026 / secondary source) — https://newsletter.pragmaticengineer.com/p/mitchell-hashimoto
Mitchell Hashimoto's posts on X (@mitchellh) (primary source; each quote should be verified live) — https://x.com/mitchellh

Frequently Asked Questions (FAQ)

What does Mitchell Hashimoto first recommend to AI skeptics?

Not code generation, but "delegating research." He considers this the lowest-risk, highest-reward use case - for example, having an agent enumerate every library in a specific language with a specific license type, and summarize each candidate's pros, cons, level of activity, and reputation in a few pages.

What is Mitchell Hashimoto's "harness engineering"?

Every time an agent makes a mistake, you engineer a solution so the agent never makes that mistake again. There are two means: implicit prompt improvement via AGENTS.md (Ghostty actually has a real src/inspector/AGENTS.md file), and tools you actually program, such as scripts that take screenshots or run a narrowed-down set of tests.

What does Mitchell Hashimoto mean by "agent psychosis"?

It is a warning about the danger of taking plausible-looking agent results at face value. He cites an agent that reduced frame time from 88ms to 2ms - which nearly impressed him - but when he rewrote it by hand it ran in about 0.02ms (roughly 75x faster) with zero allocations, arguing that the better a result looks, the more you should think, analyze, and verify it yourself.