Hermes has been part of my daily setup for a while. Most of the time it just quietly does its job — but a few moments from these months are worth writing down.

A diagnosis from a machine I’d already left

This one was the unexpected one.

One afternoon I was testing a program on my office MacBook. Initial spot-checks looked clean, so I packed up and headed home for the day. On the way back, I got a call from a colleague: the dashboard was showing odd behavior from my service, but it was running on my personal office machine, so he couldn’t reach the logs himself.

I started debating whether to head back to the office to take a look. Then I remembered: Hermes was still running on that MacBook.

So I asked it:

“There’s a program called xxx running on this Mac. Check whether it’s behaving normally — look at the terminal output and the process logs.”

Hermes did exactly that. It located the process, read what was on the screen, and tailed the relevant logs. A few minutes later, it came back with a clear diagnosis: a different service in the chain — one I depend on for tokens — had just shipped a new interface version, and the environment variables it needed for the new flow hadn’t been configured on that side. My test couldn’t get a valid token, and that was the upstream of every error my colleague was seeing on the dashboard.

I sent the diagnosis over, my colleague configured the missing variables on the interface side, and the issue closed itself. I never had to go back to the office.

That was the moment Hermes paid for itself, several times over.

A bad upgrade, and a surprising self-repair

The roughest moment I’ve had with Hermes happened right after an upgrade.

I noticed my ChatGPT usage was climbing unusually fast. But I was running heavy Codex sessions that day, so the rising usage seemed to track. Plausible. I shrugged it off.

The next day I barely touched Codex. The usage kept climbing.

That’s when I went and looked at Hermes’ own logs. The new version had been quietly erroring out in a tight loop, ever since the upgrade — silent on the foreground, expensive on the backend.

What happened next surprised me more than the bug itself. I copied the error log and handed it back to Hermes:

“Here’s a chunk of your own log from the last 24 hours. Find what’s wrong and fix it.”

It read the log, traced the failure, located the regression, and patched itself. The loop stopped. Usage flattened. I didn’t have to file a bug report or wait for an upstream fix.

The friction is real: a silent runaway after an upgrade is exactly the kind of thing observability is supposed to catch before it shows up on a bill. I’d like Hermes to warn me about its own error rate, not wait for me to notice.

But the recovery — feeding it its own logs and watching it patch itself — was the most “agent-like” experience I’ve had with any tool this year.

The quieter work, too

The stories above are the moments worth telling. Most days, Hermes’ job is much smaller. Two recurring tasks I no longer do by hand:

Bootstrapping a fresh machine

I work across multiple machines — a work laptop, a personal Mac, and a Linux box I keep around for heavier jobs. Every new one used to mean the same 90-minute ritual: Homebrew, dotfiles, zsh, Hugo, Node, Python, keys, aliases. I kept a bash script for it. It always broke on edge cases.

Now I describe the goal:

“Bootstrap this Mac the way you bootstrapped my last one. Skip anything already installed.”

It checks state, decides what’s missing, installs in a reasonable order, and reports what it skipped. I still verify before trusting it with secrets — but the boring 80% is hands-off.

Auditing my own website

Once in a while I have Hermes go through wanglong.cv — stale meta tags, missing alt text, slow images, occasional unindexed URLs. The kind of housekeeping I would otherwise never get around to.

“Audit the latest posts. Flag anything that hurts SEO, accessibility, or load time. Suggest concrete fixes — don’t lecture me.”

I accept some suggestions and push back on others. The point isn’t that it’s always right. The point is that without it, the audit doesn’t happen at all.

A few months in

I don’t have a tidy thesis about Hermes. It’s the tool I reach for when something needs doing on or around my machines, especially when I’m not at the keyboard.

The diagnosis from across town and the self-repair are the standout moments. Most days it’s quieter than that — and that’s the part I’ve come to rely on most.