Vibe FPS

Vibe FPSDecember 31, 2025

I spent the last two days vibe coding a 1v1 browser based FPS game.

The goal was to have the AI do as much of the work as possible and while my responsibility is just playtesting. I wasn't able to fully delegate the entire software development process, but I did make it pretty far (about 95% of the code was written by AI).

I stepped in a few times when a feature/bug had to be addressed via an architectural change. The triggers for this were:

I notice the AI fail a few times.
I review the AI changes and find it breaking the existing design patterns.
I preemptively realize that a refactor is needed to support this feature properly.

It should be possible for an agent to handle these cases with the right prompting, but I couldn't figure out a reliable way to do so yet. Amp and oh-my-opencode have interesting ideas of calling an oracle model to help with architectural/design decisions. The oh-my-opencode prompt for using the oracle is when the user explicitly requests architectural guidance or when the main agent fails a task 2+ times. The issue with this is that the main implementation agent can fail silently.

Some silent failure cases that I observed:

It implements a hacky solution that breaks semantic invariants.
It creates a fake test that passes even before the agent's changes.

The fake tests can be solved by just writing tests before implementation.

For the hacky solution case, maybe it's fine that the solution breaks design patterns as long as it works? At some point, technical debt will build up to the point where a refactor is necessary. All our hacky spot fixes will theoretically be resolved then.

The tradeoff with this approach is that your codebase will often be in a state of poor readability. But honestly, if you fully embrace vibe coding to the point of not even looking at the code, then maybe that's fine.

Overall, I'm pleasantly surprised. My trust in the agents is growing. My main takeaway is that it's SUPER worth investing in code quality upfront (testing, typechecking, linting, etc) to reduce the cost of tech debt and maximize your long-term shipping speed.