LLM Coding Tools Compared: Claude Code, OpenAI Codex, Google Antigravity & Mistral Vibe

Claude Code has been my go-to tool for vibe coding for a while now. Via the Terminal, it has access to your entire codebase, can independently open, edit and create files. I still use ChatGPT for ideation and second opinions.

When Google launched Antigravity in mid-November, I got curious. Antigravity does the same thing, but with a visual interface. That lowers the barrier for those who see the Terminal as an obstacle. OpenAI came out with Codex and Mistral now has Vibe, a CLI (Command Line Interface, working via the Terminal) that puts Europe in the game too.

With four LLM providers releasing their own coding tools, it seemed interesting to compare them. If you generate a lot of code, that's financially attractive: direct API costs without markup. I tested them with the same assignment.

// THE_TEST_ASSIGNMENT

The test assignment

To compare the tools fairly, I used one shared assignment: build a webapp called Think Ahead Modules that:

01Adopts the visual style of any website (colors, fonts, button style)
02Generates matching "plug-in blocks" (FAQ, newsletter signup, pricing table)
03Shows a live preview
04Exports the result as ZIP for integration in other sites
05Runs an automatic self-check to confirm everything works

// TEST_SETUP

Test setup

What was the same

• Same app goal and user flow
• Same UI layout (dark, orange accent, sharp corners)
• Same 7 one-click example prompts
• Same output format (plain HTML/CSS/JS)

What was different

Each LLM was instructed to use its own provider's API for module generation: Claude via Anthropic, Antigravity via Google Vertex, Codex via OpenAI, and Vibe via Mistral.

// THE_TOOLS

The four tools tested

"Vibe coding" doesn't have to be done via the Terminal. Most LLM providers offer multiple ways to work with their tools. Below per provider: what options there are, which one I used, and how the test went.

Claude (Anthropic)

Anthropic offers multiple ways to code with Claude:

Interface	Description
CLI (Terminal)	Full access to your local codebase.
VS Code extension	Integration in your editor.
Claude App	Web/desktop interface, without direct file access.
GitHub integration	Works directly in repositories via the browser.

Claude Code CLI

Claude Code was the first "agentic" coding CLI that runs locally with full codebase access. Type claude in your terminal and the tool can independently create files, edit them, and execute terminal commands. The others followed this model.

The CLI interface works well, is fast, and offers users a basic but solid interface. Claude Code now also has an app interface (since November 2025), but it wasn't used in this test.

Claude Code took about 11.5 minutes to set up a complete tech stack from an empty folder (just a logo) and build the application.

// APP_DEMO

Demo: the webapp built with Claude Code in action

Result: The app worked immediately after starting the local server. However, module generation had trouble with style extraction: not all elements were consistently retrieved. Prompt tweaks helped limitedly.

UX of the generated web app is reasonably good, but the contrast isn't high enough to pass WCAG standards. After style extraction from the source URL, you can see the values presented, something that wasn't always the case with the others. That was an interpretation the LLM could make on its own. The placement of the publish and zip download buttons isn't ideal. What is nice is that you can see something is being generated through the loader animation.

Google Gemini

Google's coding tools fall under the Gemini Code Assist umbrella:

Interface	Description
Antigravity	Proprietary visual interface, no Terminal.
CLI (Terminal)	Gemini CLI for local codebase access.
VS Code / JetBrains	Extensions for popular editors.
GitHub Marketplace	Code review agent for repositories.
Firebase / Android Studio	Integrations for Google's own platforms.
Google Cloud	Enterprise integration with GCP services.

Google Antigravity

Antigravity distinguishes itself with a visual interface: on the left you see the file structure "emerge", on the right code is actively written. A Chrome extension enables direct browser communication: taking screenshots, pausing the browser during changes.

The Antigravity interface is really nice. The app has a VS Code-like interface if you want to use it for coding. Especially the overview of your codebase, the implementation plan, the tasks being executed, etc. give a better sense of overview and control than via Terminal where you can only work with text.

The initial setup took only 5 minutes, the fastest of the four.

// APP_DEMO

Demo: the webapp built with Antigravity in action

Result: The app has a nice clean interface with clear buttons and good contrast. What's unfortunate is that when you extract the style from the source URL, it's not visibly presented but used in the background. You can see it's done in the bottom left. The buttons are placed or appear in logical and visible locations.

The generated code for the modules works, but style extraction was particularly difficult as proved to be the case with all apps. After the first delivery, a few tweaks were needed to fix some bugs, but they were minor.

OpenAI Codex

Codex has evolved in 2025 into an ecosystem, connected via your ChatGPT account:

Interface	Description
CLI (Terminal)	Open source, built in Rust.
VS Code extension	Editor integration, also for Cursor and Windsurf.
Codex Cloud	Via chatgpt.com/codex. Tasks run in cloud sandboxes.
GitHub integration	Tag @codex in PR comments for reviews or tasks.
ChatGPT sidebar	In the ChatGPT interface and mobile app.

Unified workflow: All interfaces share the same context. Start a task in Terminal, monitor via webapp, review on your phone.

OpenAI Codex CLI

The Codex CLI is open source and built in Rust, which ensures fast startup times. Like Claude Code, navigate to a directory and type codex to start.

The OpenAI Codex CLI interface is comparable to Claude Code, so it also works well and fast, but entirely via text.

The initial setup took about 7.5 minutes.

// APP_DEMO

Demo: the webapp built with Codex in action

Result: The delivered app is reasonably good UX-wise, but busy due to the font used and the white area where results should appear give the app an unfinished feel.

After extraction, the styles are presented, but the generated modules were not good design-wise. Often white text on white background and with some tweaks this was only marginally improvable.

Mistral

Mistral has released a complete stack of coding tools in 2025:

Interface	Description
Vibe CLI (Terminal)	Open source CLI for vibe coding.
Mistral Code (VS Code / JetBrains)	Enterprise IDE plugin, currently in private beta.
Zed extension	Vibe CLI as extension in Zed editor.
Le Chat	Chat interface for code tasks, without direct codebase access.

Mistral Vibe CLI

Vibe is open source and runs on Mistral's own models: Europe's answer to American AI. The CLI is relatively new and still lacks some features that competitors have.

Vibe CLI works similarly to Claude Code CLI and Codex CLI. The display of actions is somewhat different. What's unfortunate is that I couldn't copy from the Terminal (even though I got a notification when I tried). I also couldn't upload screenshots to try to clarify something.

After 12 minutes, Mistral reported the task was complete. However, the app didn't start: a blue screen without content.

What went wrong:

After repeated attempts to let it solve itself, I received advice that could have cost me a lot of time if I had followed it. Up to hardware upgrades! Also strange: an API call that was no longer supported by Mistral itself, outdated info it was working with.

Eventually, ChatGPT solved it with one observation:

"Your HTML has a fixed inset-0 div with an opaque gradient that comes after your content in the DOM. It gets 'painted' on top of everything. pointer-events-none only prevents clicks, not visual covering."

A rather silly mistake, but Mistral didn't see it.

// APP_DEMO

Demo: the webapp built with Vibe in action (after fixes)

Result: The app, once running, was otherwise fine and comparable to the other apps. I did miss a preview placeholder, making it seem like something wasn't loading there.

// TECH_STACK

Notable: the tech stack choices

Each tool could decide for itself how to build the assignment. The choices were interesting:

Tool	Framework	Notable
Claude Code	Next.js/React/TypeScript	"All-in-one" approach
Antigravity	Next.js/React/TypeScript	Similar choice
Mistral Vibe	Next.js/React/TypeScript	+ Puppeteer for dynamic styling
Codex	Node/Express + JSON	More pragmatic, looser setup

Three out of four independently chose the same stack. Codex deviated with a simpler backend approach.

Mistral's Puppeteer choice is interesting: it suggests that tool took more seriously into account websites that only build styling after loading. Too bad the rest of the execution was less smooth.

// STUMBLING_BLOCK

The big stumbling block: style extraction

With all four tools, style extraction proved to be the hardest part. Modern websites spread their styling across:

Multiple stylesheets
Semantic HTML and class structures
JavaScript-driven rendering

"The style" is rarely one source you can simply copy. None of the tools solved this convincingly.

// CONCLUSION

My experience

Claude Code has been my first choice for development until now, with ChatGPT and Gemini for advice, second opinions, and code reviews. But with the rapid developments at Google Gemini and the nicely working app, that's a reason to use Antigravity as my primary tool and perhaps Claude Code as backup. Time to do some more testing in the coming period.

In my opinion, Google Antigravity is now the winner, with Claude Code narrowly behind. Then OpenAI Codex (which clearly had more difficulty generating usable modules in one go) and in last place Mistral Vibe due to the interface and the mediocre first result of the built web app.

// CONTACT

Want to get started with AI-driven development?

We're happy to help you choose and implement the right tools for your situation.

// MORE_READING

Check out our cases

CASE_01

Ruysdael PoC

Proof of concept for portfolio management with realtime data

CASE

Snuffl.app

From breed recognition to dog hub with AI-driven development

LLM Coding Tools compared

The test assignment

Test setup

What was the same

What was different

The four tools tested

Claude (Anthropic)

Claude Code CLI

Google Gemini

Google Antigravity

OpenAI Codex

OpenAI Codex CLI

Mistral

Mistral Vibe CLI

What went wrong:

Notable: the tech stack choices

The big stumbling block: style extraction

My experience

Want to get started with AI-driven development?

Check out our cases

Ruysdael PoC

Snuffl.app