“You should never build a CMS”

Written by Knut Melvær

We're just going to call it: up until recently, cursor.com was powered by Sanity as its CMS.

Then Lee Robinson sat down and spent 344 agent requests and around $260 to migrate the content and setup to markdown files, GitHub, Vercel, and a vibe-coded media management interface.

He did a great write-up of the process on his blog. He was classy and didn't name us.

Of course, when a high-profile customer moves off your product and the story resonates with builders you respect, you pay attention.

The weird twist here is that we sort of agree with Lee’s take. He has a lot of great points. The conversation around complexity and abstractions that a headless CMS brings reflects real frustration. The way things have been done for the past decade deserved criticism.

But Lee's post doesn't tell the full story. We see what people are trying to solve when it comes to content every day. We live and breathe this CMS stuff. So let us add some context.

What Lee got right

The headless CMS industry built complexity that didn't deliver proportional value for many. This is true.

Preview workflows are clunky. Draft modes, toolbar toggles, account requirements just to see what your content looks like before it goes live. Having to add data attributes everywhere to connect front ends with backend fields feels unnecessary. Real friction for something that feels it should be simple.

Auth fragmentation is annoying. CMS login. GitHub login. Hosting provider login. Three systems to get a preview working.

Their CDN costs was largely caused by hosting a video from our file storage. It’s not an ideal way to host videos in front of Cursor’s massive audience. We should have made it more obvious that there are better and cheaper ways, like using the Mux plugin.

332K lines of code was removed in exchange for 43K new ones. That sounds a great win. We love getting rid of code too.

And here's the one that actually matters: AI agents couldn't easily reach content behind authenticated APIs. When your coding agent can grep your codebase but can't query your CMS, that's a real problem. Lee felt this friction and responded to it. (We did too, and the new very capable MCP server is out).

These complaints are valid. We're not going to pretend otherwise.

What Lee actually built (spoiler: a CMS)

Here's the thing though. Read his post carefully and look at what he ended up with:

An asset management GUI (built with "3-4 prompts," which, to be fair, is impressive)
User management via GitHub permissions
Version control via git
Localization tooling
A content model (markdown frontmatter with specific fields)

These are CMS features. Distributed across npm scripts, GitHub's permission system, and Vercel's infrastructure.

The features exist because the problems are real. You can delete the CMS, but you can't delete the need to manage assets, control who can publish what, track changes, and structure your content for reusability and distribution at scale.

Give it six months. The bespoke tooling will grow. The edge cases will multiply. Someone will need to schedule a post. Someone will need to preview on mobile. Someone will want to revert a change from three weeks ago and git reflog won't cut it. The "simple" system will accrete complexity because content management is complex.

Even with agents. Who were mostly trained within the constraints of these patterns.

What breaks at scale

Lee's model is clean: one markdown file equals one page. Simple. Grep-able.

This works until it doesn't.

The content === page trap

What happens when your pricing lives in three places? The pricing page, the comparison table, the footer CTA. In markdown-land, you update three files. Or you build a templating system that pulls from a canonical source. At which point you've invented content references. At which point you're building a CMS.

What happens when legal needs to update the compliance language that appears on 47 pages? You grep for the old string and replace it. Except the string has slight variations. Except someone reworded it slightly on the enterprise page. Except now you need to verify each change because regex can't understand intent. Now you are building a CMS.

What happens when you want to know "where is this product mentioned?" You can grep for the product name. You can't grep for "content that references this product entity" because markdown doesn't have entities. It has strings.

Suddenly you’re parsing thousands of files on every build to check for broken links (that you can’t query). And yes, you are building a CMS.

Structured content breaks the content = page assumption on purpose. A product is a document. A landing page document references that product and both are rendered together on the website. And in an app. And the support article for that product. When the product information changes, that changes is reflected in all these places. When you need to find every mention, you query the references, not the strings.

Engineers understand this. It's normalization. It's the same reason you don't store customer_name as a string in every order row. You store a customer_id and join.

Markdown files are the content equivalent of denormalized strings everywhere. It works for small datasets. It becomes a maintenance nightmare at scale.

Git is not a content collaboration tool

Git is a version control system built for code. Code has specific properties that make git work well:

Merge conflicts are mechanical. Two people edited the same function. The resolution is structural.
Line-based diffing makes sense. Code is organized in lines that map to logical units.
Branching maps to features. You branch to build something, then merge when it's done.
Async is fine. You don't need to see someone else's changes until they push.

Content has different properties:

Merge conflicts are semantic. Two people edited the same paragraph with different intentions. The "correct" merge requires understanding what both people meant.
Line-based diffing is arbitrary. A paragraph rewrite shows as one changed line that actually changes everything. If you have block content (like Notion) this breaks apart even more.
Branching doesn't map to content workflows. "I'm working on the Q3 campaign" isn't a branch. It's 30 pieces of content across 12 pages with 4 people contributing.
Real-time matters. When your content team is distributed, "I'm editing this doc" needs to be visible now, not after a commit and push. Even more so with AI automation and agents in the mix.

None of this is git's fault. Git solved the problem it was built for brilliantly. Content collaboration isn't that problem.

We know this because every team that scales content on git builds the same workarounds:

Lock files or Slack conversations to prevent simultaneous editing
"I'm working on X, don't touch it" announcements
Elaborate PR review processes that become bottlenecks
Content freezes before launches because merge complexity is too high

Sound familiar? These are the problems CMSes were built to solve. Real-time collaboration. Conflict-free editing. Workflow states that aren't git branches.

Why "Agents can grep" only works up to a point

Lee's core argument: AI agents can now grep the codebase, so content should live in the codebase.

This sounds reasonable until you think about what grep actually does. It's string matching. Pattern finding. It's great for "find all files containing X."

It's not great for:

"All blog posts mentioning feature Y published after September"
"Products with price > $100 that are in stock"
"Content tagged 'enterprise' that hasn't been translated to German yet"
"The three most recent case studies in the finance category"

Here's what that last one looks like in GROQ:

Internal server error