Content is Data: Announcing Sanity Content Lake

Written by Simen Svale

Today, we're excited to announce the release of Content Lake, a real-time database that gives you access to your content however and wherever you need it. It is the culmination of months of engineering work, and years of iterating on a vision, that puts in place a key piece of our long-term strategy to become a unified content platform. Read on to learn more about Content Lake and the GROQ features that are included in this release.

Why we built Content Lake

We’ve been told by countless customers that they view Sanity as being the only content platform that covers their needs. We designed Sanity to free organizations from the trappings of typical content management systems (CMS) that force you to think of your content in the context of web pages and content hierarchies. Most headless CMSes have the same limitations but offer some API as if that would address the fundamental flaw of their approach. Although we hate being compared to a headless CMS, the framing is helpful when understanding why we built the Content Lake. When one examines the two components that make up a headless CMS, you have an authoring layer and a database. This post will focus on our enhancements to the database, but suffice to say that we’ve had a powerful authoring layer for quite some time.

The reason the database is so important to Sanity is due to our structured content foundation. Most platforms in our space treat content as a soup of information that is limited and defined by the format it was authored in. Sanity spearheaded a new approach to content and turned the old model on its head. Instead of thinking of the world in terms of web pages, structured content frees you to think of content as data – well-formed records that let you reshape and present your content in any format. But structured content alone doesn’t get modern organizations to where we think they need to be. A place where they can create compelling digital experiences that resonate with their audiences.

You also need a robust, open-source query language that lets you interact with your content. You need support for real-time editing and patching of your content so you can collaborate with humans (and bots!) without locking others out or accidentally overwriting their changes. You need developer tooling that feels familiar the moment you interact with it. You need the ability to transform and shape your content on the fly. You need to base your content delivery on best practices like Portable Text. Our motivation to build this ideal database comes down to a desire to move the entire industry forward. It’s what we think our customers need, and in the end, we felt we had to build it ourselves to ensure it was done properly.

We believe today’s announcement fundamentally transforms how the industry can interact with content today and in the future.

Getting it right for the launch

When we launched Sanity.io in 2017, our APIs were already real-time, patch-based, and had full revision history down to the keystroke. We knew it’d be almost impossible to retrofit it with this functionality later, so we took our time to get it right from the start. In addition, we are very, VERY serious about having APIs that don’t break. This perspective means our team needs to keep APIs, even those with inconsistent behavior, active as customers could be inadvertently relying on those inconsistencies as part of their content management workflows. This makes optimization hard and impractical. And this is a reason we have left our APIs mostly untouched since our launch several years ago.

Enter Content Lake. The Content Lake is the database that we’ve rebuilt from scratch over the past 10 months. We've reimplemented parsers and query planners to create a faster, more consistent solution that is easier for us to optimize. Starting today, the Content Lake is available to all Sanity customers.

API versioning

As mentioned we really don’t like to make API breaking changes. That's why we are introducing API versioning to ensure that even as we fix bugs and release new features, customers relying on our platform can offer the same great experiences to their users while they decide how and when to migrate to the new versions of our APIs.

Our API versioning approach is pretty straightforward. Like the convention followed by Stripe, all Sanity endpoints are now versioned on ISO-dates:

The following query URL uses API version 2021-03-25:

Internal server error