MotherDuck alternatives: managed DuckDB without lock-in
DuckDB is one of my favorite databases. It's an embedded analytical engine that does for OLAP what SQLite did for OLTP: takes a category that used to require a server and makes it a library you pip install or pnpm add. The columnar storage is fast, the SQL dialect is generous, and the ability to query CSV and Parquet files directly from disk without any import step is genuinely revolutionary for anyone who's spent a career building ETL pipelines.
The hosting story for DuckDB is more complicated. MotherDuck is the company started by some of the DuckDB team to commercialize it as a cloud service. It works well and has real product polish. But MotherDuck is not a thin managed layer over DuckDB. It's a hybrid execution engine that adds a proprietary layer on top, with its own pricing model (compute units, dual execution), its own SQL extensions, and its own lock-in. That bothers some people and not others. If you're in the bothered camp, this post is for you.
Contents
- What MotherDuck is and isn't
- Option 1: Layerbase Cloud
- Option 2: Run it locally with SpinDB
- Option 3: Layerbase Desktop
- Option 4: Embed DuckDB in your app
- Option 5: DuckDB on a single VM
- Which one to pick
What MotherDuck is and isn't
MotherDuck is two things at once. It's a hosted DuckDB you can run queries against, and it's a proprietary "hybrid execution" engine that splits work between your laptop and their cloud based on where the data lives. The hybrid execution is the clever part. It's also the part that means MotherDuck is not interchangeable with plain DuckDB.
A few things that surprise people coming from open-source DuckDB:
The SQL dialect is mostly compatible but adds MotherDuck-specific extensions. Cross-database queries between your local DuckDB and MotherDuck's cloud use syntax that doesn't exist in vanilla DuckDB. If your queries lean on those features, migrating off MotherDuck takes work.
The pricing model is compute-unit based. Predicting cost requires understanding when queries run locally versus in the cloud, which involves their hybrid execution logic.
Lock-in is real even though it doesn't look like it. You can export your data anytime (it's just DuckDB underneath), but rewriting queries that relied on MotherDuck-specific features takes effort.
None of this is bad. MotherDuck is a real product solving real problems. It's just not what some people think they're signing up for when they search "hosted DuckDB."
Option 1: Layerbase Cloud
Layerbase Cloud hosts plain DuckDB. No hybrid execution, no proprietary SQL extensions, no compute-unit pricing. Just DuckDB behind a PostgreSQL wire protocol proxy so you can connect with any standard pg client.
psql "postgresql://layerbase:password@<your-host>.cloud.layerbase.dev/duck1?sslmode=require&sslnegotiation=direct"How this works: DuckDB itself doesn't have a server. It's embedded. To make it network-accessible, we run duckgres (a PostgreSQL-wire proxy in front of DuckDB) inside the container. From the client side, it looks like a Postgres database. From the server side, queries are running through the DuckDB engine with its full feature set.
What you get:
- Standard DuckDB SQL, no extensions or rewrites.
- Direct CSV/Parquet querying from S3 or HTTP works the same as local DuckDB.
- PostgreSQL drivers everywhere. Drizzle ORM, Prisma's PG adapter, the
pgnpm package, psql, DBeaver, all work. - Scale-to-zero for free instances. DuckDB cold-starts fast.
What you don't get:
- MotherDuck's hybrid execution. If your workload requires splitting work between a local DuckDB and a cloud one, this isn't a substitute. For pure cloud workloads it's fine.
- Multi-user concurrency at MotherDuck's scale. DuckDB is single-writer, multiple-reader by design. For a personal or small-team analytics workload, this is rarely the bottleneck. For a large customer-facing analytics product, ClickHouse is the better fit.
For anyone who likes DuckDB and wants it managed without rewriting queries, this is the option to try first.
Option 2: Run it locally with SpinDB
DuckDB is an embedded database, so "running it" usually means import duckdb in Python or pnpm add duckdb in Node. That works great for most use cases. But sometimes you want DuckDB as a process you can connect to from multiple clients on your laptop, and that's what SpinDB gives you. (What is SpinDB?)
npm i -g spindb # npm
pnpm add -g spindb # pnpmCreate a DuckDB instance:
spindb create duck1 -e duckdb --startGet the connection URL:
spindb url duck1postgresql://127.0.0.1:5432/duck1That's a real DuckDB running behind a PG-wire proxy on your machine. Same setup as Layerbase Cloud, but local. You can connect from any tool that speaks PostgreSQL: psql, DBeaver, TablePlus, the pg npm package, the Postgres adapter in your favorite ORM.
spindb stop duck1
spindb start duck1
spindb listUse cases where this beats the embedded DuckDB library:
- You want to run queries from multiple clients at once (e.g., a Jupyter notebook and a Node script).
- You want to inspect DuckDB tables with a GUI database tool without writing a Python script.
- You're testing what a Postgres-wire DuckDB feels like before deploying to Layerbase Cloud.
For pure scripting and ETL, the embedded library is still simpler.
Option 3: Layerbase Desktop
Layerbase Desktop wraps SpinDB in a Mac app. For DuckDB, it's especially useful because it gives you a stable hostname/port for a local instance you can leave running. New instance, pick DuckDB, copy the connection string into TablePlus or DBeaver.
If you do a lot of ad-hoc analytical work and don't want to fight with Python environments or Jupyter kernels just to run a SELECT, this is a comfortable way to keep a DuckDB around for query work.
Option 4: Embed DuckDB in your app
This is the option DuckDB was actually designed for. pnpm add duckdb (or the Python, Go, Rust, or .NET equivalent) and you have a database. No server, no connection string, no managed anything.
import duckdb from 'duckdb'
const db = new duckdb.Database(':memory:')
const conn = db.connect()
conn.all(`
SELECT category, SUM(quantity * price) AS revenue
FROM read_csv_auto('sales.csv')
GROUP BY category
ORDER BY revenue DESC
`, (err, rows) => {
if (err) throw err
console.log(rows)
})That's a complete DuckDB program. No daemon to start, no port to allocate. The data lives in a single .duckdb file (or in memory if you pass :memory:), and your application talks to it through the embedded API.
When this is the right choice:
- The data lives in the same process as the code that queries it.
- You don't need network access to the database.
- You're doing batch analytical work, not serving concurrent dashboard users.
For data engineering pipelines, internal CLI tools, and analytics inside web app backends where you're loading data on the fly, embedded DuckDB is almost always the right answer. The managed and local-server options become useful when the embedded model doesn't fit (multiple processes, GUI tools, network access).
Option 5: DuckDB on a single VM
If you want to self-host DuckDB as a network-accessible service, run duckgres yourself on a small VM. It's a Postgres-wire proxy that wraps a DuckDB file. Stand up a VM, install DuckDB, run duckgres, point it at your DuckDB file, expose the Postgres port.
You're doing roughly what Layerbase Cloud does internally for DuckDB instances. The trade-off is that you're now responsible for TLS, backups, restarts, monitoring, and disk pressure. For most people, the managed version is worth the price.
Which one to pick
Short version:
- Want managed DuckDB without proprietary extensions? Layerbase Cloud.
- Need DuckDB as a service running on your laptop? SpinDB.
- Want the same thing in a Mac app? Layerbase Desktop.
- Doing analytics inside an existing application or pipeline? The embedded DuckDB library.
- Specifically want MotherDuck's hybrid execution model? Use MotherDuck. It's good at what it does.
The general framing: MotherDuck is great when MotherDuck-specific features are the reason you're picking it. Plain hosted DuckDB is great when you want DuckDB itself and don't want to learn a new product surface to get it. Both are legitimate choices.
If you want a deeper walkthrough of DuckDB itself, getting started with DuckDB covers the engine in detail with a real example, and the ClickHouse vs DuckDB comparison lays out when to pick DuckDB versus a production analytics server.