Hey everyone, Alex here. Welcome back to another edition of Coding with Alex on sysseder.com.
If you're building a modern web application, a local-first desktop app, a mobile client, or even a microservice, there's a very high chance you've reached for SQLite. It is, without a doubt, one of the most reliable, battle-tested, and ubiquitous database engines on the planet. And when it comes to designing your schema, there’s another industry default we almost always reach for without thinking twice: the UUID (Universally Unique Identifier) as a primary key.
It makes sense, right? UUIDs let you generate IDs on the client side without waiting for a database round-trip. They prevent ID enumeration attacks (where malicious users guess auto-incrementing integers like /users/42), and they make syncing offline data a breeze. But if you are using UUIDs as primary keys in SQLite, you might be stepping directly into a massive performance trap.
Today, we're going to dive deep into the internals of SQLite to understand why standard UUIDs (specifically UUIDv4) can absolutely wreck your database's write performance, fragment your disk, and bloat your file sizes—and exactly what you should do instead to keep your apps blazing fast.
How SQLite Organizes Data Under the Hood
To understand why UUIDs hurt SQLite, we first have to understand how SQLite stores its data on disk. Unlike some larger database management systems (like PostgreSQL) which store data in heap files and point to them using indexes, SQLite is fundamentally a index-organized database.
By default, unless you explicitly specify otherwise, every standard table in SQLite is organized as a B-Tree (specifically a B+Tree for table data). If your table has a standard INTEGER PRIMARY KEY, SQLite optimizes this beautifully. This column becomes an alias for the internal rowid.
In a standard rowid table, data is stored sequentially on disk pages. When you insert a new row with an auto-incrementing integer key, SQLite does something highly efficient: it appends the new record to the very end of the database file. The database engine simply traverses the B-Tree, finds the rightmost page, and drops the new row in. This is an $O(\log N)$ operation that practically behaves like $O(1)$ sequential disk I/O.
But what happens when we use a UUID?
The Chaos of Randomness
A standard UUIDv4 is 128 bits of pure, beautiful randomness. It looks something like this: f81d4fae-7dec-11d0-a765-00a0c91e6bf6
When you insert a row with a random UUID primary key, SQLite cannot simply append it to the end of the file. To maintain the sorted order of the B-Tree index, SQLite must insert the new key into its exact sorted position within the tree. Because the UUID is completely random, this position could be anywhere: at the beginning of the file, in the middle, or at the end.
This randomness triggers three massive performance penalties: Page Splits, Write Amplification, and Cache Thrashing.
The Triple Threat: Page Splits, Write Amplification, and Thrashing
1. Page Splits
SQLite reads and writes data in fixed-size chunks called pages (usually 4096 bytes). When you insert a random UUID, SQLite must load the page where that UUID belongs. If that page is already full of data, SQLite cannot just squeeze the new row in. It is forced to perform a page split: it allocates a brand new page, moves half of the data from the original page to the new one, and then writes the new row.
This doesn't just happen once. As your database grows, random inserts cause cascading page splits across your entire database file, leaving pages only partially full and fragmenting your data on physical disk.
2. Write Amplification
Because SQLite must write entire pages to disk, inserting a tiny 16-byte UUID and its associated row can force SQLite to rewrite multiple 4096-byte pages back to disk. This is known as write amplification. It degrades SSD lifespan and chokes disk I/O throughput, making your application feel sluggish, especially on write-heavy workloads.
3. Cache Thrashing
SQLite relies on an in-memory page cache to keep hot data readily accessible. When inserting sequential IDs, only the most recent pages need to be kept in cache. But with random UUIDs, every single insert could target any random page in the entire database. If your database is larger than your allocated cache size, SQLite will constantly drop pages from memory and read new ones from disk. This cache thrashing completely kills read and write performance.
Quantifying the Pain: A Quick Benchmark
Let's look at some actual code to see how this plays out in practice. Below is a conceptual Node.js script using the popular better-sqlite3 driver to compare sequential inserts with auto-incrementing integers versus random UUIDv4 strings.
const Database = require('better-sqlite3');
const { v4: uuidv4 } = require('uuid');
const db = new Database(':memory:'); // Using memory for demonstration, but disk shows larger deltas
// Setup Tables
db.exec(`
CREATE TABLE sequential_table (
id INTEGER PRIMARY KEY AUTOINCREMENT,
data TEXT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE uuid_table (
id TEXT PRIMARY KEY,
data TEXT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
`);
const insertSeq = db.prepare('INSERT INTO sequential_table (data) VALUES (?)');
const insertUuid = db.prepare('INSERT INTO uuid_table (id, data) VALUES (?, ?)');
// Benchmark Sequential
console.time('Sequential Inserts (100k)');
db.transaction(() => {
for (let i = 0; i < 100000; i++) {
insertSeq.run(`User data payload ${i}`);
}
})();
console.timeEnd('Sequential Inserts (100k)');
// Benchmark UUIDv4
console.time('UUIDv4 Inserts (100k)');
db.transaction(() => {
for (let i = 0; i < 100000; i++) {
insertUuid.run(uuidv4(), `User data payload ${i}`);
}
})();
console.timeEnd('UUIDv4 Inserts (100k)');
If you run this benchmark on physical storage (like an SSD or NVMe drive), you will notice a stark difference. While the sequential inserts complete almost instantly because they are cleanly appended, the UUIDv4 inserts take significantly longer. Over millions of rows, the performance gap grows exponentially, accompanied by a bloated database file size due to sparse, half-filled pages caused by splits.
The Solution: Lexicographically Sortable Identifiers
So, does this mean we have to abandon the benefits of UUIDs and go back to sequential integers? Absolutely not. We just need identifiers that are time-ordered (lexicographically sortable).
If the first portion of your ID is based on a timestamp, and the second portion contains random data to prevent collisions, you get the best of both worlds: client-side generation, unguessable IDs, and perfect append-only behavior for SQLite B-Trees.
Here are the two best modern alternatives to UUIDv4:
1. ULID (Universally Unique Lexicographically Sortable Identifier)
A ULID is a 128-bit identifier compatible with UUIDs. It consists of a 48-bit timestamp followed by 80 bits of randomness. It is encoded as a 26-character string (using Crockford's Base32) which is URL-safe and case-insensitive.
- Example:
01ARZ3NDEKTSV4RRFFQ69G5FAV - Why it works: Because the timestamp comes first, newly generated ULIDs are always sorted after older ones. This means SQLite can insert them cleanly at the end of the B-Tree index, avoiding page splits entirely.
2. UUIDv7 (The New Standard)
If you prefer to stick to the official RFC standard, UUIDv7 is your answer. Specified in RFC 9562, UUIDv7 uses a 48-bit Unix timestamp at the start, followed by random bits, while maintaining the familiar 8-4-4-4-12 string representation.
- Example:
018c6e28-7600-7000-8000-000000000001 - Why it works: It is native, standardized, and perfectly sortable. SQLite can index it linearly, preserving page density and keeping writes incredibly fast.
Implementing UUIDv7 in SQLite
If you are generating IDs in your application code, simply use a library (like uuid in Node, uuid in Rust, or uuid in Python) to generate a UUIDv7 and insert it as a string or, even better, as a 16-byte BLOB to save space.
Here is an example of creating a highly optimized SQLite table using UUIDv7 stored as a BLOB (which takes up only 16 bytes on disk, compared to 36 bytes for a string):
-- Optimize for space and speed using binary UUIDv7
CREATE TABLE users (
id BLOB PRIMARY KEY, -- Stores raw 16-byte UUIDv7
email TEXT UNIQUE NOT NULL,
name TEXT
) WITHOUT ROWID; -- WITHOUT ROWID saves space when the primary key is not an integer
Note: The WITHOUT ROWID optimization is great for tables where you have a non-integer primary key (like a BLOB UUIDv7) and you want to prevent SQLite from maintaining an extra, redundant internal index.
Conclusion: Choose Your Keys Wisely
Architectural decisions that seem trivial on day one—like choosing UUIDv4 for convenience—often become the silent performance killers of day one hundred. By understanding how SQLite structures its data on disk, we can make informed design decisions that keep our databases lean, fast, and scalable.
If you're starting a new project with SQLite or refactoring an existing one, stop using random UUIDv4s. Transition to UUIDv7 or ULIDs. Your application’s disk I/O, cache hit ratios, and future self will thank you.
What about you?
Have you run into scaling bottlenecks with SQLite in production? What's your go-to ID generation strategy? Let me know in the comments below, or share this article with your team's lead architect!
Until next time, happy coding!