Hey everyone, Alex here. Welcome back to another edition of Coding with Alex on sysseder.com.
If you're building a modern web application, a local-first desktop app, an edge service on Cloudflare D1, or even a mobile app, there’s a high probability you’re using SQLite. It is, without a doubt, one of the most successful and widely deployed database engines on the planet. And because we’ve all been trained by modern distributed systems best practices, your go-to choice for primary keys is probably the trusty UUID (Universally Unique Identifier).
It makes perfect sense on paper. UUIDs prevent ID enumeration attacks, make offline record creation a breeze, and let you merge databases without key collisions. But if you are blindly using random UUIDs (specifically UUIDv4) as primary keys in SQLite, you are walking into a massive performance trap.
Today, we're diving deep into the storage engine of SQLite to understand why UUIDs cripple its performance, how it affects your memory and CPU, and the practical, modern alternatives you should be using instead to keep your applications lightning-fast.
Understanding SQLite’s Under-the-Hood Storage
To understand why UUIDs hurt SQLite, we first have to understand how SQLite stores data. Unlike some enterprise databases that support clustered and non-clustered indexes in highly customizable configurations, SQLite has a very specific, opinionated way of organizing tables.
By default, unless you explicitly define a table as a WITHOUT ROWID table, every table in SQLite has an implicit, 64-bit signed integer key called the rowid.
If you declare a column as INTEGER PRIMARY KEY, SQLite doesn't create a new index. Instead, that column becomes an alias for the internal rowid. This is incredibly efficient because the table is physically stored as a B-Tree organized by this integer key. Lookups by an INTEGER PRIMARY KEY are lightning-fast because the database engine can traverse the B-Tree directly to find the row data.
Here is what happens when we use a standard UUID primary key in a typical SQLite schema:
CREATE TABLE users (
id TEXT PRIMARY KEY, -- Storing UUID as a string '3f5da1b2-...'
email TEXT NOT NULL,
created_at DATETIME DEFAULT CURRENT_TIMESTAMP
);
Under the hood, SQLite does something you might not expect. Because id is of type TEXT and not INTEGER, it cannot be an alias for the rowid. Therefore, SQLite maintains:
- An implicit, auto-incrementing 64-bit
rowidto organize the physical table B-Tree. - A secondary, unique index B-Tree mapped to your
TEXTUUID column to enforce the primary key constraint.
This means every time you query a user by their UUID, SQLite must perform a lookup in the secondary index B-Tree to find the corresponding rowid, and then perform a second lookup in the table B-Tree to fetch the actual row data. You've instantly doubled your lookup complexity.
The Chaos of Randomness: B-Tree Fragmentation
While the double lookup is bad, the real killer is random write performance, and this comes down to how B-Trees manage memory on disk. This is a concept known as B-Tree page fragmentation.
SQLite organizes its database file into fixed-size "pages" (usually 4096 bytes). When you insert data into a table or index, SQLite writes to these pages. If you use monotonically increasing keys (like standard auto-incrementing integers), SQLite writes data sequentially. It fills up one page, seals it, and moves to the next. The database file grows cleanly on disk, and the cache remains highly locality-friendly.
Now, let's look at what happens with UUIDv4. A UUIDv4 is completely random. When you insert millions of rows with random UUID keys, they must be inserted into the index B-Tree in sorted order. Because the keys are random, they target random pages across your entire index database file.
If a target page is already full, SQLite must perform a page split. It allocates a new page, moves half of the data from the old page to the new one, and inserts the new key. This results in several disastrous side effects:
- High Write Amplification: Instead of writing a few bytes to the end of a file, SQLite is forced to rewrite entire 4KB pages over and over.
- Cache Thrashing: Because the inserts are random, SQLite cannot keep the hot pages in memory. It must constantly read pages from disk into the cache, modify them, and write them back.
- Bloated Database Files: Due to frequent page splits, many pages in your database remain only 50% to 70% full, leading to massive disk space waste.
The Performance Math: String UUIDs vs. Integers
Let's look at the storage size difference. A standard UUID is 128 bits of data. However, many developers store UUIDs as 36-character hexadecimal strings (e.g., '123e4567-e89b-12d3-a456-426614174000').
In SQLite, text is stored using UTF-8 or UTF-16 encoding. Storing a 36-character string takes 36 bytes of storage. Compare this to an 8-byte 64-bit integer. You are using 4.5 times more space per key! When you scale to millions of rows, this difference determines whether your index fits entirely in RAM or if your database has to constantly hit slow SSD/NVMe storage.
A Quick Benchmarking Visual
To visualize this, imagine inserting 1,000,000 records into two different schemas: one using sequential integers, and one using random string UUIDv4s. Here is what the insertion time curve looks like over time:
Insertion Time (ms) ^ | / (UUIDv4 Random Text) | / | / | / | / |________________________________/________ (Sequential Integer / rowid) +------------------------------------------> 0 1M Records
With auto-incrementing integers, the insert time remains flat. With random UUIDs, the insert time starts fast but degrades exponentially as the database size exceeds the page cache limit and page splitting takes over.
How to Fix It: Three Modern Strategies
We can't just abandon unique identifiers. We still need them for APIs, sync engines, and security. Fortunately, there are three excellent strategies to solve this problem without sacrificing the benefits of UUIDs.
Strategy 1: Use ULIDs or UUIDv7 (The Best Approach)
The industry is rapidly moving away from UUIDv4 towards lexicographically sortable identifiers. The two most popular choices are ULIDs (Universally Unique Lexicographically Sortable Identifier) and the new UUIDv7 standard.
Both of these formats combine a millisecond-precision timestamp at the beginning of the identifier with random bits at the end. Because the timestamp is at the front, these IDs are naturally ordered sequentially over time.
When you insert a UUIDv7 or ULID, they are always appended to the right side of the B-Tree index. This completely eliminates random page splits, reduces write amplification, and restores your database write performance back to near-integer speeds.
Here is how you can generate a UUIDv7 in Python and save it to SQLite:
import sqlite3
import uuid6 # pip install uuid6 (supports UUIDv7)
conn = sqlite3.connect('app.db')
cursor = conn.cursor()
# Create table using TEXT but ensure we insert sequential UUIDv7
cursor.execute('''
CREATE TABLE IF NOT EXISTS orders (
id TEXT PRIMARY KEY,
total_amount REAL
)
''')
# Generate a sequential UUIDv7
new_id = str(uuid6.uuid7()) # e.g., '018dc3a1-7b00-7000-8000-0123456789ab'
cursor.execute("INSERT INTO orders (id, total_amount) VALUES (?, ?)", (new_id, 99.99))
conn.commit()
conn.close()
Strategy 2: Storing UUIDs as BLOB (Binary)
If you absolutely must use UUIDs and want to save space, do not store them as 36-character strings. Store them as raw 16-byte binary data using SQLite's BLOB type.
CREATE TABLE users (
id BLOB PRIMARY KEY, -- 16 bytes instead of 36 bytes!
email TEXT
);
When inserting, convert your hex string UUID into raw bytes. In Python, this is incredibly simple:
import sqlite3
import uuid
conn = sqlite3.connect('app.db')
# Generate a standard UUIDv4
my_uuid = uuid.uuid4()
binary_uuid = my_uuid.bytes # Convert to 16-byte representation
conn.execute("INSERT INTO users (id, email) VALUES (?, ?)", (binary_uuid, "alex@sysseder.com"))
conn.commit()
By switching from TEXT to BLOB, you reduce your index storage footprint by more than 50%, allowing much more of your index to fit into SQLite's page cache.
Strategy 3: The Hybrid "Internal vs. External" ID Pattern
If you want the absolute best of both worlds (the maximum performance of SQLite's native rowid and the external safety of a UUID), use the Hybrid Pattern.
In this pattern, you use a standard INTEGER PRIMARY KEY for all internal database joins, foreign keys, and indexing. You then expose a TEXT UNIQUE UUIDv7/v4 column solely for public APIs and client-side lookups.
CREATE TABLE products (
id INTEGER PRIMARY KEY AUTOINCREMENT, -- Super-fast internal joins
public_id TEXT UNIQUE NOT NULL, -- Exposed to the API/Frontend
name TEXT NOT NULL
);
This keeps your foreign key tables tiny (they only need to store 8-byte integers instead of heavy strings), makes your joins incredibly fast, and prevents external users from guessing your database size or iterating through your records.
Conclusion
SQLite is an absolute powerhouse of a database, but it relies on sequential data access to perform at its best. Blindly throwing random UUIDv4 strings into your primary keys is a fast track to fragmented databases, bloated files, and degraded performance.
By transitioning to lexicographically sortable identifiers like UUIDv7, converting your storage type to BLOB, or adopting a hybrid key architecture, you can enjoy all the benefits of distributed unique keys without sacrificing the raw, legendary speed of SQLite.
What about you? Are you currently running UUIDs in production SQLite databases? Have you experienced performance degradation as your tables grew? Let's chat in the comments below!
Until next time, happy coding!