Sharding Financial Databases

Strategies for scaling ledgers. Partitioning by User ID vs. Entity ID. The "Hot Tenant" problem and handling cross-shard transactions.

As a fintech scales to millions of users and billions of rows, a single PostgreSQL instance becomes the bottleneck. Sharding—splitting the data across multiple physical database nodes—is inevitable. However, financial data has strict consistency requirements that make sharding dangerous. Splitting the ledger incorrectly can lead to broken transactions and "Split Brain" accounting.

Partition Keys: User vs. Entity

User ID Sharding: Good for B2C apps (Cash App, Venmo). All data for User_A lives on Shard 1. Transactions between User_A and User_B (who is on Shard 2) become complex "Cross-Shard" transactions.

Entity/Tenant Sharding: Good for B2B platforms. All data for Company_X lives on Shard 1. This isolates tenants but creates a "Hot Tenant" risk—if one company is huge, that shard overloads while others sit idle.

The Cross-Shard Transaction Problem

If User_A pays User_B, and they are on different shards, you cannot use a standard SQL transaction.

Two-Phase Commit (2PC): The system prepares the commit on both shards. If both say "Ready," it commits. If one fails, both roll back. This is slow and blocks locks.

Saga Pattern (Preferred):

Debit User_A (Shard 1). Status: Pending.

Message Queue event sent to Shard 2.

Credit User_B (Shard 2).

If step 3 fails, send "Compensating Transaction" to Shard 1 to refund User_A.

Frequently Asked Questions

Can I use CockroachDB/TiDB instead?

Yes. Distributed SQL databases handle sharding automatically at the storage layer while presenting a single logical SQL interface. This is often preferred over manual application-layer sharding.

Related Guides