The assumption “you need sharding at scale” is wrong for most workloads. OpenAI proved that a single PostgreSQL primary with ~50 read replicas handles millions of queries per second for ChatGPT’s backend. Their strategy is boring: connection pooling (PgBouncer dropped connection times from 50ms to 5ms), read/write separation, careful schema discipline (no new tables on the primary, no full-table rewrites, index creation and drops done CONCURRENTLY), and good observability. Write-heavy workloads are offloaded to Azure Cosmos DB, and new tables go to sharded systems — but the core PostgreSQL cluster stays unsharded.
This is Boring tech wins for AI-native startups — simpler stack means faster AI-assisted shipping at extreme scale — if OpenAI can run its core workload on PostgreSQL at millions of QPS, most startups definitely don’t need exotic databases. It validates the The three-layer AI stack: Memory, Search, Reasoning vision where Memory, Search, and Reasoning can all run on PostgreSQL (pg_textsearch for BM25, pgvector for semantic search, regular tables for everything else). The meta-lesson: PostgreSQL is infrastructure that survives because it’s the universal substrate that every layer builds on.