Should you use it?

Should you use NoSQL?

It depends on your access pattern, not on scale hype. If your data is relational and you ask varied questions of it, stay with SQL. Pick a specific NoSQL store when its shape and its queries genuinely fit yours.

NoSQL is not one thing, and scale is not the question

"NoSQL" lumps together very different databases. A document store like MongoDB, a wide-column store like Cassandra, a key-value store like DynamoDB or Redis, and a graph database are about as alike as a hammer and a paint roller. So "should I use NoSQL?" is the wrong shape of question. The real one is always whether a specific store fits a specific access pattern.

The second confusion is scale. People reach for NoSQL because they heard it is web scale, as if relational databases tip over at some traffic threshold. They do not. A well-tuned Postgres on decent hardware handles thousands of writes a second and tens of thousands of reads — comfortably more than the vast majority of products ever see. Scale is rarely the real reason to switch, and almost never the reason early.

What decides it is the shape of your data and the shape of your queries. Get honest about both before anything else.

When NoSQL is the right call

When your access pattern is simple and known up front — fetch a session by ID, pull a whole product document, append an event to a time series — a key-value or document store does that one thing with brutal efficiency and scales horizontally without much drama. You hand it a key, it hands you a blob, and adding nodes is a first-class operation rather than a months-long sharding project.

The other genuine win is write throughput at scale. Stores like Cassandra spread writes across many nodes and stay available even when some are unreachable. If you ingest a firehose of sensor data or activity events and read it back in predictable ways, that design fits the problem closely.

Document stores also shine when the data really is a self-contained document with a flexible, sparse shape — a catalog where every category has different attributes — and you almost always read the whole thing at once. No joins, no impedance mismatch. The object you store is the object you use.

When to stick with SQL

If your data is relational and you run varied, ad-hoc queries, stay with SQL. A join will run circles around lookups you hand-roll in application code, and the query planner answers questions you did not anticipate at schema-design time. The flexibility is the feature.

Skip NoSQL too when you need multi-record transactions and strong consistency by default, or when the only reason you are reaching for it is to avoid designing a schema. The schema does not disappear when you drop SQL. It moves into your code, scattered and unenforced, and you rebuild it badly.

What NoSQL actually costs

You give up the join and the ad-hoc query. You model for the queries you know, and a new question often means a new table, a duplicated copy of the data, or a slow scan over everything. The flexibility SQL hands you for free becomes engineering work.

You usually give up rich transactions. Many NoSQL stores offer single-item atomicity, not the multi-row, all-or-nothing transactions SQL gives by default. If an invariant spans records — move money between two accounts and both sides must agree — that guarantee is hard to rebuild on a store that does not provide it.

And you give up the database enforcing your schema. Three years and four developers later, the same field exists in six slightly different shapes across your documents, and nothing stopped it. That cost shows up slowly, which is exactly why it surprises people.

The trap: modelling NoSQL like SQL

The classic mistake is bringing relational habits to a non-relational store. People normalise into separate collections, then need to join them, so they fetch one collection, loop, and fetch the related items one by one in application code. That is a hand-rolled join — slow, and without any of the optimisation a real query planner applies. You have taken the worst of both worlds.

In NoSQL the queries come first, then the model. You decide exactly how you will read the data, then shape the storage — often denormalised, the same value duplicated in several places — so each read is a single lookup. The denormalisation is the deal: you trade write cost, storage, and the risk of inconsistent copies for fast, simple reads. If you are not willing to make that trade deliberately, you are not ready for the store.

NoSQL vs a relational database

A relational database is the flexible generalist: rich queries, joins, transactions, and a schema the engine enforces. It costs you horizontal write scaling, which is hard, and it assumes one big node can hold the working set — true for most products for a long time. A NoSQL store is the specialist: pick it and one access pattern gets blazing fast and scales sideways cleanly, at the price of joins, ad-hoc queries, and cross-record transactions.

Rule of thumb: default to relational, because it carries you further than the hype implies. Reach for a specific NoSQL store when you can name the pattern it serves better — "we read sessions by key millions of times a day and never query across them," "we ingest events and read them back by time range." A concrete pattern, not a vibe about future growth.

How to choose without regret

Start relational and prove the access pattern before you specialise. When you can point at a real workload that a particular store serves better, add that store for that workload — not for the whole system.

It is rarely all-or-nothing. The mature answer is usually polyglot: Postgres for the core data, Redis in front for hot key-value reads, maybe a document store for one genuinely document-shaped corner. Pick the store per workload, justify each by the pattern it serves, and you will rarely look back.

Quick reference

When it fits, when it doesn't

Reach for it when

  • Your access pattern is simple and known up front — look up by one key, read a whole document.
  • You need to scale writes horizontally beyond what one relational node handles comfortably.
  • The data is naturally a document or a wide, sparse table and you rarely join across it.
  • A few seconds of staleness is acceptable for the values in question.

Skip it when

  • Your data is relational and you run varied, ad-hoc queries — a SQL join will run circles around hand-rolled lookups.
  • You need multi-record transactions and strong consistency by default.
  • You are choosing NoSQL to avoid designing a schema; you will rebuild that schema in application code, badly.

Common mistakes

  • Modelling NoSQL like a relational database, then bolting joins back on in the application layer.
  • Assuming "NoSQL means web scale" — a well-tuned Postgres handles more than most products ever reach.
  • Designing tables before you know the queries; in NoSQL the query pattern must come first.
Settle an argument?