ELI5: What is database consistency?

From blobs, to lakes, to what a database itself is, there is a monumental amount of terminology around data, databases and how they can be used.

One such term is “database consistency”.

So, here’s an overview to answer the question “what is database consistency?”

What it is

Database consistency refers to a set of rules or requirements within database systems that outline how and when data is input and changed. It’s about ensuring only valid data gets written to the database — anything that invalidates the rules will not be added.

For this to work, ‘validation’ rules must be established. These rules outline values and formats that all data points within (and entering) the database must align to. The result? Even though your database is constantly changing, it changes consistently.

Validation rules

Validation rules can take a few forms. For instance:

  • Constraints – Rules outlining how data insertion, updates, etc., must be done. I.e.: data must do/look like this.
  • Cascades – If one data point changes or gets deleted, everything related to it is updated to reflect the change.
  • Triggers – Procedural code that automatically happens following a specified event within the database.

Types of database consistency

The type of database consistency in place varies based on the type of database.

Typically, relational databases support strong consistency, while it can be fuzzier with non-relational databases.

A SQL, or relational, database tends to follow the tenets of ‘ACID’ database consistency.

ACID: Atomicity, Consistency, Isolation, Durability

  • Atomicity:

If any part of the process fails, the database returns to its original state.

  • Consistency:

The database remains structurally sound, as all changes maintain data integrity or get cancelled. (All constraints, cascades, and triggers apply to all data.)

  • Isolation:

Each transaction (use of the database) is independent of other transactions.

  • Durability:

All successful operations/transactions are permanently preserved.

Meanwhile, non-relational, NoSQL databases are better suited to BASE database consistency.

BASE: Basic Availability, Soft state, Eventual consistency

  • Basic Availability

The database works most of the time and is available to use. Even if a part fails, the rest will still be usable until the issue is resolved.

  • Soft state

The state of the database system may change over time.

  • Eventual consistency

Updates will eventually ripple through all servers until the database is completely consistent again.

Why DB consistency is important

Database consistency is what keeps your databases working properly.

DB consistency means that you have better database health — less space is wasted on archaic or incorrect data. This in turn means faster retrieval of the right data, and no errors to inhibit your processes.

Whenever new information enters your database, it conforms to the right format. And in the case of relational databases, it gains the same links to potentially related data.  

Meanwhile, updates to existing information are displayed in a consistent manner, with the update affecting every/any related point. And, no matter where you access the database, the data will be consistent.

In short, database consistency is what keeps data and databases usable.

Database consistency

TL; DR: database consistency is all about making sure the integrity of your databases remains intact. In its simplest terms, it’s a set of rules that data must comply with in order to be accepted into the database — and that outline what happens when things go wrong.

Useful links

ELI5: What is a blob?

Using ThinkAutomation as an ETL tool

8 causes of poor data quality, and how automation can help