The data lifecycle: explained



Yes, data is an invaluable resource for modern businesses. But it’s also fiddly to collect, use and protect. And, like most things in life, its utility is subject to different phases and processes.

While humans are born, grow up, work, and retire, data too, has a lifecycle, of sorts. And knowing it can help you recognise opportunities for automation.

So, here are the six phases of the data lifecycle, explained.


Birth: capture

The data lifecycle begins with data capture. Capture, (a.k.a. ingestion), is the process of collecting data from various sources. It includes any creation of new data, as well as the acquisition of data from external sources.

So, the first phase of the data lifecycle is where data comes into your organisation. And there’s a lot of different places for this data to come from. For data capture, then, you need a tool that can pull in the raw data from a wide range of sources. Like automation software, for example.


School: maintenance

The second phase of the data lifecycle is ‘maintenance’. This phase involves processing the data but not gaining any benefit or insight from it (yet).

Because your data will have come from a host of sources, it’ll be in a cacophony of different formats. The goal of this phase is to take this raw, disorganised data, and transform it into an understandable, consistent format.

This is the role played by Extract, Transform, Load (ETL) tools. Automation software is, again, an example. It can parse raw data and extract useful information. Then, it can transform that information into a usable format and load it into your database.


Work: usage

Phase three of the data lifecycle is the point where you use the data to generate insights, benefits, and results. It is, in essence, the raison d’être of your data lifecycle.

Usage covers a host of tasks and applications. It varies widely based on your business, the type of data, and why you collected it to begin with.

For instance, you might use it to inform your customer service and provide personalised experiences. Or, you might use it to inspire your future features. You could use it to inform marketing campaigns or create effective automated emails. You could even use it to train a machine learning algorithm.


Hobbies: publication

The fourth phase of the data lifecycle is a conditional one. That is, it doesn’t apply to all data. It’s known as data publication.

Publication doesn’t necessarily mean that you’re making the data accessible to the wider public. Rather, it simply refers to any time it leaves your organisation as part of its use. For example, in a statement for a client, or as part of a study you’re using it for.


Retirement: archival

Archival is the penultimate phase of the data lifecycle, marking the beginning of the end, as it were. Archival is the process of storing data that you don’t intend to use again.

Data is in phase five when it’s no longer in any other phases of the data lifecycle. It means that the data in question is effectively retired. It’s not subject to maintenance, usage, or publication. The only purpose of storing it is just in case you need it again.

The only concern for archived data is keeping it secure.


Death: purging

Marking the very end of the data lifecycle is data purging. The final phase is where every copy of the data gets deleted (‘purged’) from your organisation. That is, it’s reached the end of its potential usefulness and is only a liability.

In most cases, the data that gets purged comes from your archives is simply archaic data. In some cases, data might skip archival and go straight to purge. This is usually due to data protection measures.

You can set automation to help you with this last phase of the data lifecycle, too. You can outline triggers that tell your automation software when to delete unwanted data. (For instance, a specific length of time.)


The data lifecycle

We’re in an age of enhanced awareness of our data. Alongside the benefits of collecting it, come the risks of failing to protect it. By knowing the data lifecycle, you know the different phases that you need to protect.

Data enters and leaves businesses in a near-constant stream. But every piece of information you collect will likely go through most of the phases of the data lifecycle.


Useful links

ELI5: data ingestion

Using ThinkAutomation as an ETL tool

Everything wrong with manual data entry