Managing Data from a Retiring Application

As I’ve noted elsewhere in this blog we are data governance experts masquerading as IT garbagemen. We have spent far too much time and money cleaning up data from legacy applications rather than investing that effort into delivering value.

Bottom line – failing to migrate all data subject to records retention to the successor application is bad data governance. It creates technical debt, cost money and requires time and effort to clean up.

With that said, as part of the planning for the retirement of the application a data governance plan should be created (if it doesn’t already exist). This will identify the data elements that are subject to records retention and will define how they are going to be retained. Options are basically:

Migrate all data to the successor application
Migrate some data to the successor application
Re-enter into the successor application
Start with a no transactional data in the successor application

One of These Things is not Like the Others …

Or put another way – option 2 is poor data governance. Data migration procedures have been developed to migrate some of the data into the successor application, and typically the data that isn’t migrated is older transactions. The typical reasons for not migrating the entire data set are:

It would cost too much to migrate all of the data.
There isn’t enough time during the cutover window to move all of the data.
The storage of the old data in the new environment would cost too much.
There will be performance issues in the new application due to the volume of data

Let’s break down these scenarios.

Costs Too Much to Migrate All of the Data

The first scenario is all data elements are mapped and migrated to the new application to the new successor applications but some data elements are only partially migrated, normally limited by some time consideration. For example, only batch records less than three years old are migrated, or only hiring actions over the prior five years are brought to the successor application.

Normally this scenario doesn’t hold up under scrutiny. All of the necessary mappings have been completed and tested, and actually limiting the data being migrated adds to the complexity and requires additional testing.

Limited Time to Migrate Data in Cutover Window

This can be a real challenge for applications that have large data sets and/or have very limited windows for down time (e.g. manufacturing applications). Creative thinking and planning can mitigate the impact, but it could be a valid reason for not migrating the data.

Cost of Data Storage in Successor Application

Data storage costs have dropped enormously over time, but applications being deployed to newer technologies (in memory, SaaS) can drive the unit cost back up for new applications. As such when the legacy application has large data volumes this can be a significant consideration.

SaaS applications can also not be designed for very large data sets so this could be prohibitive.

Performance of the Successor Application

In most cases the successor application is a more modern architecture than the legacy application, but there can still be concerns about performance if the legacy application has a huge data set. This can be particularly an issue if there is a huge data set moving to a SaaS solution that wasn’t designed for the amount of data.

Conclusion

Even if there are valid reasons why the entire data set cannot be migrated to the successor application the impact can still be lessened by good planning. If timing is the issue then perform the migration in two batches? If performance is a concern, is it really that impactful, or can it be accepted for a period? If SaaS storage cost is a concern, is it more expensive than the technical debt?

Regardless of the reason it leaving data in the legacy application is a failure of data govenance.