“Data Migration is one of the most underestimated parts of data related projects and often causes budget overruns and delays in the go-live of an implementation of a new solution. Therefore, data migration (or, as we call it, data onboarding) should be an integrated part of the project and be organized as a separate workstream with its own activities, resources and governance. This will mitigate the risk of delays and budget challenges.” – Jos Schreurs, partner and fellow founder of Squadra, is very clear on the importance of data migration in any data related program.
What is data migration and why is it that important?
Data migration is the process of moving data between systems, which allows you to make changes to applications and/or databases. Though it might look straightforward, the migration often has to address a change in storage and database structures and processes. A data migration process will involve at least the transform and load phases in the extract/transform/load (ETL-) process. Essentially, this means that extracted data has to be prepared before being loaded into the target location.
Data Migration is frequently used due to the fact that systems often suffer from restrictions and therefore need to be renewed. With the increased focus on data as part of an overall digitization or data-driven strategy, the data migration and related challenges have become more important than ever before. The implementation of a PIM (Product Information Management) or MDM (Master Data Management) solution is a key element in many of these digitization and data-driven strategy programs. The PIM and MDM projects often also cause the implementation of a new PIM/MDM system as part of the solution to manage data as part of a digital data foundation. As such, data that is stored in other sources, like ERP (and often multiple ERP’s), legacy systems, spreadsheets, shared drives needs to be onboarded in the new solution. This is the point where the data is pulled apart from the different source systems, but subsequently needs to be merged again in the target location.
In a certain way, data migration also assists in increasing the quality of the data. A decent data model for the new solution is required to properly migrate the often un- or less structured data into the new system. Creating such a proper data model will lead to a more insightful transfer and transformation of data. This will not actually improve the quality of the data, but it does enforce an improvement for loading the data into the target system.
How is data migration executed?
There are two common data migration strategies, one of which is executed in a shorter time period and therefore pretty radical (“Big bang migration”). For the other strategy, the process is divided into steps which are executed over a longer period of time (“Agile migration”). Onno, MDM associate of Squadra, used an example to clarify which strategy is the most frequently used. When a supermarket wants to migrate its entire product range, it would be an immense task to do this in a few days. Instead, they could choose to start with e.g., the vegetables assortment and try them in the test environment. Here you may identify some errors which you will have to correct before you transfer cucumbers to your new environment. In conclusion, which migration strategy to use strongly depends on the organization, their product range, and their demands.
In order to migrate data, you could decide to use tools to assist you. Marc, also MDM associate of Squadra, states that 90% of the work is often still done with Excel, however, he adds that the data tool to use strongly depends on the type of company.
What are the challenges of data migration and how to overcome these?
Data migration is a complex and rather technical process which is accompanied with some challenges. The most frequently occurring challenge is the inconsistency of data in the source system. Due to the immense variety in data formats, it often happens that companies deliver data that does not fit the target location. These errors are often due to human mistakes: this leads to wrong and inconsistent data, which in turn leads to problems in processing the data. Marc uses an example to clarify the problem of inconsistency in data quality. In a warehouse, products come in boxes of many different sizes. The company’s employees can measure the box in different ways depending on how he placed it on the measuring device (on it’s side for example). Then the width of the box becomes the depth or vice versa.
This results in unspecified data which needs to be transformed. At this point, data engineers are of great importance due to the fact that they are able to transform this data in a way that it will effectively land at the target location. It is an iterative process of improving your data in order to eventually transfer it to the target location.
The process of data migration comes with another challenge: many companies do not understand their data. Jos argues that many companies still underestimate the importance of high-quality data. “Companies assume that data migration is a simple process. It is also assumed that IT- or data engineers will know how to properly use and transform the data, but this is not always realistic as they often do not talk the language of the business and only look at the data and not at the context (e.g., processes) in which the data is used.” Squadra associates are able to understand the business (specific) knowledge and process this information into IT knowledge. This is, therefore, where Squadra makes the data more accessible by translating business knowledge into technical information (or vice versa). It can be concluded that Squadra forms a bridge between business knowledge and successful data transformation.
Squadra therefore advises not only to manage data migration as a separate workstream, but also to run a “data readiness assessment” at the very initial stages of any data related project, but especially in PIM or MDM projects.