Data Transformation – Limit the Scope
Keep things simple – The amount of data transformation and generation should be minimal. If you think about it conceptually, data migration should be quite simple – You take data from one place, do some transformation on it, and put it in another place.
Proof of Concept
Often, complexity arises from trying to do too much at once. For example, a cardinal sin is to work with full sets of data from the beginning. At this stage, don’t worry about the quality of the data being sent by the business.
Get the internal process in order before attacking issues on the extract side. Use small subsets of data to test. If the framework works on 100 customer accounts then it should also work on 1 million.
There are issues, such as performance, that come into play only when there are large quantities of data. If however, you have a proof of concept on a small set, than all you need to do to make your framework scalable, is to simply fine tune it. The measurable progress recorded in the small test runs provides the stakeholders the much needed visibility and confidence that everything is on the right track.
Scaling up without a proof of concept
When you are write code to transform data or even generate additional data to fit the new system requirements, you are adding an extra element where to make mistakes and also increasing the runtime of your process.
A few years ago I worked on a project where the business was sold the idea that they can provide a minimal amount of base data and the functionality within the new system can be used to generate the rest – including around 15 years of historical data. They never tested whether the functionality could process such volumes at one go. Even after months of performance upgrades, the calculated time for one single migration run was 3 months. Needless to say, we never finished a migration run and the project never saw the light of day.
Share this article
Follow our Blog