Data migration is the process of selecting, preparing, extracting, and transforming data and permanently transferring it from one computer storage system to another. Additionally, the validation of migrated data for completeness and the decommissioning of legacy data storage are considered part of the entire data migration process. Data migration is a key consideration for any system implementation, upgrade, or consolidation, and it is typically performed in such a way as to be as automated as possible, freeing up human resources from tedious tasks. Data migration occurs for a variety of reasons, including server or storage equipment replacements, maintenance or upgrades, application migration, website consolidation, disaster recovery, and data center relocation. Thus, proper planning is critical for an effective data migration. While the specifics of a data migration plan may vary—sometimes significantly—from project to project, IBM suggests there are three main phases to most any data migration project: planning, migration, and post-migration.
Categories
Data is stored on various media in files or databases, and is generated and consumed by software applications, which in turn support business processes. The need to transfer and convert data can be driven by multiple business requirements, and the approach taken to the migration depends on those requirements. Four major migration categories are proposed on this basis.
Storage migration
A business may choose to rationalize the physical media to take advantage of more efficient storage technologies.
Database migration
Similarly, it may be necessary to move from one database vendor to another, or to upgrade the database software being used. The latter case is less likely to require a physical data migration, but this can happen with major upgrades. In these cases a physical transformation process may be required since the underlying data format can change significantly. This may or may not affect behavior in the applications layer, depending largely on whether the data manipulation language or protocol has changed. However, some modern applications are written to be almost entirely agnostic to the database technology, so a change from Sybase, MySQL, IBM Db2 or SQL Server to Oracle should only require a testing cycle to be confident that both functional and non-functional performance has not been adversely affected.
Application migration
Changing application vendor—for instance a new CRM or ERP platform—will inevitably involve substantial transformation as almost every application or suite operates on its own specific data model and also interacts with other applications and systems within the enterprise application integration environment. Furthermore, to allow the application to be sold to the widest possible market, commercial off-the-shelf packages are generally configured for each customer using metadata. Application programming interfaces (APIs) may be supplied by vendors to protect the integrity of the data they must handle. When no API is available, controlled browser tools such as Selenium or Playwright can be used to programmatically drive a web browser, extracting data from one web application and inserting it into another.
Business process migration
Business processes operate through a combination of human and application systems actions, often orchestrated by business process management tools. When these change they can require the movement of data from one store, database or application to another to reflect the changes to the organization and information about customers, products and operations. Examples of such migration drivers are mergers and acquisitions, business optimization, and reorganization to attack new markets or respond to competitive threat.
The first two categories of migration are usually routine operational activities that the IT department takes care of without the involvement of the rest of the business. The last two categories directly affect the operational users of processes and applications, are necessarily complex, and delivering them without significant business downtime can be challenging. A highly adaptive approach, concurrent synchronization, a business-oriented audit capability, and clear visibility of the migration for stakeholders—through a project management office or data governance team—are likely to be key requirements in such migrations. Reproducing brittle newspapers onto microfilm is an example of such migration.
Disadvantages
- Migration addresses the possible obsolescence of the data carrier, but does not address that certain technologies that use the data may be abandoned altogether, leaving migration useless.
- Time-consuming – migration is a continual process, which must be repeated every time a medium reaches obsolescence, for all data objects stored on a certain media.
- Costly – an institution must purchase additional data storage media at each migration.
Data portability
See also
- Data conversion
- Data curation
- Data preservation
- Data transformation
- Digital preservation
- Extract, transform, load (ETL)
- System migration
