Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Congratulations! You have convinced your boss or the CIO to move your database to the cloud. Or
you are the boss, CIO, or both, and you finally decided to jump on the bandwagon. What you’re
trying to do is move your application to the new environment, platform, or technology (aka
application modernization), because usually people don’t move databases for fun.
Database migration is a complex, multiphase process, which usually includes assessment, database
schema conversion (if you are changing engines), script conversion, data migration, functional
testing, performance tuning, and many other steps.
Tools like the AWS Database Migration Service (AWS DMS) and AWS Schema Conversion Tool (AWS
SCT), native engine tools, and others can help to automate some phases of the database migration
process. The key to success, though, is to make your database migration project predictable. You
don’t want any surprises when you are 10 terabytes deep in the data migration phase.
So you are eager to start. But do you have all the information required to finish the project
successfully?
This blog post discusses common migration problems that can occur when you move your database
to the cloud. In the last section, we will discuss some more detailed information related to an AWS
migration.
https://aws.amazon.com/blogs/database/database-migration-what-do-you-need-to-know-before-you-start/ 1/8
2.2.2019 Database Migration—What Do You Need to Know Before You Start? | AWS Database Blog
a part of a bigger application modernization project, which means the database migration (even for
homogeneous migrations) will involve some changes to application code. A common scenario we’ve
all heard of is a 15-year-old, 300,000 line COBOL application, developed by dozens (or even
hundreds) of developers, and none of them works in the company anymore. But even if you are lucky
and you have code written in a modern language by three developers still proudly working for you,
somebody will need to assess what changes to the application code the database migration will
require. Finally, depending on the cloud provider, you will need somebody on your team who is
familiar with the particular provider ecosystem you have. To make a long story short, you will need
your best people on this project.
Again, we can go to the extreme and think about a 15-year-old database that can still support the
first version of your product, despite the fact that support for this version was discontinued years
ago. We can think about the database that has thousands of tables and dozens of schemas and still
has these stored procedures which you wrote when you joined the company more than a decade
ago…
But let’s say you are lucky and it’s not that bad. Can you answer one simple question: What is the size
of the database you are trying to migrate? You might be surprised—most customers can’t answer
this question. The size of the database is important when determining the correct approach to use
for your migration project. It also helps to estimate how long the data copy will take. What is even
more important is how many schemas and tables you are going to migrate. Knowing the layout of
your database can help you define a migration project and speed up your data copy phase
significantly. For extra credit, you can try to find out which tables participate in which transactions. If
you know this, you can execute the migration of multiple disjointed table sets in parallel.
Another important question to ask is: How many very large tables exist in your database? “Very
large” is subjective, but for migration purposes, tables larger than 200 gigabytes and with hundreds
of millions of rows might end up being the long tail for your data migration. If the tables are
partitioned, some native and AWS tools, such as AWS DMS, know how to use this and load data in
parallel.
Okay, suppose you’ve done some initial digging and now you have data about the size, schema, and
tables of the database to migrate. The next question you should ask is: Do we have some “exotic”
engine-specific data types in our schemas? Having those usually presents a challenge, especially if
you want to switch database engines (such as moving from Oracle to PostgreSQL). Even if you are
https://aws.amazon.com/blogs/database/database-migration-what-do-you-need-to-know-before-you-start/ 2/8
2.2.2019 Database Migration—What Do You Need to Know Before You Start? | AWS Database Blog
planning a homogenous migration, many tools won’t support all the data types involved. As usual,
checking the documentation is a good start. AWS DMS has a feature called premigration validation
that verifies types involved in the migration and warns you about the issues you might have with
your data types. Later, when you migrate your data, verify that all your data looks correct on the
other end before you shut down the power at your data center. Again, many tools such as AWS DMS
Data Validation can help you compare the entire data sets or samples of it.
Now let’s talk about LOBs – Large Objects. If you don’t have them in you database, lucky you, but
keep reading to understand how lucky you are. The bad news is migrating LOBs can be slow.
Migration time can be longer, and more memory will be required on your replication server. It can be
very painful to move even small numbers of them. Another challenge you might face is that you have
LOBs in a table without primary keys or any other unique constraints (perhaps your DBA read some
cutting-edge blog 10 years ago that claimed this was a good idea). These tables will cause you some
pain if you want changes to them to be part of your change data capture (CDC) stream.
Before we discuss how not to overload your source database, it’s important to know that working
with database roles, users, and permissions can be a huge time-waster during a migration. Most
migration tools require some elevated access to the source database catalog to retrieve metadata,
and to read logs and replicate changes. Knowing your database roles and permission model, or at
least having the name of the person who knows it and can grant access, can save you a lot of time.
Finally, even if you get all the types, schemas, and LOBs right, it’s possible the migration will be slow.
In some cases, you won’t be able to afford this slowness and it will cause your migration project to
fail. A few factors can affect how long the wait will be for the last row to show up on the other end.
Some of them are easy to tune and take into consideration; others can be harder.
The first is how hot or busy your source database is. This is important to understand because most
migration tools will put some additional load on your source database. If your traffic is very high, it
might be unrealistic to plan a live migration and you should consider alternatives. Usually,
compacted databases migrate faster and with fewer issues, so you should find out when your
database was last compacted. If the answer is never, think about doing it.
Phew—there are a lot of questions to answer, and a lot of people to work with to get the answers.
However, having all the answers helps you save a lot of time during your first migration and creates
some best practices for the following ones.
Finally, there is this question: Does the connection between your source and target database have
enough bandwidth for the migration? Don’t forget, your organization can have an awesome 10 GB
direct pipe to your cloud provider, but that doesn’t mean it’s all yours. Make sure that when you start
https://aws.amazon.com/blogs/database/database-migration-what-do-you-need-to-know-before-you-start/ 3/8
2.2.2019 Database Migration—What Do You Need to Know Before You Start? | AWS Database Blog
to move data you know exactly how much of this pipe the migration is going to use, and that you are
not taking down mission-critical workloads by kicking the tires migrating your test database.
Even if you are doing a lift-and-shift migration, you have decisions to make about your new platform.
You have even more decisions if you are switching engines. For instance, do you know why you chose
one engine over another? I hope that it’s not because your lead developer told you, “Dude, this is
cutting-edge, everybody is using it” and that you had a more structured conversation about why this
specific engine is best for your app, your team, and your timeline.
Another important decision to make is whether all the data needs to migrate. For example, you
might have an audit log table that is 97 percent of your database. Yes, you need the logs for
compliance reasons, but could they be archived later instead of being part of the live migration?
Another example: Do you have some dead tables in your database that supported a feature not
offered on the new platform?
Even if all the data needs to move, this is a perfect time to examine alternatives for some of your
data. For example, can you make use of NoSQL stores or data warehouses on the other end? In
general, it’s always good to keep it simple—that is, move the data first, then transform, restructure,
or rearchitect it. That said, it might be that refactoring your database during the migration is the only
opportunity to do so. In any case, it doesn’t hurt to plan and analyze your data model, even if you are
going to change it after migration is complete.
Finally, after all your preparations, thousands of things can go wrong during and after the migration.
Having a contingency plan is always a good idea. Can you flip back? Can you run your app on the old
and the new database simultaneously? Can you afford to have some data unavailable for a period of
time? Answers to these and some other questions help you shape a better contingency plan.
https://aws.amazon.com/blogs/database/database-migration-what-do-you-need-to-know-before-you-start/ 4/8
2.2.2019 Database Migration—What Do You Need to Know Before You Start? | AWS Database Blog
Also, unless there is a pressing need, it’s better to postpone all schema transformations until the end
of the migration.
Migrate to AWS
Migration to every cloud has its own specific steps and requires expertise with the specific cloud
ecosystem. You need to understand details of networking, permissions, roles, accounts, and so on.
Not only do you need to know how the cloud works in general, but you also have to be very familiar
with your specific corporate cloud implementation. This section of the post will cover some of the
specifics of migrating to AWS and what help is available, and help you ask the right questions about
the benefits and limitations of your new environment.
As I said before, AWS has created a number of tools and support materials to help you to be
successful with your migration projects. The first and very important tool is the documentation.
Please, please read the documentation for DMS and the documentation for SCT. Of course, you know
how to set up supplemental logging and you remember most of the Oracle data types and how they
map to the rest of the database engines. But please spend some time reading the documentation. It
will save you a lot of time and frustration during the migration. The AWS DMS assessment report
analyzes many possible problems and warns you before you start the migration. The DMS and SCT
forums are another place to find useful information before you start the project. In addition, you can
use a set of detailed playbooks. Finally, this GitHub repository includes tools, examples, and AWS
CloudFormation templates for most of the migration-related blog posts.
For many use cases, using Amazon RDS for your target database is a natural choice; you get the
benefits of a managed service. Make sure you are familiar with those—things like backups, OS and
database patches, security, high availability, scalability, elasticity, integration with the AWS
ecosystem, and so on. Using RDS frees you from spending time on these activities.
But there are also limitations. Make sure you are familiar with those—things like storage limits, lack
of admin privileges on your databases, and so on. You need to evaluate carefully whether you really
do require access to your database host. Of course, your DBA will tell you that the world is going to
end the moment SSH access is lost to the database box, but will it really? Some apps need this
access, but in most cases the sun will shine the day after your DBA loses database admin privileges,
as it does for other customers.
Amazon RDS also offers a variety of managed high availability (HA) features. This is a good point to
review your HA requirements. Yes, your current setup includes a few MySQL slaves, but why do you
have them? Maybe you can do better and more importantly do with less pain, or maybe this is why
your database guy wanted to have the host access? It’s important to understand that some of these
decisions can be made after the migration. For example, if you are migrating to a MySQL database, it
is perfectly fine to migrate to Amazon RDS Single-AZ MySQL (actually faster) and then convert it to
an Amazon RDS Multi-AZ instance, or go further and use Amazon Aurora.
Finally, AWS offers a wide list of migration partners that can help you to make an assessment, scope
the work, and even execute your migration projects.
Summary
Database migration projects can be hard, especially your first one, but the benefits of migrating your
https://aws.amazon.com/blogs/database/database-migration-what-do-you-need-to-know-before-you-start/ 5/8
2.2.2019 Database Migration—What Do You Need to Know Before You Start? | AWS Database Blog
database to the cloud are significantly greater than the challenges migrations can present. You can
make these challenges predictable and less painful through diligent preparation and collecting the
necessary information before you start. Thousands of AWS customers have already migrated with or
without our help. Some of them took some shortcuts and hit every possible problem during their
first migration; others spent some time preparing and flew through their migrations with zero pain.
The checklist below will help you to ask the right questions before you start.
TAGS: DMS
Resources
Getting Started
What's New
https://aws.amazon.com/blogs/database/database-migration-what-do-you-need-to-know-before-you-start/ 6/8
2.2.2019 Database Migration—What Do You Need to Know Before You Start? | AWS Database Blog
Top Posts
Official AWS Podcast
AWS Case Studies
Follow
Twitter
Facebook
LinkedIn
Twitch
RSS Feed
Email Updates
Related Posts
Create AWS CloudFormation templates for AWS DMS tasks using Microsoft Excel
Our data lake story: How Woot.com built a serverless data lake on AWS
Another Database Migration Playbook goes live—migrate from Microsoft SQL Server to Amazon
Aurora MySQL!
Oracle Database zero downtime migration with AWS Database Migration Service and Accelario
Use the AWS Database Migration Service to Stream Change Data to Amazon Kinesis Data
Streams
Best practices for migrating an Oracle database to Amazon RDS PostgreSQL or Amazon Aurora
PostgreSQL: Target database considerations for the PostgreSQL environment
https://aws.amazon.com/blogs/database/database-migration-what-do-you-need-to-know-before-you-start/ 7/8
2.2.2019 Database Migration—What Do You Need to Know Before You Start? | AWS Database Blog
https://aws.amazon.com/blogs/database/database-migration-what-do-you-need-to-know-before-you-start/ 8/8