Docker Containers and Persistent Data

Every time when I talk about the benefits of Docker, I get the same question -over and over again: “How do you deal with persistent data in Docker Containers? When you remove the Docker Container, the generated data inside the Container is also lost, which is a no-go for a SQL Server Container”. Let’s try to answer that question in this blog posting, and how you can deal in Docker with this specific problem.

Docker and Data

Docker Containers are really cool, because they are easy to deploy, they are very lightweight, and they don’t need that much storage space because of the used Overlay File System. Docker Containers are great for stateless services that don’t generate data that must be persisted.

But how does that story fit into our needs as SQL Server professionals? We all know that SQL Server is a database, and generates transactional data, which has a very high business value, and therefore that data must survive the lifetime of a Docker Container. One way to deal with that problem is by performing traditional SQL Server Database Backups. Prior to removing a Docker Container, you perform a backup of your database, so that you can restore them later on a different SQL Server Instance, which is maybe again a Docker Container.

But that’s only half of the story, because Docker also provides you a concept called Volumes, which deals with the problem of persisting data across the lifetime of Docker Containers.

Docker Volumes

If you want to persist data in Docker, the recommended way is to use Docker Volumes. Without the use of Docker Volumes, the writeable layer of your Docker Container is removed from your Host as soon as you remove the Docker Container. But a Docker Volume is a separate Docker Object, which is not removed when you remove a Docker Container. I want to give you know a concrete example how to work with Docker Volumes, and you can even perform an easy SQL Server update with them.

First of all, we have to create a new Docker Volume. The creation of a Docker Volume is a simple one-liner:

docker volume create sql_volume

You can get more information about the available volumes and the details of a specific volume with the following 2 Docker commands:

docker volume ls

docker volume inspect sql_volume

To be able to store data in a Docker Volume (instead of the writable layer of the Docker Container), you can map now during the creation of a Docker Container a file path of the Container to a Docker Volume. In our case with SQL Server, we want to map the file path /var/opt/mssql to the previous created Docker Volume. SQL Server stores by default the Data- and Transaction Log Files in that folder in an Ubuntu based SQL Server Container:

docker run -e ‘ACCEPT_EULA=Y’ -e ‘SA_PASSWORD=passw0rd1!’ -p 1433:1433 –name sql2019 -v sql_volume:/var/opt/mssql -d mcr.microsoft.com/mssql/server:2019-GA-ubuntu-16.04

As you can see from the command, you have to use the parameter -v to perform the mapping of a file path to a Docker Volume. Therefore, Docker stores any changes that occurs in the folder /var/opt/mssql in the created Docker Volume, instead of the writeable layer of the Docker Container. I have also used here the SQL Server 2019 RTM image, because I want to show you later how you can perform an upgrade of the Docker Container to the SQL Server 2019 GDR Image – without losing any data in the Docker Container!

Let’s restore now in the next step a database backup in the Docker Container. Therefore, we have to copy in the first step the database backup into the Docker Container. You can use here the following command line:

docker cp AdventureWorks2014.bak sql2019:/var/backups/AdventureWorks2014.bak

And then I have just restored the database backup itself:

RESTORE DATABASE AdventureWorks2014 FROM DISK = '/var/backups/AdventureWorks2014.bak'
WITH
MOVE 'AdventureWorks2014_Data' TO '/var/opt/mssql/data/AdventureWorks2014_Data.mdf',
MOVE 'AdventureWorks2014_Log' TO '/var/opt/mssql/data/AdventureWorks2014_Log.ldf'
GO

Again, all these changes that we are performing here are written to the Docker Volume, and NOT into the writeable layer of the Docker Container! That’s very important here!

Upgrading SQL Server

Let’s imagine now, you have a SQL Server based Docker Container in use, and you want now to upgrade SQL Server to a new newer version. In our case, we have deployed a Docker Container with SQL Server 2019 RTM, and we want now to upgrade to SQL Server 2019 GDR. Therefore, in the first step we stop our Docker Container:

docker stop sql2019

And now we just start up a new Docker Container by providing the SQL Server 2019 GDR Docker Image, and we use again the same Docker Volume:

docker run -e ‘ACCEPT_EULA=Y’ -e ‘SA_PASSWORD=passw0rd1!’ -p 1433:1433 –name sql2019-gdr -v sql_volume:/var/opt/mssql -d mcr.microsoft.com/mssql/server:2019-latest

And now some magic happens within our Docker Container:

When the new Docker Container starts up, SQL Server sees the already existing system databases (with are also stored in the file path /var/opt/mssql that we have mapped to a Docker Volume).
Therefore, SQL Server just performs an in-place upgrade of our databases – the various system databases and also of the AdventureWork2014 database, that we have restored previously.

When you fast enough, you can even see that upgrade within SSMS or Azure Data Studio, when you connect to your SQL Server Container:

The great thing about that upgrade process is the fact that the various CU updates of SQL Server are compatible to each other. Therefore, you can move between the various CU updates forward and backward – without losing any data! You can even rollback to the RTM version of SQL Server 2019 just by stopping the sql2019-gdr Container, and starting up the sql2019 Container:

docker stop sql2019-gdr

docker start sql2019

The use of Docker Volumes makes that magic possible!

Summary

Docker and persistent data are not mutually exclusive to each other. As you have seen in this blog posting, a file path within a Docker Container can be mapped to a Docker Volume which is stored outside of the scope of a Docker Container. Therefore, you can persist data across the lifetime of Docker Containers.

And in addition, with SQL Server, you can even perform upgrades and rollbacks between the various CU updates without losing any data. I hope that this blog posting has convinced you more that Docker is a very interesting technology that can help you in a lot of different places.

If you want to learn more about SQL Server on Linux, Docker, and Kubernetes, you can also checkout my brand-new on-demand online training about it.

Thanks for your time,

-Klaus

3 thoughts on “Docker Containers and Persistent Data”

tmpfiles
12/25/2019 at 12:54 PM

Very informative post.. thanks for sharing with us.

Log in to Reply
Jose
06/27/2020 at 1:28 PM

Great article.

I have a question regarding schemas changes. Let’s say I make some schemas changes to a docker images and I deploy a new version with a volume but soon after I find out I need to rollback to an my old docker image while still referencing the same volume, would my old image break now that I have new schema changes in the new and not old or would it gracefully know to ignore the new schemas changes and work as expected. For instance I added a column and renamed a column let’s say.

Log in to Reply
David Cobb
02/01/2021 at 10:36 PM

Super useful, thanks!

For those following along, there’s a copy/paste gotcha. The name param has two dashes in front (editor may have combined them into a single long dash.)

Log in to Reply