Containerization with Docker

Containerization with Docker: Courses and Training

Requirements: basic programming knowledge in R, Python or another language, basic knowledge in Git

Duration: 4h

When we started implementing data science projects, containers were only used in logistics. GitHub was still in its infancy and RStudio had just started to support R packages. At that time, the applications for our data science projects were running on specially set up virtual machines (VMs), sometimes already supported by Jenkins as a CI/CD tool. In this setup we had the same problems as everyone else at the time: the productive VM was configured differently than the laptop in development. Errors in the production environment could not be easily reproduced locally. Shared R / Python versions of different programs, for example by Shiny-Server Pro, created an unnecessary dependency between applications. The list goes on. The solution was the same for us as it was for an entire industry: Docker containers!

Nowadays our IT landscape looks different: every data science project has its own Docker image and applications can be used completely independently of one another on the infrastructure of our customers or our own. This can be batch processes, shiny dashboards, or REST APIs. What do we gain by using Docker containers?

  1. Stability, because an application behaves the same way in the development environment as it does later in the production environment.
  2. Scalability, because the number of Docker containers and their applications can be easily adapted in a cluster.
  3. Independence, because applications in Docker containers are isolated, changes to R and Python packages can be carried out individually for each container.
  4. Reproducibility, because we can create the same environment on any infrastructure.

What you will learn in this training: in our Docker training we will show you our most important use cases for Docker containers:

  • A batch job in a container using the example of R
  • A REST API in a container using the example of R and plumber
  • A dashboard in a container using the example of R and Shiny
  • A preview of shinyproxy with Kubernetes as the backend

In addition, we give you an overview of best practices in the development of containers, so that your applications can be integrated as easily as possible into services such as Amazon Web Services ECS and AWS EKS.