Dockerizing Legacy Applications

Makeover

Configurations and Environment Variables

Legacy applications often rely on complex configurations and environment variables. When dockerizing such applications, it's crucial to manage these configurations efficiently, without compromising security or functionality. Docker provides multiple ways to inject configurations and environment variables into containers: via Dockerfile instructions, command-line options, environment files, and Docker Compose. Each method serves a particular use case.

Dockerfile-based configurations are suitable for immutable settings that don't change across different environments. For instance, setting the JAVA_HOME variable for a Java application can be done in the Dockerfile:

FROM openjdk:11
ENV JAVA_HOME /usr/lib/jvm/java-11-openjdk-amd64

However, hard-coding sensitive or environment-specific information in the Dockerfile is not recommended, because it compromises security and reduces flexibility.

For more dynamic configurations, use the -e option with docker run to set environment variables:

docker run -e "DB_HOST=database.local" -e "DB_PORT=3306" my-application

While convenient for a few variables, this approach becomes unwieldy with a growing list. As a more scalable alternative, Docker allows you to specify an environment file:

# .env file
DB_HOST=database.local
DB_PORT=3306

Then, run the container as follows:

docker run --env-file .env my-application

This method keeps configurations organized, is easy to manage with version control systems, and separates the configurations from application code. However, exercise caution; ensure the .env files, especially those containing sensitive information, are adequately secured and not accidentally committed to public repositories.

In multi-container setups orchestrated with Docker Compose, you can define environment variables in the docker-compose.yml file:

services:
  my-application:
    image: my-application:latest
    environment:
      DB_HOST: database.local
      DB_PORT: 3306

For variable data across different environments (development, staging, production), Docker Compose supports variable substitution:

services:
  my-application:
    image: my-application:${TAG-latest}
    environment:
      DB_HOST: ${DB_HOST}
      DB_PORT: ${DB_PORT}

Run it with environment variables sourced from a .env file or directly from the shell:

DB_HOST=database.local DB_PORT=3306 docker-compose up

Configuration files necessary for your application can be managed using Docker volumes. Place the configuration files on the host system and mount them into the container:

docker run -v /path/to/config/on/host:/path/to/config/in/container my-application

In Docker Compose, use:

services:
  my-application:
    image: my-application:latest
    volumes:
      - /path/to/config/on/host:/path/to/config/in/container

This approach provides a live link between host and container, enabling real-time configuration adjustments without requiring container restarts.

Docker and Security Concerns

Securing Docker containers requires checking every layer: the host system, the Docker daemon, images, containers, and networking. Mistakes in any of these layers can expose your application to a variety of threats, including unauthorized data access, denial of service, code execution attacks, and many others.

Start by securing the host system running the Docker daemon. Limit access to the Docker Unix socket, typically /var/run/docker.sock. This socket allows communication with the Docker daemon and, if compromised, grants full control over Docker. Use Unix permissions to restrict access to authorized users.

Always fetch Docker images from trusted sources. Scan images for vulnerabilities using a tool like Docker Scout [6] or Clair [7].

Implement least privilege principles for containers. For instance, don't run containers as the root user. Specify a non-root user in the Dockerfile:

FROM ubuntu:latest
RUN useradd -ms /bin/bash myuser
USER myuser

Containers also should not run with least privileges. The following example:

docker run --cap-drop=all -cap-add=net_bind_service my-application

starts a container with all capabilities dropped and then adds back only the net_bind_service capability required to bind to ports lower than 1024.

Use read-only mounts for sensitive files or directories to prevent tampering:

docker run -v /my-secure-data:/data:ro my-application

If the container needs to write to a filesystem, consider using Docker volumes and restricting read/write permissions appropriately.

It is also important to implement logging and monitoring to detect abnormal container behavior, such as unexpected outgoing traffic or resource utilization spikes.

Dockerizing a Legacy CRM System

To dockerize a legacy Customer Relationship Management (CRM) system effectively, you need to first understand its current architecture. The hypothetical legacy CRM I'll dockerize consists of an Apache web server, a PHP back end, and a MySQL database. The application currently runs on a single, aging physical server, handling functions from customer data storage to sales analytics.

The CRM's monolithic architecture means that the web server, PHP back end, and database are tightly integrated, all residing on the same machine. The web server listens on port 80 and communicates directly with the PHP back end, which in turn talks to the MySQL database on port 3306. Clients interact with the CRM through a web interface served by the Apache server.

The reasons for migrating the CRM to a container environment are as follows:

Scalability: The system's monolithic nature makes it hard to scale individual components.
Maintainability: Patching or updating one part of the applications often requires taking the entire system offline.
Deployment: New feature rollouts are time-consuming and prone to errors.
Resource utilization: The aging hardware is underutilized but can't be decommissioned due to the monolithic architecture.

To containerize the CRM, you need to take the following steps.

Step 1: Initial Isolation of Components and Dependencies

Before you dive into dockerization, it is important to isolate the individual components of the legacy CRM system: the Apache web server, PHP back end, and MySQL database. This step will lay the groundwork for creating containerized versions of these components. However, the tightly integrated monolithic architecture presents challenges in isolation, specifically in ensuring that dependencies are correctly mapped and that no features break in the process.

Start by decoupling the Apache web server from the rest of the system. One approach is to create a reverse proxy that routes incoming HTTP requests to a separate machine or container where Apache is installed. You can achieve this using NGINX:

# nginx.conf
server {
  listen 80;
  location / {
    proxy_pass http://web:80;
  }
}

Next, move the PHP back end to its own environment. Use PHP-FPM to manage PHP processes separately. Update Apache's httpd.conf to route PHP requests to the PHP-FPM service:

# httpd.conf
ProxyPassMatch ^/(.*\.php(/.*)?)$fcgi://php:9000/path/to/app/$1

For the MySQL database, configure a new MySQL instance on a separate machine. Update the PHP back end to connect to this new database by altering the database connection string in the configuration:

<?php
$db = new PDO('mysql:host=db;dbname=your_db', 'user', 'password');?>

During this isolation, you might find that some components have shared libraries or dependencies that are stored locally, such as PHP extensions or Apache modules. These should be identified and installed in the respective isolated environments. Missing out on these dependencies can cause runtime errors or functional issues.

While moving the MySQL database, ensuring data consistency can be a challenge. Use tools like mysqldump [8] for data migration and validate the consistency (Listing 7).

Listing 7

MySQL Data

# Data export from old MySQL
mysqldump -u username -p database_name > data-dump.sql
# Data import to new MySQL
mysql -u username -p new_database_name < data-dump.sql

If user sessions were previously managed by storing session data locally, you'll need to migrate this functionality to a distributed session management system like Redis.

Step 2: Creating Dockerfiles and Basic Containers

Once components and dependencies are isolated, the next step is crafting Dockerfiles for each element: the Apache web server, PHP back end, and MySQL database. For Apache, the Dockerfile starts from a base Apache image and copies the necessary HTML and configuration files. A simplified Dockerfile appears in Listing 8.

Listing 8

Dockerfile for Apache

# Use an official Apache runtime as base image
FROM httpd:2.4
# Copy configuration and web files
COPY ./my-httpd.conf /usr/local/apache2/conf/httpd.conf
COPY ./html/ /usr/local/apache2/htdocs/

Build the Apache image with:

docker build -t my-apache-image .

Then, run the container:

docker run --name my-apache-container -d my-apache-image

For PHP, start with a base PHP image and then install needed extensions. Add your PHP code afterwards (Listing 9).

Listing 9

Dockerfile for PHP

# Use an official PHP runtime as base image
FROM php:8.2-fpm
# Install PHP extensions
RUN docker-php-ext-install pdo pdo_mysql
# Copy PHP files
COPY ./php/ /var/www/html/

Build and run the PHP image similarly to Apache:

docker build -t my-php-image .
docker run --name my-php-container -d my-php-image

MySQL Dockerfiles are less common because the official MySQL Docker images are configurable via environment variables. However, if you have SQL scripts to run at startup, you can include them (Listing 10).

Listing 10

Dockerfile for MySQL Startup Scripts

# Use the official MySQL image
FROM mysql:8.0
# Initialize database schema
COPY ./sql-scripts/ /docker-entrypoint-initdb.d/

Run the MySQL container with environment variables to set up the database name, user, and password:

docker run --name my-mysql-container -e MYSQL_ROOT_PASSWORD=my-secret -d my-mysql-image

For production, you'll need to optimize these Dockerfiles and runtime commands with critical settings, such as specifying non-root users to run services in containers, fine-tuning Apache and PHP settings for performance, and enabling secure connections to MySQL.

Step 3: Networking and Data Management

At this point, the decoupled components – Apache, PHP, and MySQL – each reside in a separate container. For these containers to function cohesively as your legacy CRM system, appropriate networking and data management strategies are vital.

Containers should communicate over a user-defined bridge network rather than Docker's default bridge to enable hostname-based communication. Create a user-defined network:

docker network create crm-network

Then attach each container to this network (Listing 11).

Listing 11

Network Setup

docker run --network crm-network --name my-apache-container -d my-apache-image
docker run --network crm-network --name my-php-container -d my-php-image
docker run --network crm-network --name my-mysql-container -e MYSQL_ROOT_PASSWORD=my-secret -d my-mysql-image

Now, each container can reach another using an alias or the service name as the hostname. For instance, in your PHP database connection string, you can replace the hostname with my-mysql-container.

Data in Docker containers is ephemeral. For a database system, losing data upon container termination is unacceptable. You can use Docker volumes to make certain data persistent and manageable:

docker volume create mysql-data

Bind this volume to the MySQL container:

docker run --network crm-network --name my-mysql-container -e MYSQL_ROOT_PASSWORD=my-secret -v mysql-data:/var/lib/mysql -d my-mysql-image

For the Apache web server and PHP back end, you should map any writable directories (e.g., for logs or uploads) to Docker volumes.

Docker Compose facilitates running multi-container applications. Create a docker-compose.yml file as shown in Listing 12.

Listing 12

docker-compose.yml Network Setup

services:
  web:
    image: my-apache-image
    networks:
      - crm-network
  php:
    image: my-php-image
    networks:
      - crm-network
  db:
    image: my-mysql-image
    environment:
      MYSQL_ROOT_PASSWORD: my-secret
    volumes:
      - mysql-data:/var/lib/mysql
    networks:
      - crm-network
networks:
  crm-network:
    driver: bridge
volumes:
  mysql-data:

Execute docker-compose up, and all your services will start on the defined network with the appropriate volumes for data persistence. Note that user-defined bridge networks incur a small overhead. Although this overhead is negligible for most applications, high-throughput systems might require host or macvlan networks.

If you decide to run your app in Kubernetes, for example, you will not need to worry about Docker networking, because Kubernetes has its own networking plugins.

Step 4: Configuration Management and Environment Variables

Configuration management and environment variables form the backbone of a flexible, maintainable dockerized application. They allow you to parametrize your containers so that the same image can be used in multiple contexts, such as development, testing, and production, without alteration. These parameters might include database credentials, API keys, or feature flags.

You can pass environment variables to a container at runtime via the -e flag:

docker run --name my-php-container -e API_KEY=my-api-key -d my-php-image

In your PHP code, the API_KEY variable can be accessed as $_ENV['API_KEY'] or getenv('API_KEY'). For a more comprehensive approach, Docker Compose allows you to specify environment variables for each service in the docker-compose.yml file:

services:
  db:
    image: my-mysql-image
    environment:
      MYSQL_ROOT_PASSWORD: my-secret

Alternatively, you can use a .env file in the same directory as your docker-compose.yml. Place your environment variables in the .env file:

API_KEY=my-api-key
MYSQL_ROOT_PASSWORD=my-secret

Reference these in docker-compose.yml:

services:
  db:
    image: my-mysql-image
    environment:
      MYSQL_ROOT_PASSWORD: ${MYSQL_ROOT_PASSWORD}

Running docker-compose up will load these environment variables automatically. Never commit sensitive information like passwords or API keys in your Dockerfiles or code.

Configuration files for Apache, PHP, or MySQL should never be hard-coded into the image. Instead, mount them as volumes at runtime. If you're using Docker Compose, you can specify a volume using the volumes directive:

services:
  web:
    image: my-apache-image
    volumes:
- ./my-httpd.conf:/usr/local/apache2/conf/httpd.conf

Some configurations might differ between environments (e.g., development and production). Use templates for your configuration files where variables can be replaced at runtime by environment variables. Tools like envsubst can assist in this substitution before the service starts:

envsubst < my-httpd-template.conf > /usr/local/apache2/conf/httpd.conf

Strive for immutable configurations and idempotent operations to ensure your system's consistency. Once a container is running, changing its configuration should not require manual intervention. If a change is needed, deploy a new container with the updated configuration.

While this approach is flexible, it introduces complexity into the system, requiring well-documented procedures for setting environment variables and mounting configurations. Remember that incorrect handling of secrets and environment variables can lead to security vulnerabilities.

Step 5: Testing and Validation

Testing and validation are nonnegotiables in the transition from a legacy system to a dockerized architecture. Ignoring or cutting corners in this phase jeopardizes the integrity of the system, often culminating in performance bottlenecks, functional inconsistencies, or security vulnerabilities. The CRM system, being business-critical, demands meticulous validation.

The most basic level of validation is functional testing to ensure feature parity with the legacy system. Automated tools like Selenium [9] for web UI testing or Postman [10] for API testing offer this capability. Running a test suite against both the legacy and dockerized environments verifies consistent behavior. For example, to run Selenium tests in a Docker container, you would type a command similar to the following:

docker run --net=host selenium/standalone-chrome python my_test_script.py

Once functionality is confirmed, performance metrics such as latency, throughput, and resource utilization must be gauged using tools like Apache JMeter, Gatling, or custom scripts. You should also simulate extreme conditions to validate the system's reliability under strain.

Static application security testing (SAST) and dynamic application security testing (DAST) should also be employed. Tools like OWASP ZAP can be dockerized and incorporated into the testing pipeline for dynamic testing. While testing, activate monitoring solutions like Prometheus and Grafana or ELK stack for real-time metrics and logs. These tools will identify potential bottlenecks or security vulnerabilities dynamically.

Despite rigorous testing, unforeseen issues might surface post-deployment. Therefore, formulate a rollback strategy beforehand. Container orchestration systems, such as Kubernetes and Swarm, provide the ability to easily rollout changes and rollback when issues occur.

Step 6: Deployment

Deployment into a production environment is the final phase of dockerizing a legacy CRM system. The delivery method will depend on the application and your role as a developer. Many containerized applications reside today in application repositories, including Docker's own Docker Hub container image library. If you are deploying the application within your own infrastructure, you will likely opt for a container orchestration solution already in use, such as Kubernetes.

« Previous 1 2 3 4 Next »