Understanding Docker: A Beginner’s Guide for Geophysics Students

Docker is revolutionizing how we deploy and manage software applications, offering a simple and efficient way to ensure consistency across different environments. In this beginner-friendly guide, we break down the fundamentals of Docker, explore its key concepts, and walk you through creating your own Docker image using a simple Linux base. Whether you're just getting started with computational tools or looking to streamline your research workflows, this guide will equip you with the knowledge to harness the power of Docker in your geophysics studies.

Introduction

In today’s rapidly evolving technological landscape, researchers and professionals across various fields increasingly rely on software tools and applications to conduct complex analyses and simulations. For geophysics students, managing and deploying these tools efficiently can be challenging due to differences in operating systems, software dependencies, and configurations. This is where Docker comes into play. Docker simplifies the process of setting up and running applications by using containerization technology. This guide will provide a comprehensive overview of Docker, explaining its concepts and benefits in an accessible manner.

What is Docker?

Docker is an open-source platform designed to automate the deployment, scaling, and management of applications using containers. Containers are lightweight, standalone executable packages that include everything needed to run a piece of software, such as code, runtime, system tools, libraries, and settings. This ensures that the application will run consistently across different computing environments, whether it’s your personal laptop, a university server, or a cloud platform.

Why Use Docker?

Consistency: Docker ensures that applications run the same way regardless of where they are deployed. This eliminates the classic “it works on my machine” problem.

Efficiency: Containers are lightweight and use system resources more efficiently compared to traditional virtual machines.

Portability: Docker containers can be easily shared and deployed across various platforms, facilitating collaboration among researchers and developers.

Scalability: Docker makes it straightforward to scale applications up or down to meet changing demands.

Isolation: Each container runs in isolation, preventing conflicts between different applications and ensuring a secure environment.

Understanding Key Concepts

Before diving into using Docker, it’s essential to understand some fundamental concepts:

  1. Images
    • An image is a read-only template that contains the instructions for creating a Docker container. It includes the application code, libraries, and dependencies needed to run the application.
    • Images are built using a file called a Dockerfile, which contains a series of commands that specify how to construct the image.
    • Images can be downloaded from repositories such as Docker Hub, a public registry where developers share their Docker images.
  2. Containers
    • A container is a runnable instance of an image. When you run an image, it becomes a container.
    • Containers are isolated from each other and the host system but can communicate through well-defined channels.
    • You can create, start, stop, move, or delete containers using simple Docker commands.
  3. Dockerfile
    • A Dockerfile is a text file that contains all the commands needed to assemble an image.
    • It allows you to automate the image creation process, ensuring that images are built consistently every time.
    • A basic Dockerfile specifies a base image and includes instructions to add files, install dependencies, and configure the environment.
  4. Docker Hub
    • Docker Hub is a cloud-based repository where Docker users can store and share their images.
    • It contains a vast collection of pre-built images for various applications and services, which can be used as starting points for your projects.
    • Users can also push their custom images to Docker Hub for others to use.

Docker vs. Virtual Machines

While both Docker containers and virtual machines (VMs) are used to create isolated environments, there are key differences:

  • Resource Utilization: Containers share the host system’s kernel and are more lightweight, leading to faster start-up times and reduced resource consumption compared to VMs, which include a full guest operating system.
  • Performance: Due to their lightweight nature, containers generally offer better performance and efficiency.
  • Portability: Containers are highly portable and can run consistently across different environments, whereas VMs may face compatibility issues.

Installing Docker

Installing Docker is straightforward and supports various operating systems including Windows, macOS, and Linux. Here’s how to install Docker on a typical system:

For Windows and macOS:

  • Visit the Docker Desktop download page and download the appropriate installer for your OS.
  • Run the installer and follow the on-screen instructions.
  • After installation, verify by opening a terminal and running:
docker --version

This should display the installed Docker version.

For Linux:

  • Open a terminal and run the following commands:
sudo apt-get update
sudo apt-get install docker-ce docker-ce-cli containerd.io

Verify the installation by running:

docker --version

Getting Started with Docker

Let’s walk through a simple example of using Docker to run a Python application

Step 1: Pull a Docker Image

  • Run the following command to download the official Python image from Docker Hub
docker pull python:3.8-slim

Step 2: Run a Container

  • Start a container using the pulled image
docker run -it --name my-python-app python:3.8-slim bash

The command docker run -it --name my-python-app python:3.8-slim bash creates and starts a new Docker container using the python:3.8-slim image, names the container my-python-app, and opens an interactive Bash shell within the container. The -it flag ensures that the terminal is interactive, allowing you to execute commands directly inside the container, similar to how you would operate a terminal on your local machine. This setup provides a lightweight environment with Python 3.8 installed, ready for running Python code or other commands within the container.

Step 3: Write and Run a Simple Python Script

  • Inside the container’s terminal, create a simple Python script:
echo "print('Hello, Docker')" > app.py
python app.py

You should see the output: Hello, Docker

Step 4: Exiting and Managing the Container

  • Exit the container by typing exit.
  • To see all running containers, use:
docker ps
  • To start or stop the container
docker start my-python-app
docker stop my-python-app

Writing a Dockerfile with a Simple Linux Base

Now that you understand the basics of Docker, let’s dive into creating your own Docker image using a Dockerfile. A Dockerfile is essentially a script containing a series of instructions to build an image. In this section, we’ll create a Dockerfile that uses a simple Linux base, such as Ubuntu, and sets up a basic environment.

Step 1: Create a Dockerfile

Start by creating a new directory for your project and navigating into it. Inside this directory, create a file named Dockerfile (without any extension). This file will contain the instructions to build your Docker image.

mkdir my-docker-app
cd my-docker-app
touch Dockerfile

Step 2: Choose a Base Image

The first instruction in your Dockerfile should specify the base image that your image will be built upon. For simplicity, we’ll use an official Ubuntu base image.

# Use an official Ubuntu as a parent image
FROM ubuntu:20.04

FROM specifies the base image. Here, ubuntu:20.04 refers to the Ubuntu 20.04 LTS image.

Step 3: Install Dependencies

Next, you’ll want to install any necessary software packages. For this example, we’ll install curl and git. These tools are commonly used in various applications.

# Install curl and git
RUN apt-get update && apt-get install -y curl git

RUN executes commands in a new layer on top of the current image and commits the results. This command updates the package lists and installs the required packages.

Step 4: Add Your Application Code

If you have an application or script that you want to include in the image, you can add it using the COPY command. Let’s say you have a simple shell script called hello.sh.

First, create the hello.sh script in the same directory as your Dockerfile:

echo 'echo "Hello, Docker!"' > hello.sh

Then, add the following line to your Dockerfile:

# Copy the hello.sh script into the container
COPY hello.sh /usr/local/bin/hello.sh

COPY takes the hello.sh file from your local directory and places it inside the container at /usr/local/bin/hello.sh.

Step 5: Set Default Command

The CMD instruction specifies the default command that runs when a container is started from your image. Here, we’ll make our script the default command.

# Run the hello.sh script by default
CMD ["bash", "/usr/local/bin/hello.sh"]

CMD defines the command to run within the container when it starts. This command can be overridden by specifying a different command when running the container.

Step 6: Build the Docker Image

So, our Dockerfile looks like this:

# Use an official Ubuntu as a parent image
FROM ubuntu:20.04

# Install curl and git
RUN apt-get update && apt-get install -y curl git

# Copy the hello.sh script into the container
COPY hello.sh /usr/local/bin/hello.sh

# Run the hello.sh script by default
CMD ["bash", "/usr/local/bin/hello.sh"]

With your Dockerfile complete, it’s time to build the image. Run the following command in your terminal:

docker build -t my-ubuntu-app .
  • docker build creates a Docker image from your Dockerfile.
  • -t my-ubuntu-app tags your image with the name my-ubuntu-app.
  • The . at the end specifies the current directory as the build context.
(base) ➜  docker-learn docker build -t my-ubuntu-app .

[+] Building 31.3s (9/9) FINISHED                                                                                 docker:desktop-linux
 => [internal] load build definition from Dockerfile                                                                              0.0s
 => => transferring dockerfile: 337B                                                                                              0.0s
 => [internal] load metadata for docker.io/library/ubuntu:20.04                                                                   1.5s
 => [auth] library/ubuntu:pull token for registry-1.docker.io                                                                     0.0s
 => [internal] load .dockerignore                                                                                                 0.0s
 => => transferring context: 2B                                                                                                   0.0s
 => [1/3] FROM docker.io/library/ubuntu:20.04@sha256:fa17826afb526a9fc7250e0fbcbfd18d03fe7a54849472f86879d8bf562c629e             1.3s
 => => resolve docker.io/library/ubuntu:20.04@sha256:fa17826afb526a9fc7250e0fbcbfd18d03fe7a54849472f86879d8bf562c629e             0.0s
 => => sha256:fa17826afb526a9fc7250e0fbcbfd18d03fe7a54849472f86879d8bf562c629e 1.34kB / 1.34kB                                    0.0s
 => => sha256:420b6f4cc783dc199667eae3316d9e066bd6e19931448c1e4b6f9a3759e52106 424B / 424B                                        0.0s
 => => sha256:2788af2ba581a02af2023a67e83596fcf92f74591bdea327f03cc3fd9ca25fe7 2.31kB / 2.31kB                                    0.0s
 => => sha256:6a1df50fc4815789598fa24d3ecacb70451e506447ab9e45665024b9f3f0233b 25.97MB / 25.97MB                                  0.6s
 => => extracting sha256:6a1df50fc4815789598fa24d3ecacb70451e506447ab9e45665024b9f3f0233b                                         0.6s
 => [internal] load build context                                                                                                 0.0s
 => => transferring context: 57B                                                                                                  0.0s
 => [2/3] RUN apt-get update && apt-get install -y curl git                                                                      28.2s
 => [3/3] COPY hello.sh /usr/local/bin/hello.sh                                                                                   0.0s 
 => exporting to image                                                                                                            0.3s 
 => => exporting layers                                                                                                           0.2s
 => => writing image sha256:0a54d76e8cdcb37ea83d7991c51e3cb69a177dee49df6a26d3e8fc1fd514e035                                      0.0s
 => => naming to docker.io/library/my-ubuntu-app                                                                                  0.0s

View build details: docker-desktop://dashboard/build/desktop-linux/desktop-linux/rt1u2fdeh6pex8aienjyjskf1

What's next:
    View a summary of image vulnerabilities and recommendations → docker scout quickview

Step 7: Run the Docker Container

Once the image is built, you can create and run a container from it:

docker run my-ubuntu-app
  • docker run creates and starts a container from the specified image (my-ubuntu-app).
  • The container should output Hello, Docker! as defined in your script.

Conclusion

Docker is a powerful tool that streamlines the process of deploying and managing applications across various environments. For geophysics students and researchers, Docker offers a way to handle complex software setups efficiently, ensuring consistency and reproducibility in research projects. By understanding and utilizing Docker, you can enhance your computational workflows and collaborate more effectively within the scientific community.

References

Merkel, D., & others. (2014). Docker: lightweight linux containers for consistent development and deployment. Linux j, 239(2), 2.
Boettiger, C. (2015). An introduction to Docker for reproducible research. SIGOPS Oper. Syst. Rev., 49(1), 71–79. https://doi.org/10.1145/2723872.2723882
Utpal Kumar
Utpal Kumar

Geophysicist | Geodesist | Seismologist | Open-source Developer
I am a geophysicist with a background in computational geophysics, currently working as a postdoctoral researcher at UC Berkeley. My research focuses on seismic data analysis, structural health monitoring, and understanding deep Earth structures. I have had the opportunity to work on diverse projects, from investigating building characteristics using smartphone data to developing 3D models of the Earth's mantle beneath the Yellowstone hotspot.

In addition to my research, I have experience in cloud computing, high-performance computing, and single-board computers, which I have applied in various projects. This includes working with platforms like AWS, GCP, Linode, DigitalOcean, as well as supercomputing environments such as STAMPEDE2, ANVIL, Savio and PERLMUTTER (and CORI). My work involves developing innovative solutions for structural health monitoring and advancing real-time seismic response analysis. I am committed to applying these skills to further research in computational seismology and structural health monitoring.

Articles: 42

Leave a Reply

Your email address will not be published. Required fields are marked *