Remember Why

A small box
My small Ikea rememberance box

A Minimalist Upbringing

My mother embraced a minimalist lifestyle, constantly removing excess from our home. As a child, I was the opposite, cherishing every doodle and pebble, leading to a clash of perspectives. To manage this conflict, she introduced a clever solution: she bought four small boxes, one for each child, and declared that we could keep only what fit inside our individual box. This system granted us freedom within limits and taught us to prioritize what truly mattered. Decades later, I still have my box. Overflowing now, it recently prompted me to sort through it and discard unnecessary items—a lesson in decluttering that resonates deeply.

The Connection to IT

This personal experience parallels challenges in information technology. Early IT systems were straightforward, with developers who both wrote code and managed infrastructure. Their focus was directly aligned with achieving business objectives. Over time, as IT systems grew more complex, specialization emerged. Unfortunately, this often resulted in a siloed, self-focused approach that overlooked the broader business goals. Without clear alignment to business needs, IT evolved into a cluttered space—much like a hoarder’s collection.

In my career across large IT environments, I’ve witnessed this firsthand. Many times, I couldn’t tell how the applications I supported contributed to the business, if at all. This lack of alignment highlights a critical need for IT to rediscover its “business box.”

The Business Box: A Framework for IT Alignment

To address inefficiencies, IT must adopt a “business box” approach—a framework that emphasizes activities and applications that serve clear business purposes while discarding those that don’t. Here’s how this concept can be applied:

  • Focus on Outcomes: Evaluate projects and applications based on their contribution to business objectives rather than internal IT concerns.
  • Illustrate Gaps in Business Terms: Express risks and inefficiencies in language that resonates with stakeholders—through metrics tied to revenue, profit, and risk.

Key Business Priorities for Public Entities:

  1. Revenue: Maximizing and sustaining income streams.
  2. Profit: Ensuring profitability by controlling costs and increasing efficiency.
  3. Risk: Minimizing threats to revenue or profitability.

Fostering Business-Centric IT in Government and Non-Profits

Government and non-profit organizations have distinct priorities: achieving their mission, aligning with political directives, and managing operations within strict budgetary constraints. IT in these sectors often faces unique challenges, including limited resources and diverse stakeholder demands. To overcome these obstacles, leaders must shift their focus from micromanaging tasks or outsourcing IT functions to empowering their teams through education and alignment with the “business box” concept.

The Current State of IT: The Problem of Operational Debt

Over the past three years, my work as a Staff Solutions Architect at VMware has exposed me to IT strategies across various Fortune 500 companies. One recurring theme in these engagements is the industry’s focus on creating a stable and secure initial state. While crucial, this emphasis often neglects the operational realities post-deployment. Each “initial state” creates operational debt, which compounds over time due to the dynamic nature of IT systems. This debt accounts for approximately 70% of operational spending, limiting innovation and agility in IT organizations.

Three significant factors exacerbate operational debt:

  1. Automation of Provisioning
  2. Public Cloud Adoption
  3. Dynamic Nature of Containers

Key Drivers of Operational Debt

1. Automation of Provisioning

Provisioning automation aims to reduce delivery times for development and production assets. While it increases agility, it often accelerates the accumulation of operational debt due to insufficient governance. Enhanced self-service capabilities increase consumption, compounding long-term operational demands.

2. Public Cloud Adoption

Initially adopted for cost savings, public cloud services now appeal for other reasons:

  • Reduction of infrastructure operational debt.
  • Access to unique services like machine learning and serverless functions.
  • Proximity to data already in the cloud.

Although public clouds abstract infrastructure components into software and streamline operations, they still focus on the initial state. Operational costs and vendor lock-in remain critical concerns.

3. Dynamic Nature of Containers

Containers promise immutability and agility, but they introduce new challenges:

  • Increased observability and management complexity.
  • Rapid proliferation of containerized microservices, expanding operational scope.
  • Short-lived container lifespans (minutes or hours) exacerbate operational demands.

For example, Google addressed container-driven complexity by creating the Site Reliability Engineer (SRE) role, blending development and operations to scale effectively.


Strategies for Reducing Operational Debt

1. Recognizing the Problem

Operational debt is split into two categories:

  • Common Operational Debt: Shared across organizations (e.g., patching, monitoring, hardware refresh).
  • Unique Operational Debt: Specific to the organization’s processes or systems.

2. Identifying Toil Tasks

Use the following criteria to identify toil tasks:

  • Repetition: Use ticketing systems to track common tasks.
  • No Human Judgment Needed: Tasks requiring no decision-making or creativity.
  • Interrupt-Driven: Reactive tasks triggered by tickets or notifications.

3. Automating Toil Tasks

Once identified, prioritize automation of repetitive tasks. Focus on transitioning these tasks from human operators to automated systems, reducing latency and improving efficiency.

4. Adopting Service-Oriented Models

Operational tasks should be automated as part of the service deployment. This integration minimizes toil and aligns operations with service delivery.

A Roadmap to Address Operational Debt

The steps include:

  1. Implement Software Abstraction – Adopt software abstraction to enable automation and eliminate infrastructure debt.
  2. Prioritize Toil Automation – Leverage ticketing systems to create a prioritized list of repetitive tasks for automation.
  3. Transition to Declarative Models – Shift to declarative infrastructure models to enforce expected states post-deployment, reducing the need for manual oversight.
  4. Continually Reduce Toil – Even with declarative models, ongoing effort is required to address residual operational debt.

The Business Impact of Operational Debt Reduction

Organizations that adopt these strategies report up to a 50% reduction in operational costs, allowing IT to shift from a cost center to a strategic business enabler. This approach not only drives innovation but also enhances agility, empowering IT to better support organizational goals.

By proactively addressing operational debt, IT can unlock sustained efficiency gains and long-term value, transforming the enterprise’s approach to technology management.

Operational Docker Removing the extra layers when you build containers

Building containers makes lots of different layers. When you make a change to an element in the build all following layers have to be rebuilt because they cannot be taken from cache. A simple example I have used before that has way too many layers:

docker build -t test .
Sending build context to Docker daemon  5.632kB
Step 1/7 : FROM ubuntu:latest
 ---> 7698f282e524
Step 2/7 : RUN echo "Test"
 ---> Using cache
 ---> 75ac3bfbeaba
Step 3/7 : COPY 1 .
 ---> Using cache
 ---> d457a7492d2c
Step 4/7 : ADD 2 .
 ---> Using cache
 ---> 1c3284c1e6a0
Step 5/7 : ADD 3 .
 ---> Using cache
 ---> 96ab91bcf3df
Step 6/7 : ADD 4 .
 ---> Using cache
 ---> 2889643b631b
Step 7/7 : ADD 5 .
 ---> Using cache
 ---> adb6797fe48a
Successfully built adb6797fe48a
Successfully tagged test:latest

If you examine the number of dangling layers (layers not connected to an active image) for my build there are none:

docker images -f dangling=true
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE

Now I am going to modify one of the files that are being added at step 3/7 so it triggers a rebuild:

docker build -t test .
Sending build context to Docker daemon   5.12kB
Step 1/7 : FROM ubuntu:latest
 ---> 7698f282e524
Step 2/7 : RUN echo "Test"
 ---> Using cache
 ---> 75ac3bfbeaba
Step 3/7 : COPY 1 .
 ---> Using cache
 ---> d457a7492d2c
Step 4/7 : ADD 2 .
 ---> a93a35a37ebe
Step 5/7 : ADD 3 .
 ---> b90d01ee0806
Step 6/7 : ADD 4 .
 ---> 43af309b28d8
Step 7/7 : ADD 5 .
 ---> 540007e7c833
Successfully built 540007e7c833
Successfully tagged test:latest

Now if we check for dangling layers we now have layer that was replaced 3/7:

docker images -f dangling=true
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
<none>              <none>              adb6797fe48a        10 minutes ago      69.9MB

You can identify where these dangling layers are stored by doing docker image inspect:

docker image inspect test
[
    {
        "Id": "sha256:540007e7c833d09da0edaad933d4126075bffa32badb1170da363e1e1f220c4c",
        "RepoTags": [
            "test:latest"
        ],
        "RepoDigests": [],
        "Parent": "sha256:43af309b28d81faa983b87d2db2f64b27b7658f93639f10b07d53b50dded7c45",
        "Comment": "",
        "Created": "2019-06-19T02:46:38.5065201Z",
        "Container": "",
        "ContainerConfig": {
            "Hostname": "",
            "Domainname": "",
            "User": "",
            "AttachStdin": false,
            "AttachStdout": false,
            "AttachStderr": false,
            "Tty": false,
            "OpenStdin": false,
            "StdinOnce": false,
            "Env": [
                "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
            ],
            "Cmd": [
                "/bin/sh",
                "-c",
                "#(nop) ADD file:f5b9e73db3de1fef2d430837d0d31af0af9c9405c64349def90530b2fc8ca6d2 in . "
            ],
            "ArgsEscaped": true,
            "Image": "sha256:43af309b28d81faa983b87d2db2f64b27b7658f93639f10b07d53b50dded7c45",
            "Volumes": null,
            "WorkingDir": "",
            "Entrypoint": null,
            "OnBuild": null,
            "Labels": null
        },
        "DockerVersion": "18.09.2",
        "Author": "",
        "Config": {
            "Hostname": "",
            "Domainname": "",
            "User": "",
            "AttachStdin": false,
            "AttachStdout": false,
            "AttachStderr": false,
            "Tty": false,
            "OpenStdin": false,
            "StdinOnce": false,
            "Env": [
                "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
            ],
            "Cmd": [
                "/bin/bash"
            ],
            "ArgsEscaped": true,
            "Image": "sha256:43af309b28d81faa983b87d2db2f64b27b7658f93639f10b07d53b50dded7c45",
            "Volumes": null,
            "WorkingDir": "",
            "Entrypoint": null,
            "OnBuild": null,
            "Labels": null
        },
        "Architecture": "amd64",
        "Os": "linux",
        "Size": 69859108,
        "VirtualSize": 69859108,
        "GraphDriver": {
            "Data": {
                "LowerDir": "/var/lib/docker/overlay2/77fcf68e04407dd297d93203e242e52cff942b2d8508b3b6e4db2f62d53f38bc/diff:/var/lib/docker/overlay2/dd3f244fb32b847a5d8b2b18e219142bbe3b2e61fe107e290c144154b4513d5e/diff:/var/lib/docker/overlay2/839011eb0edb75c6fedb5e4a9155de2e4bd6305d233ed085d40a4e5166328736/diff:/var/lib/docker/overlay2/3dc3a1f4d37b525f672196b55033c6b82e4ddfc20de16aa803c36fe2358bcb32/diff:/var/lib/docker/overlay2/6e11b07d20377b78ee134a037fc6e661364d273d861419eb77126d0d228abbf0/diff:/var/lib/docker/overlay2/f8c5f20e6ebd1ec759101d926a5101a36ef2378af828ef57a0f8e4a8a467f76f/diff:/var/lib/docker/overlay2/77a101af01c69427ced57be20f01d4a6a688ff2b13d50260be7a7fda1bd7fbf5/diff",
                "MergedDir": "/var/lib/docker/overlay2/28f3e078ee98a327274c59c653858559ad81b866a40e62a70cca989ee403f2a6/merged",
                "UpperDir": "/var/lib/docker/overlay2/28f3e078ee98a327274c59c653858559ad81b866a40e62a70cca989ee403f2a6/diff",
                "WorkDir": "/var/lib/docker/overlay2/28f3e078ee98a327274c59c653858559ad81b866a40e62a70cca989ee403f2a6/work"
            },
            "Name": "overlay2"
        },
        "RootFS": {
            "Type": "layers",
            "Layers": [
                "sha256:02571d034293cb241c078d7ecbf7a84b83a5df2508f11a91de26ec38eb6122f1",
                "sha256:270f934787edf0135132b6780cead0f12ca11690c5d6a5d395e44d290912100a",
                "sha256:8d267010480fed7e616b9b7861854042aad4ef5e55f8771f2c738061640d2cb0",
                "sha256:ea9703e9d50c6fdd693103fee05c65e8cc25be44c6e6587dd89c6559d8df2de7",
                "sha256:69d3f4708a57a9355cf65a99274e6b79788a052564c4fb0fd90f5283c109946a",
                "sha256:d18953dc7e1eef0e19b52db05c2ff34089e9f1166766c8f57b8475db5a3c79b8",
                "sha256:f1ce2d9ca96cc9cd13caab945986580eae2404e87d81b1b485b12ee242c37889",
                "sha256:aeb58c1f315c5baacbe4c2db1745dec548753197e2b251a958704addfd33a8c2"
            ]
        },
        "Metadata": {
            "LastTagTime": "2019-06-19T02:46:38.5547836Z"
        }
    }
]

You can see all the layers are stored in /var/lib/docker/overlay2 you can use the dangling layer id to locate how much space is now wasted on your hard drive. You can remove these dangling layers with:

docker rmi $(docker images -f dangling=true)

These dangling images can eat up tons of space on your build machine. So you need to automate the cleanup process to avoid wasting space.

Learning Docker Image Layers and Cache Best practices

If you Google Dockerfile or learning docker you will be assaulted with lots of examples of Dockerfiles to run on your environment.    Many are missing the basic understanding of how Dockerfile operates.  It’s laying technology and cache provide a host of best practices to consider when building your ideal state.  

Layers:

Each layer of a container is readonly except the final layer which is applied during the docker run command.   In older versions of Docker it was critical to minimize the layers to ensure performance. Layers are added by the following commands:

  • RUN, COPY, ADD, FROM

All other commands just create intermediate images which are thrown away post build.  You can also use multi-stage builds to just copy the required artifacts into the end image.   A few examples to illustrate the impact of layers:

First start with a simple Dockerfile

FROM
ubuntu:latest

Create an image from this file:

docker build -t test .

Sending build context to Docker daemon  4.608kB

Step 1/1 : FROM ubuntu:latest

 —> 7698f282e524

Successfully built 7698f282e524

Successfully tagged test:latest

We have a single step and that means only one layer and that layer became our final image.  Time to add one more layer:

FROM ubuntu:latest

RUN echo “Test”

Creating the image we now have two steps and two layers:

docker build -t test .

Sending build context to Docker daemon  4.608kB

Step 1/2 : FROM ubuntu:latest

 —> 7698f282e524

Step 2/2 : RUN echo “Test”

 —> Running in 7f4aba5459b1

Test

Removing intermediate container 7f4aba5459b1

 —> 57fda831491f

Successfully built 57fda831491f

Successfully tagged test:latest

I created a number of zero-byte files using touch

touch 1 2 3 4 5

Adding these one at a time using ADD or COPY creates multiple layers:

FROM ubuntu:latest

RUN echo “Test”

COPY 1 .

ADD 2 .

ADD 3 .

ADD 4 .

ADD 5 .

Building the image:

docker build -t test .

Sending build context to Docker daemon  4.608kB

Step 1/7 : FROM ubuntu:latest

 —> 7698f282e524

Step 2/7 : RUN echo “Test”

 —> Using cache

 —> 57fda831491f

Step 3/7 : COPY 1 .

 —> 1025060f36d4

Step 4/7 : ADD 2 .

 —> 35cff57055a1

Step 5/7 : ADD 3 .

 —> 0357c97e0c37

Step 6/7 : ADD 4 .

 —> 389612774b90

Step 7/7 : ADD 5 .

 —> de67547a97df

Successfully built de67547a97df

Successfully tagged test:latest

We now have seven layers of images.  These statements can be consolidated down to reduce the layers.  For this example I will only consolidate 4 and 5. 

FROM ubuntu:latest

RUN echo “Test”

COPY 1 .

ADD 2 .

ADD 3 .

ADD 4 5 /

Build image:

docker build -t test .

Sending build context to Docker daemon  4.608kB

Step 1/6 : FROM ubuntu:latest

 —> 7698f282e524

Step 2/6 : RUN echo “Test”

 —> Using cache

 —> 57fda831491f

Step 3/6 : COPY 1 .

 —> Using cache

 —> 1025060f36d4

Step 4/6 : ADD 2 .

 —> Using cache

 —> 35cff57055a1

Step 5/6 : ADD 3 .

 —> Using cache

 —> 0357c97e0c37

Step 6/6 : ADD 4 5 /

 —> 856f9a3a90d8

Successfully built 856f9a3a90d8

Successfully tagged test:latest

As you can see we have one less inbetween layer by combining the last two.   Many of the layers were pulled from cache (didn’t change).   When we use a COPY command we have to be careful because the cache will expire if the file changes.   I am going to add the text “hello” to the file 1 that is being added via COPY.   Notice the impact on the other layers:

docker build -t test .

Sending build context to Docker daemon   5.12kB

Step 1/6 : FROM ubuntu:latest

 —> 7698f282e524

Step 2/6 : RUN echo “Test”

 —> Using cache

 —> 57fda831491f

Step 3/6 : COPY 1 .

 —> 2e9f2b068ab4

Step 4/6 : ADD 2 .

 —> 7a8132435424

Step 5/6 : ADD 3 .

 —> d6ced004f0e1

Step 6/6 : ADD 4 5 /

 —> 1b2b9be67d0f

Successfully built 1b2b9be67d0f

Successfully tagged test:latest

Notice that every layer after 3 cannot be built from cache because the COPY file has changed invalidating all later layers.  For this reason, you should place COPY and ADD lines to the end of a Dockerfile.  Building the layers is an expensive time-consuming operation so we need to limit the number of layers that change.   The best version of this Dockerfile is this:

FROM ubuntu:latest

RUN echo “Test”

COPY 1 2 3 4 5 /

I combined add and copy because neither was doing something different (use COPY when it’s a local file / ADD when it’s remote or a tar archive).   When you build the image you have the least amount of layers:

docker build -t test .

Sending build context to Docker daemon  5.632kB

Step 1/3 : FROM ubuntu:latest

 —> 7698f282e524

Step 2/3 : RUN echo “Test”

 —> Using cache

 —> 57fda831491f

Step 3/3 : COPY 1 2 3 4 5 /

 —> 0f500aea029d

Successfully built 0f500aea029d

Successfully tagged test:latest

Now we only have three layers doing the same thing as seven before.  

Cache:

In the previous section we demonstrated how cache gets used but it’s important to understand what type of actions trigger a rebuild instead of cache usage:

  • All cached layers are invalidated if the higher up layer is considered changed (cascaded down)
  • Change in RUN instructions force a invalid cache (RUN apt-get install bob -y and RUN apt-get install bob -yq force a rebuild)
  • For ADD and COPY the contents of files are examined against checksum and last-accessed and modified times are considered to trigger an invalidation of cache
  • Only RUN, COPY, ADD create layers all others create temporary intermediate images

This list illustrates one of the largest problems with cache.   Using Ubuntu:latest will change depending on the current latest version but if you have it cached it will not be updated from the repository.   RUN commands that have not had a syntax change will not be updated.  For example if you have the following in your Dockerfile:

RUN apt-get upgrade -qy

On the first run will executive that command on the container and cache the output layer.   This is a point in time cached layer.   If you run upgrade a week from today the image should change yet because it’s a cached layer you don’t get the new updates.   This is the danger of the cache.   You can force a rebuild of cache layers with:

–no-cache

One command that can help you understand the inner workings of your docker images is the history command:

docker history test

IMAGE               CREATED             CREATED BY                                      SIZE                COMMENT

5b55f21f1701        21 minutes ago      /bin/sh -c #(nop) COPY multi:3b9dfb231e0b141…   12B                

75ac3bfbeaba        21 minutes ago      /bin/sh -c echo “Test”                          0B                 

7698f282e524        4 weeks ago         /bin/sh -c #(nop)  CMD [“/bin/bash”]            0B                 

<missing>           4 weeks ago         /bin/sh -c mkdir -p /run/systemd && echo ‘do…   7B                 

<missing>           4 weeks ago         /bin/sh -c rm -rf /var/lib/apt/lists/*          0B                 

<missing>           4 weeks ago         /bin/sh -c set -xe   && echo ‘#!/bin/sh’ > /…   745B               

<missing>           4 weeks ago         /bin/sh -c #(nop) ADD file:1f4fdc61e133d2f90…   69.9MB             

Learning Docker create your own micro-image

In the last article I wrote about how you can create your own small image using docker scratch image. The scratch image has the ability to execute basic binary files. I assume that you will have some code base that is then compiled to inserted into the scratch image. In order to do this, you can have a build machine that you maintain to create Linux executables, or you can use another docker image to create the binary and copy it to the scratch image. This is known as a multi-stage build that produces the smallest possible end state container. This whole process can be done in a single Dockerfile. Let’s start with a basic c program that prints out Hello from Docker when executed:

#include <stdio.h>
  
int main() {
    printf("Hello from Docker\n");
    return 0;
}

This should be saved in the current director as hello.c. We then need to build a machine with gcc to compile the c program into a binary. We will call this machine builder. The Dockerfile for builder looks like this:

FROM ubuntu:latest AS builder
# Install gcc
RUN apt-get update -qy
RUN apt-get upgrade -qy
RUN apt-get install build-essential -qy
COPY hello.c .
# Build binary saved as a.out
RUN gcc -o hello -static hello.c

This does the following:

  • Use ubuntu:latest as the image
  • RUN the commands to update and upgrade base operating system (-qy is to run quiet (-q) and answer yes (-y) to all questions)
  • RUN the command to install build-essential which includes the gcc binary and libraries
  • COPY the file hello.c from the local file system into current directory
  • RUN gcc to compile hello.c into hello – This step is critical because we are using the compiler to include all required libraries with the static line without this the executable will fail while looking for a dynamically liked library

Let’s manually build this container to test the static linking using a small docker file:

FROM ubuntu:latest

Now let’s turn this into a container and test our commands to ensure we have the correct commands and order to create our builder container:

docker build -t builder .

This will build the container image called builder from ubuntu:latest from docker hub. Now lets run an instance of this container and give it a try.

docker run -it builder /bin/bash

You are now connected to the container and you can test all your commands to ensure they work

apt-get update -qy
apt-get upgrade -qy
apt-get install build-essential -qy
#We cannot run the next command we need to copy the code using vi so we will install vi only in our test case
apt-get install vim -qy
#Copy the contents of hello.c into file named hello.c
#COPY hello.c .
# Build binary saved as a.out
gcc -o hello hello.c

Let’s check if hello has dependancies on dynamic linked libraries:

root@917d6b3c9ea9:/# ldd hello
	linux-vdso.so.1 (0x00007ffc35dbe000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fa76c376000)
	/lib64/ld-linux-x86-64.so.2 (0x00007fa76c969000)

As you can see it has dynamically linked libraries those will not work in scratch because they will not exist. Lets static link them using this command:

gcc -o hello -static hello.c
root@917d6b3c9ea9:/# ldd hello
	not a dynamic executable

As you can see making sure we are not dynamically linking executables is critical. Now we know we have a working builder we can just take the executable and copy it to the scratch container for a very small container job. As you can see this process could be used to make very fast acting functions as a service on demand.

FROM scratch
# Copy our static executable.
COPY --from=builder hello /
# Run the hello binary.
ENTRYPOINT ["/hello"]

This takes the hello binary from the builder and puts it into our final image. Put them together in a single Dockerfile like this:

FROM ubuntu:latest AS builder
# Install gcc
RUN apt-get update -qy
RUN apt-get upgrade -qy
RUN apt-get install build-essential -qy
#COPY the hello.c file from OS
COPY hello.c .
# Build the binary.
RUN gcc -o hello -static hello.c
FROM scratch
# Copy our static executable.
COPY --from=builder hello /
# Run the hello binary.
ENTRYPOINT ["/hello"]

Build the container which we will call csample using this command:

docker build -t csample .

Sending build context to Docker daemon  3.584kB
Step 1/9 : FROM ubuntu:latest AS builder
 ---> 7698f282e524
Step 2/9 : RUN apt-get update -qy
 ---> Using cache
 ---> 04915027a821
Step 3/9 : RUN apt-get upgrade -qy
 ---> Using cache
 ---> 998ea043503f
Step 4/9 : RUN apt-get install build-essential -qy
 ---> Using cache
 ---> e8e3631eaba6
Step 5/9 : COPY hello.c .
 ---> Using cache
 ---> 406ad6aafe8f
Step 6/9 : RUN gcc -o hello -static hello.c
 ---> Using cache
 ---> 3ebd38451f71
Step 7/9 : FROM scratch
 ---> 
Step 8/9 : COPY --from=builder hello /
 ---> Using cache
 ---> 8e1bcbc0d012
Step 9/9 : ENTRYPOINT ["/hello"]
 ---> Using cache
 ---> 5beac5519b31
Successfully built 5beac5519b31
Successfully tagged csample:latest

Try starting csample with docker:

docker run csample
Hello from Docker

As you can see we have now used a container to build the executable for our container.

Learning Docker creating your own base image

Docker images are compiled in layers using a set of instructions contained in a text file called Dockerfile. Every container image starts with a base in many cases this base image is pulled from Dockerhub or your own repository. When creating your own base image you have two choices build one or use scratch.

Scratch

Scratch is build into docker and is provided as a minimal linux environment that cannot do anything. If you have a compiled binary that will work in the container scratch may be a perfect minimal container. Do not expect scratch to have a package manager or even command line. For our example lets assume we have a basic c program called hello-world and is compiled:

FROM Scratch
ADD hello /
CMD ["/hello"]

This would start the container run the executable and end the container. My base container size with scratch alone is 1.84 kilobytes.

Building your own Image

Building your own image starts with install of the target operating system. Since Redhat and Ubuntu seem to be the most common operating systems available today I’ll provide instructions for both. In these instructions it’s possible to build minimal containers without package managers but these are multi-purpose base images. In both cases the process installs a minimal version of the operating system in a subdirectory then compiles the docker image from this directory.

Ubuntu

Debian based systems make it really easy with the debootstrap command which is installed by default on Ubuntu. We will setup the image using Ubuntu 19.04 Disco Dingo.

sudo debootstrap disco disco > /dev/null
sudo tar -C disco -c . | docker import - disco

You now have a docker image called disco that is a minimal Ubuntu 19.04.

docker images

Redhat / Centos

I’ll use centos since I don’t personally own any Redhat licenses but the process is exactly the same. This will build the same version of OS you are currently running. You will need to change the RPM-GPG-KEY with your version of Centos. I read about the CentOS process in this article.

# Create a folder for our new root structure
export centos_root='/image/rootfs'

mkdir -p $centos_root

rpm --root $centos_root --initdb

yum reinstall --downloadonly --downloaddir . centos-release

rpm --root $centos_root -ivh --nodeps centos-release*.rpm

rpm --root $centos_root --import  $centos_root/etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7

yum -y --installroot=$centos_root --setopt=tsflags='nodocs' --setopt=override_install_langs=en_US.utf8 install yum  

sed -i "/distroverpkg=centos-release/a override_install_langs=en_US.utf8\ntsflags=nodocs" $centos_root/etc/yum.conf

cp /etc/resolv.conf $centos_root/etc

mount -o bind /dev $centos_root/dev

#Enter the file system for our image to do a yum clean (remove cached stuff) type exit to leave the chroot 
chroot $centos_root /bin/bash 

# Run this command then type exit to leave the chroot
yum clean all

rm -f $centos_root/etc/resolv.conf

umount $centos_root/dev

#Create the docker image
tar -C $centos_root -c . | docker import - centos

You now have a image called centos.

Put it together

Building your own images assures that no one can put something into your image that is unexpected. Scratch is a great way to run very minimal containers that are very small. If you need a fuller operating system you can use Ubuntu or CentOS.

Installing Docker on Linux

There are literally hundreds of guides on the internet to install Docker on Linux. I wanted to provide brief guides on how to install on CentOS and Ubuntu. You can always download the latest guides from docs.docker.com. This guide will provide the method to install the community edition of docker. In some cases you might want to install the vendor provided version (Redhat) in that case use your vendors recommendations.

Install Docker CE on CentOS (RedHat)

First install dependencies:

sudo yum install -y yum-utils device-mapper-persistent-data lvm2

Then add the repository for Docker CE:

sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo

Now install the server, cli and drivers

sudo yum install docker-ce docker-ce-cli containerd.io

Start up the server

sudo systemctl start docker

Enable server to start at boot time

sudo systemctl enable docker

Installing Docker CE on Ubuntu

First install dependencies:

sudo apt-get install apt-transport-https ca-certificates curl software-properties-common

Then add the repository for Docker CE:

curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add –
sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu  $(lsb_release -cs)  stable" 

Now install the server, cli and drivers

sudo apt-get install docker-ce

Start up the server

sudo systemctl start docker

Enable server to start at boot time

sudo systemctl enable docker

Testing Docker

Test docker by checking the installed version

docker --version

and running a basic container

docker run hello-world
Hello world allows you to review if docker is working

Architecture of Docker

I have been spending a lot of my free time the last few months learning Kubernetes. Currently most implementations of Kubernetes use Docker as their container runtime. I wanted to share some of my knowledge gained as I learned. Since I claim to be a architecture I wanted to share the basic architecture of Docker.

What is a container?

It is a segmented process that contains only the required elements to complete it’s expected job. While a normal operating system has many libraries available to make it flexible container only has the required runtime and libraries to do it’s function. This reduced scope makes containers small and independent from operating systems. The segmentation is enforced by the container server. The container server runs as a process on another operating system.

Architecture of Docker

Docker is a server that runs a process called dockerd. This server provides a REST API for the creation, management and running of containers. For ease of management docker provides the docker command line interface to interact with the REST API. There is a company called Docker that provide a supported version of Docker called Docker Enterprise. Most people seem to use Docker community edition which is licensed under the Apache 2.0 license.

What is a registry?

Registry is a place to store container images. Docker maintains Docker Hub a huge public registry. Anyone can provide a image to Docker Hub allowing anyone else to consume it. Many companies choose to use a private registry to protect their company data and applications. Docker has two functions for registry a push and a pull:

  • Push – sends a local image to the registry
  • Pull – asks for the image to be stored locally

What is a docker image?

Docker images are built using layers and are read-only. Each layer in an image could be based upon a previous image or some unique customization. Images are compiled sets of instructions stored in a file called Dockerfile.

Basic Dockerfile

This Dockerfile defines a basic image that does nothing but ping google.com forever. When compiled this image has three layers:

Layer 1: FROM ubuntu:latest

  • Use the ubuntu base operating system with the tag of latest

Layer 2: RUN apt-get update -q && apt-get install -qy iputils-ping

  • Execute the command listed above that updates the operating system and installs iputils-ping

Layer 3: CMD [“ping”, “google.com”]

  • Run the command ping google.com forever

Once compiled this new image can be uploaded to a repository as a new container.

What is a container?

It is a runable image. They can be stored locally or in a remote repository. Once you start running an image is becomes a unique container and writable. All changes are unique to that instance of the container and not changed on the image. You can spawn hundreds or thousands of containers from a single image.

What about isolation?

Isolation is critical otherwise the container is just a process on an operating system. This isolation in docker is provided by three things:

  • namespaces – makes a container look and feel like a separate machine
  • cgroups – A way to group processes together and apply resource limits
  • capabilities – superuser privileges that can be enabled or disabled for a process

So cgroups are used to group together processes into namespaces. Namespaces creates isolated instances of different resources like network etc.. This provided the impression of being isolated.

What about networking?

For containers to talk to the outside world is critical networking is implemented along with the other seven namespaces as part of Docker. Initial docker networking was very limited. As an active open source project it continues to get better. I will skip the deep dive on Docker networking since it is mostly not part of Kubernetes.

Why do I care?

An honest question. Containers enable very rapid deployment of new code. They allow the implementation of micro-services which in turn should improve the rate of new features in code. So it’s really about speed. A simple comparison is the fact that I could setup this wordpress blog in 15 seconds with docker should help you understand the speed capabilities.

Imperative vs Declarative IT

This seems to come up a lot in discussions so I wanted to provide my view on the differences. Imperative is focused on the steps required to meet an outcome. Declarative is focused on defining the end state without understanding of the steps. To illustrate the differences I like to use visuals.

Imperative

In the imperative model we assemble our lunch by assembling the various components together. In this model we can have many specialists involved to assure we have the best product in our lunch. We have a cheese specialist ensuring awesome cheese. We have a meat specialist choosing prime cuts. When someone comes to the bar to assemble their lunch chaos becomes reality. I may not fully understand the flavors provided by the specialist and choose to assemble a mess. If I include every specialist in the assembly I am likely to get a great sandwich but this process cannot scale. The imperative model thus is focused on the individual steps to produce an outcome.

Declarative

In the declarative model the end state is defined and the system is trusted to produce the outcome. The meal above represents my request for a good dinner. I was not concerned with platting or cooking I just wanted a meal.

Why should you care about my lunch/dinner?

Allow me to illustrate the value of declarative models in detail. Let us assume you have two switches and two routers in your network:

In imperative models we are hyper focused on the steps to make this networking construct redundant. It may involve linking switches and using BGP or OSPF to ensure optimal paths. This added human complexity provides a job for many people. Now lets examine the functional differences between two options:

Or

Functionally assuming the user can detect upstream failures there is no difference. You avoid a single point of failure and communication continues. In a declarative model all we would need to define is IP method to get from point user to router and no single point of failure. Kubernetes implements a declarative model that creates initial state and ensures that desired state continues until changed which yields the real power of declarative models. For example lets look at this application definition:

apiVersion: extensions/v1
 kind: Deployment
 metadata:
   name: site
 spec:
   replicas: 2
   template:
     metadata:
       labels:
         app: web
     spec:
       containers:
         - name: front-end
           image: nginx
           ports:
             - containerPort: 80
         - name: reader
           image: nginx
           ports:
             - containerPort: 88

This declarative yaml creates four pods in a deployment (2 front-end, 2 reader). If you manually remove one of these pods a new pod is redeployed to ensure the declarative state exists. So when we implement declarative models we can ensure desired state long past imperative models.