DevSecOps for ML
Ian Hellström | 21 April 2021 | 3 min read
Go distroless and reduce the image size as well as the number of CVEs for machine learning containers.
Smaller container images suit the lean data and machine learning operations philosophy, but there is another reason why going distroless makes sense: fewer common vulnerabilities and exposures (CVEs). That leaves two questions:
- How can you scan for vulnerabilities?
- How do you create a distroless base image for Python-based frameworks?
There are several vulnerability scanners for containers, but for the purposes of this post I shall use Anchore Grype, which is FOSS and comes highly recommended.
In what follows, I shall refer to the variables
DISTROLESS_IMAGE, the standard base image and the distroless version of it, respectively.
I shall use TensorFlow 2.4.1 as an example, but the same techniques also work for other machine learning frameworks, including of course PyTorch.
As of this writing, 2.4.1 is the latest stable release of TensorFlow.
VERSION="2.4.1" IMAGE="tensorflow/tensorflow:$VERSION" DISTROLESS_IMAGE="databaseline/tensorflow-cpu:$VERSION"
Docker comes with its own scanner. To see it in action, we can run it against the official TensorFlow Docker image that is 1.57 GB:
docker scan "$IMAGE"
When I ran it, Docker uncovered 73 vulnerabilities, although the only high-severity one was in OpenSSL:
✗ High severity vulnerability found in openssl/libssl1.1 Description: NULL Pointer Dereference Info: https://snyk.io/vuln/SNYK-UBUNTU1804-OPENSSL-1089073 Introduced through: meta-common-packages@meta, email@example.com~ubuntu1.18.04.4 From: meta-common-packages@meta > firstname.lastname@example.org~18.04.7 From: email@example.com~ubuntu1.18.04.4 > firstname.lastname@example.org > email@example.com~18.04.7 Fixed in: 1.1.1-1ubuntu2.1~18.04.9
Grype scans an image with the following command:
It found 275 vulnerabilities, with the following 15 high-severity vulnerabilities:
NAME INSTALLED FIXED-IN VULNERABILITY cryptography 2.1.4 2.3 GHSA-fcf9-3qw3-gxmj flatbuffers 1.12 CVE-2020-35864 libssl1.1 1.1.1-1ubuntu2.1~18.04.7 1.1.1-1ubuntu2.1~18.04.9 CVE-2021-3449 linux-libc-dev 4.15.0-134.138 4.15.0-139.143 CVE-2021-27365 linux-libc-dev 4.15.0-134.138 4.15.0-140.144 CVE-2020-27170 linux-libc-dev 4.15.0-134.138 4.15.0-140.144 CVE-2020-27171 openssl 1.1.1-1ubuntu2.1~18.04.7 1.1.1-1ubuntu2.1~18.04.9 CVE-2021-3449 pip 20.2.4 CVE-2018-20225 pip 9.0.1 CVE-2018-20225 pip 9.0.1 CVE-2019-20916 protobuf 3.14.0 CVE-2015-5237 pycrypto 2.6.1 CVE-2018-6594 pyxdg 0.25 CVE-2019-12761 pyxdg 0.25 0.26 GHSA-r6v3-hpxj-r8rv urllib3 1.26.2 1.26.4 GHSA-5phf-pp7p-vc2r
CVE-2021-3449 is the same vulnerability discovered by
Note that there were no critical vulnerabilities found.
Distroless Base Image for ML
Since distroless images do not have an operating system, a multi-stage build is needed to generate artifacts needed in one stage, upon which they are copied to the distroless base image in a subsequent stage.
Dockerfile shown below, there is a
It allows any dependencies to be included, as you cannot
pip install into the distroless image afterwards.
Here, the requirements file only contains a single line:
FROM python:3.7-slim AS py WORKDIR /app COPY requirements.txt requirements.txt RUN python3 -m pip install --no-cache-dir --upgrade pip && \ python3 -m pip install --no-cache-dir -r requirements.txt FROM gcr.io/distroless/python3:nonroot COPY --from=py /usr/local/lib/python3.7/site-packages /site-packages ENV PYTHONPATH=/site-packages ENV LANG C.UTF-8 ENTRYPOINT ["/usr/bin/python3"]
For the first stage (
py), the official Python 3.7 image is a sensible choice.
Since we have no need of various packages that come with Debian, the ‘slim’ edition is fine.
Both Debian and Ubuntu are good defaults, in case you want to rely on your own base Python image.
The only reason I pick a leaner base image is that it downloads faster and comes with Python and
pip; Alpine can be used too, but Python would have to be installed with
apk add --update python3 first.
The size of the base image for the first stage is irrelevant: we copy the site packages into the distroless container image, so whatever we use before that step disappears after the second stage.
There is nothing inherently specific in the
Dockerfile with regard to TensorFlow.
It follows a generic pattern for Python-based frameworks and libraries.
Please note that a non-root base image is used to avoid running the container as a privileged user.
To build the distroless image, execute:
docker build -t "$DISTROLESS_IMAGE" .
If you want a GPU-ready distroless base image, you have to also copy drivers and system libraries between stages.
The distroless image weighs 756MB, a 52% decrease in size. What about vulnerabilities?
CVEs in the Distroless Image
grype "$DISTROLESS_IMAGE", we see only 3 high-severity vulnerabilities, down from 15:
NAME INSTALLED FIXED-IN VULNERABILITY flatbuffers 1.12 CVE-2020-35864 pip 21.0.1 CVE-2018-20225 protobuf 3.15.8 CVE-2015-5237
All in all, an improvement of more than 50% in the image size and 80% in high-severity vulnerabilities with an overall reduction of 99% in all vulnerabilities. Note that no known critical vulnerabilities were uncovered in any of the scans.
It is important to note that not all vulnerabilities are created equal.
For example, the
pip vulnerability is disputed, as it describes expected behaviour.
protobuf issue may have already been fixed in 3.4.0.
With the pattern shown, it is easy to build distroless base images for machine learning applications.
To ensure each base image’s
Dockerfile is not only consistent but also compliant with best practices, a linter, such as
hadolint, is recommended:
hadolint --ignore DL3013 Dockerfile
Thanks to CLI tools such as
docker scan and
grype, scanning for vulnerabilities is a breeze.
That covers base images, but what about machine learning code?
Dockerfile allows you to install any dependencies needed for the model itself without modification, since that is covered by
All that is left is the model code itself, which can be copied into the distroless image.
That code can be scanned separately with Bandit, if need be.
These steps ought to be be automated in a standardized D/MLOps process that ensures what ships to production follows security best practices. With templatable Dockerfiles building many distroless images for machine learning becomes manageable.