Skip to content

Commit

Permalink
Merge pull request #909 from yuvipanda/feat/new-base
Browse files Browse the repository at this point in the history
Let `FROM <base_image>` in the Dockerfile template be configurable
  • Loading branch information
minrk authored Jun 9, 2023
2 parents 671c423 + e1051c3 commit e8eab15
Show file tree
Hide file tree
Showing 18 changed files with 159 additions and 62 deletions.
6 changes: 3 additions & 3 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# syntax = docker/dockerfile:1.3
ARG ALPINE_VERSION=3.16
ARG ALPINE_VERSION=3.17
FROM alpine:${ALPINE_VERSION}

RUN apk add --no-cache git python3 python3-dev py-pip build-base
RUN apk add --no-cache git python3 python3-dev py3-pip py3-setuptools build-base

# build wheels in first image
ADD . /tmp/src
Expand All @@ -16,7 +16,7 @@ RUN mkdir /tmp/wheelhouse \
FROM alpine:${ALPINE_VERSION}

# install python, git, bash, mercurial
RUN apk add --no-cache git git-lfs python3 py-pip bash docker mercurial
RUN apk add --no-cache git git-lfs python3 py3-pip py3-setuptools bash docker mercurial

# install hg-evolve (Mercurial extensions)
RUN pip3 install hg-evolve --user --no-cache-dir
Expand Down
33 changes: 33 additions & 0 deletions docs/source/howto/base_image.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Change the base image used by Docker

You may change the base image used in the `Dockerfile` that creates images by repo2docker.
This is equivalent to changing the `FROM <base_image>` in the Dockerfile.

To do so, use the `base_image` traitlet when invoking `repo2docker`.
Note that this is not configurable by individual repositories, it is configured when you invoke the `repo2docker` command.

```{note}
By default repo2docker builds on top of the `buildpack-deps:bionic` base image, an Ubuntu-based image.
```

## Requirements for your base image

`repo2docker` will only work if a specific set of packages exists in the base image.
Only images that match the following criteria are supported:

- Ubuntu based distributions (minimum `18.04`)
- Contains a set of base packages installed with [the `buildpack-deps` image family](https://hub.docker.com/_/buildpack-deps).

Other images _may_ work, but are not officially supported.

## This will affect reproducibility 🚨

Changing the base image may have an impact on the reproducibility of repositories that are built.
There are **no guarantees that repositories will behave the same way as other repo2docker builds if you change the base image**.
For example these are two scenarios that would make your repositories non-reproducible:

- **Your base image is different from `Ubuntu:bionic`.**
If you change the base image in a way that is different from repo2docker's default (the Ubuntu `bionic` image), then repositories that **you** build with repo2docker may be significantly different from those that **other** instances of repo2docker build (e.g., those from [`mybinder.org`](https://mybinder.org)).
- **Your base image changes over time.**
If you choose a base image that changes its composition over time (e.g., an image provided by some other community), then it may cause repositories build with your base image to change in unpredictable ways.
We recommend choosing a base image that you know to be stable and trustworthy.
1 change: 1 addition & 0 deletions docs/source/howto/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,3 +15,4 @@ Select from the pages listed below to get started.
lab_workspaces
jupyterhub_images
deploy
base_image
21 changes: 19 additions & 2 deletions repo2docker/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -447,6 +447,21 @@ def _dry_run_changed(self, change):
""",
)

base_image = Unicode(
"docker.io/library/buildpack-deps:bionic",
config=True,
help="""
Base image to use when building docker images.
Only images that match the following criteria are supported:
- Ubuntu based distributions, minimum 18.04
- Contains set of base packages installed with the buildpack-deps
image family: https://hub.docker.com/_/buildpack-deps
Other images *may* work, but are not officially supported.
""",
)

def get_engine(self):
"""Return an instance of the container engine.
Expand Down Expand Up @@ -793,12 +808,14 @@ def build(self):

with chdir(checkout_path):
for BP in self.buildpacks:
bp = BP()
bp = BP(base_image=self.base_image)
if bp.detect():
picked_buildpack = bp
break
else:
picked_buildpack = self.default_buildpack()
picked_buildpack = self.default_buildpack(
base_image=self.base_image
)

picked_buildpack.platform = self.platform
picked_buildpack.appendix = self.appendix
Expand Down
29 changes: 23 additions & 6 deletions repo2docker/buildpacks/_r_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,19 @@ def rstudio_base_scripts(r_version):
shiny_proxy_version = "1.1"
shiny_sha256sum = "80f1e48f6c824be7ef9c843bb7911d4981ac7e8a963e0eff823936a8b28476ee"

rstudio_url = "https://download2.rstudio.org/server/bionic/amd64/rstudio-server-2022.02.1-461-amd64.deb"
rstudio_sha256sum = (
"239e8d93e103872e7c6d827113d88871965f82ffb0397f5638025100520d8a54"
# RStudio server has different builds based on wether OpenSSL 3 or 1.1 is available in the base
# image. 3 is present Jammy+, 1.1 until then. Instead of hardcoding URLs based on distro, we actually
# check for the dependency itself directly in the code below. You can find these URLs in
# https://posit.co/download/rstudio-server/, toggling between Ubuntu 22 (for openssl3) vs earlier versions (openssl 1.1)
# you may forget about openssl, but openssl never forgets you.
rstudio_openssl3_url = "https://download2.rstudio.org/server/jammy/amd64/rstudio-server-2022.12.0-353-amd64.deb"
rstudio_openssl3_sha256sum = (
"a5aa2202786f9017a6de368a410488ea2e4fc6c739f78998977af214df0d6288"
)

rstudio_openssl1_url = "https://download2.rstudio.org/server/bionic/amd64/rstudio-server-2022.12.0-353-amd64.deb"
rstudio_openssl1_sha256sum = (
"bb88e37328c304881e60d6205d7dac145525a5c2aaaf9da26f1cb625b7d47e6e"
)
rsession_proxy_version = "2.0.1"

Expand All @@ -27,11 +37,18 @@ def rstudio_base_scripts(r_version):
# but here it's important because these recommend r-base,
# which will upgrade the installed version of R, undoing our pinned version
rf"""
curl --silent --location --fail {rstudio_url} > /tmp/rstudio.deb && \
apt-get update > /dev/null && \
if apt-cache search libssl3 > /dev/null; then \
RSTUDIO_URL="{rstudio_openssl3_url}" ;\
RSTUDIO_HASH="{rstudio_openssl3_sha256sum}" ;\
else \
RSTUDIO_URL="{rstudio_openssl1_url}" ;\
RSTUDIO_HASH="{rstudio_openssl1_sha256sum}" ;\
fi && \
curl --silent --location --fail ${{RSTUDIO_URL}} > /tmp/rstudio.deb && \
curl --silent --location --fail {shiny_server_url} > /tmp/shiny.deb && \
echo '{rstudio_sha256sum} /tmp/rstudio.deb' | sha256sum -c - && \
echo "${{RSTUDIO_HASH}} /tmp/rstudio.deb" | sha256sum -c - && \
echo '{shiny_sha256sum} /tmp/shiny.deb' | sha256sum -c - && \
apt-get update > /dev/null && \
apt install -y --no-install-recommends /tmp/rstudio.deb /tmp/shiny.deb && \
rm /tmp/*.deb && \
apt-get -qq purge && \
Expand Down
12 changes: 9 additions & 3 deletions repo2docker/buildpacks/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@

# Only use syntax features supported by Docker 17.09
TEMPLATE = r"""
FROM buildpack-deps:bionic
FROM {{base_image}}
# Avoid prompts from apt
ENV DEBIAN_FRONTEND=noninteractive
Expand Down Expand Up @@ -211,7 +211,6 @@ class BuildPack:
Specifically used for creating Dockerfiles for use with repo2docker only.
Things that are kept constant:
- base image
- some environment variables (such as locale)
- user creation & ownership of home directory
- working directory
Expand All @@ -221,9 +220,13 @@ class BuildPack:
"""

def __init__(self):
def __init__(self, base_image):
"""
base_image specifies the base image to use when building docker images
"""
self.log = logging.getLogger("repo2docker")
self.appendix = ""
self.base_image = base_image
self.labels = {}
if sys.platform.startswith("win"):
self.log.warning(
Expand Down Expand Up @@ -257,6 +260,8 @@ def get_base_packages(self):
# Utils!
"less",
"unzip",
# Gives us envsubst
"gettext-base",
}

@lru_cache()
Expand Down Expand Up @@ -535,6 +540,7 @@ def render(self, build_args=None):
appendix=self.appendix,
# For docker 17.09 `COPY --chown`, 19.03 would allow using $NBUSER
user=build_args.get("NB_UID", DEFAULT_NB_UID),
base_image=self.base_image,
)

@staticmethod
Expand Down
3 changes: 3 additions & 0 deletions repo2docker/buildpacks/legacy/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,9 @@ class LegacyBinderDockerBuildPack:
This buildpack has been deprecated.
"""

def __init__(self, *args, **kwargs):
pass

def detect(self):
"""Check if current repo should be built with the Legacy BuildPack."""
log = logging.getLogger("repo2docker")
Expand Down
21 changes: 15 additions & 6 deletions repo2docker/buildpacks/r.py
Original file line number Diff line number Diff line change
Expand Up @@ -196,6 +196,7 @@ def get_packages(self):
"libapparmor1",
"sudo",
"lsb-release",
"libssl-dev",
]

return super().get_packages().union(packages)
Expand All @@ -216,7 +217,10 @@ def get_rspm_snapshot_url(self, snapshot_date, max_days_prior=7):
# Construct a snapshot URL that will give us binary packages for Ubuntu Bionic (18.04)
if "upsi" in snapshots:
return (
"https://packagemanager.posit.co/all/__linux__/bionic/"
# Env variables here are expanded by envsubst in the Dockerfile, after sourcing
# /etc/os-release. This allows us to use distro specific variables here to get
# appropriate binary packages without having to hard code version names here.
"https://packagemanager.posit.co/all/__linux__/${VERSION_CODENAME}/"
+ snapshots["upsi"]
)
raise ValueError(
Expand Down Expand Up @@ -262,7 +266,10 @@ def get_devtools_snapshot_url(self):
# Hardcoded rather than dynamically determined from a date to avoid extra API calls
# Plus, we can always use packagemanager.posit.co here as we always install the
# necessary apt packages.
return "https://packagemanager.posit.co/all/__linux__/bionic/2022-01-04+Y3JhbiwyOjQ1MjYyMTU7NzlBRkJEMzg"
# Env variables here are expanded by envsubst in the Dockerfile, after sourcing
# /etc/os-release. This allows us to use distro specific variables here to get
# appropriate binary packages without having to hard code version names here.
return "https://packagemanager.posit.co/all/__linux__/${VERSION_CODENAME}/2022-06-03+Y3JhbiwyOjQ1MjYyMTU7RkM5ODcwN0M"

@lru_cache()
def get_build_scripts(self):
Expand Down Expand Up @@ -343,16 +350,18 @@ def get_build_scripts(self):
rf"""
R RHOME && \
mkdir -p /etc/rstudio && \
echo 'options(repos = c(CRAN = "{cran_mirror_url}"))' > /opt/R/{self.r_version}/lib/R/etc/Rprofile.site && \
echo 'r-cran-repos={cran_mirror_url}' > /etc/rstudio/rsession.conf
EXPANDED_CRAN_MIRROR_URL="$(. /etc/os-release && echo {cran_mirror_url} | envsubst)" && \
echo "options(repos = c(CRAN = \"${{EXPANDED_CRAN_MIRROR_URL}}\"))" > /opt/R/{self.r_version}/lib/R/etc/Rprofile.site && \
echo "r-cran-repos=${{EXPANDED_CRAN_MIRROR_URL}}" > /etc/rstudio/rsession.conf
""",
),
(
"${NB_USER}",
# Install a pinned version of devtools, IRKernel and shiny
rf"""
R --quiet -e "install.packages(c('devtools', 'IRkernel', 'shiny'), repos='{self.get_devtools_snapshot_url()}')" && \
R --quiet -e "IRkernel::installspec(prefix='$NB_PYTHON_PREFIX')"
export EXPANDED_CRAN_MIRROR_URL="$(. /etc/os-release && echo {cran_mirror_url} | envsubst)" && \
R --quiet -e "install.packages(c('devtools', 'IRkernel', 'shiny'), repos=Sys.getenv(\"EXPANDED_CRAN_MIRROR_URL\"))" && \
R --quiet -e "IRkernel::installspec(prefix=Sys.getenv(\"NB_PYTHON_PREFIX\"))"
""",
),
]
Expand Down
6 changes: 3 additions & 3 deletions repo2docker/contentproviders/git.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,10 +45,10 @@ def fetch(self, spec, output_dir, yield_output=False):
self.log.error(
f"Failed to check out ref {ref}", extra=dict(phase=R2dState.FAILED)
)
if ref == "master":
if ref == "master" or ref == "main":
msg = (
"Failed to check out the 'master' branch. "
"Maybe the default branch is not named 'master' "
f"Failed to check out the '{ref}' branch. "
f"Maybe the default branch is not named '{ref}' "
"for this repository.\n\nTry not explicitly "
"specifying `--ref`."
)
Expand Down
8 changes: 8 additions & 0 deletions tests/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,14 @@ def run_test(args):
return run_test


@pytest.fixture()
def base_image():
"""
Base ubuntu image to use when testing specific BuildPacks
"""
return "buildpack-deps:bionic"


def _add_content_to_git(repo_dir):
"""Add content to file 'test' in git repository and commit."""
# use append mode so this can be called multiple times
Expand Down
8 changes: 4 additions & 4 deletions tests/unit/test_binder_dir.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,21 +6,21 @@


@pytest.mark.parametrize("binder_dir", ["binder", ".binder", ""])
def test_binder_dir(tmpdir, binder_dir):
def test_binder_dir(tmpdir, binder_dir, base_image):
tmpdir.chdir()
if binder_dir:
os.mkdir(binder_dir)

bp = buildpacks.BuildPack()
bp = buildpacks.BuildPack(base_image)
assert binder_dir == bp.binder_dir
assert bp.binder_path("foo.yaml") == os.path.join(binder_dir, "foo.yaml")


def test_exclusive_binder_dir(tmpdir):
def test_exclusive_binder_dir(tmpdir, base_image):
tmpdir.chdir()
os.mkdir("./binder")
os.mkdir("./.binder")

bp = buildpacks.BuildPack()
bp = buildpacks.BuildPack(base_image)
with pytest.raises(RuntimeError):
_ = bp.binder_dir
16 changes: 8 additions & 8 deletions tests/unit/test_buildpack.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,41 +7,41 @@
from repo2docker.utils import chdir


def test_legacy_raises():
def test_legacy_raises(base_image):
# check legacy buildpack raises on a repo that triggers it
with TemporaryDirectory() as repodir:
with open(pjoin(repodir, "Dockerfile"), "w") as d:
d.write("FROM andrewosh/binder-base")

with chdir(repodir):
bp = LegacyBinderDockerBuildPack()
bp = LegacyBinderDockerBuildPack(base_image)
with pytest.raises(RuntimeError):
bp.detect()


def test_legacy_doesnt_detect():
def test_legacy_doesnt_detect(base_image):
# check legacy buildpack doesn't trigger
with TemporaryDirectory() as repodir:
with open(pjoin(repodir, "Dockerfile"), "w") as d:
d.write("FROM andrewosh/some-image")

with chdir(repodir):
bp = LegacyBinderDockerBuildPack()
bp = LegacyBinderDockerBuildPack(base_image)
assert not bp.detect()


def test_legacy_on_repo_without_dockerfile():
def test_legacy_on_repo_without_dockerfile(base_image):
# check legacy buildpack doesn't trigger on a repo w/o Dockerfile
with TemporaryDirectory() as repodir:
with chdir(repodir):
bp = LegacyBinderDockerBuildPack()
bp = LegacyBinderDockerBuildPack(base_image)
assert not bp.detect()


@pytest.mark.parametrize("python_version", ["2.6", "3.0", "4.10", "3.99"])
def test_unsupported_python(tmpdir, python_version):
def test_unsupported_python(tmpdir, python_version, base_image):
tmpdir.chdir()
bp = PythonBuildPack()
bp = PythonBuildPack(base_image)
bp._python_version = python_version
assert bp.python_version == python_version
with pytest.raises(ValueError):
Expand Down
8 changes: 4 additions & 4 deletions tests/unit/test_cache_from.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
)


def test_cache_from_base(tmpdir):
def test_cache_from_base(tmpdir, base_image):
cache_from = ["image-1:latest"]
fake_log_value = {"stream": "fake"}
fake_client = MagicMock(spec=docker.APIClient)
Expand All @@ -21,7 +21,7 @@ def test_cache_from_base(tmpdir):

# Test base image build pack
tmpdir.chdir()
for line in BaseImage().build(
for line in BaseImage(base_image).build(
fake_client, "image-2", 100, {}, cache_from, extra_build_kwargs
):
assert line == fake_log_value
Expand All @@ -30,7 +30,7 @@ def test_cache_from_base(tmpdir):
assert called_kwargs["cache_from"] == cache_from


def test_cache_from_docker(tmpdir):
def test_cache_from_docker(tmpdir, base_image):
cache_from = ["image-1:latest"]
fake_log_value = {"stream": "fake"}
fake_client = MagicMock(spec=docker.APIClient)
Expand All @@ -42,7 +42,7 @@ def test_cache_from_docker(tmpdir):
with tmpdir.join("Dockerfile").open("w") as f:
f.write("FROM scratch\n")

for line in DockerBuildPack().build(
for line in DockerBuildPack(base_image).build(
fake_client, "image-2", 100, {}, cache_from, extra_build_kwargs
):
assert line == fake_log_value
Expand Down
Loading

0 comments on commit e8eab15

Please sign in to comment.