Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: in DinD, Since 4.0.0 postgres.get_connection_url() within another container gives broken connection url #475

Closed
Tanguy-LeFloch opened this issue Mar 14, 2024 · 44 comments · Fixed by #714

Comments

@Tanguy-LeFloch
Copy link

Tanguy-LeFloch commented Mar 14, 2024

Describe the bug

Before 4.0.0, after creating a PostgresContainer within another container, but with the same external daemon, postgres.get_connection_url() would return something
like postgresql+psycopg2://test:[email protected]:32770/test which is the correct connection url.

However, since 4.0.0, the exact same container's postgres.get_connection_url() returns postgresql+psycopg2://test:test@localhost:5432/test.

So it changed:

  • The exposed port to the internal one
  • The host from the gateway ip to the container host ip

Considering that the newly formatted connection url doesn't work to create an engine, I wouldn't expect this behavior.

To Reproduce

# From within a container
from testcontainers.postgres import PostgresContainer
from sqlalchemy.engine import create_engine

with PostgresContainer("postgres:15.2") as postgres:
    with create_engine(postgres.get_connection_url()).connect() as connection:
        ...

Runtime environment

Operating system:
Linux d388e0f17614 5.10.179-168.710.amzn2.x86_64 #1 SMP Mon May 22 23:10:22 UTC 2023 x86_64 GNU/Linux
Python version:
Python 3.11.5

docker info
Client:
 Version:    24.0.7
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.12.1
    Path:     /root/.docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.24.0
    Path:     /root/.docker/cli-plugins/docker-compose

Server:
 Containers: 24
  Running: 5
  Paused: 0
  Stopped: 19
 Images: 97
 Server Version: 20.10.23
 Storage Driver: overlay2
  Backing Filesystem: xfs
  Supports d_type: true
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 1e1ea6e986c6c86565bc33d52e34b81b3e2bc71f
 runc version: f19387a6bec4944c770f7668ab51c4348d9c2f38
 init version: de40ad0
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 5.10.179-168.710.amzn2.x86_64
 Operating System: Amazon Linux 2
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 30.99GiB
 Name:
 ID:
 Docker Root Dir: /home/xxxx/data/docker_data
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  xxxxxx
 Live Restore Enabled: false
pip freeze
aiobotocore==2.12.1
aiohttp==3.9.3
aiohttp-cors==0.7.0
aioitertools==0.11.0
aiosignal==1.3.1
alembic==1.13.1
aniso8601==9.0.1
anyio==4.3.0
argon2-cffi==23.1.0
argon2-cffi-bindings==21.2.0
arrow==1.3.0
asttokens==2.4.1
async-lru==2.0.4
async-timeout==4.0.3
attrs==23.2.0
autogluon.common==1.0.0
autogluon.core==1.0.0
autogluon.features==1.0.0
autogluon.tabular==1.0.0
Babel==2.14.0
backoff==2.2.1
beautifulsoup4==4.12.3
binaryornot==0.4.4
black==24.2.0
bleach==6.1.0
blessed==1.20.0
blinker==1.7.0
blis==0.7.11
boto3==1.34.51
boto3-stubs==1.34.51
botocore==1.34.51
botocore-stubs==1.34.62
bump2version==1.0.1
cachetools==5.3.3
catalogue==2.0.10
catboost==1.2.3
certifi==2024.2.2
cffi==1.16.0
cfgv==3.4.0
chardet==5.2.0
charset-normalizer==3.3.2
click==8.1.7
cloudpathlib==0.16.0
cloudpickle==2.2.1
cmake==3.28.3
coloredlogs==14.0
colorful==0.5.6
comm==0.2.2
confection==0.1.4
contextlib2==21.6.0
contourpy==1.2.0
cookiecutter==2.6.0
coverage==7.4.3
croniter==2.0.2
cruft==2.15.0
cycler==0.12.1
cymem==2.0.8
dagster==1.6.9
dagster-aws==0.22.9
dagster-graphql==1.6.9
dagster-pipes==1.6.9
dagster-postgres==0.22.9
dagster-webserver==1.6.9
debugpy==1.8.1
decorator==5.1.1
deepdiff==6.7.1
defusedxml==0.7.1
deprecation==2.1.0
dill==0.3.8
distlib==0.3.8
docker==7.0.0
docstring-parser==0.15
entrypoints==0.4
et-xmlfile==1.1.0
execnet==2.0.2
executing==2.0.1
fastai==2.7.14
fastapi==0.110.0
fastcore==1.5.29
fastdownload==0.0.7
fastjsonschema==2.19.1
fastprogress==1.0.3
filelock==3.13.1
Flask==3.0.2
fonttools==4.49.0
fqdn==1.5.1
frozenlist==1.4.1
fsspec==2024.2.0
future==1.0.0
gitdb==4.0.11
GitPython==3.1.42
google-api-core==2.17.1
google-auth==2.28.2
google-pasta==0.2.0
googleapis-common-protos==1.63.0
gpustat==1.1.1
gql==3.5.0
graphene==3.3
graphql-core==3.2.3
graphql-relay==3.2.0
graphviz==0.20.1
greenlet==3.0.3
grpcio==1.62.1
grpcio-health-checking==1.62.1
gunicorn==21.2.0
h11==0.14.0
httpcore==1.0.4
httptools==0.6.1
httpx==0.27.0
humanfriendly==10.0
hyperopt==0.2.7
identify==2.5.35
idna==3.6
importlib-metadata==6.11.0
incremental==22.10.0
iniconfig==2.0.0
ipykernel==6.29.3
ipython==8.22.2
isoduration==20.11.0
itsdangerous==2.1.2
jedi==0.19.1
Jinja2==3.1.3
jmespath==1.0.1
joblib==1.3.2
json5==0.9.22
jsonpointer==2.4
jsonschema==4.17.3
jsonschema-specifications==2023.12.1
jupyter-events==0.6.3
jupyter-lsp==2.2.4
jupyter_client==8.6.1
jupyter_core==5.7.2
jupyter_server==2.10.0
jupyter_server_terminals==0.5.3
jupyterlab==4.1.4
jupyterlab_pygments==0.3.0
jupyterlab_server==2.24.0
kaleido==0.2.1
kiwisolver==1.4.5
langcodes==3.3.0
lightgbm==4.1.0
lit==18.1.1
llvmlite==0.42.0
Mako==1.3.2
Markdown==3.5.2
markdown-it-py==3.0.0
MarkupSafe==2.1.5
matplotlib==3.8.3
matplotlib-inline==0.1.6
mdurl==0.1.2
mistune==3.0.2
mlflow==2.11.1
mpmath==1.3.0
msgpack==1.0.8
multidict==6.0.5
multiprocess==0.70.16
murmurhash==1.0.10
mypy==1.8.0
mypy-boto3-s3==1.34.62
mypy-extensions==1.0.0
nbclient==0.10.0
nbconvert==7.16.2
nbformat==5.10.2
nest-asyncio==1.6.0
networkx==3.2.1
nodeenv==1.8.0
notebook==7.1.1
notebook_shim==0.2.4
numba==0.59.0
numpy==1.26.4
nvidia-cublas-cu11==11.10.3.66
nvidia-cublas-cu12==12.1.3.1
nvidia-cuda-cupti-cu11==11.7.101
nvidia-cuda-cupti-cu12==12.1.105
nvidia-cuda-nvrtc-cu11==11.7.99
nvidia-cuda-nvrtc-cu12==12.1.105
nvidia-cuda-runtime-cu11==11.7.99
nvidia-cuda-runtime-cu12==12.1.105
nvidia-cudnn-cu11==8.5.0.96
nvidia-cudnn-cu12==8.9.2.26
nvidia-cufft-cu11==10.9.0.58
nvidia-cufft-cu12==11.0.2.54
nvidia-curand-cu11==10.2.10.91
nvidia-curand-cu12==10.3.2.106
nvidia-cusolver-cu11==11.4.0.1
nvidia-cusolver-cu12==11.4.5.107
nvidia-cusparse-cu11==11.7.4.91
nvidia-cusparse-cu12==12.1.0.106
nvidia-ml-py==12.535.133
nvidia-nccl-cu11==2.14.3
nvidia-nccl-cu12==2.19.3
nvidia-nvjitlink-cu12==12.4.99
nvidia-nvtx-cu11==11.7.91
nvidia-nvtx-cu12==12.1.105
opencensus==0.11.4
opencensus-context==0.1.3
openpyxl==3.1.2
ordered-set==4.1.0
orjson==3.9.15
overrides==7.7.0
packaging==23.2
pandarallel==1.6.5
pandas==2.1.4
pandocfilters==1.5.1
parse==1.20.1
parse-type==0.6.2
parso==0.8.3
pathos==0.3.2
pathspec==0.12.1
patsy==0.5.6
pendulum==3.0.0
pexpect==4.9.0
pillow==10.2.0
platformdirs==3.11.0
plotly==5.19.0
pluggy==1.4.0
pox==0.3.4
ppft==1.7.6.8
pre-commit==3.6.2
preshed==3.0.9
prometheus_client==0.20.0
prompt-toolkit==3.0.43
protobuf==4.25.3
psutil==5.9.8
psycopg2==2.9.9
psycopg2-binary==2.9.9
ptyprocess==0.7.0
pure-eval==0.2.2
py-cpuinfo==9.0.0
py-spy==0.3.14
py4j==0.10.9.7
pyarrow==15.0.1
pyasn1==0.5.1
pyasn1-modules==0.3.0
pycparser==2.21
pydantic==1.10.14
pygit2==1.14.1
Pygments==2.17.2
pyparsing==3.1.2
pyrsistent==0.20.0
pyspark==3.5.1
pytest==8.0.2
pytest-alembic==0.11.0
pytest-bdd==7.1.1
pytest-benchmark==4.0.0
pytest-cov==4.1.0
pytest-xdist==3.5.0
python-dateutil==2.9.0.post0
python-dotenv==1.0.1
python-json-logger==2.0.7
python-slugify==8.0.4
pytz==2024.1
PyYAML==6.0.1
pyzmq==25.1.2
querystring-parser==1.2.4
ray==2.6.3
referencing==0.33.0
requests==2.31.0
requests-toolbelt==1.0.0
rfc3339-validator==0.1.4
rfc3986-validator==0.1.1
rich==13.7.1
rpds-py==0.18.0
rsa==4.9
ruff==0.3.1
s3fs==2024.2.0
s3transfer==0.10.0
sagemaker==2.212.0
sagemaker-ssh-helper==2.1.0
schema==0.7.5
scikit-learn==1.4.0
scipy==1.12.0
Send2Trash==1.8.2
shap==0.44.1
six==1.16.0
sklearn-pandas==2.2.0
slack_sdk==3.27.1
slicer==0.0.7
smart-open==6.4.0
smdebug-rulesconfig==1.0.1
smmap==5.0.1
sniffio==1.3.1
soupsieve==2.5
spacy==3.7.4
spacy-legacy==3.0.12
spacy-loggers==1.0.5
SQLAlchemy==2.0.28
sqlparse==0.4.4
srsly==2.4.8
stack-data==0.6.3
starlette==0.36.3
statsmodels==0.14.1
structlog==24.1.0
sympy==1.12
tabulate==0.9.0
tblib==2.0.0
tenacity==8.2.3
tensorboardX==2.6.2.2
terminado==0.18.1
testcontainers==4.0.1
text-unidecode==1.3
thinc==8.2.3
threadpoolctl==3.3.0
time-machine==2.14.0
tinycss2==1.2.1
tokenize-rt==5.2.0
tomli==2.0.1
toposort==1.10
torch==2.0.1
torchvision==0.15.2
tornado==6.4
towncrier==23.11.0
tqdm==4.66.2
traitlets==5.14.2
triton==2.0.0
typer==0.9.0
types-awscrt==0.20.5
types-jsonschema==4.21.0.20240118
types-Markdown==3.5.0.20240129
types-psycopg2==2.9.21.20240218
types-python-dateutil==2.8.19.20240311
types-PyYAML==6.0.12.12
types-requests==2.31.0.20240218
types-s3transfer==0.10.0
typing_extensions==4.10.0
tzdata==2024.1
universal_pathlib==0.2.2
uri-template==1.3.0
urllib3==2.0.7
uvicorn==0.27.1
uvloop==0.19.0
virtualenv==20.21.0
wasabi==1.1.2
watchdog==4.0.0
watchfiles==0.21.0
wcwidth==0.2.13
weasel==0.3.4
webcolors==1.13
webencodings==0.5.1
websocket-client==1.7.0
websockets==12.0
Werkzeug==3.0.1
wrapt==1.16.0
xgboost==2.0.3
yarl==1.9.4
zipp==3.18.0
@totallyzen
Copy link
Collaborator

Hi!

Thanks for reporting, you are welcome to add a test case for it in the postgres module.
We're a bit short on energy and hands to deal with everything on the list of issues and PRs.
If you add a test case, it might be even easy to fix it - contributions are welcome! :)

Cheers,
B

@bearrito
Copy link
Contributor

@Tanguy-LeFloch

How are you running that code in a container? I could have a look if I had a clean way to look at this in a debugger.

Presumably this PR - #388 is the cause/regression. You might try setting the TC_HOST envar to your external host.

@Tanguy-LeFloch
Copy link
Author

Tanguy-LeFloch commented Mar 15, 2024

Hello @totallyzen @bearrito thanks for the quick answers!

On your questions:

  1. I don't mind at all adding a test, but to reproduce means adding some more setup that you don't have right now (see below).
    I could give it a shot but I'd need pointers as to 1) You mean to add it in the CI (it seems the dind tests are not in it right now) 2) You have specific prerequisites about how docker tests should be organized and which part of the codebase they should target.
    If you don't have much preferences for now I could do something sticking as much as possible to the existing.

  2. I'm running it inside a devcontainer, but it's different from yours as I'm using the external docker daemon instead of dind.
    Basically I do:

{
    "mounts": [
        "source=/var/run/docker.sock,target=/var/run/docker.sock,type=bind"
    ],
}

Instead of:

{
    "features": {
        "ghcr.io/devcontainers/features/docker-in-docker:2": {
	    "version": "latest",
	    "dockerDashComposeVersion": "v2"
	}
    },
}

I was able to reproduce with this change, your devcontainer and the test code piece above.

  1. Indeed setting the TC_HOST works but I'd rather avoid setting it manually if possible as I don't hardcode this IP anywhere currently. The tests that break run in our devcontainers as dev environments as well as in our CI (with docker run ... and the mount from above).

@bearrito
Copy link
Contributor

bearrito commented Mar 15, 2024

I can repro this. But it looks largely intentional based of the PR and some of the related TC projects. A maintainer could comment more.

@totallyzen
Copy link
Collaborator

Heyo!

It is somewhat intentional, but I wouldn't say that I personally have no interest in supporting dind or devcontainers.
I think it's important to keep improving and apologies for it breaking in the first place!
Better to ask for forgiveness than permission in this case, so apologies!

The main matter is energy in reverse-engineering what the original code was doing.
It was

  • poorly understood what it was trying to do,
  • conflicting with the tc.host intent from the owners of TC
  • and wasn't covered with tests the same way as some other things out there

it seems the dind tests are not in it right now

Yeah the dind tests were non-existent when I arrived so didn't know what to include in the first place.
I think at this point I'll fiddle around with restarting the notion of dind tests by simply replicating the socket mount like you are doing!

@Tanguy-LeFloch
Copy link
Author

Hello @totallyzen no worries, breaking things along the way happens!

  1. Just to clarify, dind and mounting the socket are two different ways of running docker from within a docker. The former uses within-docker daemon while the second uses the host docker daemon (the same that spawned the current docker). The postgres get_connection_url with dind isn't broken. With the socket mount it is.
  2. If that helps I can add tests for one of them, or even both. I was just wondering above if you already had opinions about a few things:
    1. Should they run in the CI
    2. Should they be on core / community / both / scope?

Otherwise I can design them out-of-CI first and document how to run that.
I'm asking that because they'll be much slower than your existing tests (assuming we don't setup any cache) and we'll need to build an image.

Let me know what you think

@totallyzen
Copy link
Collaborator

Hey!

Thanks for clarifying, I wouldn't call myself a docker expert for sure. I did know about the two ways, but never bothered to understand the true difference. Life is short 😅

I'm actually keen on getting the socket mount version of the testing going anyway.

I think it's valuable for catching future regressions and I see what you mean with test slowness and I have a few tricks up my sleeve for building the image once and then using it in a follow-up matrix (may not be super effective).

Scope is definintely only core for now, I don't want to explode the costs of the GHA. Kevin wouldn't be happy with me, haha!
Let me handle the new addition for the matrix and I'll ask you for some review comments. :)

I think it's important to boil down the issue to the host+port inference issue because I think this will be an issue with other containers that make use of the same inference.

It's also worth noting there was another recent PR with #368 that should help.
Can you run some tests with the main branch to see if that changed your situation? I think inferring the correct host network solves a lot of underlying issues.

Correct me if I'm wrong, I'm still grasping networking issues on this matter

@bearrito
Copy link
Contributor

I don't believe with Managed Runners you can run DIND in Github Actions? I know with self-hosted runners you can, but don't believe I've seen it with managed. Love to be proven wrong though.

@Tanguy-LeFloch
Copy link
Author

Hello,

@totallyzen you're welcome :)

So I just tested on main and unfortunately:

  1. The new ryuk feature from d019874 breaks completely in the setting of using the external daemon as I would expect due to the incorrect host (it breaks here https://github.com/testcontainers/testcontainers-python/blob/main/core/testcontainers/core/container.py#L205)
  2. Without ryuk the problem persists even if the port seems okay now, the host is still localhost (which is not okay for connections as we are within a container and the container we want to reach runs outside, on the host machine)

@bearrito I'd have to test with a dind image, I'm not sure why it wouldn't work.
However I'm sure docker run -v /var/run/docker.sock:/var/run/docker.sock ... works as it is what we are doing right now with Github managed runners.

@alexanderankin
Copy link
Member

sorry, reading this for the first time - not really a DinD expert, but bear with me, so to run dind, i think im going to:

docker run --rm -it --name dind --privileged docker

then i open a new terminal, and set up my python dev env?

$ docker exec -it dind sh
$ apk add bash python3 && bash # install python, bash shell (and launch it)
$ python3 -m venv .pe && . .pe/bin/activate && pip install -U pip && pip install poetry && ln -s $(which poetry) /usr/bin && deactivate # install poetry in the poetry venv
$ apk add git && git clone https://github.com/testcontainers/testcontainers-python
$ cd testcontainers-python
$ poetry install --with dev

now i am running testcontainers on dind? and i can write a test like:

cat > core/tests/test_issue_475.py <<EOF
from testcontainers.postgres import PostgresContainer
def test_issue_475():
    with PostgresContainer("postgres:alpine") as postgres:
        print(postgres.get_connection_url())
EOF

and it (poetry run pytest -s core/tests/test_issue_475.py) prints postgresql+psycopg2://test:test@localhost:32774/test

but it should print:

  1. "dind" - the name of the container
  2. $(hostname)"? (in my case "hostname" command printed: 8a51aa4b9616)
  3. one of the ip addresses?
ifconfig output
8a51aa4b9616:/testcontainers-python# ifconfig
docker0   Link encap:Ethernet  HWaddr 02:42:5A:E1:AC:CD  
          inet addr:172.18.0.1  Bcast:172.18.255.255  Mask:255.255.0.0
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:21 errors:0 dropped:0 overruns:0 frame:0
          TX packets:29 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:1020 (1020.0 B)  TX bytes:2122 (2.0 KiB)

eth0      Link encap:Ethernet  HWaddr 02:42:AC:11:00:02  
          inet addr:172.17.0.2  Bcast:172.17.255.255  Mask:255.255.0.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:15635 errors:0 dropped:0 overruns:0 frame:0
          TX packets:14403 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:209109459 (199.4 MiB)  TX bytes:1290642 (1.2 MiB)

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:32 errors:0 dropped:0 overruns:0 frame:0
          TX packets:32 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:2040 (1.9 KiB)  TX bytes:2040 (1.9 KiB)

the default route's gateway ip address is what is provided in the issue, but does that really make sense

# route | grep ^default | awk '{ print $2 }'
172.17.0.1

when i do that outside of docker, i get my router so i think what we actually want is the ip address of the interface which is our outbound route:

# # inside dind container:
# apk add iproute2 # because busybox doesn't support "brief"
# route | grep ^default | awk '{ print $NF }' | xargs ip -brief addr show  | awk '{ print $3 }' | cut -d'/' -f1
172.17.0.2

# outside docker:
$ route | grep ^default | awk '{ print $NF }' | xargs ip -brief addr show  | awk '{ print $3 }' | cut -d'/' -f1
192.168.<omitted>.<omitted>

and i thought that this brings back the terribleness of the ipv6 localhost address that macs report, but maybe not:

$ # on macos:
$ route -n get default | sed -n '/interface:/{s/.*: //;p;}' | xargs ipconfig getifaddr
192.168.<omitted>.<omitted>

@alexanderankin
Copy link
Member

interestingly enough, this one seems to give me the 172 address inside docker and a localhost address in linux, the LAN address in macos(??? - i mean 192.168), and then on windows, it gives two fe80's, an ipv6, and the LAN address also:

python -c 'import socket; print(set([a[4][0] for a in socket.getaddrinfo(socket.gethostname(), None)]))'

anyways, since all the other discussions were about docker abstraction layer, i figured maybe we can get somewhere with just basic networking command chat gpt prompting research

@alexanderankin
Copy link
Member

alexanderankin commented Mar 29, 2024

what im really hoping to understand is folks' dind setup - are the commands i sent realistic? i think the answer is no, somehow there are multiple containers involved.

taking the "two scenarios" from above - docker engine in container, docker engine on host:

  1. docker engine on host - i dont see how this is distinguishable from not dind (except of compose i guess)
  2. docker engine in docker - how do you even connect to containers running side a container if you are outside the container, don't you have to set up the port forwarding ahead of time?

are you supposed to interact with the dind container from outside of it? e.g. if you use devcontainers:

___________________________
| inside dind             |        integration tests (outside)
|                         |
|    ___________          |
|   | in D.E.   |         |
|   |           |         |
|   |  CPsql    |  CDinD--|-------------------CDev
|   |           |         |
|   |  CKafka   |         |
|   |           |         |
|   |  CValKey  |         |
|   |___________|         |
|                         |
|_________________________|

@Tanguy-LeFloch
Copy link
Author

Tanguy-LeFloch commented Mar 29, 2024

Hey @alexanderankin ,

Thanks for the digging! I think there is still a slight misanderstanding.
I'll try to clarify the two scenarios:

  1. With dind
    docker_dind_setup drawio (1)

  2. With docker sock mount
    docker_sock_mount_setup drawio (1)

N.B:

  1. The green container beeing a devcontainer or a "regular" one shouldn't matter but it may be easier to understand seeing it as the devcontainer
  2. The two options are two ways to do something but they have different implications, the second one being much closer to a compose as dockers are side-to-side
  3. The difference I see with your messages @alexanderankin above is that if you are using dind you are not supposed to interact with the dind container from outside of it, you are supposed to interact with the containers that are spawned inside it. In the case of a sock mount, you are supposed to interect with the containers on the side of the container your are currently in. Which used to work with the connection string provided pre-version 4.

My issue happens in the setup 2., and AFAIK not with setup 1.
Actually as I was stating above, setup 2. is now completely broken with the ryuk addition.

Hope it's clearer 🙏🏻

@bearrito
Copy link
Contributor

@alexanderankin If you are having a problem getting this setup, you can just use this projects devcontainers to get what you need.

You can follow @Tanguy-LeFloch comment here - #475 (comment)

Then tweak our file here - https://github.com/testcontainers/testcontainers-python/blob/main/.devcontainer/devcontainer.json#L6

@Tanguy-LeFloch Have you tried again since this commit - b10d916

@alexanderankin
Copy link
Member

@Tanguy-LeFloch message received - will run through scenario 2 asap/schedule permitting.

@bearrito I (me) do not plan on using devcontainers until some poor shmuck (also potentially me 😄) figures out how to fix all the issues with it. jokes aside, i am extremely conservative (bordering on luddite) with the kind of software i introduce into my development stack for related reasons.

@/Tanguy-LeFloch Have you tried again since this commit - b10d916

very good point, i will try before and after when i do

@alexanderankin alexanderankin changed the title Bug: Since 4.0.0 postgres.get_connection_url() within another container gives broken connection url Bug: in DinD, Since 4.0.0 postgres.get_connection_url() within another container gives broken connection url Mar 31, 2024
@Tanguy-LeFloch
Copy link
Author

@bearrito @alexanderankin yes, unfortunately I can confirm that the problem in setup 2 (localhost hostname in postgresql's connection string) still persists with the latest 4.3.0 version which includes b10d916.

@CarliJoy
Copy link
Contributor

CarliJoy commented Apr 5, 2024

As a workaround I am currently using:

class FixedMySqlContainer(MySqlContainer):
    def get_container_host_ip(self) -> str:
        if inside_container() and Path("/run/docker.sock").exists():
            return self.get_docker_client().gateway_ip(self._container.id)
        return super().get_container_host_ip()

which works well in the gitlab ci with a privileged runner.

I couldn't reproduce it locally, as all connection to containers started inside my container fail, I don't know why...

@Tanguy-LeFloch
Copy link
Author

Your workaround works in my setup as well @CarliJoy thanks! I just needed to either disable ryuk (d019874) or add the workaround in the core directly https://github.com/testcontainers/testcontainers-python/blob/main/core/testcontainers/core/container.py#L110 so that it applies to both.

I'm not sure what's the state of the tests to be sure that this works as well in the dind (#1 here #475 (comment)) setup.

@David2011Hernandez
Copy link

David2011Hernandez commented May 23, 2024

I have found that this part of code is commented in testcontainers: v4.4.1

def get_container_host_ip(self) -> str:
        # infer from docker host
        host = self.get_docker_client().host()
        if not host:
            return "localhost"
        # see https://github.com/testcontainers/testcontainers-python/issues/415
        if host == "localnpipe" and system() == "Windows":
            return "localhost"

        # # check testcontainers itself runs inside docker container
        # if inside_container() and not os.getenv("DOCKER_HOST") and not host.startswith("http://"):
        #     # If newly spawned container's gateway IP address from the docker
        #     # "bridge" network is equal to detected host address, we should use
        #     # container IP address, otherwise fall back to detected host
        #     # address. Even it's inside container, we need to double check,
        #     # because docker host might be set to docker:dind, usually in CI/CD environment
        #     gateway_ip = self.get_docker_client().gateway_ip(self._container.id)

        #     if gateway_ip == host:
        #         return self.get_docker_client().bridge_ip(self._container.id)
        #     return gateway_ip
        return host

It seems similar to the hack @CarliJoy suggest , btw it also works for me, thanks for sharing it

@gield
Copy link

gield commented May 28, 2024

I encountered the same issue when using Redis (AsyncRedisContainer).

I found a solution by adapting @CarliJoy's solution:

  • Use the bridge IP in get_container_host_ip() instead of the gateway IP
  • Override the get_exposed_port() method as well
  • As @Tanguy-LeFloch noted, I also disabled ryuk
class FixedAsyncRedisContainer(AsyncRedisContainer):
    def get_container_host_ip(self) -> str:
        if inside_container():
            return self.get_docker_client().bridge_ip(self._container.id)
        return super().get_container_host_ip()

    def get_exposed_port(self, port: int) -> str:
        if inside_container():
            return port  # or self.port
        return super().get_exposed_port(port)

Not pretty but it works both locally and in CI/CD.

@tharwan
Copy link

tharwan commented Jun 13, 2024

Our workaround required something like this

class MySqlServerContainer(SqlServerContainer):
    def _shares_network(self):
        host = self.get_docker_client().host()
        gateway_ip = self.get_docker_client().gateway_ip(self._container.id)

        if gateway_ip == host:
            return True
        return False

    def get_container_host_ip(self) -> str:
        # in CI the code runs inside a container itself, so the testcontainer starts in parallel
        if inside_container():
            # also, at least in azure, the pipeline assigns the container its own network, 
            # not the default bridge one
            if self._shares_network():
                return self.get_docker_client().bridge_ip(self._container.id)
            # therefore we have to access the testcontainer via the gateway
            return self.get_docker_client().gateway_ip(self._container.id)

        return super().get_container_host_ip()

    def get_exposed_port(self, port: int) -> str:
        # if inside a container AND the same network we can use the internal port
        if inside_container() and self._shares_network():
            return port
        # otherwise the external
        return super().get_exposed_port(port)

@alexanderankin
Copy link
Member

just to be clear for all in the thread - if there is an idea for a contribution to make this easier to work around (dind feature flag even, or something like this) I would merge it. I don't know how what the correct solution is but we can work together to fix it as long as the end result is less breakage and not more.

@CarliJoy
Copy link
Contributor

I guess the problem is that everybody know the workaround for their env but we are lacking a deep enough understanding to find a general solution.

Currently I don't have this topic on my agenda. If I do I will try to analyse it a bit more in depth and post a solution here, asking everybody to test it.
But maybe in the meanwhile somebody else figures it out...

@tharwan
Copy link

tharwan commented Jun 14, 2024

I can try to list all the options i currently see as possible. After digging in the dark of our CI pipelines, I made a lot of hypotheses how things could look.

@wlipski
Copy link

wlipski commented Aug 16, 2024

Hi guys,

Does this thing moves anywhere? A few workarounds mentioned above seems to work for our setup also. Could we introduce any of those under the feature flag and make a release?

@alexanderankin
Copy link
Member

i havent had time. pr's are welcome. I started #622 - i can merge it if someone wants to work on top of it, or just PR to my branch.

@wlipski
Copy link

wlipski commented Aug 20, 2024

@alexanderankin got it. I'll try to dig into the commented piece of code inside get_container_host_ip and compare with @gield proposition (I use this one to setup a Jenkins pipeline with dind).

Also I do have an issue even with this patch, but I think it's because I'm not patching Ryuk container to have correct host resolution, but as a workaround I'm running with --network=host and that seems to be fixing. At least I could run a pipeline and with docker desktop on MacOS.

@tharwan for the Azure - did it specifies by default specific network to a container? Do you have any option to remove that automatic assignment and check whether @gield proposition will suit your needs?

@CarliJoy
Copy link
Contributor

CarliJoy commented Sep 5, 2024

Ok I did some rather intensive digging with our gitlab-ci and locally on my linux machine.

Running local outside of docker the following three combinations work:

  • ✅ [1] '172.17.0.1' (mapped port) → gateway_ip
  • ✅ [2] '172.17.0.4' (original port) → brigde_ip
  • ✅ [3] 'localhost' (mapped port) → container_host_ip, docker_host

Interestingly these three combinations also work within github codespaces.

For normal DooD (Docker out of Docker = mounted the /var/run/docker.sock inside the container) [2] bridge_ip:original port always works. The gateway_ip:mapped port combination works on the gitlab runner but not locally.

For DooD container running with extra network (not the default bridge, i.e. FF_NETWORK_PER_BUILD flag), I have to detect the network running and attach the container to it. Then [2] bridge_ip:original port also works (and only this).

For DinD only the [3] docker_host:mapped port worked.

I prepared a test script.
If you guys could run it locally/in your CI/CD and send me or post the results we could debug the problem a bit more I think.

@RafalSkolasinski
Copy link

RafalSkolasinski commented Sep 6, 2024

Thanks @CarliJoy for taking a time to dig into it. Here's the outcome from my test (got here as we have failure in Github Action for Python's testcontainers while Golang ones worked fine):

On my macbook

inside_container()=False
os_name()='mac'
### Run
 - gateway_ip: '172.17.0.1 (172.17.0.1)'
 - brigde_ip: '172.17.0.2 (172.17.0.2)'
 - default_gateway_ip: '?'
 - container_host_ip, docker_host: 'localhost (127.0.0.1)'
find_host_network=None
Docker Client points to client_base_url='http+docker://localhost' (localhost (127.0.0.1))
network_name='bridge'
Successfully connections for default network:
 ✅  'localhost' (mapped port) → container_host_ip, docker_host
Could not determine network running in

On Codespaces

inside_container()=True
os_name()='linux'
### Run 
 - gateway_ip: '172.17.0.1 (172.17.0.1)'
 - brigde_ip: '172.17.0.2 (172.17.0.2)'
 - default_gateway_ip: '?'
 - container_host_ip, docker_host: 'localhost (127.0.0.1)'
find_host_network=None
Docker Client points to client_base_url='http+docker://localhost' (localhost (127.0.0.1))
network_name='bridge'
Successfully connections for default network:
 ✅  '172.17.0.1' (mapped port) → gateway_ip
 ✅  '172.17.0.2' (original port) → brigde_ip
 ✅  'localhost' (mapped port) → container_host_ip, docker_host
Could not determine network running in

And in the Github Actions

inside_container()=True
os_name()='linux'
### Run 
 - gateway_ip: '172.17.0.1 (172.17.0.1)'
 - brigde_ip: '172.17.0.2 (172.17.0.2)'
 - default_gateway_ip: '?'
 - container_host_ip, docker_host: 'localhost (127.0.0.1)'
find_host_network=None
Docker Client points to client_base_url='http+docker://localhost' (localhost (127.0.0.1))
network_name='bridge'
Successfully connections for default network:
 ✅  '172.17.0.1' (mapped port) → gateway_ip
Could not determine network running in

That's from output.txt.

@CarliJoy
Copy link
Contributor

CarliJoy commented Sep 6, 2024

@RafalSkolasinski Thank's for running it.
Mac seems strange.
You can ignore the output, it is only for you to see what is happening. The output.txt is the interesing part as it contains the summary.

I updated the script to be more descriptive. Could you rerun it and update the output (just edit it) for Mac and Github actions?

So far the summary is:

Not inside a container:

  • use docker_host:mapped port as this is only thing that works also in mac

Inside a container it get complicated:

  • bride_ip:original port DooD
  • gateway_ip:mapped port (Github action)
  • docker_host:mapped port DinD gitlab-ci

We have to determine the difference between between the github actions and DinD in gitlab-ci.
So I added some more debug info for the script, hopeing to find a difference.

@RafalSkolasinski
Copy link

RafalSkolasinski commented Sep 6, 2024

Thanks @CarliJoy, I updated the outputs for all three.

Mac seems strange.

I am using Docker for Desktop if it makes a difference. Could try Colima as well later if you'd like to have more data.


We have to determine the difference between between the github actions and DinD in gitlab-ci.

Not sure about the GitlabCi as we do not use that but with Golang's testcontainers I did not experience any issues so you could potentially borrow some logic from there?

@RafalSkolasinski
Copy link

RafalSkolasinski commented Sep 6, 2024

OK, for me it does seem to work if I add to my container

    def get_container_host_ip(self) -> str:
        """
        This is a temporary workaround issue with running in Github Actions.
        See: https://github.com/testcontainers/testcontainers-python/issues/475
        """
        if inside_container():
            return self.get_docker_client().gateway_ip(self._container.id)
        return super().get_container_host_ip()

but this seemed to only propagate to my test container, and not ryuk as I also need to set TESTCONTAINERS_RYUK_DISABLED: true. Not sure how to propagate change also to the Reaper.
This may be OK for the Github Actions though...

@CarliJoy
Copy link
Contributor

I would like to fix this properly.
But for this I need to information about azure and windows. Could somebody please run the test script on this platforms and post the results.
After this I am hopefully able to determine what a proper algorithm that works in all common settings.

@massdosage
Copy link

I have also run into this issue and came to pretty much the same conclusion with using inside_container() and then returning the gateway_ip, see #388 (comment).

#388 seem to be when this regression was introduced but the tc.host feature doesn't solve the problem from what I could see. The changed code was left in git as commented out which also suggests to me they weren't that confident about the change. The title says "de-prioritise docker:dind" but what actually got merged completely removed it, perhaps that wasn't the final intention?

What about just putting the commented out code back? One further complication seems to be that the ryuk cleanup container which was introduced around this time also suffers from this problem so I have had to disable it to get my fix working. Maybe it will "just work" if the code in DockerContainer is updated?

@CarliJoy
Copy link
Contributor

@massdosage there at least 3 different possibilities for inside the container. Depending if DooD (socket mount), DinD and if an extra network is used and how the DinD service is mounted.

In my setup even the original code had flaws and didn't work properly.

Just have yet another person say "mah don't work, I use xyz" doesn't help to fix the problem properly.

Please run the test script and post the information.

We still lack information about Windows and Azure CI's.

@massdosage
Copy link

Here are the results of running the test script.

GitLab CI/CD pipeline:

### Run 
 - gateway_ip: '172.17.0.1 (172.17.0.1)'
 - brigde_ip: '172.17.0.3 (172.17.0.3)'
 - default_gateway_ip: '?'
 - container_host_ip, docker_host: 'localhost (127.0.0.1)'
find_host_network=None
Docker Client points to client_base_url='http+docker://localhost' (localhost (127.0.0.1))
network_name='bridge'
Successfully connections for default network:
 ✅  '172.17.0.1' (mapped port) → gateway_ip
 ✅  '172.17.0.3' (original port) → brigde_ip
Found in_container_id='b7768bff63105ce06e589ce2fa15d5b661968d12208afe7211c9b5834cd32250'
Pulling image testcontainers/ryuk:0.8.1
INFO:testcontainers.core.container:Pulling image testcontainers/ryuk:0.8.1
Container started: f2c99af15f32
INFO:testcontainers.core.container:Container started: f2c99af15f32
Waiting for container <Container: f2c99af15f32> with image testcontainers/ryuk:0.8.1 to be ready ...
INFO:testcontainers.core.waiting_utils:Waiting for container <Container: f2c99af15f32> with image testcontainers/ryuk:0.8.1 to be ready ...
Waiting for container <Container: f2c99af15f32> with image testcontainers/ryuk:0.8.1 to be ready ...
INFO:testcontainers.core.waiting_utils:Waiting for container <Container: f2c99af15f32> with image testcontainers/ryuk:0.8.1 to be ready ...
Waiting for container <Container: f2c99af15f32> with image testcontainers/ryuk:0.8.1 to be ready ...
INFO:testcontainers.core.waiting_utils:Waiting for container <Container: f2c99af15f32> with image testcontainers/ryuk:0.8.1 to be ready ...
### Run bridge
 - gateway_ip: '172.17.0.1 (172.17.0.1)'
 - brigde_ip: '172.17.0.3 (172.17.0.3)'
 - default_gateway_ip: '?'
 - container_host_ip, docker_host: 'localhost (127.0.0.1)'
find_host_network=None
Docker Client points to client_base_url='http+docker://localhost' (localhost (127.0.0.1))
network_name='bridge'
Successfully connections for bridge:
 ✅  '172.17.0.1' (mapped port) → gateway_ip
 ✅  '172.17.0.3' (original port) → brigde_ip
▶️ Start Ryuk, connect to 172.17.0.1:8080
🌈  6.02    🔥 (('172.17.0.1', False)) ryuk failed (gateway_ip)
▶️ Start Ryuk, connect to 172.17.0.1:32768
🌈  0.01   → ✅ (('172.17.0.1', True)) ryuk ok (gateway_ip)
▶️ Start Ryuk, connect to 172.17.0.3:8080
🌈  0.01   → ✅ (('172.17.0.3', False)) ryuk ok (brigde_ip)
▶️ Start Ryuk, connect to 172.17.0.3:32768
🌈  6.02    🔥 (('172.17.0.3', True)) ryuk failed (brigde_ip)
▶️ Start Ryuk, connect to localhost:8080
🌈  6.02    🔥 (('localhost', False)) ryuk failed (container_host_ip, docker_host)
▶️ Start Ryuk, connect to localhost:32768
🌈  6.02    🔥 (('localhost', True)) ryuk failed (container_host_ip, docker_host)
▶️ Start Ryuk, connect to 172.17.0.1:8080
🌈  6.02    🔥 (('172.17.0.1', False)) ryuk failed (gateway_ip)
▶️ Start Ryuk, connect to 172.17.0.1:32769
🌈  0.01   → ✅ (('172.17.0.1', True)) ryuk ok (gateway_ip)
▶️ Start Ryuk, connect to 172.17.0.3:8080
🌈  0.01   → ✅ (('172.17.0.3', False)) ryuk ok (brigde_ip)
▶️ Start Ryuk, connect to 172.17.0.3:32769
🌈  6.02    🔥 (('172.17.0.3', True)) ryuk failed (brigde_ip)
▶️ Start Ryuk, connect to localhost:8080
🌈  6.02    🔥 (('localhost', False)) ryuk failed (container_host_ip, docker_host)
▶️ Start Ryuk, connect to localhost:32769
🌈  6.02    🔥 (('localhost', True)) ryuk failed (container_host_ip, docker_host)

Local developer run on a macbook:

▶️ Start Ryuk, connect to 172.17.0.1:8080
🌈  6.04    🔥 (('172.17.0.1', False)) ryuk failed (gateway_ip)

Waiting for container <Container: a0120b66f16c> with image testcontainers/ryuk:0.8.1 to be ready ...
▶️ Start Ryuk, connect to 172.17.0.1:49973
🌈  6.06    🔥 (('172.17.0.1', True)) ryuk failed (gateway_ip)

▶️ Start Ryuk, connect to 172.17.0.2:8080
🌈  6.04    🔥 (('172.17.0.2', False)) ryuk failed (brigde_ip)

Waiting for container <Container: a0120b66f16c> with image testcontainers/ryuk:0.8.1 to be ready ...
▶️ Start Ryuk, connect to 172.17.0.2:49973
🌈  6.04    🔥 (('172.17.0.2', True)) ryuk failed (brigde_ip)

▶️ Start Ryuk, connect to localhost:8080
🌈  6.05    🔥 (('localhost', False)) ryuk failed (container_host_ip, docker_host)

Waiting for container <Container: a0120b66f16c> with image testcontainers/ryuk:0.8.1 to be ready ...
▶️ Start Ryuk, connect to localhost:49973
🌈  0.04   → ✅ (('localhost', True)) ryuk ok (container_host_ip, docker_host)

### Run
 - gateway_ip: '172.17.0.1 (172.17.0.1)'
 - brigde_ip: '172.17.0.2 (172.17.0.2)'
 - default_gateway_ip: '?'
 - container_host_ip, docker_host: 'localhost (127.0.0.1)'
find_host_network=None
Docker Client points to client_base_url='http+docker://localhost' (localhost (127.0.0.1))
network_name='bridge'
Successfully connections for default network:
 ✅  'localhost' (mapped port) → container_host_ip, docker_host
Could not determine network running in

@Mithmi
Copy link

Mithmi commented Sep 17, 2024

Run inside my sandbox docker container (DinD)

inside_container()=True
os_name()='linux'
Pulling image testcontainers/ryuk:0.8.1
Container started: 0ac4a19a5062
▶️ Start Ryuk, connect to 172.17.0.1:8080
🌈  7.02    🔥 (('172.17.0.1', False)) ryuk failed (gateway_ip)

Waiting for container <Container: 0ac4a19a5062> with image testcontainers/ryuk:0.8.1 to be ready ...
▶️ Start Ryuk, connect to 172.17.0.1:32773
🌈  0.02   → ✅ (('172.17.0.1', True)) ryuk ok (gateway_ip)

▶️ Start Ryuk, connect to 172.17.0.2:8080
🌈  6.01    🔥 (('172.17.0.2', False)) ryuk failed (brigde_ip)

Waiting for container <Container: 0ac4a19a5062> with image testcontainers/ryuk:0.8.1 to be ready ...
▶️ Start Ryuk, connect to 172.17.0.2:32773
🌈  6.03    🔥 (('172.17.0.2', True)) ryuk failed (brigde_ip)

▶️ Start Ryuk, connect to localhost:8080
🌈  6.01    🔥 (('localhost', False)) ryuk failed (container_host_ip, docker_host)

Waiting for container <Container: 0ac4a19a5062> with image testcontainers/ryuk:0.8.1 to be ready ...
▶️ Start Ryuk, connect to localhost:32773
🌈  6.02    🔥 (('localhost', True)) ryuk failed (container_host_ip, docker_host)

### Run 
 - gateway_ip: '172.17.0.1 (172.17.0.1)'
 - brigde_ip: '172.17.0.2 (172.17.0.2)'
 - default_gateway_ip: '?'
 - container_host_ip, docker_host: 'localhost (127.0.0.1)'
find_host_network=None
Docker Client points to client_base_url='http+docker://localhost' (localhost (127.0.0.1))
network_name='bridge'
Successfully connections for default network:
 ✅  '172.17.0.1' (mapped port) → gateway_ip
Could not determine network running in

local just parallel dockers.

inside_container()=False
os_name()='linux'
Pulling image testcontainers/ryuk:0.8.1
Container started: c6d57140fe85
▶️ Start Ryuk, connect to 172.16.139.1:8080
🌈  6.01    🔥 (('172.16.139.1', False)) ryuk failed (default_gateway_ip)

Waiting for container <Container: c6d57140fe85> with image testcontainers/ryuk:0.8.1 to be ready ...
▶️ Start Ryuk, connect to 172.16.139.1:32782
🌈  6.02    🔥 (('172.16.139.1', True)) ryuk failed (default_gateway_ip)

▶️ Start Ryuk, connect to 172.17.0.1:8080
🌈  6.02    🔥 (('172.17.0.1', False)) ryuk failed (gateway_ip)

Waiting for container <Container: c6d57140fe85> with image testcontainers/ryuk:0.8.1 to be ready ...
▶️ Start Ryuk, connect to 172.17.0.1:32782
🌈  0.03   → ✅ (('172.17.0.1', True)) ryuk ok (gateway_ip)

▶️ Start Ryuk, connect to 172.17.0.2:8080
🌈  0.01   → ✅ (('172.17.0.2', False)) ryuk ok (brigde_ip)

Waiting for container <Container: c6d57140fe85> with image testcontainers/ryuk:0.8.1 to be ready ...
▶️ Start Ryuk, connect to 172.17.0.2:32782
🌈  6.01    🔥 (('172.17.0.2', True)) ryuk failed (brigde_ip)

▶️ Start Ryuk, connect to localhost:8080
🌈  6.03    🔥 (('localhost', False)) ryuk failed (container_host_ip, docker_host)

Waiting for container <Container: c6d57140fe85> with image testcontainers/ryuk:0.8.1 to be ready ...
▶️ Start Ryuk, connect to localhost:32782
🌈  0.01   → ✅ (('localhost', True)) ryuk ok (container_host_ip, docker_host)

### Run 
 - default_gateway_ip: '172.16.139.1 (172.16.139.1)'
 - gateway_ip: '172.17.0.1 (172.17.0.1)'
 - brigde_ip: '172.17.0.2 (172.17.0.2)'
 - container_host_ip, docker_host: 'localhost (127.0.0.1)'
find_host_network=None
Docker Client points to client_base_url='http+docker://localhost' (localhost (127.0.0.1))
network_name='bridge'
Successfully connections for default network:
 ✅  '172.17.0.1' (mapped port) → gateway_ip
 ✅  '172.17.0.2' (original port) → brigde_ip
 ✅  'localhost' (mapped port) → container_host_ip, docker_host
Could not determine network running in

@wlipski
Copy link

wlipski commented Sep 17, 2024

This is a run from our pipeline. We are running a Jenkins pipeline with DinD.

The only problem for us is that we need to pass ip/host/url of a container to other containers during pipeline, so they will be able to talk. I think I've mistakenly assumed that get_container_host_ip should be used to pick ip to the container. I think in our case we should use internal ip of a container inside of a bridge network (since we don't create any other network) and that works. Example:

kafka_docker_ip = kafka_container.get_docker_client().bridge_ip(kafka_container.get_wrapped_container().id)

In order to run it on the pipeline I need to add --network=host for the Jenkins pipeline. So I'm no sure if I need any changes at that point, but below are the results.

That is the run from the pipeline (docker socket is mounted):

# --network=host

inside_container()=True
os_name()='linux'
WARNING:root:DOCKER_AUTH_CONFIG is experimental, see testcontainers/testcontainers-python#566
Pulling image testcontainers/ryuk:0.8.1
INFO:testcontainers.core.container:Pulling image testcontainers/ryuk:0.8.1
Container started: 072c200fd44d
INFO:testcontainers.core.container:Container started: 072c200fd44d

Start Ryuk, connect to 172.17.0.1:8080
 6.01    🔥(('172.17.0.1', False)) ryuk failed (gateway_ip)

Waiting for container <Container: 072c200fd44d> with image testcontainers/ryuk:0.8.1 to be ready ...
INFO:testcontainers.core.waiting_utils:Waiting for container <Container: 072c200fd44d> with image testcontainers/ryuk:0.8.1 to be ready ...
Start Ryuk, connect to 172.17.0.1:32768
 0.01    ✅(('172.17.0.1', True)) ryuk ok (gateway_ip)

Start Ryuk, connect to 172.17.0.2:8080
 0.01    ✅(('172.17.0.2', False)) ryuk ok (brigde_ip)

Waiting for container <Container: 072c200fd44d> with image testcontainers/ryuk:0.8.1 to be ready ...
INFO:testcontainers.core.waiting_utils:Waiting for container <Container: 072c200fd44d> with image testcontainers/ryuk:0.8.1 to be ready ...
Start Ryuk, connect to 172.17.0.2:32768
 6.02    🔥(('172.17.0.2', True)) ryuk failed (brigde_ip)

Start Ryuk, connect to localhost:8080
 6.01    🔥(('localhost', False)) ryuk failed (container_host_ip, docker_host)

Waiting for container <Container: 072c200fd44d> with image testcontainers/ryuk:0.8.1 to be ready ...
INFO:testcontainers.core.waiting_utils:Waiting for container <Container: 072c200fd44d> with image testcontainers/ryuk:0.8.1 to be ready ...
Start Ryuk, connect to localhost:32768
 0.01    ✅(('localhost', True)) ryuk ok (container_host_ip, docker_host)

### Run 
 - gateway_ip: '172.17.0.1 (172.17.0.1)'
 - brigde_ip: '172.17.0.2 (172.17.0.2)'
 - default_gateway_ip: '?'
 - container_host_ip, docker_host: 'localhost (127.0.0.1)'
find_host_network=None
Docker Client points to client_base_url='http+docker://localhost' (localhost (127.0.0.1))
network_name='bridge'
Successfully connections for default network:
   '172.17.0.1' (mapped port)  gateway_ip
   '172.17.0.2' (original port)  brigde_ip
   'localhost' (mapped port)  container_host_ip, docker_host
Could not determine network running in

While developing locally I'm just packing all the test code into the docker container and running it, so technically not DinD I think (I do run that container with mounted docker socket).

 inside_container()=True
os_name()='linux'
WARNING:root:DOCKER_AUTH_CONFIG is experimental, see testcontainers/testcontainers-python#566
Pulling image testcontainers/ryuk:0.8.1
INFO:testcontainers.core.container:Pulling image testcontainers/ryuk:0.8.1
Container started: 30797ba38d42
INFO:testcontainers.core.container:Container started: 30797ba38d42
▶️ Start Ryuk, connect to 172.17.0.1:8080
🌈  6.03    🔥 (('172.17.0.1', False)) ryuk failed (gateway_ip)

Waiting for container <Container: 30797ba38d42> with image testcontainers/ryuk:0.8.1 to be ready ...
INFO:testcontainers.core.waiting_utils:Waiting for container <Container: 30797ba38d42> with image testcontainers/ryuk:0.8.1 to be ready ...
▶️ Start Ryuk, connect to 172.17.0.1:60864
🌈  6.03    🔥 (('172.17.0.1', True)) ryuk failed (gateway_ip)

▶️ Start Ryuk, connect to 172.17.0.3:8080
🌈  0.01   → ✅ (('172.17.0.3', False)) ryuk ok (brigde_ip)

Waiting for container <Container: 30797ba38d42> with image testcontainers/ryuk:0.8.1 to be ready ...
INFO:testcontainers.core.waiting_utils:Waiting for container <Container: 30797ba38d42> with image testcontainers/ryuk:0.8.1 to be ready ...
▶️ Start Ryuk, connect to 172.17.0.3:60864
🌈  6.03    🔥 (('172.17.0.3', True)) ryuk failed (brigde_ip)

▶️ Start Ryuk, connect to localhost:8080
🌈  6.03    🔥 (('localhost', False)) ryuk failed (container_host_ip, docker_host)

Waiting for container <Container: 30797ba38d42> with image testcontainers/ryuk:0.8.1 to be ready ...
INFO:testcontainers.core.waiting_utils:Waiting for container <Container: 30797ba38d42> with image testcontainers/ryuk:0.8.1 to be ready ...
▶️ Start Ryuk, connect to localhost:60864
🌈  6.05    🔥 (('localhost', True)) ryuk failed (container_host_ip, docker_host)

### Run 
 - gateway_ip: '172.17.0.1 (172.17.0.1)'
 - brigde_ip: '172.17.0.3 (172.17.0.3)'
 - default_gateway_ip: '?'
 - container_host_ip, docker_host: 'localhost (127.0.0.1)'
find_host_network=None
Docker Client points to client_base_url='http+docker://localhost' (localhost (127.0.0.1))
network_name='bridge'
Successfully connections for default network:
 ✅  '172.17.0.3' (original port) → brigde_ip
Could not determine network running in
#  -e TESTCONTAINERS_HOST_OVERRIDE=host.docker.internal

inside_container()=True
os_name()='linux'
WARNING:root:DOCKER_AUTH_CONFIG is experimental, see testcontainers/testcontainers-python#566
Pulling image testcontainers/ryuk:0.8.1
INFO:testcontainers.core.container:Pulling image testcontainers/ryuk:0.8.1
Container started: b0d5d1d1beb2
INFO:testcontainers.core.container:Container started: b0d5d1d1beb2
▶️ Start Ryuk, connect to 172.17.0.1:8080
🌈  6.03    🔥 (('172.17.0.1', False)) ryuk failed (gateway_ip)

Waiting for container <Container: b0d5d1d1beb2> with image testcontainers/ryuk:0.8.1 to be ready ...
INFO:testcontainers.core.waiting_utils:Waiting for container <Container: b0d5d1d1beb2> with image testcontainers/ryuk:0.8.1 to be ready ...
▶️ Start Ryuk, connect to 172.17.0.1:60916
🌈  6.05    🔥 (('172.17.0.1', True)) ryuk failed (gateway_ip)

▶️ Start Ryuk, connect to 172.17.0.3:8080
🌈  0.03   → ✅ (('172.17.0.3', False)) ryuk ok (brigde_ip)

Waiting for container <Container: b0d5d1d1beb2> with image testcontainers/ryuk:0.8.1 to be ready ...
INFO:testcontainers.core.waiting_utils:Waiting for container <Container: b0d5d1d1beb2> with image testcontainers/ryuk:0.8.1 to be ready ...
▶️ Start Ryuk, connect to 172.17.0.3:60916
🌈  6.04    🔥 (('172.17.0.3', True)) ryuk failed (brigde_ip)

▶️ Start Ryuk, connect to host.docker.internal:8080
🌈  0.04   → ✅ (('host.docker.internal', False)) ryuk ok (container_host_ip, docker_host)

Waiting for container <Container: b0d5d1d1beb2> with image testcontainers/ryuk:0.8.1 to be ready ...
INFO:testcontainers.core.waiting_utils:Waiting for container <Container: b0d5d1d1beb2> with image testcontainers/ryuk:0.8.1 to be ready ...
▶️ Start Ryuk, connect to host.docker.internal:60916
🌈  0.03   → ✅ (('host.docker.internal', True)) ryuk ok (container_host_ip, docker_host)

### Run 
 - gateway_ip: '172.17.0.1 (172.17.0.1)'
 - brigde_ip: '172.17.0.3 (172.17.0.3)'
 - default_gateway_ip: '?'
 - container_host_ip, docker_host: 'host.docker.internal (192.168.65.254)'
find_host_network=None
Docker Client points to client_base_url='http+docker://localhost' (localhost (127.0.0.1))
network_name='bridge'
Successfully connections for default network:
 ✅  '172.17.0.3' (original port) → brigde_ip
 ✅  'host.docker.internal' (original port) → container_host_ip, docker_host
 ✅  'host.docker.internal' (mapped port) → container_host_ip, docker_host
Could not determine network running in

@CarliJoy
Copy link
Contributor

CarliJoy commented Oct 8, 2024

@wlipski your local setup seems strange. In DooD settings (mounted socket) I was always able to determine the container_id the current container was running in. For you this doesn't work.
Could you investigate that?

Because this seemed to be the best way to determine if we running a DinD or DooD.

What is your local setup?

@CarliJoy
Copy link
Contributor

I performed a summary:

Image

In green I marked the determined configuration that clustered a possible connection mode.

To summarize:

  • If not running in a container or if the docker_host is not localhost, connect to docker_host:mapped_port.

  • Otherwise try to determine the network the current container is running in. If found use it and use bridge ip:port to connect to the container.
    This is rather a special case.

  • In any other cases use gateway_ip:mapped_port to connect to the container.

This covers all cases besides the one of @wlipski without TESTCONTAINERS_HOST_OVERRIDE. As setting this variable fixes it, it seems to be okay. Also as this is not a CI, I don't consider it as important.

If you see another way to determine the correct connection mode from the given settings, feel free to adopt the given
Sheet and post you solution here.

I will start working on a PR now.

In my humble option the determination of the connection mode should be done already when settings are loaded with the possibility to overwrite it. I will implement it that way.

@CarliJoy
Copy link
Contributor

@Mithmi and @wlipski it would be helpful to know how you setup you local DinD / DooD containers.
The exact command you ran and what kind of docker you are using.

@wlipski
Copy link

wlipski commented Oct 11, 2024

Hi @CarliJoy

Sorry for a delay and thank you for the summary!

Results above where archived by packing test script into a docker (assuming testc.py exists in the folder):

# Dockerfile
FROM python:3.10
COPY ./testc.py /
RUN pip install testcontainers==4.8.0
CMD ["python3", "testc.py"]

Env:

  • M1 Mac, MacOS 15.0.1
  • Docker version 27.3.1, build ce1223035a
  • Docker desktop 4.32.0

Commands:

docker build -f ./Dockerfile -t testc . 
docker run --rm -v /var/run/docker.sock:/var/run/docker.sock -t testc
docker run --rm -v /var/run/docker.sock:/var/run/docker.sock -e TESTCONTAINERS_HOST_OVERRIDE=host.docker.internal -t testc

@CarliJoy
Copy link
Contributor

Hmm okay the problem lies within Docker Desktop.

I would suggest that at a later point in time add a detection if DockerDestkop is running somehow (let's dig into that in another issue as this issue is already getting to big [not all anwers are loaded]) and set the docker_host correctly in this case.

That would eliminate the need for the `TESTCONTAINERS_HOST_OVERRIDE´ which a is valid workaround in the meanwhile.

CarliJoy added a commit to CarliJoy/testcontainers-python that referenced this issue Oct 11, 2024
CarliJoy added a commit to CarliJoy/testcontainers-python that referenced this issue Oct 11, 2024
@RafalSkolasinski
Copy link

Hmm okay the problem lies within Docker Desktop.

FWIW, my reported results where also from using Docker Desktop. Just don't remember if I had the /var/run/docker.sock setup or did I rely fully on docker context.

CarliJoy added a commit to CarliJoy/testcontainers-python that referenced this issue Oct 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.