-
Notifications
You must be signed in to change notification settings - Fork 300
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug: in DinD, Since 4.0.0 postgres.get_connection_url()
within another container gives broken connection url
#475
Comments
Hi! Thanks for reporting, you are welcome to add a test case for it in the postgres module. Cheers, |
How are you running that code in a container? I could have a look if I had a clean way to look at this in a debugger. Presumably this PR - #388 is the cause/regression. You might try setting the TC_HOST envar to your external host. |
Hello @totallyzen @bearrito thanks for the quick answers! On your questions:
{
"mounts": [
"source=/var/run/docker.sock,target=/var/run/docker.sock,type=bind"
],
} Instead of: {
"features": {
"ghcr.io/devcontainers/features/docker-in-docker:2": {
"version": "latest",
"dockerDashComposeVersion": "v2"
}
},
} I was able to reproduce with this change, your devcontainer and the test code piece above.
|
I can repro this. But it looks largely intentional based of the PR and some of the related TC projects. A maintainer could comment more. |
Heyo! It is somewhat intentional, but I wouldn't say that I personally have no interest in supporting The main matter is energy in reverse-engineering what the original code was doing.
Yeah the |
Hello @totallyzen no worries, breaking things along the way happens!
Otherwise I can design them out-of-CI first and document how to run that. Let me know what you think |
Hey! Thanks for clarifying, I wouldn't call myself a docker expert for sure. I did know about the two ways, but never bothered to understand the true difference. Life is short 😅 I'm actually keen on getting the socket mount version of the testing going anyway. I think it's valuable for catching future regressions and I see what you mean with test slowness and I have a few tricks up my sleeve for building the image once and then using it in a follow-up matrix (may not be super effective). Scope is definintely only I think it's important to boil down the issue to the host+port inference issue because I think this will be an issue with other containers that make use of the same inference. It's also worth noting there was another recent PR with #368 that should help. Correct me if I'm wrong, I'm still grasping networking issues on this matter |
I don't believe with Managed Runners you can run DIND in Github Actions? I know with self-hosted runners you can, but don't believe I've seen it with managed. Love to be proven wrong though. |
Hello, @totallyzen you're welcome :) So I just tested on
@bearrito I'd have to test with a |
sorry, reading this for the first time - not really a DinD expert, but bear with me, so to run dind, i think im going to:
then i open a new terminal, and set up my python dev env? $ docker exec -it dind sh
$ apk add bash python3 && bash # install python, bash shell (and launch it)
$ python3 -m venv .pe && . .pe/bin/activate && pip install -U pip && pip install poetry && ln -s $(which poetry) /usr/bin && deactivate # install poetry in the poetry venv
$ apk add git && git clone https://github.com/testcontainers/testcontainers-python
$ cd testcontainers-python
$ poetry install --with dev now i am running testcontainers on dind? and i can write a test like: cat > core/tests/test_issue_475.py <<EOF
from testcontainers.postgres import PostgresContainer
def test_issue_475():
with PostgresContainer("postgres:alpine") as postgres:
print(postgres.get_connection_url())
EOF and it ( but it should print:
ifconfig output8a51aa4b9616:/testcontainers-python# ifconfig
docker0 Link encap:Ethernet HWaddr 02:42:5A:E1:AC:CD
inet addr:172.18.0.1 Bcast:172.18.255.255 Mask:255.255.0.0
UP BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:21 errors:0 dropped:0 overruns:0 frame:0
TX packets:29 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:1020 (1020.0 B) TX bytes:2122 (2.0 KiB)
eth0 Link encap:Ethernet HWaddr 02:42:AC:11:00:02
inet addr:172.17.0.2 Bcast:172.17.255.255 Mask:255.255.0.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:15635 errors:0 dropped:0 overruns:0 frame:0
TX packets:14403 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:209109459 (199.4 MiB) TX bytes:1290642 (1.2 MiB)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:32 errors:0 dropped:0 overruns:0 frame:0
TX packets:32 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:2040 (1.9 KiB) TX bytes:2040 (1.9 KiB) the default route's gateway ip address is what is provided in the issue, but does that really make sense # route | grep ^default | awk '{ print $2 }'
172.17.0.1 when i do that outside of docker, i get my router so i think what we actually want is the ip address of the interface which is our outbound route: # # inside dind container:
# apk add iproute2 # because busybox doesn't support "brief"
# route | grep ^default | awk '{ print $NF }' | xargs ip -brief addr show | awk '{ print $3 }' | cut -d'/' -f1
172.17.0.2
# outside docker:
$ route | grep ^default | awk '{ print $NF }' | xargs ip -brief addr show | awk '{ print $3 }' | cut -d'/' -f1
192.168.<omitted>.<omitted> and i thought that this brings back the terribleness of the ipv6 localhost address that macs report, but maybe not: $ # on macos:
$ route -n get default | sed -n '/interface:/{s/.*: //;p;}' | xargs ipconfig getifaddr
192.168.<omitted>.<omitted> |
interestingly enough, this one seems to give me the 172 address inside docker and a localhost address in linux, the LAN address in macos(??? - i mean 192.168), and then on windows, it gives two fe80's, an ipv6, and the LAN address also: python -c 'import socket; print(set([a[4][0] for a in socket.getaddrinfo(socket.gethostname(), None)]))' anyways, since all the other discussions were about docker abstraction layer, i figured maybe we can get somewhere with just basic networking command |
what im really hoping to understand is folks' dind setup - are the commands i sent realistic? i think the answer is no, somehow there are multiple containers involved. taking the "two scenarios" from above - docker engine in container, docker engine on host:
are you supposed to interact with the dind container from outside of it? e.g. if you use devcontainers:
|
Hey @alexanderankin , Thanks for the digging! I think there is still a slight misanderstanding. N.B:
My issue happens in the setup 2., and AFAIK not with setup 1. Hope it's clearer 🙏🏻 |
@alexanderankin If you are having a problem getting this setup, you can just use this projects devcontainers to get what you need. You can follow @Tanguy-LeFloch comment here - #475 (comment) Then tweak our file here - https://github.com/testcontainers/testcontainers-python/blob/main/.devcontainer/devcontainer.json#L6 @Tanguy-LeFloch Have you tried again since this commit - b10d916 |
@Tanguy-LeFloch message received - will run through scenario 2 asap/schedule permitting. @bearrito I (me) do not plan on using devcontainers until some poor shmuck (also potentially me 😄) figures out how to fix all the issues with it. jokes aside, i am extremely conservative (bordering on luddite) with the kind of software i introduce into my development stack for related reasons.
very good point, i will try before and after when i do |
postgres.get_connection_url()
within another container gives broken connection urlpostgres.get_connection_url()
within another container gives broken connection url
@bearrito @alexanderankin yes, unfortunately I can confirm that the problem in setup 2 ( |
As a workaround I am currently using: class FixedMySqlContainer(MySqlContainer):
def get_container_host_ip(self) -> str:
if inside_container() and Path("/run/docker.sock").exists():
return self.get_docker_client().gateway_ip(self._container.id)
return super().get_container_host_ip() which works well in the gitlab ci with a privileged runner. I couldn't reproduce it locally, as all connection to containers started inside my container fail, I don't know why... |
Your workaround works in my setup as well @CarliJoy thanks! I just needed to either disable I'm not sure what's the state of the tests to be sure that this works as well in the dind (#1 here #475 (comment)) setup. |
I have found that this part of code is commented in def get_container_host_ip(self) -> str:
# infer from docker host
host = self.get_docker_client().host()
if not host:
return "localhost"
# see https://github.com/testcontainers/testcontainers-python/issues/415
if host == "localnpipe" and system() == "Windows":
return "localhost"
# # check testcontainers itself runs inside docker container
# if inside_container() and not os.getenv("DOCKER_HOST") and not host.startswith("http://"):
# # If newly spawned container's gateway IP address from the docker
# # "bridge" network is equal to detected host address, we should use
# # container IP address, otherwise fall back to detected host
# # address. Even it's inside container, we need to double check,
# # because docker host might be set to docker:dind, usually in CI/CD environment
# gateway_ip = self.get_docker_client().gateway_ip(self._container.id)
# if gateway_ip == host:
# return self.get_docker_client().bridge_ip(self._container.id)
# return gateway_ip
return host It seems similar to the hack @CarliJoy suggest , btw it also works for me, thanks for sharing it |
I encountered the same issue when using Redis ( I found a solution by adapting @CarliJoy's solution:
class FixedAsyncRedisContainer(AsyncRedisContainer):
def get_container_host_ip(self) -> str:
if inside_container():
return self.get_docker_client().bridge_ip(self._container.id)
return super().get_container_host_ip()
def get_exposed_port(self, port: int) -> str:
if inside_container():
return port # or self.port
return super().get_exposed_port(port) Not pretty but it works both locally and in CI/CD. |
Our workaround required something like this class MySqlServerContainer(SqlServerContainer):
def _shares_network(self):
host = self.get_docker_client().host()
gateway_ip = self.get_docker_client().gateway_ip(self._container.id)
if gateway_ip == host:
return True
return False
def get_container_host_ip(self) -> str:
# in CI the code runs inside a container itself, so the testcontainer starts in parallel
if inside_container():
# also, at least in azure, the pipeline assigns the container its own network,
# not the default bridge one
if self._shares_network():
return self.get_docker_client().bridge_ip(self._container.id)
# therefore we have to access the testcontainer via the gateway
return self.get_docker_client().gateway_ip(self._container.id)
return super().get_container_host_ip()
def get_exposed_port(self, port: int) -> str:
# if inside a container AND the same network we can use the internal port
if inside_container() and self._shares_network():
return port
# otherwise the external
return super().get_exposed_port(port) |
just to be clear for all in the thread - if there is an idea for a contribution to make this easier to work around (dind feature flag even, or something like this) I would merge it. I don't know how what the correct solution is but we can work together to fix it as long as the end result is less breakage and not more. |
I guess the problem is that everybody know the workaround for their env but we are lacking a deep enough understanding to find a general solution. Currently I don't have this topic on my agenda. If I do I will try to analyse it a bit more in depth and post a solution here, asking everybody to test it. |
I can try to list all the options i currently see as possible. After digging in the dark of our CI pipelines, I made a lot of hypotheses how things could look. |
Hi guys, Does this thing moves anywhere? A few workarounds mentioned above seems to work for our setup also. Could we introduce any of those under the feature flag and make a release? |
i havent had time. pr's are welcome. I started #622 - i can merge it if someone wants to work on top of it, or just PR to my branch. |
@alexanderankin got it. I'll try to dig into the commented piece of code inside Also I do have an issue even with this patch, but I think it's because I'm not patching @tharwan for the Azure - did it specifies by default specific network to a container? Do you have any option to remove that automatic assignment and check whether @gield proposition will suit your needs? |
Ok I did some rather intensive digging with our gitlab-ci and locally on my linux machine. Running local outside of docker the following three combinations work:
Interestingly these three combinations also work within github codespaces. For normal DooD (Docker out of Docker = mounted the For DooD container running with extra network (not the default bridge, i.e. For DinD only the [3] I prepared a test script. |
Thanks @CarliJoy for taking a time to dig into it. Here's the outcome from my test (got here as we have failure in Github Action for Python's testcontainers while Golang ones worked fine): On my macbook
On Codespaces
And in the Github Actions
That's from |
@RafalSkolasinski Thank's for running it. I updated the script to be more descriptive. Could you rerun it and update the output (just edit it) for Mac and Github actions? So far the summary is: Not inside a container:
Inside a container it get complicated:
We have to determine the difference between between the github actions and DinD in gitlab-ci. |
Thanks @CarliJoy, I updated the outputs for all three.
I am using Docker for Desktop if it makes a difference. Could try Colima as well later if you'd like to have more data.
Not sure about the GitlabCi as we do not use that but with Golang's |
OK, for me it does seem to work if I add to my container def get_container_host_ip(self) -> str:
"""
This is a temporary workaround issue with running in Github Actions.
See: https://github.com/testcontainers/testcontainers-python/issues/475
"""
if inside_container():
return self.get_docker_client().gateway_ip(self._container.id)
return super().get_container_host_ip() but this seemed to only propagate to my test container, and not ryuk as I also need to set |
I would like to fix this properly. |
I have also run into this issue and came to pretty much the same conclusion with using #388 seem to be when this regression was introduced but the What about just putting the commented out code back? One further complication seems to be that the |
@massdosage there at least 3 different possibilities for In my setup even the original code had flaws and didn't work properly. Just have yet another person say "mah don't work, I use xyz" doesn't help to fix the problem properly. Please run the test script and post the information. We still lack information about Windows and Azure CI's. |
Here are the results of running the test script. GitLab CI/CD pipeline:
Local developer run on a macbook:
|
Run inside my sandbox docker container (DinD)
local just parallel dockers.
|
This is a run from our pipeline. We are running a Jenkins pipeline with DinD. The only problem for us is that we need to pass ip/host/url of a container to other containers during pipeline, so they will be able to talk. I think I've mistakenly assumed that kafka_docker_ip = kafka_container.get_docker_client().bridge_ip(kafka_container.get_wrapped_container().id) In order to run it on the pipeline I need to add That is the run from the pipeline (docker socket is mounted):
While developing locally I'm just packing all the test code into the docker container and running it, so technically not DinD I think (I do run that container with mounted docker socket).
|
@wlipski your local setup seems strange. In DooD settings (mounted socket) I was always able to determine the container_id the current container was running in. For you this doesn't work. Because this seemed to be the best way to determine if we running a DinD or DooD. What is your local setup? |
I performed a summary: In green I marked the determined configuration that clustered a possible connection mode. To summarize:
This covers all cases besides the one of @wlipski without If you see another way to determine the correct connection mode from the given settings, feel free to adopt the given I will start working on a PR now. In my humble option the determination of the connection mode should be done already when settings are loaded with the possibility to overwrite it. I will implement it that way. |
Hi @CarliJoy Sorry for a delay and thank you for the summary! Results above where archived by packing test script into a docker (assuming # Dockerfile
FROM python:3.10
COPY ./testc.py /
RUN pip install testcontainers==4.8.0
CMD ["python3", "testc.py"]
Env:
Commands: docker build -f ./Dockerfile -t testc .
docker run --rm -v /var/run/docker.sock:/var/run/docker.sock -t testc
docker run --rm -v /var/run/docker.sock:/var/run/docker.sock -e TESTCONTAINERS_HOST_OVERRIDE=host.docker.internal -t testc |
Hmm okay the problem lies within Docker Desktop. I would suggest that at a later point in time add a detection if DockerDestkop is running somehow (let's dig into that in another issue as this issue is already getting to big [not all anwers are loaded]) and set the docker_host correctly in this case. That would eliminate the need for the `TESTCONTAINERS_HOST_OVERRIDE´ which a is valid workaround in the meanwhile. |
see testcontainers#475 (comment) for a summary
see testcontainers#475 (comment) for a summary
FWIW, my reported results where also from using Docker Desktop. Just don't remember if I had the |
see testcontainers#475 (comment) for a summary
Describe the bug
Before
4.0.0
, after creating aPostgresContainer
within another container, but with the same external daemon,postgres.get_connection_url()
would return somethinglike
postgresql+psycopg2://test:[email protected]:32770/test
which is the correct connection url.However, since
4.0.0
, the exact same container'spostgres.get_connection_url()
returnspostgresql+psycopg2://test:test@localhost:5432/test
.So it changed:
Considering that the newly formatted connection url doesn't work to create an engine, I wouldn't expect this behavior.
To Reproduce
Runtime environment
Operating system:
Linux d388e0f17614 5.10.179-168.710.amzn2.x86_64 #1 SMP Mon May 22 23:10:22 UTC 2023 x86_64 GNU/Linux
Python version:
Python 3.11.5
docker info
pip freeze
The text was updated successfully, but these errors were encountered: