-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Troubleshooting IPv6 HTTP(S) deployment #166
Comments
Anything in /var/log/confluent/events? confluent_selfcheck -n t1 |
Apologies for the delay; long weekend here /var/log/confluent/events does shed a bit of light on the situation:
As for the selfcheck:
Thanks for that pointer, I'll try and work out what it wants. FWIW the deployment interface '5' has a FE80 address which the docs have lead me to believe is all that we require to provision.
|
Tracking this a touch further, we are in fact failing early in confluent/confluent_server/confluent/discovery/protocols/pxe.py Lines 635 to 639 in 3a0218c
The question is... why. I might quickly add a quick log of all of Not very helpful:
I can't quite see what's wrong here, any suggestions? |
For IPv6 boot, you need a more 'real' IPv6 address, like a ULA address. (the standard way is to generate 40 random bits after fd and then you have a /48 you can allocate from.). For example: Then you could have the deployment server be: You'd further want IPv6 addresses corresponding to the nodes, e.g.: LLA addresses (fe80::) can work for some things, for example, for connecting to BMCs over standard network protocols. Broadly speaking, some things simply require a non-LLA IPv6 address. Since they usually require a scope index added to the address and would otherwise be too ambiguous in multi-homed systems, they are too awkward for a lot of software to accomodate, and so a number of standards exclude them from everyday use. IIRC, UEFI is one of those contexts that do not implement pure LLA operation. You also can't use LLA in DNS or /etc/hosts. |
That would do it. I'm happy to call this a docs issue:
then
on https://hpc.lenovo.com/users/documentation/confluentosdeploy.html Gives the impression that all that is required is a LLA. I'll spin up some DHCP4 stuff and get on with my deployment. Do you want to close off this ticket or leave it open to track the docs updates? |
Hi Team,
I'm trying to get the most basic deployment of Confluent going with two servers directly connected via ethernet cables (no switches or anything in the way).
Our desired configuration will use HTTP boot over UEFI, so I am attempting to set this up without DNS or DHCP in the first instance to trial node deployment (and image management) before we scale out to one of our smaller clusters.
I have been able to define my node, automatically discover it's MAC, and assign that MAC to the the defined node (I have plugged in a second interface to make sure the shared ILO port is not the issue here). My deployment interface(s) (not defined anywhere...) have IPv6 enabled and link-local (
FE80
) addresses.I have defined the client node to boot only from HTTP(S) UEFI options.
I see the following via
tcpdump
of the deployment interface (limited by node MAC) when the node attempts to HTTP boot off that port:Confluent appears to be listening on the DHCP6 port:
And the node is primed for deployment:
I never see any response from my Confluent head node telling this node where to boot. I assume this is all supposed to happen over Layer 2 IPv6 "magic" based on the sparse confluent docs that exist. What have I missed here?
Edit:
Head Node Details:
The text was updated successfully, but these errors were encountered: