Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Equinix Move: Rebuild SmartOS Hosts #3731

Open
6 tasks done
ryanaslett opened this issue May 17, 2024 · 22 comments
Open
6 tasks done

Equinix Move: Rebuild SmartOS Hosts #3731

ryanaslett opened this issue May 17, 2024 · 22 comments

Comments

@ryanaslett
Copy link
Contributor

ryanaslett commented May 17, 2024

This is a sub step of
#3597

The two release and four testing smartOS hosts need to be replaced.
The current ones are running SmartOS 18 and SmartOS 20, which are both EOL.

In discussions with @bahamat It was determined that the likely targets for smartos should change to be SmartOS 21 and SmartOS 23

I plan to Provision the following hosts:

  • release-mnx-smartos21-x64-1
  • release-mnx-smartos23-x64-1
  • test-mnx-smartos21-x64-1
  • test-mnx-smartos21-x64-2
  • test-mnx-smartos23-x64-1
  • test-mnx-smartos23-x64-2

Once those are up I'll connect them to jenkins and we can run some preliminary tests to make sure they work and are accepting jobs.

Theres a non-zero chance that something has broken from smartos20 all the way to smartos23, so somebody with SmartOS/node knowledge is likely going to have to manage anticipated failures.

@targos
Copy link
Member

targos commented May 18, 2024

Are release hosts needed? We stopped building SmartOS a while ago.

@ryanaslett
Copy link
Contributor Author

I've provisioned the smartos test hosts and got ansible to run through end to end. #3737
That being said I don't yet have jenkins admin access to fully add these machines, nor do I really know the next steps to verify that tests can actually run on them.

I've held off on provisioning the release hosts because @targos makes a good point that we probably do not need to provision machines that are not used.

@ryanaslett
Copy link
Contributor Author

Discovered the supporting info that we do not need release nodes for smartos: #2168

@ryanaslett
Copy link
Contributor Author

I have added the four smartos nodes to jenkins, and re-ran ansible with their jenkins secret set, however I do not yet have infra level access so I cannot change the firewall settings to allow for these nodes to connect to jenkins to test and verify that they are functioning properly.

@targos
Copy link
Member

targos commented May 30, 2024

I added the IPs to the firewall rules. The 4 machines are connected to Jenkins now.

@ryanaslett
Copy link
Contributor Author

Great. I added them with the smartos21-64 and smartos23-64 labels, and altered the node-test-commit-smartos job to attempt to run on them. https://ci.nodejs.org/job/node-test-commit-smartos/

Both smartos21 and smartos23 are failing during the make step, but this is where I dont think I can be of much assistance in getting these jobs to successfully build on smartos -> Im unfamiliar with the build/compilation process, and Im unfamiliar with smartos in general.

Example failures:

https://ci.nodejs.org/job/node-test-commit-smartos/nodes=smartos23-64/54688/console
https://ci.nodejs.org/job/node-test-commit-smartos/nodes=smartos21-64/54688/console

Is there a point of contact we can ask in the smartos community to attempt to get these jobs to build?

@targos
Copy link
Member

targos commented May 31, 2024

@nodejs/platform-smartos PTAL.

@richardlau
Copy link
Member

richardlau commented May 31, 2024

Example failures:

https://ci.nodejs.org/job/node-test-commit-smartos/nodes=smartos23-64/54688/console

21:55:02 ../deps/v8/src/base/platform/platform-posix.cc:81:16: error: conflicting declaration of C function 'int madvise(caddr_t, std::size_t, int)'
21:55:02    81 | extern "C" int madvise(caddr_t, size_t, int);
21:55:02       |                ^~~~~~~
21:55:02 In file included from ../deps/v8/src/base/platform/platform-posix.cc:20:
21:55:02 /usr/include/sys/mman.h:268:12: note: previous declaration 'int madvise(void*, std::size_t, int)'
21:55:02   268 | extern int madvise(void *, size_t, int);
21:55:02       |            ^~~~~~~
21:55:02 make[2]: *** [tools/v8_gypfiles/v8_libbase.target.mk:209: /home/iojs/build/workspace/node-test-commit-smartos/nodes/smartos23-64/out/Release/obj.target/v8_libbase/deps/v8/src/base/platform/platform-posix.o] Error 1
21:55:02 rm 7a898a1853eba0e19e78f94553f35c8ae3ef2670.intermediate

looks like #3108 (comment) which was sort of machine configuration issue with the system header files.

@ryanaslett
Copy link
Contributor Author

It appears that smartos updated those header files

illumos/illumos-gate@df5cd01#diff-e7ee1077e2440c045ad16995cdc8ecba3b955d39cab24eaa5f022fe737a234eaL38

So If I understand correctly, v8 is checking whether or not it is compiling on solaris and accounting for atypical header file structure and in the meantime, smartos/illumos has updated the atypical header structure, causing v8 to no longer compile.

This also seems like only the first issue we'll encounter trying to get v8 to compile on modern SmartOS, and this will likely take somebody some dedicated time to get smartOS building again, and that also, this is upstream v8, and not nodejs itself that needs patching.

I'm unclear how to proceed here. I don't imagine that we'll be able to get this resolved before we need to remove the smartos18/smartos20 hosts, so it appears as though smartos tests are going to be broken for the forseeable future.

@bahamat
Copy link

bahamat commented May 31, 2024

@ryanaslett I think we have patches for this. We can get @jperkin to take a look, but he’s currently out on vacation so it won’t be right away.

@jperkin
Copy link

jperkin commented Jun 3, 2024

Yeh, we've been running with this patch for the nodejs pkgsrc builds for the last few years:

NetBSD/pkgsrc@527b29c

The problem is that you can't just update the prototypes, as that would break builds on older platforms, and there's no way to test which is available using the preprocessor.

I'd recommend applying a similar patch to just remove the prototypes (it's unclear why they were added in the first place but they certainly aren't required, at least on modern illumos or Solaris platforms) and probably the posix_madvise() chunk too.

@mhdawson
Copy link
Member

mhdawson commented Jun 5, 2024

@nodejs/build to facilitate this we have:

  1. created this copy of the smartos job - https://ci.nodejs.org/job/node-test-commit-smartos-test-ryan/. It currently pulls additional patches from this PR - Jenkins: Add smartos stub patches and patch script #3753 and runs on smartos 21 and 23

  2. I'm going to give @bahamat the ability to run that job unless there are any objections by Tomorrow. EDIT: also added @jperkin as mentioned below

The path forward would be:

  1. Somebody from the smartos community (@bahamat to identify), works to get the patch to apply. Once the patch applies and the job passes we can update the PR and land.
  2. One the build PR is landed we can move the changes from the test job into the regular job for smartos. They include the lines to pull/apply the patch script, and change to run on smartos 21 and 23 instead of the older versions.
  3. If 2) cannot be completed by June 14 we will need to remove smartos testing from the PR testing job until it is.

@bahamat we'd also need you to test with the current Node.js versions which include 18, 20 and 22. The patch will either need to apply to all of them, or different patches will need to be applied for each version by the script. We can also configure the job to only run on a subset of those versions (for example latest) if that is what makes sense for the smartos community.

@bahamat
Copy link

bahamat commented Jun 6, 2024

@mhdawson The best person to handle that from my side is @jperkin.

@mhdawson
Copy link
Member

mhdawson commented Jun 6, 2024

@bahamat, you and @jperkin should now be able to run that job.

@ryanaslett
Copy link
Contributor Author

@bahamat / @jperkin I've invited you to collaborate on my build fork that contains the smartos patches. You should be able to adjust the patch, and then run the job.

@richardlau
Copy link
Member

I noticed that even the newer SmartOS hosts are running Java 11 (according to https://ci.nodejs.org/computer/). Jenkins LTS is expecting to drop support for Java 11 in October, and will need Java 17 or later.

@bahamat
Copy link

bahamat commented Sep 24, 2024

This is also something that we've been dealing with in our Jenkins as well. OpenJDK is available in zone images 22.4.0 and later. Here's my recommendation:

  • Move build hosts to platform image 20210826T002459Z
  • Move oldest zone image to 22.4.0 (21.4.x will be EOL in 3 months)
  • Move newest zone image to 23.4.0 (or wait until January and move to 24.4.0, once it's available, or do 23.4.0 now and move to 24.4.0 in January)

@sxa
Copy link
Member

sxa commented Oct 8, 2024

@bahamat Can you confirm if 22.4.0 has a build of OpenJDK17 and not just 11? I know there are third party patches to OpenJDK which will allow it to build and run, but the primary OpenJDK codebase no longer supports building JDK11 from it (I'm guessing there's a chance that SmartOS already incorporated such patches to build 11 though ;-) )

@jperkin
Copy link

jperkin commented Oct 8, 2024

@sxa Yes, 22.4 includes openjdk17-17.0.5. 23.4 includes openjdk17-17.0.11 and openjdk21-21.0.3, and the upcoming 24.4 will include whatever the latest -ga tags are of each.

@bahamat
Copy link

bahamat commented Oct 8, 2024

@bahamat Can you confirm if 22.4.0 has a build of OpenJDK17 and not just 11?

Yep. We’re already using it in our Jenkins agent instances.

See here: https://smartos.org/packages/set/2022Q4-x86_64/search/Openjdk

@richardlau
Copy link
Member

There was an ask in nodejs/node#55239 (comment) to use newer Python than 3.8.

@bahamat
Copy link

bahamat commented Oct 9, 2024

The 22.4.0 image has 3.9 through 3.11, so we're good there.

https://smartos.org/packages/set/2022Q4-x86_64/search/python3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants