Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

idea: log job and node events directly to journald #6624

Open
grondo opened this issue Feb 11, 2025 · 7 comments
Open

idea: log job and node events directly to journald #6624

grondo opened this issue Feb 11, 2025 · 7 comments

Comments

@grondo
Copy link
Contributor

grondo commented Feb 11, 2025

@morrone stopped by for a chat today and suggested an idea for handling streaming events from Flux, i.e. those currently supported by the JournalConsumer interfaces for the job manager and resource module journals.

Chris proposed sending events directly to the systemd journal, presumably using the native protocol. Consumers can then use journald apis and commands to grab events without connecting to Flux, and the persistent problem is (presumably) foisted off to journald.

@garlick
Copy link
Member

garlick commented Feb 12, 2025

A worry is that dumping cluster RAS/job data into the systemd journal on a management node might fill up the limited journal storage allotment with Flux data and push out other useful things. The end goal presumably wouldn't be to do queries on the journal directly anyway, but to just get it out of there and into something scalable and site-specific.

So isn't the python API for sites to do their own thing that we've already developed arguably the better solution?

@grondo
Copy link
Contributor Author

grondo commented Feb 12, 2025

Your argument does make sense, I don't recall the exact arguments for the journald approach. I had just agreed to open an issue describing the idea.

If the worry is that the backend database may spend significant time down and thus the Python based consumers might have to cache large numbers of events to avoid loss of data due to eventlog truncation or job purges, then nothing is stopping the Python consumer from using the journal as a local store. (Though the same caveat about filling the journal applies)

@morrone
Copy link
Contributor

morrone commented Feb 12, 2025

The main thing I had in mind was to offload all of the work of creating reliable, resumable journal semantics to journald. Less for flux to implement, and perhaps less custom API since journald is well known (if journald really offers sufficient semantics to meet our needs).

Journald allows setting individual log limits, does it not? I'm not suggesting buffering things forever.

If flux is going to make these journals reliable across flux and node restarts, it needs to use up disk space too.

@garlick
Copy link
Member

garlick commented Feb 12, 2025

AFAIK it only allows an overall limit to be set.

https://www.man7.org/linux/man-pages/man5/journald.conf.5.html

although I'm by no means an expert.

@morrone
Copy link
Contributor

morrone commented Feb 12, 2025

It looks like they have "namespaces" (see same man page). So a flux namespace with its own independent limits could be used. It looks like the journalctl command takes a "--namespace" option for reading, but I haven't checked the C API to see what its namespace support looks like.

@garlick
Copy link
Member

garlick commented Feb 12, 2025

Seems like that's a systemd 256 245 feature? (RHEL 8 has systemd 239)

@morrone
Copy link
Contributor

morrone commented Feb 13, 2025

Seems like that's a systemd 256 245 feature? (RHEL 8 has systemd 239)

Yeah, that figures.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants