idea: log job and node events directly to journald #6624

grondo · 2025-02-11T20:56:13Z

@morrone stopped by for a chat today and suggested an idea for handling streaming events from Flux, i.e. those currently supported by the JournalConsumer interfaces for the job manager and resource module journals.

Chris proposed sending events directly to the systemd journal, presumably using the native protocol. Consumers can then use journald apis and commands to grab events without connecting to Flux, and the persistent problem is (presumably) foisted off to journald.

garlick · 2025-02-12T16:38:55Z

A worry is that dumping cluster RAS/job data into the systemd journal on a management node might fill up the limited journal storage allotment with Flux data and push out other useful things. The end goal presumably wouldn't be to do queries on the journal directly anyway, but to just get it out of there and into something scalable and site-specific.

So isn't the python API for sites to do their own thing that we've already developed arguably the better solution?

grondo · 2025-02-12T17:14:31Z

Your argument does make sense, I don't recall the exact arguments for the journald approach. I had just agreed to open an issue describing the idea.

If the worry is that the backend database may spend significant time down and thus the Python based consumers might have to cache large numbers of events to avoid loss of data due to eventlog truncation or job purges, then nothing is stopping the Python consumer from using the journal as a local store. (Though the same caveat about filling the journal applies)

morrone · 2025-02-12T17:29:58Z

The main thing I had in mind was to offload all of the work of creating reliable, resumable journal semantics to journald. Less for flux to implement, and perhaps less custom API since journald is well known (if journald really offers sufficient semantics to meet our needs).

Journald allows setting individual log limits, does it not? I'm not suggesting buffering things forever.

If flux is going to make these journals reliable across flux and node restarts, it needs to use up disk space too.

garlick · 2025-02-12T17:38:47Z

AFAIK it only allows an overall limit to be set.

https://www.man7.org/linux/man-pages/man5/journald.conf.5.html

although I'm by no means an expert.

morrone · 2025-02-12T18:20:12Z

It looks like they have "namespaces" (see same man page). So a flux namespace with its own independent limits could be used. It looks like the journalctl command takes a "--namespace" option for reading, but I haven't checked the C API to see what its namespace support looks like.

garlick · 2025-02-12T18:37:09Z

Seems like that's a systemd ~~256~~ 245 feature? (RHEL 8 has systemd 239)

morrone · 2025-02-13T21:35:58Z

Seems like that's a systemd ~~256~~ 245 feature? (RHEL 8 has systemd 239)

Yeah, that figures.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

idea: log job and node events directly to journald #6624

idea: log job and node events directly to journald #6624

grondo commented Feb 11, 2025

garlick commented Feb 12, 2025

grondo commented Feb 12, 2025

morrone commented Feb 12, 2025

garlick commented Feb 12, 2025

morrone commented Feb 12, 2025

garlick commented Feb 12, 2025 •

edited

Loading

morrone commented Feb 13, 2025

idea: log job and node events directly to journald #6624

idea: log job and node events directly to journald #6624

Comments

grondo commented Feb 11, 2025

garlick commented Feb 12, 2025

grondo commented Feb 12, 2025

morrone commented Feb 12, 2025

garlick commented Feb 12, 2025

morrone commented Feb 12, 2025

garlick commented Feb 12, 2025 • edited Loading

morrone commented Feb 13, 2025

garlick commented Feb 12, 2025 •

edited

Loading