Skip to content

OVIS-3.4.6

Compare
Choose a tag to compare
@narategithub narategithub released this 23 May 15:48
· 2722 commits to main since this release

Changes in 3.4.6 since 3.4.4

FUNCTIONAL CHANGES:

  • Added /usr/bin/ldms-static-test.sh and numerous test examples of ldms configuration in /usr/share/doc/ovis-ldms-3.4.6/examples/static-test. See man ldms-static-test. Includes store, sampler, and multilevel aggregation examples.

  • Added dstat sampler for monitoring ldmsd itself. Expected use is to be
    loaded on aggregator and storage ldmsd instances. See Plugin_dstat man page.

  • Added jobid collection support to lustre2_client sampler.

  • Added opa2 sampler to collect omnipath hfi interface metrics. See Plugin_opa2 man page.

  • Updated libgenders support for managing ports (see man ldms-attributes) in init scripts (see man ldms-attributes):
    ldmsd_use_unix_socket
    ldmsd_sockpath
    ldmsd_use_inet_socket
    ldmsd_config_port
    ldmsd_log
    ldmsd_vg
    ldmsd_vgargfile

  • Added filters to trap and warn about common gender spelling and punctuation errors.

  • Split the build/install of libgenders/boost tool from install of systemd scripts. Systemd scripts can be used without the ldmsctl_args3 tool if the user provides the daemon configuration commands in a named script listed in ldmsd.local.conf.

  • Added missing man pages for samplers ported from LDMS v2: clock, procstat, sysclassib, jobid, lustre2_client, procsensors.

  • New/updates to man, plugins for cray samplers aries_linkstatus, aries_mmr.

  • Changed defaults in systemd scripts to allow more open files at aggregators and syslogid.

  • Fixed overzealous failure condition handling in ldms_jobid.

  • Added debug output of registered memory (mmalloc) in use at exit to better bound -m option value needed for ldmsd instances. New mm_stat call in lib/mmalloc supplies the data.

SECURITY CHANGES:

  • Fixed default insecure (commonly know secret) ldmsauth file. Now it is invalid by default (too short).

RUNTIME CHANGES/BUG FIXES:

  • Fixed C bugs in store related code:

    • idx_delete
    • notification (memory leak)
    • avl (attribute/value list handling of error conditions)
    • thread locking error in store_csv
  • Fixed C bugs in network transports:

    • rdma connection resource leaks in error handling cases.
  • Fixed C bugs in samplers:

    • jobid minor fixes
    • procnfs sampler now accounts for variations in nfs file layout. The procnfs sampler has never supported nfsv4 metrics and does not now.
    • Reduced repetitive logging of the same transient failure conditions.
    • Updated several samplers to run through transient disappearance of /proc.

HOUSEKEEPING CHANGES:

  • Removed LDMS_BUILDTYPE from systemd control scripts (it was preventing relocatability, and is in any case obsolete).

  • Remove most old packaging scripts from ldms source tree packaging/ directory.

  • Change install permissions on pedigree script.

  • Update rpath macro in build (deprecates some old apple os versions).

  • Made rpms fully relocatable without forcing the user to manually set ld and zap related environment variables before invocation. This entails wrapping all the sbin/ldms binaries in .ldms-wrapper. Thanks to cray for assistance in this.

DEVELOPER CHANGES:

  • Updated installed include files and /usr/lib/ovis-[ldms/lib]-configvars.sh so that 3rd party plugins can be built when only the installed ldms binaries and headers are used.

  • Updated .gitignore settings.