Skip to content

Releases: mercury-hpc/mercury

mercury 2.4.0

28 Oct 16:52
v2.4.0
Compare
Choose a tag to compare

Summary

This new version brings both bug fixes and feature updates to mercury. Notable are the addition of a new progress mechanism, new initialization parameters for the handling of multi-recv buffers and the support of cxi with HPE SHS 11.0.

New features

  • [HG]
    • Add HG_Get_input_payload_size()/HG_Get_output_payload_size()
      • Add the ability to query input / output payload sizes
    • Add HG_Diag_dump_counters() to dump diagnostic counters
      • Add rpc_req_recv_active_count and rpc_multi_recv_copy_count counters
    • Add HG_Class_get_counters() to retrieve internal counters
    • Add multi_recv_copy_threshold init parameter
      • Use this new parameter to fallback to memcpy to prevent starvation of multi-recv buffers
    • Add multi_recv_op_max init parameter
      • This allows users to control number of multi-recv buffers posted (libfabric plugin only)
    • Add no_overflow init option to prevent use of overflow buffers
    • Improve multi-recv buffer warning messages
    • Associate handle to HG proc
      • hg_proc_get_handle() can be used to retrieve handle within proc functions
    • Add HG_Event_get_wait_fd() to retrieve internal wait object
    • Add HG_Event_ready() / HG_Event_progress() / HG_Event_trigger() to support wait fd progress model
      • Simplify progress mechanism and remove use of internal timers
      • Always make NA progress when HG_Event_progress() is called
      • Update HG progress to use new NA progress routines
    • Add missing HG_WARN_UNUSED_RESULT to HG calls
    • Switch to using standard types and align with NA
      • Keep some uint8_t instances instead of hg_bool_t for ABI compatibility
    • Add HG_IO_ERROR return code
  • [NA]
    • Bump NA version to v5.0.0
    • Add NA_Poll() and NA_Poll_wait() routines
    • Deprecate NA_Progress() in favor of poll routines
    • Add NA_Context_get_completion_count() to retrieve size of completion queue
    • Update plugins to use new poll and poll_wait callbacks
      • poll_wait plugin callback remains for compatibility
    • Fix documentation of NA_Poll_get_fd()
    • Add missing NA_WARN_UNUSED_RESULT qualifiers
    • Remove deprecated CCI plugin
    • Return last known error when plugin loading fails
    • Add init info version compatibility wrappers
    • Add support for traffic_class init info (only supported by ofi plugin)
    • Add NA_IO_ERROR return code for generic I/O errors
      • Update OFI and UCX plugins to use new code
  • [NA OFI]
    • Support use of cxi provider with SHS 11.0
    • Add support for FI_AV_AUTH_KEY (requires libfabric >= 1.20)
      • Add runtime check for cxi provider version
      • Setting multiple auth keys disables FI_DIRECTED_RECV
      • Separate opening of AV and auth key insertion
      • Parse auth key range when FI_AV_AUTH_KEY is available
      • Encode/decode auth key when serializing addrs
    • Add support for FI_AV_USER_ID
    • Always use FI_SOURCE and FI_SOURCE_ERR when both are supported
      • Clean up handling of FI_SOURCE_ERR
      • Remove support of FI_SOURCE w/o FI_SOURCE_ERR
    • Add support for new CXI address format
    • Attempt to distribute multi-NIC domains based on selected CPU ID
    • Support selection of traffic classes (single class per NA class)
    • Add support for FI_PROTO_CXI_RNR
    • Add NA_OFI_SKIP_DOMAIN_OPS env variable to skip cxi domain ops
    • Remove unused NA_OFI_DOM_SHARED flag
  • [NA UCX]
    • Add ucx log outlet and redirect UCX log
      • Use default HG log level if UCX_LOG_LEVEL is not set
  • [HG/NA perf]
    • Add hg_first perf test to measure cost of initial RPC
    • Add -u option to control number of multi-recv ops (server only)
    • Add -i option to control number of handles posted (server only)
    • Add -f/--hostfile option to select hostfile to write to / read from
    • Add -T/--tclass option to select trafic class
    • Autodetect MPI implementation in perf utilities
      • MPI can now be autodetected and dynamically loaded in utilities, even if MERCURY_TESTING_ENABLE_PARALLEL was turned off. If MERCURY_TESTING_ENABLE_PARALLEL is turned on, tests remain manually linked against MPI as they used to be.
    • Print registration and deregistration times when -R option is used
    • Update to use new HG/NA progress routines and remove use of hg_request
    • Support forced registration in hg_bw_read/hg_bw_write
  • [HG Util]
    • Add hg_log_vwrite() to write log from va_list
    • Add hg_log_level_to_string()
    • Clean up mercury_event code and add const qualifier to hg_poll_get_fd()
    • Add const on atomic gets
    • Switch to using sys/queue.h directly
    • Remove HG_QUEUE and HG_LIST definitions
    • Add hg_dl_error() to return last error

Bug fixes

  • [HG]
    • Fix shared-memory path that was previously disabled in conjunction with libfabric transports that use the multi-recv capability
    • Fix behavior of request_post_incr init parameter
      • request_post_incr cannot be disabled (set to -1) with multi-recv
  • [HG/NA]
    • HG NA init info is fixed to v4.0 for now and duplicates tclass info
  • [NA]
    • Fix missing free of dynamic plugin entries
  • [NA BMI/MPI]
    • Return actual msg size through cb info
  • [NA OFI]
    • Fix cxi domain ops settings and disable PROV_KEY_CACHE
    • Fix shm provider flags
    • Remove excessive MR count warning message
  • [NA UCX]
    • Fix hg_info not filtering protocol
      • Allow na_ucx_get_protocol_info() to resolve ucx tl name aliases
    • Fix context thread mode to default to UCS_THREAD_MODE_MULTI
  • [HG/NA Perf]
    • Ensure NA perf tests wait on send completion
    • Fix bulk permission flag in hg_bw_read
    • Add some missing error checks in mercury_perf
  • [HG util]
    • Multiple logging fixes:
      • Fix dlog_free not called when parent/child have separate dlogs
      • Fix mercury log to correctly generate outlet names
      • Fix log outlets to use prefixed subsys name
      • Fix use of macros in debug log
      • Use destructor to free log outlets
    • Add missing prototype to hg_atomic_fence() definition
  • [CMake]
    • Fix cmake_minimum_required() warning
    • Update kwsys and mchecksum dependencies

⚠️ Known Issues

  • [NA OFI]
    • [tcp/verbs;ofi_rxm] Using more than 256 peers requires FI_UNIVERSE_SIZE to be set.

mercury 2.4.0rc5

26 Aug 22:01
v2.4.0rc5
Compare
Choose a tag to compare
mercury 2.4.0rc5 Pre-release
Pre-release

Summary

This is a preview release of the v2.4.0 release.

New features

Added in rc5

  • [HG]
    • Add HG_Get_input_payload_size()/HG_Get_output_payload_size()
      • Add the ability to query input / output payload sizes
    • Add HG_Diag_dump_counters() to dump diagnostic counters
      • Add rpc_req_recv_active_count and rpc_multi_recv_copy_count counters
    • Add HG_Class_get_counters() to retrieve internal counters

Added in rc4

  • [HG]
    • Add multi_recv_copy_threshold init parameter
      • Use this new parameter to fallback to memcpy to prevent starvation of multi-recv buffers
    • Associate handle to HG proc
      • hg_proc_get_handle() can be used to retrieve handle within proc functions

Added in rc3

  • [HG]
    • Add multi_recv_op_max init parameter
      • This allows users to control number of multi-recv buffers posted (libfabric plugin only)
    • Add no_overflow init option to prevent use of overflow buffers
    • Improve multi-recv buffer warning messages
    • Add HG_Event_get_wait_fd() to retrieve internal wait object
    • Add HG_Event_ready() / HG_Event_progress() / HG_Event_trigger() to support wait fd progress model
      • Simplify progress mechanism and remove use of internal timers
      • Always make NA progress when HG_Event_progress() is called
      • Update HG progress to use new NA progress routines
    • Add missing HG_WARN_UNUSED_RESULT to HG calls
    • Switch to using standard types and align with NA
      • Keep some uint8_t instances instead of hg_bool_t for ABI compatibility
  • [NA]
    • Add NA_Poll() and NA_Poll_wait() routines
    • Deprecate NA_Progress() in favor of poll routines
    • Add NA_Context_get_completion_count() to retrieve size of completion queue
    • Update plugins to use new poll and poll_wait callbacks
      • poll_wait plugin callback remains for compatibility
    • Fix documentation of NA_Poll_get_fd()
    • Add missing NA_WARN_UNUSED_RESULT qualifiers
    • Bump NA version to 5.0.0
    • Remove deprecated CCI plugin
    • Return last known error when plugin loading fails
  • [NA OFI]
    • Remove unused NA_OFI_DOM_SHARED flag
    • Always use FI_SOURCE and FI_SOURCE_ERR when both are supported
  • [NA UCX]
    • Add ucx log outlet and redirect UCX log
      • Use default HG log level if UCX_LOG_LEVEL is not set
  • [HG Util]
    • Add hg_log_vwrite() to write log from va_list
    • Add hg_log_level_to_string()
    • Clean up mercury_event code and add const qualifier to hg_poll_get_fd()
    • Add const on atomic gets
    • Switch to using sys/queue.h directly
    • Remove HG_QUEUE and HG_LIST definitions
    • Add hg_dl_error() to return last error
  • [HG/NA Perf Test]
    • Add -u option to control number of multi-recv ops (server only)
    • Add -i option to control number of handles posted (server only)
    • Update to use new HG/NA progress routines and remove use of hg_request

Added in rc2

  • [NA OFI]
    • Add support for FI_AV_AUTH_KEY (requires libfabric >= 1.20)
      • Add runtime check for cxi provider version
      • Setting multiple auth keys disables FI_DIRECTED_RECV
      • Separate opening of AV and auth key insertion
      • Parse auth key range when FI_AV_AUTH_KEY is available
      • Encode/decode auth key when serializing addrs
    • Add support for FI_AV_USER_ID
    • Clean up handling of FI_SOURCE_ERR
    • Remove support of FI_SOURCE w/o FI_SOURCE_ERR
    • Add support for new CXI address format

Added in rc1

  • [NA]
    • Add init info version compatibility wrappers
    • Bump NA version to v4.1.0
    • Add support for traffic_class init info (only supported by ofi plugin)
  • [NA OFI]
    • Attempt to distribute multi-NIC domains based on selected CPU ID
    • Support selection of traffic classes (single class per NA class)
  • [HG/NA Perf Test]
    • Add -f/--hostfile option to select hostfile to write to / read from
    • Add -T/--tclass option to select trafic class
    • Autodetect MPI implementation in perf utilities
      • MPI can now be autodetected and dynamically loaded in utilities, even if MERCURY_TESTING_ENABLE_PARALLEL was turned off. If MERCURY_TESTING_ENABLE_PARALLEL is turned on, tests remain manually linked against MPI as they used to be.

Bug fixes

Added in rc5

  • [HG]
    • Make HG_Core_event_ready() non-inline to fix NA dependency and remove HG_Core_event_ready_loopback() from public API
    • Fix NA init info not correctly set from HG
  • [NA BMI/MPI]
    • Return actual msg size through cb info

Added in rc4

  • [HG]
    • Fix couple of type changes introduced in rc1 that could have broken ABI
    • Fix shared-memory path that was previously disabled in conjunction with libfabric transports that use the multi-recv capability
  • [HG util]
    • Fix dlog_free not called when parent/child have separate dlogs
  • [HG/NA]
    • Fix init info changes made in previous rcs to prevent ABI breakage
    • HG NA init info is fixed to v4.0 for now and duplicates tclass info

Added in rc3

  • [HG]
    • Fix behavior of request_post_incr init parameter
      • request_post_incr cannot be disabled (set to -1) with multi-recv
  • [HG Util]
    • Fix mercury log to correctly generate outlet names
    • Fix log outlets to use prefixed subsys name
    • Fix use of macros in debug log
  • [CMake]
    • Fix cmake_minimum_required() warning
    • Update kwsys and mchecksum dependencies

Added in rc2

  • [HG Util]
    • Use destructor to free log outlets
  • [NA]
    • Fix missing free of dynamic plugin entries
  • [NA UCX]
    • Fix hg_info not filtering protocol
      • Allow na_ucx_get_protocol_info() to resolve ucx tl name aliases
  • [NA OFI]
    • Fix shm provider flags
  • [NA Test]
    • Remove could not find MPI message

Added in rc1

  • [HG Util]
    • Add missing prototype to hg_atomic_fence() definition
  • [NA OFI]
    • Remove excessive MR count warning message
  • [NA Perf]
    • Ensure perf tests wait on send completion

⚠️ Known Issues

  • [NA OFI]
    • [tcp/verbs;ofi_rxm] Using more than 256 peers requires FI_UNIVERSE_SIZE to be set.

mercury 2.4.0rc4

02 Aug 22:27
v2.4.0rc4
Compare
Choose a tag to compare
mercury 2.4.0rc4 Pre-release
Pre-release

Summary

This is a preview release of the v2.4.0 release.

New features

Added in rc4

  • [HG]
    • Add multi_recv_copy_threshold init parameter
      • Use this new parameter to fallback to memcpy to prevent starvation of multi-recv buffers
    • Associate handle to HG proc
      • hg_proc_get_handle() can be used to retrieve handle within proc functions

Added in rc3

  • [HG]
    • Add multi_recv_op_max init parameter
      • This allows users to control number of multi-recv buffers posted (libfabric plugin only)
    • Add no_overflow init option to prevent use of overflow buffers
    • Improve multi-recv buffer warning messages
    • Add HG_Event_get_wait_fd() to retrieve internal wait object
    • Add HG_Event_ready() / HG_Event_progress() / HG_Event_trigger() to support wait fd progress model
      • Simplify progress mechanism and remove use of internal timers
      • Always make NA progress when HG_Event_progress() is called
      • Update HG progress to use new NA progress routines
    • Add missing HG_WARN_UNUSED_RESULT to HG calls
    • Switch to using standard types and align with NA
      • Keep some uint8_t instances instead of hg_bool_t for ABI compatibility
  • [NA]
    • Add NA_Poll() and NA_Poll_wait() routines
    • Deprecate NA_Progress() in favor of poll routines
    • Add NA_Context_get_completion_count() to retrieve size of completion queue
    • Update plugins to use new poll and poll_wait callbacks
      • poll_wait plugin callback remains for compatibility
    • Fix documentation of NA_Poll_get_fd()
    • Add missing NA_WARN_UNUSED_RESULT qualifiers
    • Bump NA version to 5.0.0
    • Remove deprecated CCI plugin
    • Return last known error when plugin loading fails
  • [NA OFI]
    • Remove unused NA_OFI_DOM_SHARED flag
    • Always use FI_SOURCE and FI_SOURCE_ERR when both are supported
  • [NA UCX]
    • Add ucx log outlet and redirect UCX log
      • Use default HG log level if UCX_LOG_LEVEL is not set
  • [HG Util]
    • Add hg_log_vwrite() to write log from va_list
    • Add hg_log_level_to_string()
    • Clean up mercury_event code and add const qualifier to hg_poll_get_fd()
    • Add const on atomic gets
    • Switch to using sys/queue.h directly
    • Remove HG_QUEUE and HG_LIST definitions
    • Add hg_dl_error() to return last error
  • [HG/NA Perf Test]
    • Add -u option to control number of multi-recv ops (server only)
    • Add -i option to control number of handles posted (server only)
    • Update to use new HG/NA progress routines and remove use of hg_request

Added in rc2

  • [NA OFI]
    • Add support for FI_AV_AUTH_KEY (requires libfabric >= 1.20)
      • Add runtime check for cxi provider version
      • Setting multiple auth keys disables FI_DIRECTED_RECV
      • Separate opening of AV and auth key insertion
      • Parse auth key range when FI_AV_AUTH_KEY is available
      • Encode/decode auth key when serializing addrs
    • Add support for FI_AV_USER_ID
    • Clean up handling of FI_SOURCE_ERR
    • Remove support of FI_SOURCE w/o FI_SOURCE_ERR
    • Add support for new CXI address format

Added in rc1

  • [NA]
    • Add init info version compatibility wrappers
    • Bump NA version to v4.1.0
    • Add support for traffic_class init info (only supported by ofi plugin)
  • [NA OFI]
    • Attempt to distribute multi-NIC domains based on selected CPU ID
    • Support selection of traffic classes (single class per NA class)
  • [HG/NA Perf Test]
    • Add -f/--hostfile option to select hostfile to write to / read from
    • Add -T/--tclass option to select trafic class
    • Autodetect MPI implementation in perf utilities
      • MPI can now be autodetected and dynamically loaded in utilities, even if MERCURY_TESTING_ENABLE_PARALLEL was turned off. If MERCURY_TESTING_ENABLE_PARALLEL is turned on, tests remain manually linked against MPI as they used to be.

Bug fixes

Added in rc4

  • [HG]
    • Fix couple of type changes introduced in rc1 that could have broken ABI
    • Fix shared-memory path that was previously disabled in conjunction with libfabric transports that use the multi-recv capability
  • [HG util]
    • Fix dlog_free not called when parent/child have separate dlogs
  • [HG/NA]
    • Fix init info changes made in previous rcs to prevent ABI breakage
    • HG NA init info is fixed to v4.0 for now and duplicates tclass info

Added in rc3

  • [HG]
    • Fix behavior of request_post_incr init parameter
      • request_post_incr cannot be disabled (set to -1) with multi-recv
  • [HG Util]
    • Fix mercury log to correctly generate outlet names
    • Fix log outlets to use prefixed subsys name
    • Fix use of macros in debug log
  • [CMake]
    • Fix cmake_minimum_required() warning
    • Update kwsys and mchecksum dependencies

Added in rc2

  • [HG Util]
    • Use destructor to free log outlets
  • [NA]
    • Fix missing free of dynamic plugin entries
  • [NA UCX]
    • Fix hg_info not filtering protocol
      • Allow na_ucx_get_protocol_info() to resolve ucx tl name aliases
  • [NA OFI]
    • Fix shm provider flags
  • [NA Test]
    • Remove could not find MPI message

Added in rc1

  • [HG Util]
    • Add missing prototype to hg_atomic_fence() definition
  • [NA OFI]
    • Remove excessive MR count warning message
  • [NA Perf]
    • Ensure perf tests wait on send completion

⚠️ Known Issues

  • [NA OFI]
    • [tcp/verbs;ofi_rxm] Using more than 256 peers requires FI_UNIVERSE_SIZE to be set.

mercury 2.4.0rc3

25 Jun 23:28
v2.4.0rc3
Compare
Choose a tag to compare
mercury 2.4.0rc3 Pre-release
Pre-release

Summary

This is a preview release of the v2.4.0 release.

New features

Added in rc3

  • [HG]
    • Add multi_recv_op_max init parameter
      • This allows users to control number of multi-recv buffers posted (libfabric plugin only)
    • Add no_overflow init option to prevent use of overflow buffers
    • Improve multi-recv buffer warning messages
    • Add HG_Event_get_wait_fd() to retrieve internal wait object
    • Add HG_Event_ready() / HG_Event_progress() / HG_Event_trigger() to support wait fd progress model
      • Simplify progress mechanism and remove use of internal timers
      • Always make NA progress when HG_Event_progress() is called
      • Update HG progress to use new NA progress routines
    • Add missing HG_WARN_UNUSED_RESULT to HG calls
    • Switch to using standard types and align with NA
      • Keep some uint8_t instances instead of hg_bool_t for ABI compatibility
  • [NA]
    • Add NA_Poll() and NA_Poll_wait() routines
    • Deprecate NA_Progress() in favor of poll routines
    • Add NA_Context_get_completion_count() to retrieve size of completion queue
    • Update plugins to use new poll and poll_wait callbacks
      • poll_wait plugin callback remains for compatibility
    • Fix documentation of NA_Poll_get_fd()
    • Add missing NA_WARN_UNUSED_RESULT qualifiers
    • Bump NA version to 5.0.0
    • Remove deprecated CCI plugin
    • Return last known error when plugin loading fails
  • [NA OFI]
    • Remove unused NA_OFI_DOM_SHARED flag
    • Always use FI_SOURCE and FI_SOURCE_ERR when both are supported
  • [NA UCX]
    • Add ucx log outlet and redirect UCX log
      • Use default HG log level if UCX_LOG_LEVEL is not set
  • [HG Util]
    • Add hg_log_vwrite() to write log from va_list
    • Add hg_log_level_to_string()
    • Clean up mercury_event code and add const qualifier to hg_poll_get_fd()
    • Add const on atomic gets
    • Switch to using sys/queue.h directly
    • Remove HG_QUEUE and HG_LIST definitions
    • Add hg_dl_error() to return last error
  • [HG/NA Perf Test]
    • Add -u option to control number of multi-recv ops (server only)
    • Add -i option to control number of handles posted (server only)
    • Update to use new HG/NA progress routines and remove use of hg_request

Added in rc2

  • [NA OFI]
    • Add support for FI_AV_AUTH_KEY (requires libfabric >= 1.20)
      • Add runtime check for cxi provider version
      • Setting multiple auth keys disables FI_DIRECTED_RECV
      • Separate opening of AV and auth key insertion
      • Parse auth key range when FI_AV_AUTH_KEY is available
      • Encode/decode auth key when serializing addrs
    • Add support for FI_AV_USER_ID
    • Clean up handling of FI_SOURCE_ERR
    • Remove support of FI_SOURCE w/o FI_SOURCE_ERR
    • Add support for new CXI address format

Added in rc1

  • [NA]
    • Add init info version compatibility wrappers
    • Bump NA version to v4.1.0
    • Add support for traffic_class init info (only supported by ofi plugin)
  • [NA OFI]
    • Attempt to distribute multi-NIC domains based on selected CPU ID
    • Support selection of traffic classes (single class per NA class)
  • [HG/NA Perf Test]
    • Add -f/--hostfile option to select hostfile to write to / read from
    • Add -T/--tclass option to select trafic class
    • Autodetect MPI implementation in perf utilities
      • MPI can now be autodetected and dynamically loaded in utilities, even if MERCURY_TESTING_ENABLE_PARALLEL was turned off. If MERCURY_TESTING_ENABLE_PARALLEL is turned on, tests remain manually linked against MPI as they used to be.

Bug fixes

Added in rc3

  • [HG]
    • Fix behavior of request_post_incr init parameter
      • request_post_incr cannot be disabled (set to -1) with multi-recv
  • [HG Util]
    • Fix mercury log to correctly generate outlet names
    • Fix log outlets to use prefixed subsys name
    • Fix use of macros in debug log
  • [CMake]
    • Fix cmake_minimum_required() warning
    • Update kwsys and mchecksum dependencies

Added in rc2

  • [HG Util]
    • Use destructor to free log outlets
  • [NA]
    • Fix missing free of dynamic plugin entries
  • [NA UCX]
    • Fix hg_info not filtering protocol
      • Allow na_ucx_get_protocol_info() to resolve ucx tl name aliases
  • [NA OFI]
    • Fix shm provider flags
  • [NA Test]
    • Remove could not find MPI message

Added in rc1

  • [HG Util]
    • Add missing prototype to hg_atomic_fence() definition
  • [NA OFI]
    • Remove excessive MR count warning message
  • [NA Perf]
    • Ensure perf tests wait on send completion

⚠️ Known Issues

  • [NA OFI]
    • [tcp/verbs;ofi_rxm] Using more than 256 peers requires FI_UNIVERSE_SIZE to be set.

mercury 2.4.0rc2

07 May 20:10
v2.4.0rc2
Compare
Choose a tag to compare
mercury 2.4.0rc2 Pre-release
Pre-release

Summary

This is a preview release of the v2.4.0 release.

New features

Added in rc2

  • [NA OFI]
    • Add support for FI_AV_AUTH_KEY (requires libfabric >= 1.20)
      • Add runtime check for cxi provider version
      • Setting multiple auth keys disables FI_DIRECTED_RECV
      • Separate opening of AV and auth key insertion
      • Parse auth key range when FI_AV_AUTH_KEY is available
      • Encode/decode auth key when serializing addrs
    • Add support for FI_AV_USER_ID
    • Clean up handling of FI_SOURCE_ERR
    • Remove support of FI_SOURCE w/o FI_SOURCE_ERR
    • Add support for new CXI address format

Added in rc1

  • [NA]
    • Add init info version compatibility wrappers
    • Bump NA version to v4.1.0
    • Add support for traffic_class init info (only supported by ofi plugin)
  • [HG/NA Perf Test]
    • Add -f/--hostfile option to select hostfile to write to / read from
    • Add -T/--tclass option to select trafic class
    • Autodetect MPI implementation in perf utilities
      • MPI can now be autodetected and dynamically loaded in utilities, even if MERCURY_TESTING_ENABLE_PARALLEL was turned off. If MERCURY_TESTING_ENABLE_PARALLEL is turned on, tests remain manually linked against MPI as they used to be.
  • [NA OFI]
    • Attempt to distribute multi-NIC domains based on selected CPU ID
    • Support selection of traffic classes (single class per NA class)

Bug fixes

Added in rc2

  • [HG Util]
    • Use destructor to free log outlets
  • [NA]
    • Fix missing free of dynamic plugin entries
  • [NA UCX]
    • Fix hg_info not filtering protocol
      • Allow na_ucx_get_protocol_info() to resolve ucx tl name aliases
  • [NA OFI]
    • Fix shm provider flags
  • [NA Test]
    • Remove could not find MPI message

Added in rc1

  • [HG Util]
    • Add missing prototype to hg_atomic_fence() definition
  • [NA OFI]
    • Remove excessive MR count warning message
  • [NA Perf]
    • Ensure perf tests wait on send completion

⚠️ Known Issues

  • [NA OFI]
    • [tcp/verbs;ofi_rxm] Using more than 256 peers requires FI_UNIVERSE_SIZE to be set.

mercury 2.4.0rc1

20 Dec 21:22
v2.4.0rc1
Compare
Choose a tag to compare
mercury 2.4.0rc1 Pre-release
Pre-release

Summary

This is a preview release of the v2.4.0 release.

New features

  • [NA]
    • Add init info version compatibility wrappers
    • Bump NA version to v4.1.0
    • Add support for traffic_class init info (only supported by ofi plugin)
  • [HG/NA Perf Test]
    • Add -f/--hostfile option to select hostfile to write to / read from
    • Add -T/--tclass option to select trafic class
    • Autodetect MPI implementation in perf utilities
      • MPI can now be autodetected and dynamically loaded in utilities, even if MERCURY_TESTING_ENABLE_PARALLEL was turned off. If MERCURY_TESTING_ENABLE_PARALLEL is turned on, tests remain manually linked against MPI as they used to be.
  • [NA OFI]
    • Attempt to distribute multi-NIC domains based on selected CPU ID
    • Support selection of traffic classes (single class per NA class)

Bug fixes

  • [HG Util]
    • Add missing prototype to hg_atomic_fence() definition
  • [NA OFI]
    • Remove excessive MR count warning message
  • [NA Perf]
    • Ensure perf tests wait on send completion

⚠️ Known Issues

  • [NA OFI]
    • [tcp/verbs;ofi_rxm] Using more than 256 peers requires FI_UNIVERSE_SIZE to be set.

mercury 2.3.1

26 Oct 23:54
v2.3.1
Compare
Choose a tag to compare

Summary

This version brings bug fixes and updates to our v2.3.0 release.

New features

  • [HG info]
    • Add support for CSV and JSON output formats
  • [HG/NA Perf Test]
    • Enable sizes to be passed using k/m/g qualifiers
  • [NA OFI]
    • Add tcp_rxm alias for tcp;ofi_rxm
    • Find CXI svc_id or vni if auth_key components have zeros (e.g., auth_key=0:0)
      • Add VNI index for SLINGSHOT_VNIS discovery as extra auth_key parameter

Bug fixes

  • [HG/NA]
    • Fix potential race when checking secondary completion queue
  • [HG]
    • Prevent multiple threads from entering HG_Core_progress()
      • Add HG_ALLOW_MULTI_PROGRESS CMake option to control behavior (ON by default)
      • Disable NA_HAS_MULTI_PROGRESS if HG_ALLOW_MULTI_PROGRESS is ON
    • Fix expected operation count for handle to be atomic
      • Expected operation count can change if extra RPC payload must be transferred
    • Let poll events remain private to HG poll wait
      • Prevent a race when multiple threads call progress and HG_ALLOW_MULTI_PROGRESS is OFF
    • Separate internal list from user created list of handles
      • Address an issue where HG_Context_unpost() would unnecessarily wait
  • [HG Core]
    • Cache disabled response info in proc info
    • Add HG_Core_registered_disable(d)_response() routines
    • Refactor and optimize self RPC code path
    • Add additional logging of refcount/expected op count
    • Fixes for self RPCs with no response
  • [HG Util]
    • Prevent locking in hg_request_wait()
      • Concurrent progress in multi-threaded scenarios on the same context could complete another thread's request and let a thread blocked in progress
  • [HG Perf]
    • Fix tests to be run in parallel with any communicator size
  • [HG Test]
    • Ensure affinity of class thread is set
    • Add concurrent multi RPC test
    • Add multi-progress test
    • Add multi-progress test with handle creation
    • Refactoring of unit test cleanup
  • [NA]
    • Fix memory leak on NA_Get_protocol_info()
  • [NA OFI]
    • Fix na_ofi_get_protocol_info() not returning opx protocol
      • Refactor na_ofi_getinfo() to account for NA_OFI_PROV_NULL type
      • Ensure there are no duplicated entries
    • Refactor parsing of init info strings and fix OPX parsing
    • Simplify parsing of some address strings
    • Bump default CQ size to have a maximum depth of 128k entries
    • Remove sockets as the only provider on macOS
    • Remove send afer send tagged msg ordering
    • Ensure that rx_ctx_bits are not set if SEP is not used
    • Set CXI domain ops w/ slingshot 2.2 to prevent from potential memory corruptions
  • [NA Perf]
    • Prevent tests from being run as parallel tests
  • [CMake]
    • Pass INSTALL_NAME_DIR through target properties
      • This fixes an issue seen on macOS where libraries would not be found using @rpath

⚠️ Known Issues

  • [NA OFI]
    • [tcp/verbs;ofi_rxm] Using more than 256 peers requires FI_UNIVERSE_SIZE to be set.

mercury 2.3.1rc1

29 Aug 15:18
v2.3.1rc1
Compare
Choose a tag to compare
mercury 2.3.1rc1 Pre-release
Pre-release

Summary

This version brings bug fixes and updates to our v2.3.0 release.

New features

  • [HG/NA Perf Test]
    • Enable sizes to be passed using k/m/g qualifiers

Bug fixes

  • [HG/NA]
    • Fix potential race when checking secondary completion queue
  • [HG]
    • Prevent multiple threads from entering HG_Core_progress()
      • Add HG_ALLOW_MULTI_PROGRESS CMake option to control behavior (ON by default)
      • Disable NA_HAS_MULTI_PROGRESS if HG_ALLOW_MULTI_PROGRESS is ON
    • Fix expected operation count for handle to be atomic
      • Expected operation count can change if extra RPC payload must be transferred
    • Let poll events remain private to HG poll wait
      • Prevent a race when multiple threads call progress and HG_ALLOW_MULTI_PROGRESS is OFF
    • Separate internal list from user created list of handles
      • Address an issue where HG_Context_unpost() would unnecessarily wait
  • [HG Test]
    • Ensure affinity of class thread is set
  • [NA OFI]
    • Fix na_ofi_get_protocol_info() not returning opx protocol
      • Refactor na_ofi_getinfo() to account for NA_OFI_PROV_NULL type
      • Ensure there are no duplicated entries
    • Refactor parsing of init info strings and fix OPX parsing
    • Simplify parsing of some address strings
    • Bump default CQ size to have a maximum depth of 128k entries
    • Remove sockets as the only provider on macOS
  • [CMake]
    • Pass INSTALL_NAME_DIR through target properties
      • This fixes an issue seen on macOS where libraries would not be found using @rpath

⚠️ Known Issues

  • [NA OFI]
    • [tcp/verbs;ofi_rxm] Using more than 256 peers requires FI_UNIVERSE_SIZE
      to be set.

mercury 2.3.0

06 Jun 23:47
v2.3.0
Compare
Choose a tag to compare

Summary

This version brings bug fixes and updates to our v2.0.0 release.

New features

  • [HG/NA]
    • Add HG_Init_opt2() / HG_Core_init_opt2() / NA_Initialize_opt2() to
      safely pass updated init info while maintaining ABI compatibility between
      versions
    • Add HG_Get_na_protocol_info() / HG_Free_na_protocol_info() and add
      hg_info utility for basic listing of protocols
  • [HG]
    • Add support for multi-recv operations (OFI plugin only)
      • Currently disable multi-recv when auto SM is on
      • Posted recv operations are in that case decoupled from pool of RPC
        handles
      • Add release_input_early init info flag to attempt to release input
        buffers early once input is decoded
      • Add HG_Release_input_buf() to manually release input buffer.
      • Add also no_multi_recv init info option to force disabling
        multi-recv
    • Make use of subsys logs (cls, ctx, addr, rpc, poll) to control
      log output
    • Add init info struct versioning
    • Add HG_Context_unpost() / HG_Core_context_unpost() for optional
      2-step context shutdown
  • [HG bulk]
    • Update to new logging system through bulk subsys log.
  • [HG proc]
    • Update to new logging system through proc subsys log.
  • [HG Test]
    • Refactor tests to separate perf tests from unit tests
    • Add NA/HG test common library
    • Add hg_rate / hg_bw_write and hg_bw_read perf tests
      • Perf test now supports multi-client / multi-server workloads
    • Add BUILD_TESTING_UNIT and BUILD_TESTING_PERF CMake options
  • [NA]
    • Add support for multi-recv operations
      • Add NA_Msg_multi_recv_unexpected() and
        na_cb_info_multi_recv_unexpected cb info
      • Add flags parameter to NA_Op_create() and NA_Msg_buf_alloc()
      • Add NA_Has_opt_feature() to query multi recv capability
    • Remove int return type from NA callbacks and return void
    • Remove unused timeout parameter from NA_Trigger()
    • NA_Addr_free() / NA_Mem_handle_free() and NA_Op_destroy() now
      return void
    • na_mem_handle_t and na_addr_t types no longer include pointer type
    • Add support for dynamically loaded plugins
      • Add NA_PLUGIN_PATH env variable to optionally control plugin loading
        path (default is NA_INSTALL_PLUGIN_DIR)
      • Add NA_INSTALL_PLUGIN_DIR variable to control plugin install path
        (default is lib install path)
      • Add NA_USE_DYNAMIC_PLUGINS CMake option (OFF by default)
    • Add ability to query protocol info from plugins
      • Add NA_Get_protocol_info()/NA_Free_protocol_info() API routines
      • Add na_protocol_info struct to na_types
    • Bump NA library version to 4.0.0
  • [NA OFI]
    • Add support for multi-recv operations and use FI_MSG
    • Allocate multi-recv buffers using hugepages when available
    • Switch to using fi_senddata() with immediate data for unexpected msgs
      • NA_OFI_UNEXPECTED_TAG_MSG can be set to switch back to former
        behavior that uses tagged messages instead
    • Remove support for deprecated psm provider
    • Control CQ interrupt signaling with FI_AFFINITY (only used if thread is
      bound to a single CPU ID)
    • Enable cxi provider to use FI_WAIT_FD
    • Add NA_OFI_OP_RETRY_TIMEOUT and NA_OFI_OP_RETRY_PERIOD
      • Once NA_OFI_OP_RETRY_TIMEOUT milliseconds elapse, retry is stopped
        and operation is aborted (default is 120000ms)
      • When NA_OFI_OP_RETRY_PERIOD is set, operations are retried only
        every NA_OFI_OP_RETRY_PERIOD milliseconds (default is 0)
    • Add support for tcp with and without ofi_rxm
      • tcp defaults to tcp;ofi_rxm for libfabric < 1.18
    • Enable plugin to be built as a dynamic plugin
    • Add support for get_protocol_info to query list of protocols
    • Add support for libfabric log redirection
      • Requires libfabric >= 1.16.0, disabled if FI_LOG_LEVEL is set
      • Add libfabric log subsys (off by default)
      • Bump FI_VERSION to 1.13 when log redirection is supported
  • [NA UCX]
    • Attempt to disable UCX backtrace if UCX_HANDLE_ERRORS is not set
    • Add support for UCP_EP_PARAM_FIELD_LOCAL_SOCK_ADDR
      • With UCX >= 1.13 local src address information can now be specified
        on client to use specific interface and port
    • Set CM_REUSEADDR by default to enable reuse of existing listener addr
      after a listener exits abnormally
    • Attempt to reconnect EP if disconnected
      • This concerns cases where a peer would have reappeared after a
        previous disconnection
    • Add support for get_protocol_info to query list of protocols
    • Enable plugin to be built as a dynamic plugin
  • [NA Test]
    • Update NA test perf to use multi-recv feature
    • Update perf test to use hugepages
    • Add support for multi-targets and add lookup test
    • Install perf tests if BUILD_TESTING_PERF is ON
  • [HG util]
    • Change return type of hg_time_less() to bool
    • Add HG_LOG_WRITE_FUNC() macro to pass func/line info
      • Add also module / no_return parameters to hg_log_write()
    • Add support for hugepage allocations
    • Use isb for cpu_spinwait on aarch64
    • Add mercury_dl to support dynamically loaded modules
    • Bump HG util version to 4.0.0

Bug fixes

  • [HG]
    • Ensure init info version is compatible with previous versions of the struct
    • Clean up and refactoring fixes
    • Fix race condition in hg_core_forward with debug enabled
    • Simplify RPC map and fix hashing for RPC IDs larger than 32-bit integer
    • Refactor context pools and cleanup
    • Fix potential leak on ack buffer
    • Ensure list of created RPC handles is empty before closing context
    • Bump default number of pre-allocated requests from 256 to 512 to make use
      of 2M hugepages by default
    • Add extra error checking to prevent class mismatch
    • Fix potential race when sending one-way RPCs to ourself
  • [HG Bulk]
    • Add extra error checking to prevent class mismatch
  • [HG Test]
    • Refactor test_rpc to correctly handle timeout return values
    • Fix overflow of number of target / classes
      • Number of targets was limited to UINT8_MAX
  • [NA OFI]
    • Fix handling of extra caps to not always follow advertised caps
      • Ensure also that extra caps passed are honored by provider
    • Force sockets provider to use shared domains
      • This prevents a performance regression when multiple classes are
        being used (FI_THREAD_DOMAIN is therefore disabled for this provider)
    • Refactor unexpected and expected sends, retry of OFI operations, handling
      of RMA operations
    • Always include FI_DIRECTED_RECV in primary caps
    • Disable use of FI_SOURCE for most providers to reduce lookup overhead
      • Separate code paths for providers that do not support FI_SOURCE
      • Remove insert of FI addr into secondary table if FI_SOURCE is
        not used
    • Remove NA_OFI_SOURCE_MSG flag that was matching FI_SOURCE_ERR
    • Fix potential refcount race when sharing domains
    • Check domain's optimal MR count if non-zero
    • Fix potential double free of src_addr info
    • Refactor auth key parsing code to build without extension headers
    • Merge latest changes required for opx provider enablement
      • Pass FI_COMPLETION to RMA ops as flag is currently not ignored
        (prov/opx tmp fix)
    • Add runtime version check
      • Ensure that runtime version is greater than min version
  • [NA SM]
    • Fix handling of 0-size messages when no receive has been posted
    • Fix issue where an expected msg that is no longer posted arrives
      • In that particular case just drop the incoming msg
    • Add perf warning message for unexpected messages without recv posted
  • [NA UCX]
    • Fix handling of UCS return types to match NA types
    • Enforce src_addr port used for connections to be 0
      • This fixes a port conflict between listener and connection ports
    • Fix handling of unexpected messages without pre-posted recv
  • [NA BMI]
    • Clean up and fix some coverity warnings
  • [NA MPI]
    • Clean up and fix some coverity warnings
  • [NA Test]
    • Fix NA latency test to ensure recvs are always pre-posted
    • Do not use MPI_Init_thread() if not needed
      • Fix missing return check of na_test_mpi_init()
  • [HG util]
    • Clean up logging and set log root to hg_all
      • hg_all subsys can now be set to turn on logging in all subsystems
    • Set log subsys to hg_all if log level env is set
    • Fixes to support WIN32 builds
  • [CMake]
    • Fix internal/external dependencies that were not correctly set
    • Fix pkg-config entries wrongly set as public/private
    • Ensure VERSION/SOVERSION is not set on MODULE libraries
    • Allow for in-source builds (RPM support)
    • Add DL lib dependency
    • Fix object target linking on CMake < 3.12
    • Ensure we build with PIC and PIE when available
  • [Examples]
    • Allow examples to build without Boost support

⚠️ Known Issues

  • [NA OFI]
    • [tcp/verbs;ofi_rxm] Using more than 256 peers requires FI_UNIVERSE_SIZE
      to be set.

mercury 2.3.0rc5

12 Apr 16:27
v2.3.0rc5
Compare
Choose a tag to compare
mercury 2.3.0rc5 Pre-release
Pre-release

Summary

This version brings bug fixes and updates to our v2.0.0 release.

New features

Added in rc5

  • [HG/NA]
    • Add HG_Init_opt2() / HG_Core_init_opt2() / NA_Initialize_opt2() to
      safely pass updated init info while maintaining ABI compatibility between
      versions
  • [CMake]
    • Add NA_INSTALL_PLUGIN_DIR variable to control plugin install path

Added in rc4

  • [HG]
    • Add HG_Context_unpost() / HG_Core_context_unpost() for optional
      2-step context shutdown

Added in rc2

  • [HG Test]
    • Perf test now supports multi-client / multi-server workloads
    • Add BUILD_TESTING_UNIT and BUILD_TESTING_PERF CMake options
  • [NA OFI]
    • Add support for libfabric log redirection
      • Requires libfabric >= 1.16.0, disabled if FI_LOG_LEVEL is set
      • Add libfabric log subsys (off by default)
      • Bump FI_VERSION to 1.13 when log redirection is supported
  • [HG util]
    • Add HG_LOG_WRITE_FUNC() macro to pass func/line info
    • Add also module / no_return parameters to hg_log_write()
    • Remove HG_ATOMIC_VAR_INIT (deprecated)

Added in rc1

  • [HG]
    • Add support for multi-recv operations (OFI plugin only)
      • Currently disable multi-recv when auto SM is on
      • Posted recv operations are in that case decoupled from pool of RPC
        handles
      • Add release_input_early init info flag to attempt to release input
        buffers early once input is decoded
      • Add HG_Release_input_buf() to manually release input buffer.
      • Add also no_multi_recv init info option to force disabling
        multi-recv
    • Make use of subsys logs (cls, ctx, addr, rpc, poll) to control
      log output
    • Add init info struct versioning
  • [HG bulk]
    • Update to new logging system through bulk subsys log.
  • [HG proc]
    • Update to new logging system through proc subsys log.
  • [HG Test]
    • Refactor tests to separate perf tests from unit tests
    • Add NA/HG test common library
    • Add hg_rate / hg_bw_write and hg_bw_read perf tests
    • Install perf tests if BUILD_TESTING is ON
  • [NA]
    • Add support for multi-recv operations
      • Add NA_Msg_multi_recv_unexpected() and
        na_cb_info_multi_recv_unexpected cb info
      • Add flags parameter to NA_Op_create() and NA_Msg_buf_alloc()
      • Add NA_Has_opt_feature() to query multi recv capability
    • Remove int return type from NA callbacks and return void
    • Remove unused timeout parameter from NA_Trigger()
    • NA_Addr_free() / NA_Mem_handle_free() and NA_Op_destroy() now
      return void
    • na_mem_handle_t and na_addr_t types to no longer include pointer type
    • Add NA_PLUGIN_PATH env variable to optionally control plugin loading
      path
    • Add NA_DEFAULT_PLUGIN_PATH CMake option to control default plugin path
      (default is lib install path)
    • Add NA_USE_DYNAMIC_PLUGINS CMake option (OFF by default)
    • Bump NA library version to 4.0.0
  • [NA OFI]
    • Add support for multi-recv operations and use FI_MSG
    • Allocate multi-recv buffers using hugepages when available
    • Switch to using fi_senddata() with immediate data for unexpected msgs
      • NA_OFI_UNEXPECTED_TAG_MSG can be set to switch back to former
        behavior that uses tagged messages instead
    • Remove support for deprecated psm provider
    • Control CQ interrupt signaling with FI_AFFINITY (only used if thread is
      bound to a single CPU ID)
    • Enable cxi provider to use FI_WAIT_FD
    • Add NA_OFI_OP_RETRY_TIMEOUT and NA_OFI_OP_RETRY_PERIOD
      • Once NA_OFI_OP_RETRY_TIMEOUT milliseconds elapse, retry is stopped
        and operation is aborted (default is 120000ms)
      • When NA_OFI_OP_RETRY_PERIOD is set, operations are retried only
        every NA_OFI_OP_RETRY_PERIOD milliseconds (default is 0)
    • Add support for tcp with and without ofi_rxm
      • tcp defaults to tcp;ofi_rxm for libfabric < 1.18
    • Enable plugin to be built as a dynamic plugin
  • [NA UCX]
    • Attempt to disable UCX backtrace if UCX_HANDLE_ERRORS is not set
    • Add support for UCP_EP_PARAM_FIELD_LOCAL_SOCK_ADDR
      • With UCX >= 1.13 local src address information can now be specified
        on client to use specific interface and port
    • Set CM_REUSEADDR by default to enable reuse of existing listener addr
      after a listener exits abnormally
    • Attempt to reconnect EP if disconnected
      • This concerns cases where a peer would have reappeared after a
        previous disconnection
    • Enable plugin to be built as a dynamic plugin
  • [NA Test]
    • Update NA test perf to use multi-recv feature
    • Update perf test to use hugepages
    • Add support for multi-targets and add lookup test
    • Install perf tests if BUILD_TESTING is ON
  • [HG util]
    • Change return type of hg_time_less() to be bool
    • Add support for hugepage allocations
    • Use isb for cpu_spinwait on aarch64
    • Add mercury_dl to support dynamically loaded modules
    • Bump HG util version to 4.0.0

Bug fixes

Added in rc5

  • [Examples]
    • Allow examples to build without Boost support
  • [CMake]
    • Fix internal/external dependencies that were not correctly set
    • Fix pkg-config entries wrongly set as public/private

Added in rc4

  • [NA OFI]
    • Add runtime version check
      • Ensure that runtime version is greater than min version
      • Replace prov/tcp compile check by runtime check
  • [NA SM]
    • Fix issue where an expected msg that is no longer posted arrives
      • In that particular case just drop the incoming msg

Added in rc3

  • [NA OFI]
    • Log redirection requires libfabric >= 1.16.0

Added in rc2

  • [HG/NA]
    • Ensure init info version is compatible
  • [NA OFI]
    • Fix handling of extra caps to not always follow advertised caps
    • Pass FI_COMPLETION to RMA ops as flag is currently not ignored
      (prov/opx tmp fix)
  • [CMake]
    • Ensure VERSION/SOVERSION is not set on MODULE libraries
    • Allow for in-source builds (RPM support)
    • Add missing DL lib dependency
    • Fix object target linking on CMake < 3.12
    • Ensure we build with PIC and PIE when available

Added in rc1

  • [HG]
    • Clean up and refactoring fixes
    • Fix race condition in hg_core_forward with debug enabled
    • Simplify RPC map and fix hashing for RPC IDs larger than 32-bit integer
    • Refactor context pools and cleanup
    • Fix potential leak on ack buffer
    • Ensure list of created RPC handles is empty before closing context
    • Bump pre-allocated requests to 512 to make use of 2M hugepages
    • Add extra error checking to prevent class mismatch
    • Fix potential race when sending one-way RPCs to ourself
  • [HG Bulk]
    • Add extra error checking to prevent class mismatch
  • [HG Test]
    • Refactor test_rpc to correctly handle timeout return values
  • [NA OFI]
    • Force sockets provider to use shared domains
      • This prevents a performance regression when multiple classes are
        being used (FI_THREAD_DOMAIN is therefore disabled for this provider)
    • Refactor unexpected and expected sends, retry of OFI operations, handling
      of RMA operations
    • Always include FI_DIRECTED_RECV in primary caps
    • Remove NA_OFI_SOURCE_MSG flag that was matching FI_SOURCE_ERR
    • Fix potential refcount race when sharing domains
    • Check domain's optimal MR count if non-zero
    • Fix potential double free of src_addr info
    • Refactor auth key parsing code to build without extension headers
    • Merge latest changes required for opx provider enablement
  • [NA SM]
    • Fix handling of 0-size messages when no receive has been posted
  • [NA UCX]
    • Fix handling of UCS return types to match NA types
  • [NA BMI]
    • Clean up and fix some coverity warnings
  • [NA MPI]
    • Clean up and fix some coverity warnings
  • [HG util]
    • Clean up logging and set log root to hg_all
      • hg_all subsys can now be set to turn on logging in all subsystems
    • Set log subsys to hg_all if log level env is set
    • Fixes to support WIN32 builds

⚠️ Known Issues

  • [NA OFI]
    • [tcp/verbs;ofi_rxm] Using more than 256 peers requires FI_UNIVERSE_SIZE
      to be set.