Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preserve ctf_{sequence,array}_text string data even if they contain NULL characters #114

Open
wants to merge 1 commit into
base: stable-2.0
Choose a base branch
from

Conversation

Kerilk
Copy link

@Kerilk Kerilk commented Sep 17, 2020

This patch restores the behavior of babeltrace2 to that of babeltrace1.5 regarding the handling of ctf_sequence_text and ctf_array_text. The original bytes/length provided during tracing are reflected in the resulting string irrespective of it containing '\0' characters.

If the project is willing to accept the pull request I could devise a test to ensure this behavior is preserved in the future.

@jgalar
Copy link
Member

jgalar commented Sep 22, 2020

Hi @Kerilk!

I'm sorry but the Babeltrace 1.x release series won't see new releases beyond bug fixes.

@jgalar jgalar closed this Sep 22, 2020
@Kerilk
Copy link
Author

Kerilk commented Sep 22, 2020

This change is for Babeltrace 2.0.

@compudj
Copy link
Member

compudj commented Sep 22, 2020

Indeed, we need to decide whether we want to reinstate this bt1 behavior into bt2 or not.

@compudj compudj reopened this Sep 22, 2020
@Kerilk
Copy link
Author

Kerilk commented Sep 22, 2020

To elaborate a bit, we use it to save blobs of memory that we can easily cast back to structured types.

@compudj
Copy link
Member

compudj commented Sep 22, 2020

To elaborate a bit, we use it to save blobs of memory that we can easily cast back to structured types.

Why can't you use ctf_sequence or ctf_array for this ? The *_text variants are meant to trace zero-terminated strings.

@Kerilk
Copy link
Author

Kerilk commented Sep 22, 2020

Because I can't get a pointer back to the data, as far as I can tell, without iterating over all the elements of the array and making an additional copy. Which is painful and slow, especially if you do it from high level languages (ruby, python, etc).

@jgalar
Copy link
Member

jgalar commented Sep 24, 2020

Thanks for the added clarifications and sorry for closing the issue earlier; I completely misinterpreted the description.

I understand the overall use-case for accessing "raw" binary payloads directly.

The proposed change doesn't work as it violates a precondition of bt_field_string_append_with_length() in appending a string that contains a null character.

A check for this is performed here:
https://github.com/efficios/babeltrace/blob/master/src/lib/trace-ir/field.c#L919

This check is only performed in "developer mode" for performance reasons, which I guess is why you didn't hit it. I recommend you configure your babeltrace2 build in "developer mode" (more details in README.adoc) for any future development work.
To enable those runtime checks, define BABELTRACE_DEV_MODE=1 at configure like so:

$ BABELTRACE_DEV_MODE=1 ./configure

Anyhow, this would be an abuse of the API that I would rather not encourage even if it did work.

I have discussed this with @eepp and, so far, we both think the inclusion of a new "blob" field class would fit best in the current API.

This field class would be used when a source identifies a field as an "unstructured" binary payload and would provide a mean to access the data directly (i.e. const void * and a size). This field class would be completely separate from the array field class hierarchy.

I still have to give this some thought to flesh out the implications and I am open to suggestions.

To understand your use case a bit better:

  • What is the typical size of the memory dumps you include in traces?
  • How do you plan to access them from Python?

I'm trying to figure how the Python bindings can expose this in a natural and efficient way and see how "smart" this has to be to provide an acceptable performance level.

Also, are the ruby bindings you are using publicly available?

Thanks!

@Kerilk
Copy link
Author

Kerilk commented Sep 25, 2020

Sorry for the lengthy response.

General Remarks

I think your proposed solution is the best one. I can still use the modified ctf plugin in the meantime. Adopting the new feature will break compatibility for our babeltrace1 based tools, but I can build them back around babeltrace2.

Thanks for the tip for the BABELTRACE_DEV_MODE=1, I will be sure to use it in the future.

Use Case

Our use case is the model-centric tracing of Heterogeneous API like OpenCL, CUDA or Level Zero (see bottom for more details). The idea is to dump not only the arguments of the API calls but also the data behind pointers as well. We try to stay reasonable, we use file IOs to dump larger objects like buffers or compiled programs, and use LTTng to dump the path. We tried LTTng events to dump those fields, but we never managed to get the daemon not to drop messages when message size neared the GiB. This was out of curiosity.

  • in OpenCL we need raw pointer to kernel argument, because those are not typed and we only get a size information and a void * pointer (see: clSetKernelArg) those are seldom bigger than a few dozen of bytes
  • in Level Zero, and to a lesser extent in CUDA, on top of untyped kernel arguments, the APIs makes extensive use of structure passing, and unpacking them to pass them as individual LTTng fields is impractical (I already compile the tracers using a modified tracepoint.h to bypass the LTTng argument limit, but this would be pushing it). Those can go up to a KiB.
  • For the access to those fields, both in C and Ruby (and Python, even if I don't use it in this project) I cast the pointer to the correct type, provided the field is not empty, through an FFI struct in high level languages. More about this below.

About bindings

The ruby binding I am using are for babeltrace1 and can be found here:
https://github.com/alcf-perfengr/babeltrace-ruby
They are built around FFI with minimal native C code (more or less what is in the python babeltrace1 bindings). I plan on doing a similar babeltrace2 bindings soon™. With the proposed functionality I would not need native code in the binding, unless I discover performance bottlenecks.

The most efficient most efficient way for me to map structures in ruby (this applies to python as well) is to wrap the pointer returned by babeltrace into an FFI Pointer and use this pointer as backing for an FFI struct. Here is an extract of my babeltrace_ze tool that reads level zero LTTng traces (ZEDevicegetProperties):

15:26:28.571136164 - lttng_ust_ze:zeDeviceGetProperties_stop: { zeResult: ZE_RESULT_SUCCESS, pDeviceProperties_val: { stype: ZE_STRUCTURE_TYPE_DEVICE_PROPERTIES, pNext: 0x0000000000000000, type: ZE_DEVICE_TYPE_GPU, vendorId: 32902, deviceId: 35410, flags: [ ZE_DEVICE_PROPERTY_FLAG_INTEGRATED ], subdeviceId: 0, coreClockRate: 1100, maxMemAllocSize: 4294959104, maxHardwareContexts: 4020744488, maxCommandQueuePriority: 0, numThreadsPerEU: 7, physicalEUSimdWidth: 8, numEUsPerSubslice: 8, numSubslicesPerSlice: 8, numSlices: 1, timerResolution: 52, timestampValidBits: 4288884032, kernelTimestampValidBits: 32766, uuid: { id: 00008086-8a52-0000-0000-000000000000 }, name: Intel(R) Gen11 } }

In ruby the code (generated) looks like:

class ZEDeviceProperties < FFI::ZEStruct
 layout :stype, :ze_structure_type_t,
        :pNext, :pointer,
        :type, :ze_device_type_t,
        :vendorId, :uint32_t,
        :deviceId, :uint32_t,
        :flags, :ze_device_property_flags_t,
        :subdeviceId, :uint32_t,
        :coreClockRate, :uint32_t,
        :maxMemAllocSize, :uint64_t,
        :maxHardwareContexts, :uint32_t,
        :maxCommandQueuePriority, :uint32_t,
        :numThreadsPerEU, :uint32_t,
        :physicalEUSimdWidth, :uint32_t,
        :numEUsPerSubslice, :uint32_t,
        :numSubslicesPerSlice, :uint32_t,
        :numSlices, :uint32_t,
        :timerResolution, :uint64_t,
        :timestampValidBits, :uint32_t,
        :kernelTimestampValidBits, :uint32_t,
        :uuid, :ze_device_uuid_t,
        :name, [ :char, 256 ]
end

$event_lambdas["lttng_ust_ze:zeDeviceGetProperties_stop"] = lambda { |defi|
  s = "{ "
  s << "zeResult: #{ZE::ZEResult.from_native(defi["zeResult"], nil)}"
  s << ', '
  s << "pDeviceProperties_val: #{defi["pDeviceProperties_val"].size > 0 ? ZE::ZEDeviceProperties.new(FFI::MemoryPointer.from_string(defi["pDeviceProperties_val"])) : nil}"
  s << " }"
}

Note that with the proposed feature I would not need to get the pointer back from the string, but could directly map the struct on the returned pointer. The same approach is used in C to build a babeltrace2 event dispatcher that allows introspection into the API structs. Here is an example of using ctypes in python to achieve the same kind of results https://xgitlab.cels.anl.gov/videau/cconfigspace/-/blob/master/bindings/python/cconfigspace/base.py#L260-273 (note that unions are somewhat broken in ctypes).

THAPI

The whole THAPI tracing project is hosted here:
https://xgitlab.cels.anl.gov/heteroflow/tracer
THAPI implements LTTng based tracers for OpenCL, Level Zero and CUDA. This is not widely advertised yet, as the project is not finalized. Most of the tracers and tools around them are generated from the APIs, so the source code that use LTTng and babeltrace2 are not readily available for reading.

@jgalar
Copy link
Member

jgalar commented Sep 25, 2020

Thanks a lot for the detailed answer.

Our use case is the model-centric tracing of Heterogeneous API like OpenCL, CUDA or Level Zero (see bottom for more details). The idea is to dump not only the arguments of the API calls but also the data behind pointers as well. We try to stay reasonable, we use file IOs to dump larger objects like buffers or compiled programs, and use LTTng to dump the path. We tried LTTng events to dump those fields, but we never managed to get the daemon not to drop messages when message size neared the GiB. This was out of curiosity.

Hmm, Babeltrace would have to be a bit smarter to accommodate payloads of that size; copying them into the bt_field objects is not going to cut it... I have a couple of ideas if it does enable use-cases, though.

For the moment, I guess even a naive copy approach would help get you going.

As for LTTng, the tracers will not save event payloads that are larger than a sub-buffer as payloads cannot span more than one sub-buffer. LTTng isn't really tuned for those kind of payload sizes, but if you make sure to configure sub-buffers to be larger than your expected payload, it should work. See https://lttng.org/docs/#doc-channel-subbuf-size-vs-subbuf-count.

You may also want to have a look at lttng-ust's blocking mode (https://lttng.org/docs/#doc-blocking-timeout-example) to leave the consumer daemon enough time to extract those payloads to disk.

I plan on doing a similar babeltrace2 bindings soon™. With the proposed functionality I would not need native code in the binding, unless I discover performance bottlenecks.

Great! Looking forward to it 😃

I will have a look at THAPI, it sounds pretty interesting.

@eepp
Copy link
Member

eepp commented Nov 20, 2020

Follow-up: the recent CTF 2 specification proposal revision includes static-length and dynamic-length BLOB field classes.

If this is accepted, it means you'll be able to use such a field class in CTF 2 instead of static-length and dynamic-length array field classes to describe BLOB fields.

A CTF 2 BLOB field class also has an associated IANA media type.

In Babeltrace 2:

  • The BLOB field C API will return a byte pointer and a byte count.
  • The BLOB field Python API will return a bytes object.

Will this satisfy your use cases @Kerilk?

@Kerilk
Copy link
Author

Kerilk commented Nov 20, 2020

Yes it will, thanks a lot for making this proposal.

On my side I am almost done binding Babeltrace 2 in Ruby. I will send a message to the mailing list with any problems I have encountered when I am done, as none of the issues I have found until now need urgent fixes.
You can find the WIP here: https://github.com/alcf-perfengr/babeltrace2-ruby

Kerilk added a commit to Kerilk/spack that referenced this pull request Apr 15, 2021
…to the resulting string and not stop at the first '\0' character.
@Kerilk
Copy link
Author

Kerilk commented Jul 7, 2022

Hello @eepp,

Any news on the the CTF2 adoption?
In the meantime we are deploying patched versions of Babeltrace 2 using spack:
https://github.com/argonne-lcf/THAPI-spack/tree/main/packages/babeltrace2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants