Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL][Doc] Add spec to get device image backend content #14811

Merged
merged 10 commits into from
Dec 3, 2024
Original file line number Diff line number Diff line change
@@ -0,0 +1,255 @@
= sycl_ext_oneapi_device_image_backend_content

:source-highlighter: coderay
:coderay-linenums-mode: table

// This section needs to be after the document title.
:doctype: book
:toc2:
:toc: left
:encoding: utf-8
:lang: en
:dpcpp: pass:[DPC++]
:endnote: —{nbsp}end{nbsp}note

// Set the default source code type in this document to C++,
// for syntax highlighting purposes. This is needed because
// docbook uses c++ and html5 uses cpp.
:language: {basebackend@docbook:c++:cpp}


== Notice

[%hardbreaks]
Copyright (C) 2024 Intel Corporation. All rights reserved.

Khronos(R) is a registered trademark and SYCL(TM) and SPIR(TM) are trademarks
of The Khronos Group Inc. OpenCL(TM) is a trademark of Apple Inc. used by
permission by Khronos.


== Contact

To report problems with this extension, please open a new issue at:

https://github.com/intel/llvm/issues


== Dependencies

This extension is written against the SYCL 2020 revision 9 specification.
All references below to the "core SYCL specification" or to section numbers in
the SYCL specification refer to that revision.


== Status

This is a proposed extension specification, intended to gather community
feedback.
Interfaces defined in this specification may not be implemented yet or may be
in a preliminary state.
The specification itself may also change in incompatible ways before it is
finalized.
*Shipping software products should not rely on APIs defined in this
specification.*


== Overview

This extension adds a mechanism to obtain the raw backend content of the device
images that are in a kernel bundle.
The format of this content is implementation-defined, so applications that make
use of this extension are not expected to be portable to other implementations
of SYCL.


== Specification

=== Feature test macro

This extension provides a feature-test macro as described in the core SYCL
specification.
An implementation supporting this extension must predefine the macro
`SYCL_EXT_ONEAPI_DEVICE_IMAGE_BACKEND_CONTENT` to one of the values defined in
the table below.
Applications can test for the existence of this macro to determine if the
implementation supports this feature, or applications can test the macro's
value to determine which of the extension's features the implementation
supports.

[%header,cols="1,5"]
|===
|Value
|Description

|1
|The APIs of this experimental extension are not versioned, so the
feature-test macro always has this value.
|===

=== New member functions in the `device_image` class

This extension adds the following member functions to the `device_image` class.

[source,c++]
----
namespace sycl {

template <bundle_state State>
class device_image {
public:
backend ext_oneapi_get_backend() const noexcept;
std::vector<std::byte> ext_oneapi_get_backend_content() const;

#if defined(__cpp_lib_span)
std::span<std::byte> ext_oneapi_get_backend_content_view() const;
gmlueck marked this conversation as resolved.
Show resolved Hide resolved
#endif

/*...*/
};

} // namespace sycl
----

'''

[frame=all,grid=none,separator="@"]
!====
a@
[source,c++]
----
backend ext_oneapi_get_backend() const noexcept;
----
!====

_Returns:_ The backend that is associated with this device image.
This is always the same as the backend of the kernel bundle that contains this
device image.

'''

[frame=all,grid=none,separator="@"]
!====
a@
[source,c++]
----
std::vector<std::byte> ext_oneapi_get_backend_content() const;
----
!====

_Constraints:_ Available only when `State` is `bundle_state::executable`.

_Returns:_ A copy of the raw backend content for this device image.
The format of this data is implementation-defined.
See below for a description of the formats used by {dpcpp}.

'''

[frame=all,grid=none,separator="@"]
!====
a@
[source,c++]
----
std::span<std::byte> ext_oneapi_get_content_backend_view() const;
----
!====

Available only when the compiler defines the `__cpp_lib_span` feature-test macro
(which is defined in {cpp}20 and higher).

_Constraints:_ Available only when `State` is `bundle_state::executable`.

_Returns:_ A view of the raw backend content for this device image.
The data behind this view has the same lifetime as the `device_image` object.
The format of this data is implementation-defined.
See below for a description of the formats used by {dpcpp}.

'''


== Device image format for {dpcpp}

This section is non-normative and applies only to the {dpcpp} implementation.
The format of the data returned by
`device_image::ext_oneapi_get_backend_content` and
`device_image::ext_oneapi_get_backend_content_view` depends on the backend of the
kernel bundle that contains the device image.
Pennycook marked this conversation as resolved.
Show resolved Hide resolved

=== Format on Level Zero

The device image's backend content is native ISA for the device, which can be
AlexeySachkov marked this conversation as resolved.
Show resolved Hide resolved
passed to `zeModuleCreate` as `ZE_MODULE_FORMAT_NATIVE` format.

:ref1: ../proposed/sycl_ext_oneapi_free_function_kernels.asciidoc#level-zero-and-opencl-compatibility

[_Note:_ The interface to kernels in the device image backend content is not
defined in the general case, which means there is no portable way to invoke
kernels from a Level Zero module that is created from the raw device image
content.
However, see link:{ref1}[here] for a limited case where this portability is
guaranteed.
_{endnote}_]

=== Format on OpenCL

The device image's backend content is executable binary device code representing
one or more kernels, which can be passed to `clCreateProgramWithBinary`.

[_Note:_ The interface to kernels in the device image backend content is not
defined in the general case, which means there is no portable way to invoke
kernels from a OpenCL `cl_program` object that is created from the raw device
image content.
However, see link:{ref1}[here] for a limited case where this portability is
guaranteed.
_{endnote}_]

=== Format on CUDA

The device image's backend content is a CUBIN module representing one or more
kernels.


== Example

:ref2: ../proposed/sycl_ext_oneapi_free_function_kernels.asciidoc

A kernel bundle can contain multiple device images with different
representations of the same kernel for different devices.
This example shows how to get the device image's backend content for a
particular kernel for a particular device.
Note that this example also uses the kernel syntax described in link:{ref2}[
sycl_ext_oneapi_free_function_kernels], but it is not necessary to define
kernels in that syntax when using this extension.

[source,c++]
----
#include <sycl/sycl.hpp>
namespace syclext = sycl::ext::oneapi;
namespace syclexp = sycl::ext::oneapi::experimental;

SYCL_EXT_ONEAPI_FUNCTION_PROPERTY((syclexp::nd_range_kernel<1>))
void iota(float start, float *ptr) {
size_t id = syclext::this_work_item::get_nd_item().get_global_linear_id();
ptr[id] = start + static_cast<float>(id);
}

void main() {
sycl::device d;
sycl::queue q{d};
sycl::context ctxt = q.get_context();

// Get a kernel bundle that contains the kernel "iota".
sycl::kernel_id iota = syclexp::get_kernel_id<iota>();
auto exe_bndl =
sycl::get_kernel_bundle<sycl::bundle_state::executable>(ctxt, {iota});

std::vector<std::byte> bytes;
for (auto& img: bundle) {
// Search for the device image that contains "iota" for this device.
if (img.has_kernel(iota, dev)) {
bytes = img.ext_oneapi_get_backend_content();
break;
}
}
}
----
Original file line number Diff line number Diff line change
Expand Up @@ -883,15 +883,14 @@ int main() {
```


[[level-zero-and-opencl-compatibility]]
== {dpcpp} guaranteed compatibility with Level Zero and OpenCL backends

The contents of this section are non-normative and apply only to the {dpcpp}
implementation.
Kernels written using the free function kernel syntax can be submitted to a
device by using the Level Zero or OpenCL backends, without going through the
SYCL host runtime APIs.
This works only when the kernel is AOT compiled to native device code using the
`-fsycl-targets` compiler option.

The interface to the kernel in the native device code module is only guaranteed
when the kernel adheres to the following restrictions:
Expand All @@ -905,6 +904,16 @@ when the kernel adheres to the following restrictions:
* The translation unit containing the kernel is compiled with the
`-fno-sycl-dead-args-optimization` option.

In order to invoke a kernel using Level Zero or OpenCL, the application must
first obtain the raw backend content of the device image that contains the
kernel.
One way to do this is by using
link:../proposed/sycl_ext_oneapi_device_image_backend_content.asciidoc[
sycl_ext_oneapi_device_image_backend_content].
It is also possible to compile the application in AOT mode via the
`-fsycl-targets` compiler option and then extract the device image's backend
content from the executable file.

Both Level Zero and OpenCL identify a kernel via a _name_ string.
(See `zeKernelCreate` and `clCreateKernel` in their respective specifications.)
When a kernel is defined according to the restrictions above, the _name_ is
Expand Down