-
Notifications
You must be signed in to change notification settings - Fork 15
Asynchronicity in YAKL
Matt Norman edited this page Dec 11, 2021
·
1 revision
Most parallel dispatch and data copying routines in YAKL are asynchronous with respect to host code. They are all launched on the same default device "stream" (however that concept is mapped to different backends). Dependency between kernel launches and data movement is respected on the device even though it is asynchronous with respect to the host.
Calls that are asynchronous with respect to the host:
parallel_for
Array::deep_copy_to
Array::slice
Calls that are synchronized with respect to the host at the end:
Array::createHostCopy
Array::createDeviceCopy
- All
yakl::intrinsics::
reductions (e.g.,sum
,minval
,maxval
,count
, etc.)
To synchronize the host with respect to work on the device, please use yakl::fence()
, which will stall the host until all device work has been completed.