-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
clEnqueueSVMUnmap #890
Comments
Interesting question. We currently say for the description of the
IMHO this is ambiguous whether "otherwise the behavior is undefined" applies to just the sentence about being allocated from the same context as the command queue or whether it applies to The only relevant error condition is:
Do you have a preference how you would like this to behave? It would also be interesting to know how different implementations behave in this case. |
In our implementation, we don't check if the buffer has been mapped or not, and we perform the unmapping operation anyway, which may be redundant, but harmless. |
We (Arm) return |
I tried a couple of Intel OpenCL implementations and it looks like our CPU implementation returns I don't think it would be too hard for our GPU implementation to return an error if this is what we decide to do. Theoretically this could cause an app to fail, since we would be returning an error (and not generating an event) when we weren't previously, but this probably won't be a problem, especially given that other OpenCL implementations are generating an error in this case. In case it's helpful: https://github.com/bashbaug/SimpleOpenCLSamples/blob/svm-unmap-test/samples/99_svmunmap/main.cpp |
Our (Qualcomm) implementation returns CL_INVALID_VALUE for this situation. |
We don't return an error in Mesa for this, but it would be consistent with |
Discussed in the August 22nd teleconference:
|
I have a (draft) PR written up that adds this error condition in #979, but I'm starting to have second thoughts whether it is actually possible to reliably return an error in this case. Consider a tricky case like the following (pseudocode): event1 = clCreateUserEvent();
clEnqueueSVMUnmap(queue1, ptr, depends on event1);
event0 = clCreateUserEvent();
clEnqueueSVMMap(queue0, ptr, depends on event0); Observations are:
Given (3), is the expectation that implementations track which SVM regions have been mapped similarly, even though there is (currently) no queryable map count? Should the example above generate an error because clEnqueueSVMUnmap was called before clEnqueueSVMMap, even though the map may actually execute before the unmap? |
Given your interesting examples, I believe we can live without this error case. However, maybe we need a note in the spec to explain why we don't handle this case. |
We observed in the teleconference on October 17th that there could be a similar issue with buffer and image memory objects, also. Specifically, when is the map count updated, and is it used to identify the (synchronous) runtime error? Consider this example: ptr = clEnqueueMapBuffer(queue0, buffer, blocking_map = CL_TRUE);
// map was blocking, so now MAP_COUNT is one
event0 = clCreateUserEvent();
clEnqueueUnmapMemObject(queue0, ptr, depends on event0);
// what is the MAP_COUNT now?
event1 = clCreateUserEvent();
clEnqueueUnmapMemObject(queue1, ptr, depends on even1); // is this an error?
// what is the MAP_COUNT now? I haven't written code to try this case specifically, but I do have code in the OpenCL Intercept Layer to display the queryed map count after mapping and unmapping memory objects and empirically it seems like most implementations upate the map count as soon as the host API executes. |
I just noticed that I never implemented |
I made a quick tester to try this out: Results were pretty similar across all of the implementations I've tried, though there were a few minor differences. I agree it would be good to have some CTS tests to exercise this, at least for the common cases. In summary:
Example output:
|
Consider separating into multiple issues:
All of these scenarios need CTS tests. |
|
about the function clEnqueueSVMUnmap , what happens if we call this function without calling clEnqueueSVMMap first? this is not well-defined in openCL 3.0 Reference Page (spec)
The text was updated successfully, but these errors were encountered: