Skip to content

Conversation

@JablonskiMateusz
Copy link

get a vector of cl_device_id and then use the device ids
to populate a vector of compute::device

Signed-off-by: Mateusz Jablonski [email protected]

@coveralls
Copy link

coveralls commented Jun 24, 2020

Coverage Status

Coverage increased (+0.008%) to 84.027% when pulling 881ca42 on JablonskiMateusz:master into 36c8913 on boostorg:master.

get a vector of cl_device_id and then use the device ids
to populate a vector of compute::device

Signed-off-by: Mateusz Jablonski <[email protected]>
@JablonskiMateusz
Copy link
Author

@jszuppe could you look at this PR?

@kylelutz
Copy link
Collaborator

kylelutz commented Jul 8, 2020

Hey @JablonskiMateusz! Could you provide some detail on the motivation for this change? Was this causing issues with your OpenCL library?

For context, device is a very thin wrapper on cl_device_id, and they have identical memory layouts. There should be no need to first populate vector<cl_device_id> and then copy into vector<device>.

@JablonskiMateusz
Copy link
Author

Hi @kylelutz,
the problem is scenario with sub devices. The current implementation is that boost queries number of context devices, then creates a vector of compute::device (default constructor of compute::device is called, without calling clRetainDevice), then it passes the pointer to data of the vector to clGetContextInfo and it fills the devices (also without calling clRetainDevice) and the vector is passed to the user's app. When the app destroys the vector, then the destructor of compute::device checks if it is a sub device and it calls clReleaseDevice. This causes that boost calls clReleaseDevice without calling clRetainDevice before. According to OpenCL spec, it caues UB:

After the device reference count becomes zero and all the objects attached to device (such as
command-queues) are released, the device object is deleted. Using this function to release a
reference that was not obtained by creating the object or by calling clRetainDevice causes
undefined behavior.

IMHO the cl_device_id should be passed to compute::device to its constructor or to assignment operator, currently it is performed in a quite hacky way like memcpy to the memory of compute::device .

My change was inspired by compute::program::get_devices where the vector of compute::device is populated from vector of cl_device_id https://github.com/boostorg/compute/blob/master/include/boost/compute/program.hpp#L191

@JablonskiMateusz
Copy link
Author

@kylelutz @jszuppe any comments?

@keryell
Copy link
Contributor

keryell commented Jul 14, 2020

I think that in a real application the compute::context::get_devices should not be the critical path, so having a more correct version does not seem like a problem.

@JablonskiMateusz
Copy link
Author

@kylelutz ping

@JablonskiMateusz
Copy link
Author

@kylelutz @jszuppe ping

@JablonskiMateusz
Copy link
Author

kindly reminder @kylelutz

@JablonskiMateusz
Copy link
Author

@kylelutz @jszuppe could you merge this PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants