Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrap clCreateCommandQueueWithProperties #198

Open
inducer opened this issue Aug 19, 2017 · 10 comments
Open

Wrap clCreateCommandQueueWithProperties #198

inducer opened this issue Aug 19, 2017 · 10 comments

Comments

@inducer
Copy link
Owner

inducer commented Aug 19, 2017

This would be the pattern to follow:

error*
create_command_queue(clobj_t *queue, clobj_t _ctx,
clobj_t _dev, cl_command_queue_properties props)
{
auto ctx = static_cast<context*>(_ctx);
auto py_dev = static_cast<device*>(_dev);
return c_handle_error([&] {
cl_device_id dev;
if (py_dev) {
dev = py_dev->data();
} else {
auto devs = pyopencl_get_vec_info(cl_device_id, Context,
ctx, CL_CONTEXT_DEVICES);
if (devs.len() == 0) {
throw clerror("CommandQueue", CL_INVALID_VALUE,
"context doesn't have any devices? -- "
"don't know which one to default to");
}
dev = devs[0];
}
cl_command_queue cl_queue =
pyopencl_call_guarded(clCreateCommandQueue, ctx, dev, props);
*queue = new command_queue(cl_queue, false);
});
}

@glupescu
Copy link

glupescu commented Apr 21, 2018

I tried the below quick hack but I am getting INVALID_COMMAND_QUEUE later on. There is no error upon calling clCreateCommandQueue though. Also setting an invalid QUEUE_SIZE does result in INVALID_QUEUE_VALUE, for example on AMD creating a queue with 16MB size when 8MB is max.

glupescu@5874ce0#diff-93de4fb0b5fa13ed2f66448c780b3102

From stackoverflow https://stackoverflow.com/questions/45767759/how-to-set-device-side-queue-size-in-pyopencl/49957843#49957843

@inducer
Copy link
Owner Author

inducer commented Apr 23, 2018

Sorry, I don't have the spare cycles at this moment to investigate in detail. I've put this on my list for later in the summer.

@inducer
Copy link
Owner Author

inducer commented Aug 13, 2018

#240 adds support for this. I'd be happy to hear your feedback.

@glupescu
Copy link

Will definitely check this out soon - thanks for adding support on this.

@atypic
Copy link

atypic commented Jul 8, 2020

Kicking this slightly back to life: did you manage to get a device-side queue working through pyopencl ever? I have spent the better part of today trying to make this work, but the closest i've gotten is the queue being created and a "clEnqueueNDRangeKernel failed: INVALID_COMMAND_QUEUE" being thrown at me when I try to enqueue a dumb kernel (that does nothing).

@inducer
Copy link
Owner Author

inducer commented Jul 8, 2020

What ICD (OpenCL driver) are you using?

@atypic
Copy link

atypic commented Jul 9, 2020

Edit: PEBCAK

The ranting below here is because i didn't understand that you can't enqueue to a device side queue from the host side. You need 2 queues. One on the host, one on the device. You can mark the device queue as default.

--
I've tried both the Nvidia(1.2) and intel (2.1) runtimes. The method complains about incompability when i use nvidia, of course.

Both using this way:
cl.CommandQueue(self._cl_context, properties=cmcq.ON_DEVICE | cmcq.ON_DEVICE_DEFAULT | cmcq.OUT_OF_ORDER_EXEC_MODE_ENABLE)

and...
cl.CommandQueue(self._cl_context, properties = [cmq.PROPERTIES, cmcq.ON_DEVICE | cmcq.ON_DEVICE_DEFAULT | cmcq.OUT_OF_ORDER_EXEC_MODE_ENABLE, cmq.SIZE, 1024])
leads to
pyopencl._cl.LogicError: clEnqueueNDRangeKernel failed: INVALID_COMMAND_QUEUE

actually, I lie, on nVidia this leads to Segfault, though I have read that the ...withProperties() function is supported now.

Removing this and simply making an in-order on-host queue (default) the kernel runs fine...

@inducer
Copy link
Owner Author

inducer commented Jul 9, 2020

Thanks for following up! Just to be clear: Did you get things to work on Intel? (I'd expect that to work more than I'd epxect the same of Nvidia.)

@atypic
Copy link

atypic commented Jul 10, 2020

Eh!

It's complicated. So, I am for sure able to create on-device queues on both intel and nvidia platforms. I have made the following observations:

  • Using the ...withProperties()-call is required for doing this on nvidia. For intel I can use both calls and it works: but only on certain cards. My desktop has a 1660 and it doesn't work (OUT OF RESOURCES error), but the same code on a Tesla V100 works. I have an AMD card as well that throws "out of host memory" when I try to make the second queue using the withProperties() function, but I am able to use the 'normal' CreateCommandQueue().

  • I can enque_kernel() on both intel and nivida: BUT, on both platforms I get hangs if I do not turn off code caching. No idea why.

@inducer
Copy link
Owner Author

inducer commented Jul 10, 2020

Thanks for reporting back! Could you share some example code? I'd like to include that in the tests, if for no other reason than to make sure that the things that are working stay working.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants