-
Notifications
You must be signed in to change notification settings - Fork 73
Clarify requirements for types in device code #733
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
dac49fd
22334ab
d3238a4
f4373e2
e690144
ab95d12
7b489a9
6f04c82
da1c0db
a339d3b
c0d33c0
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -204,6 +204,7 @@ Amongst other things, this restriction makes it illegal for a | |
| However, a function may be defined in another translation unit if the | ||
| implementation defines the [code]#SYCL_EXTERNAL# macro as described in | ||
| <<subsec:syclexternal>>. | ||
| * Use of the [code]#long double# type results in undefined behavior. | ||
|
|
||
| Inside a <<discarded-statement>> or in the case of a | ||
| <<manifestly-constant-evaluated>>, any code accepted by the C++ standard in this | ||
|
|
@@ -216,132 +217,23 @@ The restriction waiver in <<discarded-statement>> or | |
| <<device-function>>. | ||
| ==== | ||
|
|
||
| [[subsec:scalartypes]] | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Note that the latest version of the "main" branch has text that refers to this section identifier. (This was part of the recent |
||
| == Built-in scalar data types | ||
| == Data types | ||
|
|
||
| In a SYCL device compiler, the device definition of all standard {cpp} | ||
| fundamental types from <<table.types.fundamental>> must match the host | ||
| definition of those types, in both size and alignment. | ||
| A device compiler may have this preconfigured so that it can match them based on | ||
| the definitions of those types on the platform, or there may be a necessity for | ||
| a device compiler command-line option to ensure the types are the same. | ||
| All fundamental types defined by the {cpp} core language must be available in | ||
| device code except for those listed in <<sec:language.restrictions.kernels>>. | ||
| Whether extended integer and/or extended floating-point types are available in | ||
| device code is implementation-defined. | ||
|
|
||
| The standard {cpp} fixed width types, e.g. [code]#int8_t#, [code]#int16_t#, | ||
| [code]#int32_t#,[code]#int64_t#, should have the same size as defined by the | ||
| {cpp} standard for host and device. | ||
|
|
||
|
|
||
| [[table.types.fundamental]] | ||
| .Fundamental data types supported by SYCL | ||
| [width="100%",options="header",separator="@",cols="65%,35%"] | ||
| |==== | ||
| @ Fundamental data type @ Description | ||
| a@ | ||
| [source] | ||
| ---- | ||
| bool | ||
| ---- | ||
| a@ A conditional data type which can be either true or false. The value | ||
| true expands to the integer constant 1 and the value false expands to the | ||
| integer constant 0. | ||
|
|
||
| a@ | ||
| [source] | ||
| ---- | ||
| char | ||
| ---- | ||
| a@ A signed or unsigned 8-bit integer, as defined by the {cpp} core language | ||
|
|
||
| a@ | ||
| [source] | ||
| ---- | ||
| signed char | ||
| ---- | ||
| a@ A signed 8-bit integer, as defined by the {cpp} core language | ||
|
|
||
| a@ | ||
| [source] | ||
| ---- | ||
| unsigned char | ||
| ---- | ||
| a@ An unsigned 8-bit integer, as defined by the {cpp} core language | ||
|
|
||
| a@ | ||
| [source] | ||
| ---- | ||
| short int | ||
| ---- | ||
| a@ A signed integer of at least 16-bits, as defined by the {cpp} core language | ||
|
|
||
| a@ | ||
| [source] | ||
| ---- | ||
| unsigned short int | ||
| ---- | ||
| a@ An unsigned integer of at least 16-bits, as defined by the {cpp} core language | ||
|
|
||
| a@ | ||
| [source] | ||
| ---- | ||
| int | ||
| ---- | ||
| a@ A signed integer of at least 16-bits, as defined by the {cpp} core language | ||
|
|
||
| a@ | ||
| [source] | ||
| ---- | ||
| unsigned int | ||
| ---- | ||
| a@ An unsigned integer of at least 16-bits, as defined by the {cpp} core language | ||
|
|
||
| a@ | ||
| [source] | ||
| ---- | ||
| long int | ||
| ---- | ||
| a@ A signed integer of at least 32-bits, as defined by the {cpp} core language | ||
|
|
||
| a@ | ||
| [source] | ||
| ---- | ||
| unsigned long int | ||
| ---- | ||
| a@ An unsigned integer of at least 32-bits, as defined by the {cpp} core language | ||
|
|
||
| a@ | ||
| [source] | ||
| ---- | ||
| long long int | ||
| ---- | ||
| a@ An integer of at least 64-bits, as defined by the {cpp} core language | ||
|
|
||
| a@ | ||
| [source] | ||
| ---- | ||
| unsigned long long int | ||
| ---- | ||
| a@ An unsigned integer of at least 64-bits, as defined by the {cpp} core language | ||
|
|
||
| a@ | ||
| [source] | ||
| ---- | ||
| float | ||
| ---- | ||
| a@ A 32-bit floating-point. The float data type must conform to the IEEE 754 | ||
| single precision storage format. | ||
|
|
||
| a@ | ||
| [source] | ||
| ---- | ||
| double | ||
| ---- | ||
| a@ A 64-bit floating-point. The double data type must conform to the IEEE 754 | ||
| double precision storage format. This type is only supported on devices | ||
| that have [code]#aspect::fp64#. | ||
Pennycook marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| |==== | ||
| The availability of a type in device code does not guarantee that it will be | ||
| supported by all devices. | ||
| Some types are only supported on devices with specific device aspects, as | ||
| described in <<sec:device-aspects>>. | ||
|
|
||
| All types which are available in device code must have the same size, alignment | ||
| requirements, and representation on both host and device. | ||
|
Comment on lines
+232
to
+233
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think enumeration types should be addressed in this section as well. There is an implementation requirement as well as a user requirement. For enumerations with a fixed underlying type (e.g.,
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That makes sense to me. I've added wording very similar to what you wrote here in f4373e2. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The updates for enumeration types looks good to me.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm unresolving this conversation. It seems confusing to me to say that applications are required to choose the same underlying type for enumerations. Why state this specifically when there are other things an application could do that cause types to be different on host and device? For example, if an application defines members of a struct to be different on host and device, you would have a similar problem. If we feel we need to say something about this, perhaps we could say it more generally like:
Or maybe it should be "... implementation defined behavior"? I think it is useful to specifically list There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it is reasonable to relax the requirement that (user specified) fixed underlying types must match so long as device copyability rules retain a requirement for the same (or at least a compatible) fixed underlying type.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think a general note like the one @gmlueck suggests would be useful for application developers, but I didn't think that was what @tahonermann was talking about here. The wording I added in f4373e2 is in the "Device compiler" section, and I didn't intend it to be advice for application developers. The intent was to clarify that the implementation must guarantee that enumerations have matching types. The first sentence just says that enumerations with different fixed underlying types are invalid -- that means a device compiler (and more broadly, the implementation) shouldn't introduce any of those. The second sentence is probably the more important one, because it means that a device compiler may require specific logic to ensure that it selects the same underlying type as the host compiler if an enumeration has no fixed underlying type. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. With regard to explicit fixed types, there is both an implementation and a requirement on application developers. I expect device code restrictions to be relaxed over time to allow more use of the C and C++ standard libraries in device code. Device code can already link with device-only TUs via Likewise, user-defined enumeration types must be consistently defined to be copied between the host and device. For those cases, we can require all such types to be consistently defined, or we can place the requirement in the device copyable restrictions (with or without a compatible size/alignment allowance and corresponding |
||
|
|
||
| For enumerations without a fixed underlying type, the implementation must select | ||
| the same underlying type on host and device. | ||
gmlueck marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| == Preprocessor directives and macros | ||
|
|
||
|
|
@@ -387,6 +279,30 @@ literal with greater value. | |
| In addition, for each <<backend>> supported, the preprocessor macros described | ||
| in <<sec:backends>> must be defined by all conformant implementations. | ||
|
|
||
| [[sec:standard-library-support]] | ||
| == {cpp} standard library support | ||
|
|
||
gmlueck marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| In general, any use of functions or types from the [code]#std# namespace in | ||
| device code produces undefined behavior. | ||
| However, the functions and types listed in this section are guaranteed to have | ||
| defined semantics when used in device code. | ||
|
|
||
| === Type aliases | ||
|
|
||
| The following type aliases must alias the same types on host and device: | ||
|
|
||
| * [code]#std::size_t#; | ||
| * [code]#std::ptrdiff_t#; | ||
| * [code]#std::max_align_t#; | ||
| * [code]#std::nullptr_t#; and | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thinking about it, I was wondering about how this is implemented in C++, looking at typedef decltype(nullptr) nullptr_t;which follows https://en.cppreference.com/w/cpp/types/nullptr_t But how an implementation using for example a different host and device compiler can ensure that these are the same types on host and device side? 🤔
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I think we do care that they are the same type on host vs. device. Otherwise, you could construct constexpr values that are different on host vs. device via There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is also important that the types of the type alias targets match so that mangled names can be correlated across host and device compilation. Consider a SYCL kernel name type that is a class template specialization with a
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The problem with |
||
| * The type aliases for integer types defined in [code]#<cstdint>#. | ||
|
|
||
| === Type traits | ||
|
|
||
| For any type available in device code, specializations of the following type | ||
| traits must have identical definitions on host and device: | ||
|
|
||
| * [code]#std::numeric_limits# | ||
|
|
||
| [[sec:optional-kernel-features]] | ||
| == Optional kernel features | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I'd say:
Statements in this list are inconsistent. Some say that using a prohibited feature causes undefined behavior, while others say that application should not use the feature. It seems to me that we should consistently say that device code must not use these features.
You may argue that an extension might lift some of these restrictions. That's fine. An extension is allowed to lift any prohibition in the spec and assign some defined meaning. There is no need to use the "undefined behavior" wording to allow an extension to do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had previously suggested that we address
long doublevia #371. I think undefined behavior is the status quo.Does "device code must not use these features" mean that such code is ill-formed? I think ISO standards reserve "must" for use in implication (A must be B because of C). I understand that other specifications use the term to specify requirements (e.g., RFC 2119), but I think its use can be confusing. I think "shall" communicates the intent better.
It wouldn't be a bad thing for the SYCL specification to explicitly state its requirement terminology; perhaps via reference to RFC 2119 or similar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still like the direction proposed in #371 -- that is that we make
long doublean optional device feature that is tied to some new aspect. However, that is probably something that needs to wait for SYCL-Next.I think the status quo in DPC++ is that use of
long doublein device code is ill-formed. I think the compiler diagnoses a compile-time error for this. (Sorry, haven't checked lately, but this is my memory.)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was curious about this so I just went to check.
By default, DPC++ throws an error because
long doubledefaults to 128-bit, which isn't supported in SPIR-V:You can circumvent that check by adding the
-mlong-double-64flag. In other words, DPC++ doesn't check forlong double, it just fails if you try to use 128- or 80-bit data types.SimSYCL also doesn't do any checks for use of
long double, and because it's a header-only implementation I can't immediately think of a way they could add these checks. It's also worth noting that device code usinglong doublealways works in SimSYCL, because the code is running on the host CPU.So I think the status quo is undefined behavior. Sometimes it's ill-formed, sometimes you can make it work (by accident), and sometimes it works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, and some people are Argonne is using this flag in production
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It just treats every instance of
long doubleas if it weredouble, so everything is generated usingOpTypeFloat 64.I'm not convinced it works properly, because all of the tests I've tried so far have returned
nan. But the code compiles without producing any errors or warnings, and runs without producing any errors or warnings.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see.
Why? Is it just because you have existing code that uses
long doubleand it's easier to pass the flag than to rewrite the code to usedouble?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly. It begs the question why it even works, and why they
long doublein the first place, but 🤷🏽There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps Argonne needs it because they are using user-defined literals (UDLs). A request for UDL support is what lead me to file #371 in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If so, there may be a cleaner solution now that we have changed the spec to allow
long doublein manifestly constant-evaluated expressions. As I noted in #379, I think an application can define the the UDLoperator""asconsteval, and this will cause the evaluation of UDL's to happen at compile time as a constant expression. SYCL will allow this even in device code because we lifted the restriction aboutlong double(and all other "forbidden" C++ features) when they are used in manifestly constant-evaluated expressions.Of course,
constevalrequires the application to compile in C++20 mode.