Description
Feature gate: #![feature(stdarch_s390x)]
This is a tracking issue for the s390x (aka SystemZ) intrinsics in core::arch::s390x
.
Public API
Everything in core::arch::s390x
.
Missing instructions
based on the clang vecintrin.h (roungly similar to the linkable GCC vecintrin.h).
vec_any_nge
s390x: final batch of intrinsics stdarch#1743vec_any_ngt
s390x: final batch of intrinsics stdarch#1743vec_any_nle
s390x: final batch of intrinsics stdarch#1743vec_any_nlt
s390x: final batch of intrinsics stdarch#1743vec_all_nge
s390x: final batch of intrinsics stdarch#1743vec_all_ngt
s390x: final batch of intrinsics stdarch#1743vec_all_nle
s390x: final batch of intrinsics stdarch#1743vec_all_nlt
s390x: final batch of intrinsics stdarch#1743vec_all_nan
s390x: final batch of intrinsics stdarch#1743vec_all_numeric
s390x: final batch of intrinsics stdarch#1743vec_any_nan
s390x: final batch of intrinsics stdarch#1743vec_any_numeric
s390x: final batch of intrinsics stdarch#1743vec_genmask
s390x: add more intrinsics stdarch#1728vec_genmasks_8
s390x: add more intrinsics stdarch#1728vec_genmasks_16
s390x: add more intrinsics stdarch#1728vec_genmasks_32
s390x: add more intrinsics stdarch#1728vec_genmasks_64
s390x: add more intrinsics stdarch#1728vec_splat_u8
s390x: add more intrinsics stdarch#1728vec_splat_s8
s390x: add more intrinsics stdarch#1728vec_splat_u16
s390x: add more intrinsics stdarch#1728vec_splat_s16
s390x: add more intrinsics stdarch#1728vec_splat_u32
s390x: add more intrinsics stdarch#1728vec_splat_s32
s390x: add more intrinsics stdarch#1728vec_splat_u64
s390x: add more intrinsics stdarch#1728vec_splat_s64
s390x: add more intrinsics stdarch#1728vec_checksum
s390x: another batch of intrinsics stdarch#1738vec_gfmsum_128
s390x: another batch of intrinsics stdarch#1738vec_gfmsum_accum_128
s390x: another batch of intrinsics stdarch#1738vec_ceil
S390x float rounding stdarch#1712vec_roundp
S390x float rounding stdarch#1712vec_floor
S390x float rounding stdarch#1712vec_roundm
S390x float rounding stdarch#1712vec_trunc
S390x float rounding stdarch#1712vec_roundz
S390x float rounding stdarch#1712vec_rint
S390x float rounding stdarch#1712vec_roundc
S390x float rounding stdarch#1712vec_round
S390x float rounding stdarch#1712vec_doublee
s390x: final batch of intrinsics stdarch#1743vec_add_u128
s390x: another batch of intrinsics stdarch#1738vec_addc_u128
s390x: another batch of intrinsics stdarch#1738vec_adde_u128
s390x: another batch of intrinsics stdarch#1738vec_addec_u128
s390x: another batch of intrinsics stdarch#1738vec_bperm_u128
s390x: final batch of intrinsics stdarch#1743vec_cmpeq_idx
s390x: final batch of intrinsics stdarch#1743vec_cmpeq_idx_cc
s390x: final batch of intrinsics stdarch#1743vec_cmpeq_or_0_idx
s390x: final batch of intrinsics stdarch#1743vec_cmpeq_or_0_idx_cc
s390x: final batch of intrinsics stdarch#1743vec_cmpne_idx
s390x: final batch of intrinsics stdarch#1743vec_cmpne_idx_cc
s390x: final batch of intrinsics stdarch#1743vec_cmpne_or_0_idx
s390x: final batch of intrinsics stdarch#1743vec_cmpne_or_0_idx_cc
s390x: final batch of intrinsics stdarch#1743vec_cmpnrg_cc
s390x: final batch of intrinsics stdarch#1743vec_cmpnrg_idx
s390x: final batch of intrinsics stdarch#1743vec_cmpnrg_idx_cc
s390x: final batch of intrinsics stdarch#1743vec_cmpnrg_or_0_idx
s390x: final batch of intrinsics stdarch#1743vec_cmpnrg_or_0_idx_cc
s390x: final batch of intrinsics stdarch#1743vec_cmprg_cc
s390x: final batch of intrinsics stdarch#1743vec_cmprg_idx
s390x: final batch of intrinsics stdarch#1743vec_cmprg_idx_cc
s390x: final batch of intrinsics stdarch#1743vec_cmprg_or_0_idx
s390x: final batch of intrinsics stdarch#1743vec_cmprg_or_0_idx_cc
s390x: final batch of intrinsics stdarch#1743vec_cp_until_zero
s390x: final batch of intrinsics stdarch#1743vec_cp_until_zero_cc
s390x: final batch of intrinsics stdarch#1743vec_extend_s64
s390x: final batch of intrinsics stdarch#1743vec_find_any_eq
s390x: add more intrinsics stdarch#1728vec_find_any_eq_cc
s390x: add more intrinsics stdarch#1728vec_find_any_eq_idx
s390x: add more intrinsics stdarch#1728vec_find_any_eq_idx_cc
s390x: add more intrinsics stdarch#1728vec_find_any_eq_or_0_idx
s390x: add more intrinsics stdarch#1728vec_find_any_eq_or_0_idx_cc
s390x: add more intrinsics stdarch#1728vec_find_any_ne
s390x: add more intrinsics stdarch#1728vec_find_any_ne_cc
s390x: add more intrinsics stdarch#1728vec_find_any_ne_idx
s390x: add more intrinsics stdarch#1728vec_find_any_ne_idx_cc
s390x: add more intrinsics stdarch#1728vec_find_any_ne_or_0_idx
s390x: add more intrinsics stdarch#1728vec_find_any_ne_or_0_idx_cc
s390x: add more intrinsics stdarch#1728vec_fp_test_data_class
s390x: final batch of intrinsics stdarch#1743vec_gather_element
s390x: final batch of intrinsics stdarch#1743vec_gfmsum_accum
s390x: another batch of intrinsics stdarch#1738vec_load_bndry
s390x: another batch of intrinsics stdarch#1738vec_load_len
s390x: another batch of intrinsics stdarch#1738vec_load_len_r
s390x: another batch of intrinsics stdarch#1738vec_load_pair
s390x: another batch of intrinsics stdarch#1738vec_mergeh
s390x: add more intrinsics stdarch#1728vec_mergel
s390x: add more intrinsics stdarch#1728vec_msum_u128
s390x: final batch of intrinsics stdarch#1743vec_packs_cc
s390x: another batch of intrinsics stdarch#1738vec_packsu_cc
s390x: another batch of intrinsics stdarch#1738vec_popcnt
S390x vector bitwise operations stdarch#1709vec_rl_mask
s390x: add more intrinsics stdarch#1728vec_scatter_element
s390x: final batch of intrinsics stdarch#1743vec_search_string_cc
s390x: final batch of intrinsics stdarch#1743vec_search_string_until_zero_cc
s390x: final batch of intrinsics stdarch#1743vec_splat
s390x: add more intrinsics stdarch#1728vec_store_len
s390x: another batch of intrinsics stdarch#1738vec_store_len_r
s390x: another batch of intrinsics stdarch#1738vec_sub_u128
s390x: add more intrinsics stdarch#1728vec_subc_u128
s390x: add more intrinsics stdarch#1728vec_sube_u128
s390x: add more intrinsics stdarch#1728vec_subec_u128
s390x: add more intrinsics stdarch#1728vec_sum_u128
s390x: add more intrinsics stdarch#1728vec_test_mask
s390x: final batch of intrinsics stdarch#1743vec_unpackh
s390x: another batch of intrinsics stdarch#1738vec_unpackl
s390x: another batch of intrinsics stdarch#1738vec_unsigned
s390x: final batch of intrinsics stdarch#1743
blocked on #137447
vec_insert_and_zero
addvec_extract
,vec_insert
,vec_promote
andvec_insert_and_zero
stdarch#1772
from nnp-assist
, current qemu traps on these
vec_extend_to_fp32_hi
vec_extend_to_fp32_lo
vec_round_from_fp32
vec_convert_to_fp16
vec_convert_from_fp16
deprecated functions
vec_ctd
vec_ctd_s64
vec_ctd_u64
vec_ctsl
vec_ctul
vec_ld2f
vec_st2f
vec_xstd2
vec_xstw4
vec_xld2
vec_xlw4
vec_permi
Steps / History
Unresolved Questions
- None yet.
@rustbot label O-SystemZ
general s390x vector/intrinsics progress is tracked at #130869
cc @taiki-e
Activity
vec_add
fors390x
rust-lang/stdarch#1703s390x
: addvec_sub
,vec_mul
,vec_min
,vec_max
,vec_abs
andvec_splats
rust-lang/stdarch#1704vec_extract
,vec_insert
,vec_promote
andvec_insert_and_zero
rust-lang/stdarch#1772uweigand commentedon May 16, 2025
The new machines IBM z17 and IBM LinuxONE Emperor 5 were recently announced. These machines implement the
arch15
level of the z/Architecture. Support for this has been added to LLVM here: llvm/llvm-project@8424bf2Support for the new architecture level also comes with a new revision of the vector intrinsics (implemented across GCC, LLVM, and the IBM compilers). It would be good to update the Rust implementation to match.
The new
vecintrin.h
file can be seen e.g. here: https://github.com/llvm/llvm-project/blob/8424bf207efd89eacf2fe893b67be98d535e1db6/clang/lib/Headers/vecintrin.h This implements the following set of changes compared to the previous version:Generic cleanup
While reviewing the new changes, we noticed a number of inconsistencies and deficiencies in the existing intrinsics, which were cleaned up as part of the new revision. Specifically:
vec_and
,vec_or
, andvec_xor
intrinsics. These are mostly redundant with the&
,|
, and^
operators, but can also be used with floating-point vector arguments.vector unsigned char
. Also, the operand to be shifted should not be of anyvector bool
type since the result may not necessarily be a valid bool vector value. Added the following intrinsics:and deprecated those intrinsics:
vec_load_len/vec_store_len
andvec_load_len_r/vec_store_len_r
. Both now support onlyvector signed char
andvector unsigned char
. Added intrinsics:and deprecated the existing intrinsics:
Support for 128-bit integer vector types
One main feature of the
arch15
ISA is support for a full set of arithmetical operations on 128-bit integer values held in vector registers. This is used to a large extent implicitly by the code-generator back end. However, there are also a number of operations that required intrinsics to fully exploit. We decided to add the following new vector types to be used with those intrinsics:Note that since the vector length is only 128 bits, these vector types only contain a single element. They are still useful as they use a different ABI (passed in vector registers and not in memory), and it seems cleaner to consistently use "vector" types with the vector intrinsics.
Note that many operations on these types can actually be performed with prior versions of the ISA, so the types have been made available unconditionally. Many existing intrinsics have been extended to support the new types:
Some other intrinsics also now support the new types, but only when the
vector-enhancements-3
feature is present:Finally, a number of existing intrinsics already operated on 128-bit integer types, but used
vector unsigned char
to represent those values in the absence of a better type. These have now all been deprecated:and replaced by the following new intrinsics:
and new overloads of existing intrinsics:
Other new ISA capabilities
In addition to the above, the new ISA provides an extended set of new multiplication operations on 64-bit and 128-bit
integers, including 64->128 and 128->256 widening multiply. These have been added as new overloads to the following intrinsics (only available with
vector-enhancements-3
):Finally, there are a few completely new intrinsics to support new operations (with
vector-enhancements-3
):FYI @folkertdev @taiki-e @cuviper @fneddy