Based ops are not implemented for 'Float8_e4m3fn'

### 🐛 Describe the bug

Some ops are not supported float8

```python
import torch

device = "xpu"
dtype = torch.float8_e4m3fn
a = torch.randn((4,4), device=device).to(dtype)
b = torch.randn((4,4), device=device).to(dtype)
```

NotImplementedError:
#### compare_xpu
`a>0` raise `"compare_xpu" not implemented for 'Float8_e4m3fn'`
#### where_xpu
`torch.where(a == 0, a, b)` raise `"where_xpu" not implemented for 'Float8_e4m3fn'`
#### normal_kernel_xpu
`a = torch.randn((4,4), device=device, dtype=torch.float8_e4m3fn)` raise `"normal_kernel_xpu" not implemented for 'Float8_e4m3fn'`
#### add_xpu
`a+b` and `a-b` raise `"add_xpu" not implemented for 'Float8_e4m3fn'`
#### mul_xpu
`a*b` raise `"mul_xpu" not implemented for 'Float8_e4m3fn'`
#### div_true_xpu
`a/b` raise `"div_true_xpu" not implemented for 'Float8_e4m3fn'`

### Versions

```
PyTorch version: 2.9.0.dev20250811+xpu
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A

OS: Ubuntu 24.04.2 LTS (x86_64)
GCC version: (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0
Clang version: Could not collect
CMake version: version 3.28.3
Libc version: glibc-2.39

Python version: 3.12.3 (main, Jun 18 2025, 17:59:45) [GCC 13.3.0] (64-bit runtime)
Python platform: Linux-5.15.0-151-generic-x86_64-with-glibc2.39
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
Is XPU available: True
XPU used to build PyTorch: 20250101
Intel GPU driver version:
* intel-opencl-icd:     25.18.33578.15-1146~24.04
* libze1:       1.21.9.0-1136~24.04
Intel GPU models onboard:
* Intel(R) Data Center GPU Max 1550
* Intel(R) Data Center GPU Max 1550
* Intel(R) Data Center GPU Max 1550
* Intel(R) Data Center GPU Max 1550
* Intel(R) Data Center GPU Max 1550                                                                                               [89/1846]
Intel GPU models detected:
* [0] _XpuDeviceProperties(name='Intel(R) Data Center GPU Max 1550', platform_name='Intel(R) oneAPI Unified Runtime over Level-Zero', type=
'gpu', device_id=0xBD5, driver_version='1.6.33578+15', total_memory=65520MB, max_compute_units=512, gpu_eu_count=512, gpu_subslice_count=64
, max_work_group_size=1024, max_num_sub_groups=64, sub_group_sizes=[16 32], has_fp16=1, has_fp64=1, has_atomic64=1)
* [1] _XpuDeviceProperties(name='Intel(R) Data Center GPU Max 1550', platform_name='Intel(R) oneAPI Unified Runtime over Level-Zero', type=
'gpu', device_id=0xBD5, driver_version='1.6.33578+15', total_memory=65520MB, max_compute_units=512, gpu_eu_count=512, gpu_subslice_count=64
, max_work_group_size=1024, max_num_sub_groups=64, sub_group_sizes=[16 32], has_fp16=1, has_fp64=1, has_atomic64=1)
* [2] _XpuDeviceProperties(name='Intel(R) Data Center GPU Max 1550', platform_name='Intel(R) oneAPI Unified Runtime over Level-Zero', type=
'gpu', device_id=0xBD5, driver_version='1.6.33578+15', total_memory=65520MB, max_compute_units=512, gpu_eu_count=512, gpu_subslice_count=64
, max_work_group_size=1024, max_num_sub_groups=64, sub_group_sizes=[16 32], has_fp16=1, has_fp64=1, has_atomic64=1)
* [3] _XpuDeviceProperties(name='Intel(R) Data Center GPU Max 1550', platform_name='Intel(R) oneAPI Unified Runtime over Level-Zero', type=
'gpu', device_id=0xBD5, driver_version='1.6.33578+15', total_memory=65520MB, max_compute_units=512, gpu_eu_count=512, gpu_subslice_count=64
, max_work_group_size=1024, max_num_sub_groups=64, sub_group_sizes=[16 32], has_fp16=1, has_fp64=1, has_atomic64=1)
* [4] _XpuDeviceProperties(name='Intel(R) Data Center GPU Max 1550', platform_name='Intel(R) oneAPI Unified Runtime over Level-Zero', type=
'gpu', device_id=0xBD5, driver_version='1.6.33578+15', total_memory=65520MB, max_compute_units=512, gpu_eu_count=512, gpu_subslice_count=64
, max_work_group_size=1024, max_num_sub_groups=64, sub_group_sizes=[16 32], has_fp16=1, has_fp64=1, has_atomic64=1)
* [5] _XpuDeviceProperties(name='Intel(R) Data Center GPU Max 1550', platform_name='Intel(R) oneAPI Unified Runtime over Level-Zero', type=
'gpu', device_id=0xBD5, driver_version='1.6.33578+15', total_memory=65520MB, max_compute_units=512, gpu_eu_count=512, gpu_subslice_count=64
, max_work_group_size=1024, max_num_sub_groups=64, sub_group_sizes=[16 32], has_fp16=1, has_fp64=1, has_atomic64=1)
* [6] _XpuDeviceProperties(name='Intel(R) Data Center GPU Max 1550', platform_name='Intel(R) oneAPI Unified Runtime over Level-Zero', type=
'gpu', device_id=0xBD5, driver_version='1.6.33578+15', total_memory=65520MB, max_compute_units=512, gpu_eu_count=512, gpu_subslice_count=64
, max_work_group_size=1024, max_num_sub_groups=64, sub_group_sizes=[16 32], has_fp16=1, has_fp64=1, has_atomic64=1)
* [7] _XpuDeviceProperties(name='Intel(R) Data Center GPU Max 1550', platform_name='Intel(R) oneAPI Unified Runtime over Level-Zero', type=
'gpu', device_id=0xBD5, driver_version='1.6.33578+15', total_memory=65520MB, max_compute_units=512, gpu_eu_count=512, gpu_subslice_count=64
, max_work_group_size=1024, max_num_sub_groups=64, sub_group_sizes=[16 32], has_fp16=1, has_fp64=1, has_atomic64=1)
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
Versions of relevant libraries:
[pip3] dpcpp-cpp-rt==2025.1.1
[pip3] galore-torch==1.0
[pip3] impi-rt==2021.15.0
[pip3] intel-cmplr-lib-rt==2025.1.1
[pip3] intel-cmplr-lib-ur==2025.1.1
[pip3] intel-cmplr-lic-rt==2025.1.1
[pip3] intel-opencl-rt==2025.1.1
[pip3] intel-openmp==2025.1.1
[pip3] intel-pti==0.12.3
[pip3] intel-sycl-rt==2025.1.1
[pip3] mkl==2025.1.0
[pip3] mypy_extensions==1.1.0
[pip3] numpy==1.26.4
[pip3] oneccl==2021.15.2
[pip3] oneccl-devel==2021.15.2
[pip3] onemkl-sycl-blas==2025.1.0
[pip3] onemkl-sycl-dft==2025.1.0
[pip3] onemkl-sycl-lapack==2025.1.0
[pip3] onemkl-sycl-rng==2025.1.0
[pip3] onemkl-sycl-sparse==2025.1.0
[pip3] onnx==1.18.0
[pip3] pytorch-msssim==1.0.0
[pip3] pytorch-triton-xpu==3.4.0+gitae324eea
[pip3] tbb==2022.1.0
[pip3] tcmlib==1.3.0
[pip3] torch==2.9.0.dev20250811+xpu
[pip3] torchaudio==2.8.0.dev20250812+xpu
[pip3] torchcodec==0.6.0
[pip3] torchdata==0.11.0
[pip3] torchvision==0.24.0.dev20250812+xpu
[pip3] triton==3.3.0
[pip3] umf==0.10.0
[conda] Could not collect
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Based ops are not implemented for 'Float8_e4m3fn' #1939

🐛 Describe the bug

compare_xpu

where_xpu

normal_kernel_xpu

add_xpu

mul_xpu

div_true_xpu

Versions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Based ops are not implemented for 'Float8_e4m3fn' #1939

Description

🐛 Describe the bug

compare_xpu

where_xpu

normal_kernel_xpu

add_xpu

mul_xpu

div_true_xpu

Versions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions