-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Open
Labels
api:Javaissues related to the Java APIissues related to the Java API
Description
Describe the issue
Using the latest Java API (1.22.0),
I cannot load he model gpt-oss-20b
To reproduce
- get model from: https://huggingface.co/onnxruntime/gpt-oss-20b-onnx/tree/main/cpu_and_mobile/cpu-int4-rtn-block-32-acc-level-4
- create OrtSession with the model from 1)
import ai.onnxruntime.OrtEnvironment;
import ai.onnxruntime.OrtSession;
public class TryOnnx {
public static void main(String[] args) throws Exception {
var env = OrtEnvironment.getEnvironment();
var session = env.createSession("models/gpt-oss-20b/cuda/model.onnx",new OrtSession.SessionOptions());
}
}-> throws
Exception in thread "main" ai.onnxruntime.OrtException: Error code - ORT_INVALID_GRAPH - message: Load model from models/gpt-oss-20b/cuda/model.onnx failed:This is an invalid model. In Node, ("/model/layers.0/attn/GroupQueryAttention", GroupQueryAttention, "com.microsoft", -1) : ("/model/layers.0/attn/qkv_proj/Add/output_0": tensor(float),"","","past_key_values.0.key": tensor(float),"past_key_values.0.value": tensor(float),"/model/attn_mask_reformat/attn_mask_subgraph/Sub/Cast/output_0": tensor(int32),"/model/attn_mask_reformat/attn_mask_subgraph/Gather/Cast/output_0": tensor(int32),"cos_cache": tensor(float),"sin_cache": tensor(float),"","","model.layers.0.attn.sinks": tensor(float),) -> ("/model/layers.0/attn/GroupQueryAttention/output_0": tensor(float),"present.0.key": tensor(float),"present.0.value": tensor(float),) , Error Node(/model/layers.0/attn/GroupQueryAttention) with schema(com.microsoft::GroupQueryAttention:1) has input size 12 not in range [min=7, max=11].Urgency
No response
Platform
Linux
OS Version
ubuntu noble
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
1.22.0
ONNX Runtime API
Java
Architecture
X64
Execution Provider
Default CPU
Execution Provider Library Version
No response
Metadata
Metadata
Assignees
Labels
api:Javaissues related to the Java APIissues related to the Java API