Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HF model tracker #899

Open
pdhirajkumarprasad opened this issue Jan 9, 2025 · 3 comments
Open

HF model tracker #899

pdhirajkumarprasad opened this issue Jan 9, 2025 · 3 comments

Comments

@pdhirajkumarprasad
Copy link

pdhirajkumarprasad commented Jan 9, 2025

Total no. of models 545
PASS 307 -> 408
Numeric 12 -> 37
compilation
compiled_inference
setup and import

Detailed list

@amd-vivekag
Copy link

amd-vivekag commented Feb 13, 2025

Passing Summary

TOTAL TESTS = 142

Stage # Passing % of Total % of Attempted
Setup 130 91.5% 91.5%
IREE Compilation 64 45.1% 49.2%
Gold Inference 43 30.3% 67.2%
IREE Inference Invocation 38 26.8% 88.4%
Inference Comparison (PASS) 36 25.4% 94.7%

Fail Summary

TOTAL TESTS = 142

Stage # Failed at Stage % of Total
Setup 12 8.5%
IREE Compilation 66 46.5%
Gold Inference 21 14.8%
IREE Inference Invocation 5 3.5%
Inference Comparison 2 1.4%

Passing Summary for text-classification testcases:

TOTAL TESTS = 72

Stage # Passing % of Total % of Attempted
Setup 72 100.0% 100.0%
IREE Compilation 62 86.1% 86.1%
Gold Inference 62 86.1% 100.0%
IREE Inference Invocation 61 84.7% 98.4%
Inference Comparison (PASS) 60 83.3% 98.4%

Fail Summary

TOTAL TESTS = 72

Stage # Failed at Stage % of Total
Setup 0 0.0%
IREE Compilation 10 13.9%
Gold Inference 0 0.0%
IREE Inference Invocation 1 1.4%
Inference Comparison 1 1.4%

Failure summary:

# Stage
61 compilation
6 compiled_inference
5 construct_inputs
15 import_model
16 native_inference
12 setup

GIST containing all the failures: https://gist.github.com/amd-vivekag/377a7b141b40c118f880b2ced176f95c

Setup failures categories:
Total Failures: 12

# Device Issue type Issue Message Issue no #Model impacted List of model Assignee Status
1 CPU setup ImportError("Loading an AWQ quantized model requires auto-awq library (pip install autoawq) 918 2 hf_Midnight-Miqu-70B-v1.5-4bit, hf_Meta-Llama-3.1-8B-Instruct-AWQ-INT4
2 CPU setup requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url 919 3 hf_Multiple_Choice, hf_multiple_choice_model, hf_Multiple_Choice_EN
3 CPU setup IndexError: index out of range in self 920 1 hf_ruRoPEBert-e5-base-2k
4 CPU setup Unknown task: fill-mask 921 2 hf_multi-qa-mpnet-base-cos-v1, hf_all-mpnet-base-v1
5 CPU setup importlib.metadata.PackageNotFoundError: No package metadata was found for bitsandbytes 922 1 hf_Meta-Llama-3.1-8B-Instruct-bnb-4bit
6 CPU setup RuntimeError: Error(s) in loading state_dict for DebertaV2ForMultipleChoice: 923 1 hf_fine-tuned-MoritzLaurer-deberta-v3-large-zeroshot-v2.0-arceasy
7 CPU setup TypeError: DisableCompileContextManager.enter....() got an unexpected keyword argument 'dtype' 924 1 hf_Llama3-8B-1.58-100B-tokens-GGUF
8 CPU setup torch.onnx.errors.UnsupportedOperatorError: Exporting the operator 'aten::bitwise_and' to ONNX opset version 14 is not supported 925 1 hf_Mistral-7B-Instruct-v0.2-GPTQ
9 CPU import_model Killed due to OOM #926 1 hf_StableBeluga2
10 CPU import_model assertNonNull: Assertion g.get() != nullptr failed #927 5 hf_esm2_t36_3B_UR50D, hf_Phi-3.5-mini-instruct, hf_Phi-3-mini-128k-instruct, hf_Phi-3-mini-4k-instruct, hf_zephyr-7b-beta
11 CPU import_model assertInVersionRange: Assertion version >= version_range.first && version <= version_range.second failed #928 8 hf_llama-7b, hf_oasst-sft-4-pythia-12b-epoch-3.5, hf_Qwen2.5-1.5B-Instruct, hf_Qwen2.5-7B-Instruct, hf_Qwen2-7B-Instruct, hf_TinyLlama-1.1B-Chat-v1.0, hf_vicuna-7b-v1.5, hf_wasmai-7b-v1
12 CPU import_model Assertion node->outputs().size() < 4 failed #929 1 hf_nfnet_l0.ra2_in1k
13 CPU compilation error: failed to legalize operation 'torch.operator' that was explicitly marked illegal #930 45 hf_1_microsoft_deberta_V1.0, hf_1_microsoft_deberta_V1.1, hf_checkpoints_10_1_microsoft_deberta_V1.1_384, hf_checkpoints_1_16, hf_checkpoints_26_9_microsoft_deberta_21_9, hf_checkpoints_28_9_microsoft_deberta_V2, hf_checkpoints_28_9_microsoft_deberta_V4, hf_checkpoints_28_9_microsoft_deberta_V5, hf_checkpoints_29_9_microsoft_deberta_V1, hf_checkpoints_30_9_microsoft_deberta_V1.0_384, hf_checkpoints_3_14, hf_content, hf_deberta-base, hf_deberta_finetuned_pii, hf_deberta-large-mnli, hf_Debertalarg_model_multichoice_Version2, hf_deberta-v2-base-japanese, hf_deberta-v2-base-japanese-char-wwm, hf_deberta-v3-base, hf_deberta-v3-base-absa-v1.1, hf_deberta-v3-base_finetuned_ai4privacy_v2, hf_deberta-v3-base-injection, hf_DeBERTa-v3-base-mnli-fever-anli, hf_deberta-v3-base-squad2, hf_deberta-v3-base-zeroshot-v1.1-all-33, hf_deberta-v3-large, hf_deberta-v3-large_boolq, hf_deberta-v3-large-squad2, hf_deberta-v3-large_test, hf_deberta-v3-large_test_9e-6, hf_deberta-v3-small, hf_deberta-v3-xsmall, hf_llm-mdeberta-v3-swag, hf_mdeberta-v3-base, hf_mDeBERTa-v3-base-mnli-xnli, hf_mdeberta-v3-base-squad2, hf_mDeBERTa-v3-xnli-ft-bs-multiple-choice, hf_Medical-NER, hf_mxbai-rerank-base-v1, hf_mxbai-rerank-xsmall-v1, hf_nli-deberta-v3-base, hf_output, hf_piiranha-v1-detect-personal-information, hf_splinter-base, hf_splinter-base-qass
14 CPU compilation error: failed to legalize unresolved materialization from ('i64') to ('index') that remained live after conversion #931 3 hf_deeplabv3-mobilevit-small, hf_deeplabv3-mobilevit-xx-small, hf_mobilevit-small
15 CPU compilation error: 'flow.dispatch.workgroups' op value set has 3 dynamic dimensions but only 2 dimension values are attached #932 3 hf_beit-base-patch16-224-pt22k, hf_beit-base-patch16-224-pt22k-ft22k, hf_pedestrian_gender_recognition
16 CPU compilation error: expected sizes to be non-negative, but got -1 #933 7 hf_swin_base_patch4_window7_224.ms_in22k_ft_in1k, hf_swin-tiny-patch4-window7-224, hf_yolos-base, hf_yolos-fashionpedia, hf_yolos-small, hf_yolos-small-finetuned-license-plate-detection, hf_yolos-small-rego-plates-detection
17 CPU compilation error: 'stream.async.dispatch' op has invalid Read access range #934 1 hf_dpt-large-ade
18 CPU compilation error: 'iree_linalg_ext.pack' op write affecting operations on global resources are restricted to workgroup distributed contexts. #935 1 hf_distilhubert
19 CPU compilation error: expected offsets to be non-negative, but got -1 #936 1 hf_pnasnet5large.tf_in1k
20 CPU construct_inputs ValueError: Asking to pad but the tokenizer does not have a padding token #938 4 hf_distilgpt2, hf_gpt2, hf_llama-68m, hf_tiny-random-mistral
21 CPU construct_inputs name 'tokens' is not defined #939 1 hf_wavlm-base-plus @amd-vivekag
22 CPU native_inference IndexError: tuple index out of range #940 14 hf_bart-base, hf_gpt2-small-spanish, hf_ivila-row-layoutlm-finetuned-s2vl-v2, hf_opt-125m, hf_Qwen1.5-0.5B-Chat, hf_Qwen2-0.5B, hf_Qwen2.5-0.5B-Instruct, hf_really-tiny-falcon-testing, hf_tiny-dummy-qwen2, hf_tiny-Qwen2ForCausalLM-2.5, hf_tiny-random-GemmaForCausalLM, hf_tiny-random-LlamaForCausalLM, hf_tiny-random-mt5, hf_tiny-random-Phi3ForCausalLM
23 CPU native_inference [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Got invalid dimensions for input: pixel_values for the following indices #941 1 hf_mobilenet_v1_0.75_192
24 CPU native_inference [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Add node #942 1 hf_eva_large_patch14_196.in22k_ft_in22k_in1k
25 CPU compiled_inference INVALID_ARGUMENT; function expected fewer input values; parsing input input.bin #943 1 hf_ko-sroberta-multitask, hf_robertuito-sentiment-analysis, hf_sbert_large_nlu_ru, hf_sentence-bert-base-ja-mean-tokens-v2
26 CPU compiled_inference :0: FAILED_PRECONDITION; onnx.Expand input has a dim that is not statically 1 #944 1 hf_phobert-base-finetuned, hf_phobert-large-finetuned

@zjgarvey
Copy link
Collaborator

zjgarvey commented Feb 13, 2025

I assume the most recent run is on CPU? Can you share the detail table in a gist? Can you also post the IREE version?

@amd-vivekag
Copy link

I assume the most recent run is on CPU? Can you share the detail table in a gist? Can you also post the IREE version?

Yes, these are run on CPU. I was getting more failures (around 40 more failures on GPU). I'm using following IREE version:

IREE (https://iree.dev):
  IREE compiler version 3.2.0rc20250206 @ f3bef2de123f08b4fc3b0ce691494891bd6760d0
  LLVM version 20.0.0git
  Optimized build

Following is the detailed table link:
https://gist.github.com/amd-vivekag/377a7b141b40c118f880b2ced176f95c

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants