-
Notifications
You must be signed in to change notification settings - Fork 545
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LLAMA3_1-405B-99 Docker Cmind not found issues #2105
Comments
Hi @hhuo24pm looks like you are on an old version of mlperf-automations repository. Can you please do |
But llama3-405b is too large a model to try on |
I am having similar issues trying to run the other ones like ResNet50. this is the result of running
[2025-02-11 17:19:37,912 main.py:1754 ERROR] - Git command failed: Command '['git', '-C', '/home/hhremote/MLC/repos/mlcommons@mlperf-automations', 'pull']' returned non-zero exit status 1. |
oh. Looks like it is an old dev version of
|
I have removed MLC then pulled the appropriate mlcommons perf automation repo, and the previous error is seemingly resolved.
Checking existing Docker container: docker ps --format "{{ .ID }}," --filter "ancestor=localhost/local/mlc-script-app-mlperf-inference-generic--reference--resnet50--onnxruntime--cpu--test--r5.0-dev-default--offline:ubuntu-22.04-latest" 2> /dev/null Traceback (most recent call last): |
Looks like docker failed. Can you please share the output of the below?
|
it just returns 0
|
oh, so thats the problem. The user
After this you probably need to restart the shell to make it effective. |
Thank you for helping with diagnosis but after running ResNet50 it produce a different error:
|
Looks like the download of the image failed as only 18599/50000 images got downloaded. Please retry the command after doing @anandhu-eng are we not checking the checksum for imagenet download? |
@arjunsuresh , yes we are. By default, the imagenet dataset is downloaded using |
Thank you, this did fix ResNet50 for me, I can now run it. Command: Error message: Running loadgen scenario: Offline and mode: performance ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
(following instruction on https://docs.mlcommons.org/inference/benchmarks/language/llama3_1-405b/)
mlcr run-mlperf,inference,_find-performance,_full,_r5.0-dev
--model=llama3_1-405b-99
--implementation=reference
--framework=pytorch
--category=datacenter
--scenario=Offline
--execution_mode=test
--device=cpu
--docker --quiet
--test_query_count=10
error after installing cmind:
The text was updated successfully, but these errors were encountered: