You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
./run_local.sh tf resnet50 gpu --scenario Offline --threads 2 --user_conf '/root/MLC/repos/mlcommons@mlperf-automations/script/generate-mlperf-inference-user-conf/
tmp/008b42b487e843888434313954e77347.conf' --use_preprocessed_dataset --cache_dir /root/MLC/repos/local/cache/get-preprocessed-dataset-imagenet_f2fa0fec --dataset-lis
t /root/MLC/repos/local/cache/extract-file_49f3fae9/val.txt 2>&1 | tee '/mlc-mount/home/anandhu/test_results/93e5e028e03c-reference-gpu-tf-v2.18.0-cu124/resnet50/offl
ine/performance/run_1/console.out'; echo ${PIPESTATUS[0]} > exitstatus
python3 python/main.py --profile resnet50-tf --model "/root/MLC/repos/local/cache/download-file_a5ea13cc/resnet50_v1.pb" --dataset-path /root/MLC/repos/local/cache/ge
t-preprocessed-dataset-imagenet_f2fa0fec --output "/mlc-mount/home/anandhu/test_results/93e5e028e03c-reference-gpu-tf-v2.18.0-cu124/resnet50/offline/performance/run_1
" --scenario Offline --threads 2 --user_conf /root/MLC/repos/mlcommons@mlperf-automations/script/generate-mlperf-inference-user-conf/tmp/008b42b487e843888434313954e77
347.conf --use_preprocessed_dataset --cache_dir /root/MLC/repos/local/cache/get-preprocessed-dataset-imagenet_f2fa0fec --dataset-list /root/MLC/repos/local/cache/extr
act-file_49f3fae9/val.txt
INFO:main:Namespace(dataset='imagenet', dataset_path='/root/MLC/repos/local/cache/get-preprocessed-dataset-imagenet_f2fa0fec', dataset_list='/root/MLC/repos/local/cac
he/extract-file_49f3fae9/val.txt', data_format=None, profile='resnet50-tf', scenario='Offline', max_batchsize=32, model='/root/MLC/repos/local/cache/download-file_a5e
a13cc/resnet50_v1.pb', output='/mlc-mount/home/anandhu/test_results/93e5e028e03c-reference-gpu-tf-v2.18.0-cu124/resnet50/offline/performance/run_1', inputs=['input_te
nsor:0'], outputs=['ArgMax:0'], backend='tensorflow', device=None, model_name='resnet50', threads=2, qps=None, cache=0, cache_dir='/root/MLC/repos/local/cache/get-pre
processed-dataset-imagenet_f2fa0fec', preprocessed_dir=None, use_preprocessed_dataset=True, accuracy=False, find_peak_performance=False, debug=False, user_conf='/root
/MLC/repos/mlcommons@mlperf-automations/script/generate-mlperf-inference-user-conf/tmp/008b42b487e843888434313954e77347.conf', audit_conf='audit.config', time=None, c
ount=None, performance_sample_count=None, max_latency=None, samples_per_query=8)
2025-02-12 10:30:16.828618: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-poin
t round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-02-12 10:30:16.853479: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin
cuFFT when one has already been registered
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
E0000 00:00:1739356216.880204 3058 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been re
gistered
E0000 00:00:1739356216.887595 3058 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been
registered
2025-02-12 10:30:16.915071: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-
critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI AVX512_BF16 AVX512_FP16 AVX_VNNI AMX_TILE AMX_INT8 AMX_BF16 FMA, in other operations, rebuild TensorFlo
w with the appropriate compiler flags.
INFO:matplotlib.font_manager:generated new fontManager
INFO:imagenet:Loading 50000 preprocessed images using 2 threads
INFO:imagenet:loaded 50000 images, cache=0, already_preprocessed=True, took=0.9sec
WARNING:tensorflow:From /root/MLC/repos/local/cache/get-git-repo_c7f3aa29/inference/vision/classification_and_detection/python/backend_tf.py:55: FastGFile.__init__ (f
rom tensorflow.python.platform.gfile) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.gfile.GFile.
WARNING:tensorflow:From /root/venv/mlc/lib/python3.10/site-packages/tensorflow/python/tools/strip_unused_lib.py:84: extract_sub_graph (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This API was designed for TensorFlow v1. See https://www.tensorflow.org/guide/migrate for instructions on how to migrate your code to TensorFlow v2.
WARNING:tensorflow:From /root/venv/mlc/lib/python3.10/site-packages/tensorflow/python/tools/optimize_for_inference_lib.py:138: remove_training_nodes (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This API was designed for TensorFlow v1. See https://www.tensorflow.org/guide/migrate for instructions on how to migrate your code to TensorFlow v2.
I0000 00:00:1739356257.281273 3058 gpu_device.cc:2022] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 78665 MB memory: -> device: 0, name: NVIDIA H100 80GB HBM3, pci bus id: 0000:18:00.0, compute capability: 9.0
I0000 00:00:1739356257.287068 3058 gpu_device.cc:2022] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 78665 MB memory: -> device: 1, name: NVIDIA H100 80GB HBM3, pci bus id: 0000:2a:00.0, compute capability: 9.0
I0000 00:00:1739356257.290797 3058 gpu_device.cc:2022] Created device /job:localhost/replica:0/task:0/device:GPU:2 with 78665 MB memory: -> device: 2, name: NVIDIA H100 80GB HBM3, pci bus id: 0000:3a:00.0, compute capability: 9.0
I0000 00:00:1739356257.294197 3058 gpu_device.cc:2022] Created device /job:localhost/replica:0/task:0/device:GPU:3 with 78665 MB memory: -> device: 3, name: NVIDIA H100 80GB HBM3, pci bus id: 0000:5d:00.0, compute capability: 9.0
I0000 00:00:1739356257.298001 3058 gpu_device.cc:2022] Created device /job:localhost/replica:0/task:0/device:GPU:4 with 78665 MB memory: -> device: 4, name: NVIDIA H100 80GB HBM3, pci bus id: 0000:9a:00.0, compute capability: 9.0
I0000 00:00:1739356257.308591 3058 gpu_device.cc:2022] Created device /job:localhost/replica:0/task:0/device:GPU:5 with 78665 MB memory: -> device: 5, name: NVIDIA H100 80GB HBM3, pci bus id: 0000:ab:00.0, compute capability: 9.0
I0000 00:00:1739356257.312613 3058 gpu_device.cc:2022] Created device /job:localhost/replica:0/task:0/device:GPU:6 with 78665 MB memory: -> device: 6, name: NVIDIA H100 80GB HBM3, pci bus id: 0000:ba:00.0, compute capability: 9.0
I0000 00:00:1739356257.315976 3058 gpu_device.cc:2022] Created device /job:localhost/replica:0/task:0/device:GPU:7 with 78665 MB memory: -> device: 7, name: NVIDIA H100 80GB HBM3, pci bus id: 0000:db:00.0, compute capability: 9.0
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1739356257.774200 3058 mlir_graph_optimization_pass.cc:401] MLIR V1 optimization pass is not enabled
E0000 00:00:1739356261.281341 3599 cuda_dnn.cc:522] Loaded runtime CuDNN library: 9.0.0 but source was compiled with: 9.3.0. CuDNN library needs to have matching major version and equal or higher minor version. If using a binary install, upgrade your CuDNN library. If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.
2025-02-12 10:31:01.283436: W tensorflow/core/framework/op_kernel.cc:1841] OP_REQUIRES failed at conv_ops_fused_impl.h:625 : INVALID_ARGUMENT: No DNN in stream executor.
2025-02-12 10:31:01.283474: I tensorflow/core/framework/local_rendezvous.cc:405] Local rendezvous is aborting with status: INVALID_ARGUMENT: No DNN in stream executor.
[[{{node resnet_model/Relu}}]]
2025-02-12 10:31:01.283486: I tensorflow/core/framework/local_rendezvous.cc:405] Local rendezvous is aborting with status: INVALID_ARGUMENT: No DNN in stream executor.
[[{{node resnet_model/Relu}}]]
[[ArgMax/_3]]
2025-02-12 10:31:01.283510: I tensorflow/core/framework/local_rendezvous.cc:424] Local rendezvous recv item cancelled. Key hash: 5866837555468538586
Traceback (most recent call last):
File "/root/venv/mlc/lib/python3.10/site-packages/tensorflow/python/client/session.py", line 1407, in _do_call
return fn(*args)
File "/root/venv/mlc/lib/python3.10/site-packages/tensorflow/python/client/session.py", line 1390, in _run_fn
return self._call_tf_sessionrun(options, feed_dict, fetch_list,
File "/root/venv/mlc/lib/python3.10/site-packages/tensorflow/python/client/session.py", line 1483, in _call_tf_sessionrun
return tf_session.TF_SessionRun_wrapper(self._session, options, feed_dict,
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
(0) INVALID_ARGUMENT: No DNN in stream executor.
[[{{node resnet_model/Relu}}]]
[[ArgMax/_3]]
(1) INVALID_ARGUMENT: No DNN in stream executor.
[[{{node resnet_model/Relu}}]]
0 successful operations.
0 derived errors ignored.
output log:
run command:
The text was updated successfully, but these errors were encountered: