Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with run GPTJ-99 #2146

Open
Agalakdak opened this issue Mar 5, 2025 · 0 comments
Open

Problem with run GPTJ-99 #2146

Agalakdak opened this issue Mar 5, 2025 · 0 comments

Comments

@Agalakdak
Copy link

I wanted to run GPT-J using the command below
But I ran into an error. What are the possible solutions?

mlcr run-mlperf,inference,_find-performance,_full,_r4.1-dev \
   --model=gptj-99 \
   --implementation=nvidia \
   --framework=tensorrt \
   --category=edge \
   --scenario=Offline \
   --execution_mode=test \
   --device=cuda  \
   --docker --quiet \
   --test_query_count=50

93%] Built target layers_src
[ 93%] Building CUDA object tensorrt_llm/kernels/CMakeFiles/kernels_src.dir/decoderMaskedMultiheadAttention/decoderMaskedMultiheadAttention48_float.cu.o
/code/tensorrt_llm/cpp/tensorrt_llm/cutlass_extensions/include/cutlass_extensions/epilogue/threadblock/epilogue_tensor_op_int32.h(97): error: class template "cutlass::epilogue::threadblock::detail::DefaultIteratorsTensorOp" has already been defined
struct DefaultIteratorsTensorOp<cutlass::bfloat16_t, int32_t, 8, ThreadblockShape, WarpShape, InstructionShape,
^

/code/tensorrt_llm/cpp/tensorrt_llm/cutlass_extensions/include/cutlass_extensions/epilogue/threadblock/epilogue_tensor_op_int32.h(97): error: class template "cutlass::epilogue::threadblock::detail::DefaultIteratorsTensorOp" has already been defined
struct DefaultIteratorsTensorOp<cutlass::bfloat16_t, int32_t, 8, ThreadblockShape, WarpShape, InstructionShape,
^

/code/tensorrt_llm/cpp/tensorrt_llm/cutlass_extensions/include/cutlass_extensions/epilogue/threadblock/epilogue_tensor_op_int32.h(97): error: class template "cutlass::epilogue::threadblock::detail::DefaultIteratorsTensorOp" has already been defined
struct DefaultIteratorsTensorOp<cutlass::bfloat16_t, int32_t, 8, ThreadblockShape, WarpShape, InstructionShape,
^

1 error detected in the compilation of "/code/tensorrt_llm/cpp/tensorrt_llm/kernels/cutlass_kernels/int8_gemm/int8_gemm_bf16.cu".
/code/tensorrt_llm/cpp/tensorrt_llm/cutlass_extensions/include/cutlass_extensions/epilogue/threadblock/epilogue_tensor_op_int32.h(97): error: class template "cutlass::epilogue::threadblock::detail::DefaultIteratorsTensorOp" has already been defined
struct DefaultIteratorsTensorOp<cutlass::bfloat16_t, int32_t, 8, ThreadblockShape, WarpShape, InstructionShape,
^

gmake[3]: *** [tensorrt_llm/kernels/CMakeFiles/kernels_src.dir/build.make:12917: tensorrt_llm/kernels/CMakeFiles/kernels_src.dir/cutlass_kernels/int8_gemm/int8_gemm_bf16.cu.o] Error 2
gmake[3]: *** Waiting for unfinished jobs....
1 error detected in the compilation of "/code/tensorrt_llm/cpp/tensorrt_llm/kernels/cutlass_kernels/int8_gemm/int8_gemm_fp32.cu".
1 error detected in the compilation of "/code/tensorrt_llm/cpp/tensorrt_llm/kernels/cutlass_kernels/int8_gemm/int8_gemm_int32.cu".
gmake[3]: *** [tensorrt_llm/kernels/CMakeFiles/kernels_src.dir/build.make:12947: tensorrt_llm/kernels/CMakeFiles/kernels_src.dir/cutlass_kernels/int8_gemm/int8_gemm_fp32.cu.o] Error 2
gmake[3]: *** [tensorrt_llm/kernels/CMakeFiles/kernels_src.dir/build.make:12962: tensorrt_llm/kernels/CMakeFiles/kernels_src.dir/cutlass_kernels/int8_gemm/int8_gemm_int32.cu.o] Error 2
1 error detected in the compilation of "/code/tensorrt_llm/cpp/tensorrt_llm/kernels/cutlass_kernels/int8_gemm/int8_gemm_fp16.cu".
gmake[3]: *** [tensorrt_llm/kernels/CMakeFiles/kernels_src.dir/build.make:12932: tensorrt_llm/kernels/CMakeFiles/kernels_src.dir/cutlass_kernels/int8_gemm/int8_gemm_fp16.cu.o] Error 2
[ 93%] Built target common_src
[ 93%] Built target runtime_src
gmake[2]: *** [CMakeFiles/Makefile2:816: tensorrt_llm/kernels/CMakeFiles/kernels_src.dir/all] Error 2
gmake[1]: *** [CMakeFiles/Makefile2:771: tensorrt_llm/CMakeFiles/tensorrt_llm.dir/rule] Error 2
gmake: *** [Makefile:192: tensorrt_llm] Error 2
Traceback (most recent call last):
File "/code/tensorrt_llm/scripts/build_wheel.py", line 319, in
main(**vars(args))
File "/code/tensorrt_llm/scripts/build_wheel.py", line 164, in main
build_run(
File "/usr/lib/python3.10/subprocess.py", line 526, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command 'cmake --build . --config Release --parallel 64 --target tensorrt_llm nvinfer_plugin_tensorrt_llm th_common bindings ' returned non-zero exit status 2.
make: *** [Makefile:102: devel_run] Error 1
make: Leaving directory '/home/user/MLC/repos/local/cache/get-git-repo_15a989f1/repo/docker'
Traceback (most recent call last):
File "/home/user/mlc/bin/mlcr", line 8, in
sys.exit(mlcr())
^^^^^^
File "/home/user/mlc/lib/python3.12/site-packages/mlc/main.py", line 1715, in mlcr
main()
File "/home/user/mlc/lib/python3.12/site-packages/mlc/main.py", line 1797, in main
res = method(run_args)
^^^^^^^^^^^^^^^^
File "/home/user/mlc/lib/python3.12/site-packages/mlc/main.py", line 1529, in run
return self.call_script_module_function("run", run_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/mlc/lib/python3.12/site-packages/mlc/main.py", line 1509, in call_script_module_function
result = automation_instance.run(run_args) # Pass args to the run method
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/MLC/repos/mlcommons@mlperf-automations/automation/script/module.py", line 225, in run
r = self._run(i)
^^^^^^^^^^^^
File "/home/user/MLC/repos/mlcommons@mlperf-automations/automation/script/module.py", line 1772, in _run
r = customize_code.preprocess(ii)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/MLC/repos/mlcommons@mlperf-automations/script/run-mlperf-inference-app/customize.py", line 284, in preprocess
r = mlc.access(ii)
^^^^^^^^^^^^^^
File "/home/user/mlc/lib/python3.12/site-packages/mlc/main.py", line 92, in access
result = method(options)
^^^^^^^^^^^^^^^
File "/home/user/mlc/lib/python3.12/site-packages/mlc/main.py", line 1526, in docker
return self.call_script_module_function("docker", run_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/mlc/lib/python3.12/site-packages/mlc/main.py", line 1511, in call_script_module_function
result = automation_instance.docker(run_args) # Pass args to the run method
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/MLC/repos/mlcommons@mlperf-automations/automation/script/module.py", line 4691, in docker
return docker_run(self, i)
^^^^^^^^^^^^^^^^^^^
File "/home/user/MLC/repos/mlcommons@mlperf-automations/automation/script/docker.py", line 308, in docker_run
r = self_module._run_deps(
^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/MLC/repos/mlcommons@mlperf-automations/automation/script/module.py", line 3702, in _run_deps
r = self.action_object.access(ii)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/mlc/lib/python3.12/site-packages/mlc/main.py", line 92, in access
result = method(options)
^^^^^^^^^^^^^^^
File "/home/user/mlc/lib/python3.12/site-packages/mlc/main.py", line 1529, in run
return self.call_script_module_function("run", run_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/mlc/lib/python3.12/site-packages/mlc/main.py", line 1519, in call_script_module_function
raise ScriptExecutionError(f"Script {function_name} execution failed. Error : {error}")
mlc.main.ScriptExecutionError: Script run execution failed. Error : MLC script failed (name = get-ml-model-gptj, return code = 256)

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Please file an issue at https://github.com/mlcommons/mlperf-automations/issues along with the full MLC command being run and the relevant
or full console log.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant