Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

immich-machine-learning throws a Exception in ASGI application error on starting any kind of machine learning job #9720

Closed
1 of 3 tasks
LuminarLeaf opened this issue May 24, 2024 · 6 comments

Comments

@LuminarLeaf
Copy link

The bug

As stated in the title as soon as I start the machine learning jobs(smart search in my usecase) the container downloads the model but then throws a Exception in ASGI application application with a long python error trace pointing to onnx.

The OS that Immich Server is running on

Win11 + WSL2

Version of Immich Server

v1.105.1

Version of Immich Mobile App

v1.105.0

Platform with the issue

  • Server
  • Web
  • Mobile

Your docker-compose.yml content

#
# WARNING: Make sure to use the docker-compose.yml of the current release:
#
# https://github.com/immich-app/immich/releases/latest/download/docker-compose.yml
#
# The compose file on main may not be compatible with the latest release.
#

name: immich

services:
  immich-server:
    container_name: immich_server
    image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
    command: ['start.sh', 'immich']
    volumes:
      - ${UPLOAD_LOCATION}:/usr/src/app/upload
      - /etc/localtime:/etc/localtime:ro
    env_file:
      - .env
    ports:
      - 2283:3001
    depends_on:
      - redis
      - database
    restart: always

  immich-microservices:
    container_name: immich_microservices
    image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
    extends: # uncomment this section for hardware acceleration - see https://immich.app/docs/features/hardware-transcoding
      file: hwaccel.transcoding.yml
      service: nvenc # set to one of [nvenc, quicksync, rkmpp, vaapi, vaapi-wsl] for accelerated transcoding
    command: [ "start.sh", "microservices" ]
    volumes:
      - ${UPLOAD_LOCATION}:/usr/src/app/upload
      - /etc/localtime:/etc/localtime:ro
    env_file:
      - .env
    depends_on:
      - redis
      - database
    restart: always

  immich-machine-learning:
    container_name: immich_machine_learning
    # For hardware acceleration, add one of -[armnn, cuda, openvino] to the image tag.
    # Example tag: ${IMMICH_VERSION:-release}-cuda
    image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}-cuda
    # extends: # uncomment this section for hardware acceleration - see https://immich.app/docs/features/ml-hardware-acceleration
    #   file: hwaccel.ml.yml
    #   service: cuda # set to one of [armnn, cuda, openvino, openvino-wsl] for accelerated inference - use the `-wsl` version for WSL2 where applicable
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities:
                - gpu
    volumes:
      - model-cache:/cache
    env_file:
      - .env
    restart: always

  redis:
    container_name: immich_redis
    image: registry.hub.docker.com/library/redis:6.2-alpine@sha256:84882e87b54734154586e5f8abd4dce69fe7311315e2fc6d67c29614c8de2672
    restart: always

  database:
    container_name: immich_postgres
    image: registry.hub.docker.com/tensorchord/pgvecto-rs:pg14-v0.2.0@sha256:90724186f0a3517cf6914295b5ab410db9ce23190a2d9d0b9dd6463e3fa298f0
    ports:
      - 5432:5432
    environment:
      POSTGRES_PASSWORD: ${DB_PASSWORD}
      POSTGRES_USER: ${DB_USERNAME}
      POSTGRES_DB: ${DB_DATABASE_NAME}
    volumes:
      - ${DB_DATA_LOCATION}:/var/lib/postgresql/data
    restart: always

volumes:
  model-cache:

Your .env content

# You can find documentation for all the supported env variables at https://immich.app/docs/install/environment-variables

# The location where your uploaded files are stored
UPLOAD_LOCATION=./library

# The Immich version to use. You can pin this to a specific version like "v1.71.0"
IMMICH_VERSION=release

# Connection secret for postgres. You should change it to a random password
DB_PASSWORD=<password>

# External Libraries path(s)

# The values below this line do not need to be changed
###################################################################################
DB_HOSTNAME=immich_postgres
DB_USERNAME=postgres
DB_DATABASE_NAME=immich
DB_DATA_LOCATION=./postgres

REDIS_HOSTNAME=immich_redis

Reproduction steps

1. docker compose down -v
2. docker compose up -d
3. go to jobs and start the smart search job for all

Relevant log output

[05/24/24 04:39:52] INFO     Starting gunicorn 22.0.0                           
[05/24/24 04:39:52] INFO     Listening at: http://[::]:3003 (8)                 
[05/24/24 04:39:52] INFO     Using worker: app.config.CustomUvicornWorker       
[05/24/24 04:39:52] INFO     Booting worker with pid: 16                        
[05/24/24 04:39:56] INFO     Started server process [16]                        
[05/24/24 04:39:56] INFO     Waiting for application startup.                   
[05/24/24 04:39:56] INFO     Created in-memory cache with unloading after 300s  
                             of inactivity.                                     
[05/24/24 04:39:56] INFO     Initialized request thread pool with 8 threads.    
[05/24/24 04:39:56] INFO     Application startup complete.                      
[05/24/24 04:43:50] INFO     Setting 'ViT-B-32__openai' execution providers to  
                             ['CUDAExecutionProvider', 'CPUExecutionProvider'], 
                             in descending order of preference                  
[05/24/24 04:43:50] INFO     Downloading clip model 'ViT-B-32__openai'. This may
                             take a while.                                      
/opt/venv/lib/python3.11/site-packages/huggingface_hub/file_download.py:1194: UserWarning: `local_dir_use_symlinks` parameter is deprecated and will be ignored. The process to download files to a local folder has been updated and do not rely on symlinks anymore. You only need to pass a destination folder as`local_dir`.
For more details, check out https://huggingface.co/docs/huggingface_hub/main/en/guides/download#download-files-to-local-folder.
  warnings.warn(

Fetching 11 files:   0%|          | 0/11 [00:00<?, ?it/s]
Fetching 11 files:   9%|| 1/11 [00:02<00:26,  2.67s/it]
Fetching 11 files:  27%|██▋       | 3/11 [00:03<00:08,  1.12s/it]
Fetching 11 files:  36%|███▋      | 4/11 [00:04<00:05,  1.25it/s]
Fetching 11 files:  45%|████▌     | 5/11 [00:19<00:34,  5.67s/it]
Fetching 11 files:  91%|█████████ | 10/11 [00:24<00:02,  2.40s/it]
Fetching 11 files: 100%|██████████| 11/11 [00:24<00:00,  2.24s/it]
[05/24/24 04:44:16] INFO     Loading clip model 'ViT-B-32__openai' to memory    
*************** EP Error ***************
EP Error /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:114 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUDA failure 500: named symbol not found ; GPU=32642 ; hostname=da5cd404647d ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider.cc ; line=245 ; expr=cudaSetDevice(info_.device_id); 

 when using ['CUDAExecutionProvider', 'CPUExecutionProvider']
Falling back to ['CUDAExecutionProvider', 'CPUExecutionProvider'] and retrying.
****************************************
[05/24/24 04:44:16] INFO     Loading clip model 'ViT-B-32__openai' to memory    
*************** EP Error ***************
EP Error /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:114 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUDA failure 500: named symbol not found ; GPU=950236142 ; hostname=da5cd404647d ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider.cc ; line=245 ; expr=cudaSetDevice(info_.device_id); 

 when using ['CUDAExecutionProvider', 'CPUExecutionProvider']
Falling back to ['CUDAExecutionProvider', 'CPUExecutionProvider'] and retrying.
****************************************
[05/24/24 04:44:16] ERROR    Exception in ASGI application                      
                                                                                
                             ╭─────── Traceback (most recent call last) ───────╮
                             │ /opt/venv/lib/python3.11/site-packages/onnxrunt │
                             │ ime/capi/onnxruntime_inference_collection.py:41 │
                             │ 9 in __init__                                   │
                             │                                                 │
                             │    416 │   │   disabled_optimizers = kwargs["di │
                             │        kwargs else None                         │
                             │    417 │   │                                    │
                             │    418 │   │   try:                             │
                             │ ❱  419 │   │   │   self._create_inference_sessi │
                             │        disabled_optimizers)                     │
                             │    420 │   │   except (ValueError, RuntimeError │
                             │    421 │   │   │   if self._enable_fallback:    │
                             │    422 │   │   │   │   try:                     │
                             │                                                 │
                             │ /opt/venv/lib/python3.11/site-packages/onnxrunt │
                             │ ime/capi/onnxruntime_inference_collection.py:48 │
                             │ 3 in _create_inference_session                  │
                             │                                                 │
                             │    480 │   │   │   disabled_optimizers = set(di │
                             │    481 │   │                                    │
                             │    482 │   │   # initialize the C++ InferenceSe │
                             │ ❱  483 │   │   sess.initialize_session(provider │
                             │    484 │   │                                    │
                             │    485 │   │   self._sess = sess                │
                             │    486 │   │   self._sess_options = self._sess. │
                             ╰─────────────────────────────────────────────────╯
                             RuntimeError:                                      
                             /onnxruntime_src/onnxruntime/core/providers/cuda/cu
                             da_call.cc:121 std::conditional_t<THRW, void,      
                             onnxruntime::common::Status>                       
                             onnxruntime::CudaCall(ERRTYPE, const char*, const  
                             char*, ERRTYPE, const char*, const char*, int)     
                             [with ERRTYPE = cudaError; bool THRW = true;       
                             std::conditional_t<THRW, void,                     
                             onnxruntime::common::Status> = void]               
                             /onnxruntime_src/onnxruntime/core/providers/cuda/cu
                             da_call.cc:114 std::conditional_t<THRW, void,      
                             onnxruntime::common::Status>                       
                             onnxruntime::CudaCall(ERRTYPE, const char*, const  
                             char*, ERRTYPE, const char*, const char*, int)     
                             [with ERRTYPE = cudaError; bool THRW = true;       
                             std::conditional_t<THRW, void,                     
                             onnxruntime::common::Status> = void] CUDA failure  
                             500: named symbol not found ; GPU=32642 ;          
                             hostname=da5cd404647d ;                            
                             file=/onnxruntime_src/onnxruntime/core/providers/cu
                             da/cuda_execution_provider.cc ; line=245 ;         
                             expr=cudaSetDevice(info_.device_id);               
                                                                                
                                                                                
                                                                                
                             The above exception was the direct cause of the    
                             following exception:                               
                                                                                
                             ╭─────── Traceback (most recent call last) ───────╮
                             │ /usr/src/app/main.py:116 in predict             │
                             │                                                 │
                             │   113 │   except orjson.JSONDecodeError:        │
                             │   114 │   │   raise HTTPException(400, f"Invali │
                             │   115 │                                         │
                             │ ❱ 116 │   model = await load(await model_cache. │
                             │       ttl=settings.model_ttl, **kwargs))        │
                             │   117 │   model.configure(**kwargs)             │
                             │   118 │   outputs = await run(model.predict, in │
                             │   119 │   return ORJSONResponse(outputs)        │
                             │                                                 │
                             │ /usr/src/app/main.py:137 in load                │
                             │                                                 │
                             │   134 │   │   │   model.load()                  │
                             │   135 │                                         │
                             │   136 │   try:                                  │
                             │ ❱ 137 │   │   await run(_load, model)           │
                             │   138 │   │   return model                      │
                             │   139 │   except (OSError, InvalidProtobuf, Bad │
                             │   140 │   │   log.warning(                      │
                             │                                                 │
                             │ /usr/src/app/main.py:125 in run                 │
                             │                                                 │
                             │   122 async def run(func: Callable[..., Any], i │
                             │   123 │   if thread_pool is None:               │
                             │   124 │   │   return func(inputs)               │
                             │ ❱ 125 │   return await asyncio.get_running_loop │
                             │   126                                           │
                             │   127                                           │
                             │   128 async def load(model: InferenceModel) ->  │
                             │                                                 │
                             │ /usr/local/lib/python3.11/concurrent/futures/th │
                             │ read.py:58 in run                               │
                             │                                                 │
                             │ /usr/src/app/main.py:134 in _load               │
                             │                                                 │
                             │   131 │                                         │
                             │   132 │   def _load(model: InferenceModel) -> N │
                             │   133 │   │   with lock:                        │
                             │ ❱ 134 │   │   │   model.load()                  │
                             │   135 │                                         │
                             │   136 │   try:                                  │
                             │   137 │   │   await run(_load, model)           │
                             │                                                 │
                             │ /usr/src/app/models/base.py:52 in load          │
                             │                                                 │
                             │    49 │   │   │   return                        │
                             │    50 │   │   self.download()                   │
                             │    51 │   │   log.info(f"Loading {self.model_ty │
                             │       to memory")                               │
                             │ ❱  52 │   │   self._load()                      │
                             │    53 │   │   self.loaded = True                │
                             │    54 │                                         │
                             │    55 │   def predict(self, inputs: Any, **mode │
                             │                                                 │
                             │ /usr/src/app/models/clip.py:146 in _load        │
                             │                                                 │
                             │   143 │   │   super().__init__(clean_name(model │
                             │   144 │                                         │
                             │   145 │   def _load(self) -> None:              │
                             │ ❱ 146 │   │   super()._load()                   │
                             │   147 │   │   self._load_tokenizer()            │
                             │   148 │   │                                     │
                             │   149 │   │   size: list[int] | int = self.prep │
                             │                                                 │
                             │ /usr/src/app/models/clip.py:41 in _load         │
                             │                                                 │
                             │    38 │   │                                     │
                             │    39 │   │   if self.mode == "vision" or self. │
                             │    40 │   │   │   log.debug(f"Loading clip visi │
                             │ ❱  41 │   │   │   self.vision_model = self._mak │
                             │    42 │   │   │   log.debug(f"Loaded clip visio │
                             │    43 │                                         │
                             │    44 │   def _predict(self, image_or_text: Ima │
                             │                                                 │
                             │ /usr/src/app/models/base.py:117 in              │
                             │ _make_session                                   │
                             │                                                 │
                             │   114 │   │   │   case ".armnn":                │
                             │   115 │   │   │   │   session = AnnSession(mode │
                             │   116 │   │   │   case ".onnx":                 │
                             │ ❱ 117 │   │   │   │   session = ort.InferenceSe │
                             │   118 │   │   │   │   │   model_path.as_posix() │
                             │   119 │   │   │   │   │   sess_options=self.ses │
                             │   120 │   │   │   │   │   providers=self.provid │
                             │                                                 │
                             │ /opt/venv/lib/python3.11/site-packages/onnxrunt │
                             │ ime/capi/onnxruntime_inference_collection.py:43 │
                             │ 2 in __init__                                   │
                             │                                                 │
                             │    429 │   │   │   │   │   self.disable_fallbac │
                             │    430 │   │   │   │   │   return               │
                             │    431 │   │   │   │   except Exception as fall │
                             │ ❱  432 │   │   │   │   │   raise fallback_error │
                             │    433 │   │   │   # Fallback is disabled. Rais │
                             │    434 │   │   │   raise e                      │
                             │    435                                          │
                             │                                                 │
                             │ /opt/venv/lib/python3.11/site-packages/onnxrunt │
                             │ ime/capi/onnxruntime_inference_collection.py:42 │
                             │ 7 in __init__                                   │
                             │                                                 │
                             │    424 │   │   │   │   │   print(f"EP Error {e} │
                             │    425 │   │   │   │   │   print(f"Falling back │
                             │    426 │   │   │   │   │   print("************* │
                             │ ❱  427 │   │   │   │   │   self._create_inferen │
                             │    428 │   │   │   │   │   # Fallback only once │
                             │    429 │   │   │   │   │   self.disable_fallbac │
                             │    430 │   │   │   │   │   return               │
                             │                                                 │
                             │ /opt/venv/lib/python3.11/site-packages/onnxrunt │
                             │ ime/capi/onnxruntime_inference_collection.py:48 │
                             │ 3 in _create_inference_session                  │
                             │                                                 │
                             │    480 │   │   │   disabled_optimizers = set(di │
                             │    481 │   │                                    │
                             │    482 │   │   # initialize the C++ InferenceSe │
                             │ ❱  483 │   │   sess.initialize_session(provider │
                             │    484 │   │                                    │
                             │    485 │   │   self._sess = sess                │
                             │    486 │   │   self._sess_options = self._sess. │
                             ╰─────────────────────────────────────────────────╯
                             RuntimeError:                                      
                             /onnxruntime_src/onnxruntime/core/providers/cuda/cu
                             da_call.cc:121 std::conditional_t<THRW, void,      
                             onnxruntime::common::Status>                       
                             onnxruntime::CudaCall(ERRTYPE, const char*, const  
                             char*, ERRTYPE, const char*, const char*, int)     
                             [with ERRTYPE = cudaError; bool THRW = true;       
                             std::conditional_t<THRW, void,                     
                             onnxruntime::common::Status> = void]               
                             /onnxruntime_src/onnxruntime/core/providers/cuda/cu
                             da_call.cc:114 std::conditional_t<THRW, void,      
                             onnxruntime::common::Status>                       
                             onnxruntime::CudaCall(ERRTYPE, const char*, const  
                             char*, ERRTYPE, const char*, const char*, int)     
                             [with ERRTYPE = cudaError; bool THRW = true;       
                             std::conditional_t<THRW, void,                     
                             onnxruntime::common::Status> = void] CUDA failure  
                             500: named symbol not found ; GPU=950236142 ;      
                             hostname=da5cd404647d ;                            
                             file=/onnxruntime_src/onnxruntime/core/providers/cu
                             da/cuda_execution_provider.cc ; line=245 ;         
                             expr=cudaSetDevice(info_.device_id);               
                                                                                
                                                                                
[05/24/24 04:44:17] INFO     Loading clip model 'ViT-B-32__openai' to memory    
*************** EP Error ***************
EP Error /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:114 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUDA failure 500: named symbol not found ; GPU=950236142 ; hostname=da5cd404647d ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider.cc ; line=245 ; expr=cudaSetDevice(info_.device_id); 

 when using ['CUDAExecutionProvider', 'CPUExecutionProvider']
Falling back to ['CUDAExecutionProvider', 'CPUExecutionProvider'] and retrying.
****************************************
[05/24/24 04:44:17] ERROR    Exception in ASGI application                      
                                                                                
                             ╭─────── Traceback (most recent call last) ───────╮
                             │ /opt/venv/lib/python3.11/site-packages/onnxrunt │
                             │ ime/capi/onnxruntime_inference_collection.py:41 │
                             │ 9 in __init__                                   │
                             │                                                 │
                             │    416 │   │   disabled_optimizers = kwargs["di │
                             │        kwargs else None                         │
                             │    417 │   │                                    │
                             │    418 │   │   try:                             │
                             │ ❱  419 │   │   │   self._create_inference_sessi │
                             │        disabled_optimizers)                     │
                             │    420 │   │   except (ValueError, RuntimeError │
                             │    421 │   │   │   if self._enable_fallback:    │
                             │    422 │   │   │   │   try:                     │
                             │                                                 │
                             │ /opt/venv/lib/python3.11/site-packages/onnxrunt │
                             │ ime/capi/onnxruntime_inference_collection.py:48 │
                             │ 3 in _create_inference_session                  │
                             │                                                 │
                             │    480 │   │   │   disabled_optimizers = set(di │
                             │    481 │   │                                    │
                             │    482 │   │   # initialize the C++ InferenceSe │
                             │ ❱  483 │   │   sess.initialize_session(provider │
                             │    484 │   │                                    │
                             │    485 │   │   self._sess = sess                │
                             │    486 │   │   self._sess_options = self._sess. │
                             ╰─────────────────────────────────────────────────╯
                             RuntimeError:                                      
                             /onnxruntime_src/onnxruntime/core/providers/cuda/cu
                             da_call.cc:121 std::conditional_t<THRW, void,      
                             onnxruntime::common::Status>                       
                             onnxruntime::CudaCall(ERRTYPE, const char*, const  
                             char*, ERRTYPE, const char*, const char*, int)     
                             [with ERRTYPE = cudaError; bool THRW = true;       
                             std::conditional_t<THRW, void,                     
                             onnxruntime::common::Status> = void]               
                             /onnxruntime_src/onnxruntime/core/providers/cuda/cu
                             da_call.cc:114 std::conditional_t<THRW, void,      
                             onnxruntime::common::Status>                       
                             onnxruntime::CudaCall(ERRTYPE, const char*, const  
                             char*, ERRTYPE, const char*, const char*, int)     
                             [with ERRTYPE = cudaError; bool THRW = true;       
                             std::conditional_t<THRW, void,                     
                             onnxruntime::common::Status> = void] CUDA failure  
                             500: named symbol not found ; GPU=950236142 ;      
                             hostname=da5cd404647d ;                            
                             file=/onnxruntime_src/onnxruntime/core/providers/cu
                             da/cuda_execution_provider.cc ; line=245 ;         
                             expr=cudaSetDevice(info_.device_id);               
                                                                                
                                                                                
                                                                                
                             The above exception was the direct cause of the    
                             following exception:                               
                                                                                
                             ╭─────── Traceback (most recent call last) ───────╮
                             │ /usr/src/app/main.py:116 in predict             │
                             │                                                 │
                             │   113 │   except orjson.JSONDecodeError:        │
                             │   114 │   │   raise HTTPException(400, f"Invali │
                             │   115 │                                         │
                             │ ❱ 116 │   model = await load(await model_cache. │
                             │       ttl=settings.model_ttl, **kwargs))        │
                             │   117 │   model.configure(**kwargs)             │
                             │   118 │   outputs = await run(model.predict, in │
                             │   119 │   return ORJSONResponse(outputs)        │
                             │                                                 │
                             │ /usr/src/app/main.py:137 in load                │
                             │                                                 │
                             │   134 │   │   │   model.load()                  │
                             │   135 │                                         │
                             │   136 │   try:                                  │
                             │ ❱ 137 │   │   await run(_load, model)           │
                             │   138 │   │   return model                      │
                             │   139 │   except (OSError, InvalidProtobuf, Bad │
                             │   140 │   │   log.warning(                      │
                             │                                                 │
                             │ /usr/src/app/main.py:125 in run                 │
                             │                                                 │
                             │   122 async def run(func: Callable[..., Any], i │
                             │   123 │   if thread_pool is None:               │
                             │   124 │   │   return func(inputs)               │
                             │ ❱ 125 │   return await asyncio.get_running_loop │
                             │   126                                           │
                             │   127                                           │
                             │   128 async def load(model: InferenceModel) ->  │
                             │                                                 │
                             │ /usr/local/lib/python3.11/concurrent/futures/th │
                             │ read.py:58 in run                               │
                             │                                                 │
                             │ /usr/src/app/main.py:134 in _load               │
                             │                                                 │
                             │   131 │                                         │
                             │   132 │   def _load(model: InferenceModel) -> N │
                             │   133 │   │   with lock:                        │
                             │ ❱ 134 │   │   │   model.load()                  │
                             │   135 │                                         │
                             │   136 │   try:                                  │
                             │   137 │   │   await run(_load, model)           │
                             │                                                 │
                             │ /usr/src/app/models/base.py:52 in load          │
                             │                                                 │
                             │    49 │   │   │   return                        │
                             │    50 │   │   self.download()                   │
                             │    51 │   │   log.info(f"Loading {self.model_ty │
                             │       to memory")                               │
                             │ ❱  52 │   │   self._load()                      │
                             │    53 │   │   self.loaded = True                │
                             │    54 │                                         │
                             │    55 │   def predict(self, inputs: Any, **mode │
                             │                                                 │
                             │ /usr/src/app/models/clip.py:146 in _load        │
                             │                                                 │
                             │   143 │   │   super().__init__(clean_name(model │
                             │   144 │                                         │
                             │   145 │   def _load(self) -> None:              │
                             │ ❱ 146 │   │   super()._load()                   │
                             │   147 │   │   self._load_tokenizer()            │
                             │   148 │   │                                     │
                             │   149 │   │   size: list[int] | int = self.prep │
                             │                                                 │
                             │ /usr/src/app/models/clip.py:41 in _load         │
                             │                                                 │
                             │    38 │   │                                     │
                             │    39 │   │   if self.mode == "vision" or self. │
                             │    40 │   │   │   log.debug(f"Loading clip visi │
                             │ ❱  41 │   │   │   self.vision_model = self._mak │
                             │    42 │   │   │   log.debug(f"Loaded clip visio │
                             │    43 │                                         │
                             │    44 │   def _predict(self, image_or_text: Ima │
                             │                                                 │
                             │ /usr/src/app/models/base.py:117 in              │
                             │ _make_session                                   │
                             │                                                 │
                             │   114 │   │   │   case ".armnn":                │
                             │   115 │   │   │   │   session = AnnSession(mode │
                             │   116 │   │   │   case ".onnx":                 │
                             │ ❱ 117 │   │   │   │   session = ort.InferenceSe │
                             │   118 │   │   │   │   │   model_path.as_posix() │
                             │   119 │   │   │   │   │   sess_options=self.ses │
                             │   120 │   │   │   │   │   providers=self.provid │
                             │                                                 │
                             │ /opt/venv/lib/python3.11/site-packages/onnxrunt │
                             │ ime/capi/onnxruntime_inference_collection.py:43 │
                             │ 2 in __init__                                   │
                             │                                                 │
                             │    429 │   │   │   │   │   self.disable_fallbac │
                             │    430 │   │   │   │   │   return               │
                             │    431 │   │   │   │   except Exception as fall │
                             │ ❱  432 │   │   │   │   │   raise fallback_error │
                             │    433 │   │   │   # Fallback is disabled. Rais │
                             │    434 │   │   │   raise e                      │
                             │    435                                          │
                             │                                                 │
                             │ /opt/venv/lib/python3.11/site-packages/onnxrunt │
                             │ ime/capi/onnxruntime_inference_collection.py:42 │
                             │ 7 in __init__                                   │
                             │                                                 │
                             │    424 │   │   │   │   │   print(f"EP Error {e} │
                             │    425 │   │   │   │   │   print(f"Falling back │
                             │    426 │   │   │   │   │   print("************* │
                             │ ❱  427 │   │   │   │   │   self._create_inferen │
                             │    428 │   │   │   │   │   # Fallback only once │
                             │    429 │   │   │   │   │   self.disable_fallbac │
                             │    430 │   │   │   │   │   return               │
                             │                                                 │
                             │ /opt/venv/lib/python3.11/site-packages/onnxrunt │
                             │ ime/capi/onnxruntime_inference_collection.py:48 │
                             │ 3 in _create_inference_session                  │
                             │                                                 │
                             │    480 │   │   │   disabled_optimizers = set(di │
                             │    481 │   │                                    │
                             │    482 │   │   # initialize the C++ InferenceSe │
                             │ ❱  483 │   │   sess.initialize_session(provider │
                             │    484 │   │                                    │
                             │    485 │   │   self._sess = sess                │
                             │    486 │   │   self._sess_options = self._sess. │
                             ╰─────────────────────────────────────────────────╯
                             RuntimeError:                                      
                             /onnxruntime_src/onnxruntime/core/providers/cuda/cu
                             da_call.cc:121 std::conditional_t<THRW, void,      
                             onnxruntime::common::Status>                       
                             onnxruntime::CudaCall(ERRTYPE, const char*, const  
                             char*, ERRTYPE, const char*, const char*, int)     
                             [with ERRTYPE = cudaError; bool THRW = true;       
                             std::conditional_t<THRW, void,                     
                             onnxruntime::common::Status> = void]               
                             /onnxruntime_src/onnxruntime/core/providers/cuda/cu
                             da_call.cc:114 std::conditional_t<THRW, void,      
                             onnxruntime::common::Status>                       
                             onnxruntime::CudaCall(ERRTYPE, const char*, const  
                             char*, ERRTYPE, const char*, const char*, int)     
                             [with ERRTYPE = cudaError; bool THRW = true;       
                             std::conditional_t<THRW, void,                     
                             onnxruntime::common::Status> = void] CUDA failure  
                             500: named symbol not found ; GPU=950236142 ;      
                             hostname=da5cd404647d ;                            
                             file=/onnxruntime_src/onnxruntime/core/providers/cu
                             da/cuda_execution_provider.cc ; line=245 ;         
                             expr=cudaSetDevice(info_.device_id);               
                                                                                
                                                                                
[05/24/24 04:44:17] INFO     Loading clip model 'ViT-B-32__openai' to memory    
*************** EP Error ***************
EP Error /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:114 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUDA failure 500: named symbol not found ; GPU=950236142 ; hostname=da5cd404647d ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider.cc ; line=245 ; expr=cudaSetDevice(info_.device_id); 

 when using ['CUDAExecutionProvider', 'CPUExecutionProvider']
Falling back to ['CUDAExecutionProvider', 'CPUExecutionProvider'] and retrying.
****************************************
[05/24/24 04:44:18] ERROR    Exception in ASGI application                      
                                                                                
                             ╭─────── Traceback (most recent call last) ───────╮
                             │ /opt/venv/lib/python3.11/site-packages/onnxrunt │
                             │ ime/capi/onnxruntime_inference_collection.py:41 │
                             │ 9 in __init__                                   │
                             │                                                 │
                             │    416 │   │   disabled_optimizers = kwargs["di │
                             │        kwargs else None                         │
                             │    417 │   │                                    │
                             │    418 │   │   try:                             │
                             │ ❱  419 │   │   │   self._create_inference_sessi │
                             │        disabled_optimizers)                     │
                             │    420 │   │   except (ValueError, RuntimeError │
                             │    421 │   │   │   if self._enable_fallback:    │
                             │    422 │   │   │   │   try:                     │
                             │                                                 │
                             │ /opt/venv/lib/python3.11/site-packages/onnxrunt │
                             │ ime/capi/onnxruntime_inference_collection.py:48 │
                             │ 3 in _create_inference_session                  │
                             │                                                 │
                             │    480 │   │   │   disabled_optimizers = set(di │
                             │    481 │   │                                    │
                             │    482 │   │   # initialize the C++ InferenceSe │
                             │ ❱  483 │   │   sess.initialize_session(provider │
                             │    484 │   │                                    │
                             │    485 │   │   self._sess = sess                │
                             │    486 │   │   self._sess_options = self._sess. │
                             ╰─────────────────────────────────────────────────╯
                             RuntimeError:                                      
                             /onnxruntime_src/onnxruntime/core/providers/cuda/cu
                             da_call.cc:121 std::conditional_t<THRW, void,      
                             onnxruntime::common::Status>                       
                             onnxruntime::CudaCall(ERRTYPE, const char*, const  
                             char*, ERRTYPE, const char*, const char*, int)     
                             [with ERRTYPE = cudaError; bool THRW = true;       
                             std::conditional_t<THRW, void,                     
                             onnxruntime::common::Status> = void]               
                             /onnxruntime_src/onnxruntime/core/providers/cuda/cu
                             da_call.cc:114 std::conditional_t<THRW, void,      
                             onnxruntime::common::Status>                       
                             onnxruntime::CudaCall(ERRTYPE, const char*, const  
                             char*, ERRTYPE, const char*, const char*, int)     
                             [with ERRTYPE = cudaError; bool THRW = true;       
                             std::conditional_t<THRW, void,                     
                             onnxruntime::common::Status> = void] CUDA failure  
                             500: named symbol not found ; GPU=950236142 ;      
                             hostname=da5cd404647d ;                            
                             file=/onnxruntime_src/onnxruntime/core/providers/cu
                             da/cuda_execution_provider.cc ; line=245 ;         
                             expr=cudaSetDevice(info_.device_id);               
                                                                                
                                                                                
                                                                                
                             The above exception was the direct cause of the    
                             following exception:                               
                                                                                
                             ╭─────── Traceback (most recent call last) ───────╮
                             │ /usr/src/app/main.py:116 in predict             │
                             │                                                 │
                             │   113 │   except orjson.JSONDecodeError:        │
                             │   114 │   │   raise HTTPException(400, f"Invali │
                             │   115 │                                         │
                             │ ❱ 116 │   model = await load(await model_cache. │
                             │       ttl=settings.model_ttl, **kwargs))        │
                             │   117 │   model.configure(**kwargs)             │
                             │   118 │   outputs = await run(model.predict, in │
                             │   119 │   return ORJSONResponse(outputs)        │
                             │                                                 │
                             │ /usr/src/app/main.py:137 in load                │
                             │                                                 │
                             │   134 │   │   │   model.load()                  │
                             │   135 │                                         │
                             │   136 │   try:                                  │
                             │ ❱ 137 │   │   await run(_load, model)           │
                             │   138 │   │   return model                      │
                             │   139 │   except (OSError, InvalidProtobuf, Bad │
                             │   140 │   │   log.warning(                      │
                             │                                                 │
                             │ /usr/src/app/main.py:125 in run                 │
                             │                                                 │
                             │   122 async def run(func: Callable[..., Any], i │
                             │   123 │   if thread_pool is None:               │
                             │   124 │   │   return func(inputs)               │
                             │ ❱ 125 │   return await asyncio.get_running_loop │
                             │   126                                           │
                             │   127                                           │
                             │   128 async def load(model: InferenceModel) ->  │
                             │                                                 │
                             │ /usr/local/lib/python3.11/concurrent/futures/th │
                             │ read.py:58 in run                               │
                             │                                                 │
                             │ /usr/src/app/main.py:134 in _load               │
                             │                                                 │
                             │   131 │                                         │
                             │   132 │   def _load(model: InferenceModel) -> N │
                             │   133 │   │   with lock:                        │
                             │ ❱ 134 │   │   │   model.load()                  │
                             │   135 │                                         │
                             │   136 │   try:                                  │
                             │   137 │   │   await run(_load, model)           │
                             │                                                 │
                             │ /usr/src/app/models/base.py:52 in load          │
                             │                                                 │
                             │    49 │   │   │   return                        │
                             │    50 │   │   self.download()                   │
                             │    51 │   │   log.info(f"Loading {self.model_ty │
                             │       to memory")                               │
                             │ ❱  52 │   │   self._load()                      │
                             │    53 │   │   self.loaded = True                │
                             │    54 │                                         │
                             │    55 │   def predict(self, inputs: Any, **mode │
                             │                                                 │
                             │ /usr/src/app/models/clip.py:146 in _load        │
                             │                                                 │
                             │   143 │   │   super().__init__(clean_name(model │
                             │   144 │                                         │
                             │   145 │   def _load(self) -> None:              │
                             │ ❱ 146 │   │   super()._load()                   │
                             │   147 │   │   self._load_tokenizer()            │
                             │   148 │   │                                     │
                             │   149 │   │   size: list[int] | int = self.prep │
                             │                                                 │
                             │ /usr/src/app/models/clip.py:41 in _load         │
                             │                                                 │
                             │    38 │   │                                     │
                             │    39 │   │   if self.mode == "vision" or self. │
                             │    40 │   │   │   log.debug(f"Loading clip visi │
                             │ ❱  41 │   │   │   self.vision_model = self._mak │
                             │    42 │   │   │   log.debug(f"Loaded clip visio │
                             │    43 │                                         │
                             │    44 │   def _predict(self, image_or_text: Ima │
                             │                                                 │
                             │ /usr/src/app/models/base.py:117 in              │
                             │ _make_session                                   │
                             │                                                 │
                             │   114 │   │   │   case ".armnn":                │
                             │   115 │   │   │   │   session = AnnSession(mode │
                             │   116 │   │   │   case ".onnx":                 │
                             │ ❱ 117 │   │   │   │   session = ort.InferenceSe │
                             │   118 │   │   │   │   │   model_path.as_posix() │
                             │   119 │   │   │   │   │   sess_options=self.ses │
                             │   120 │   │   │   │   │   providers=self.provid │
                             │                                                 │
                             │ /opt/venv/lib/python3.11/site-packages/onnxrunt │
                             │ ime/capi/onnxruntime_inference_collection.py:43 │
                             │ 2 in __init__                                   │
                             │                                                 │
                             │    429 │   │   │   │   │   self.disable_fallbac │
                             │    430 │   │   │   │   │   return               │
                             │    431 │   │   │   │   except Exception as fall │
                             │ ❱  432 │   │   │   │   │   raise fallback_error │
                             │    433 │   │   │   # Fallback is disabled. Rais │
                             │    434 │   │   │   raise e                      │
                             │    435                                          │
                             │                                                 │
                             │ /opt/venv/lib/python3.11/site-packages/onnxrunt │
                             │ ime/capi/onnxruntime_inference_collection.py:42 │
                             │ 7 in __init__                                   │
                             │                                                 │
                             │    424 │   │   │   │   │   print(f"EP Error {e} │
                             │    425 │   │   │   │   │   print(f"Falling back │
                             │    426 │   │   │   │   │   print("************* │
                             │ ❱  427 │   │   │   │   │   self._create_inferen │
                             │    428 │   │   │   │   │   # Fallback only once │
                             │    429 │   │   │   │   │   self.disable_fallbac │
                             │    430 │   │   │   │   │   return               │
                             │                                                 │
                             │ /opt/venv/lib/python3.11/site-packages/onnxrunt │
                             │ ime/capi/onnxruntime_inference_collection.py:48 │
                             │ 3 in _create_inference_session                  │
                             │                                                 │
                             │    480 │   │   │   disabled_optimizers = set(di │
                             │    481 │   │                                    │
                             │    482 │   │   # initialize the C++ InferenceSe │
                             │ ❱  483 │   │   sess.initialize_session(provider │
                             │    484 │   │                                    │
                             │    485 │   │   self._sess = sess                │
                             │    486 │   │   self._sess_options = self._sess. │
                             ╰─────────────────────────────────────────────────╯
                             RuntimeError:                                      
                             /onnxruntime_src/onnxruntime/core/providers/cuda/cu
                             da_call.cc:121 std::conditional_t<THRW, void,      
                             onnxruntime::common::Status>                       
                             onnxruntime::CudaCall(ERRTYPE, const char*, const  
                             char*, ERRTYPE, const char*, const char*, int)     
                             [with ERRTYPE = cudaError; bool THRW = true;       
                             std::conditional_t<THRW, void,                     
                             onnxruntime::common::Status> = void]               
                             /onnxruntime_src/onnxruntime/core/providers/cuda/cu
                             da_call.cc:114 std::conditional_t<THRW, void,      
                             onnxruntime::common::Status>                       
                             onnxruntime::CudaCall(ERRTYPE, const char*, const  
                             char*, ERRTYPE, const char*, const char*, int)     
                             [with ERRTYPE = cudaError; bool THRW = true;       
                             std::conditional_t<THRW, void,                     
                             onnxruntime::common::Status> = void] CUDA failure  
                             500: named symbol not found ; GPU=950236142 ;      
                             hostname=da5cd404647d ;                            
                             file=/onnxruntime_src/onnxruntime/core/providers/cu
                             da/cuda_execution_provider.cc ; line=245 ;         
                             expr=cudaSetDevice(info_.device_id);               
                                                                                
                                                                                
[05/24/24 04:44:18] INFO     Loading clip model 'ViT-B-32__openai' to memory    
*************** EP Error ***************
EP Error /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:114 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUDA failure 500: named symbol not found ; GPU=950236142 ; hostname=da5cd404647d ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider.cc ; line=245 ; expr=cudaSetDevice(info_.device_id); 

 when using ['CUDAExecutionProvider', 'CPUExecutionProvider']
Falling back to ['CUDAExecutionProvider', 'CPUExecutionProvider'] and retrying.
****************************************

Additional information

This didn't happen with previous versions and only started happening after updating to 1.105.x

@LuminarLeaf
Copy link
Author

tried creating new instances on arch and windows, arch worked without any problems but windows still had the same error even with a new instance

@bo0tzz
Copy link
Member

bo0tzz commented May 24, 2024

@mertalev I've seen a few cases of this CUDA failure 500: named symbol not found now. Is it an issue on our end, or just misconfiguration?

@LuminarLeaf
Copy link
Author

heres another thing that happened

[05/24/24 09:56:22] INFO     Loading clip model 'ViT-B-32__openai' to memory    
*************** EP Error ***************
EP Error /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:114 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUDA failure 500: named symbol not found ; GPU=-1980571051 ; hostname=822e5a2b482e ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider.cc ; line=245 ; expr=cudaSetDevice(info_.device_id); 

 when using ['CUDAExecutionProvider', 'CPUExecutionProvider']
Falling back to ['CUDAExecutionProvider', 'CPUExecutionProvider'] and retrying.
****************************************
[05/24/24 09:56:23] INFO     Loading clip model 'ViT-B-32__openai' to memory    
[05/24/24 09:56:25] ERROR    Worker (pid:17) was sent SIGKILL! Perhaps out of   
                             memory?                                            
[05/24/24 09:56:25] INFO     Booting worker with pid: 524                       
[05/24/24 09:56:30] INFO     Started server process [524]                       
[05/24/24 09:56:30] INFO     Waiting for application startup.                   
[05/24/24 09:56:30] INFO     Created in-memory cache with unloading after 300s  
                             of inactivity.                                     
[05/24/24 09:56:30] INFO     Initialized request thread pool with 8 threads.    
[05/24/24 09:56:30] INFO     Application startup complete.

@LuminarLeaf
Copy link
Author

well it turns out nothing is able to access the gpu inside wsl for some reason.

I have tried these two

docker run --rm -it --gpus=all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
Output
Run "nbody -benchmark [-numbodies=<numBodies>]" to measure performance.
        -fullscreen       (run n-body simulation in fullscreen mode)
        -fp64             (use double precision floating point values for simulation)
        -hostmem          (stores simulation data in host memory)
        -benchmark        (run benchmark to measure performance)
        -numbodies=<N>    (number of bodies (>= 1) to run in simulation)
        -device=<d>       (where d=0,1,2.... for the CUDA device to use)
        -numdevices=<i>   (where i=(number of CUDA devices > 0) to use for simulation)
        -compare          (compares simulation results running once on the default GPU and once on the CPU)
        -cpu              (run n-body simulation on the CPU)
        -tipsy=<file.bin> (load a tipsy model file for simulation)

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

Error: only 0 Devices available, 1 requested.  Exiting.

and running tf.config.list_physical_devices() inside of this

docker run --rm -it -p 8888:8888 --gpus all tensorflow/tensorflow:latest-gpu-jupyter
Output
2024-05-24 13:08:45.278336: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-05-24 13:08:45.570992: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-05-24 13:09:00.572513: E external/local_xla/xla/stream_executor/cuda/cuda_driver.cc:282] failed call to cuInit: CUDA_ERROR_NOT_FOUND: named symbol not found

closing this issue as this is not related to immich but would appreciate any kind of help in solving this issue

@cliffwoolley
Copy link

Please see NVIDIA/nvidia-container-toolkit#520 .

@jasonbrimblecombe
Copy link

There's a version update for Docker Desktop to 4.31.0 that has resolved this issue. It contains the updated NVIDIA Container Toolkit 1.15.0.
https://docs.docker.com/desktop/release-notes/#4310

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants