caoyunkang
diff --git a/‎.gitignore
Lines changed: 4 additions & 0 deletions b/‎.gitignore
Lines changed: 4 additions & 0 deletions
diff --git a/‎README.md
Lines changed: 175 additions & 3 deletions b/‎README.md
Lines changed: 175 additions & 3 deletions
diff --git a/‎app.py
Lines changed: 133 additions & 0 deletions b/‎app.py
Lines changed: 133 additions & 0 deletions
diff --git a/‎asset/Fig_app.png
262 KB b/‎asset/Fig_app.png
262 KB
diff --git a/‎asset/Fig_detection_results.png
355 KB b/‎asset/Fig_detection_results.png
355 KB
diff --git a/‎asset/Table_industrial.png
392 KB b/‎asset/Table_industrial.png
392 KB
diff --git a/‎asset/Table_medical.png
284 KB b/‎asset/Table_medical.png
284 KB
diff --git a/‎asset/framework.png
430 KB b/‎asset/framework.png
430 KB
diff --git a/‎asset/img.png
1.36 MB b/‎asset/img.png
1.36 MB
diff --git a/‎asset/img2.png
535 KB b/‎asset/img2.png
535 KB
diff --git a/‎asset/img3.png
610 KB b/‎asset/img3.png
610 KB
diff --git a/‎config.py
Lines changed: 1 addition & 0 deletions b/‎config.py
Lines changed: 1 addition & 0 deletions
diff --git a/‎data_preprocess/br35h.py
Lines changed: 50 additions & 0 deletions b/‎data_preprocess/br35h.py
Lines changed: 50 additions & 0 deletions
@@ -0,0 +1,4 @@
+/result/
+/.idea/
+/__pycache__/
+/weights/
@@ -1,4 +1,176 @@
-# AdaCLIP
-[ECCV2024] The Official Implementation for ''AdaCLIP: Adapting CLIP with Hybrid Learnable Prompts for Zero-Shot Anomaly Detection''
+# AdaCLIP (Detecting Anomalies for Novel Categories)
+[![HuggingFace Space](https://img.shields.io/badge/🤗-HuggingFace%20Space-cyan.svg)]()
 
-Code will be released at the end of July 2024.
+> [**ECCV 24**] [**AdaCLIP: Adapting CLIP with Hybrid Learnable Prompts for Zero-Shot Anomaly Detection**]().
+>
+> by [Yunkang Cao](https://caoyunkang.github.io/), [Jiangning Zhang](https://zhangzjn.github.io/),  [Luca Frittoli](https://scholar.google.com/citations?user=cdML_XUAAAAJ), 
+> [Yuqi Cheng](https://scholar.google.com/citations?user=02BC-WgAAAAJ&hl=en), [Weiming Shen](https://scholar.google.com/citations?user=FuSHsx4AAAAJ&hl=en), [Giacomo Boracchi](https://boracchi.faculty.polimi.it/) 
+> 
+
+## Introduction 
+Zero-shot anomaly detection (ZSAD) targets the identification of anomalies within images from arbitrary novel categories. 
+This study introduces AdaCLIP for the ZSAD task, leveraging a pre-trained vision-language model (VLM), CLIP. 
+AdaCLIP incorporates learnable prompts into CLIP and optimizes them through training on auxiliary annotated anomaly detection data. 
+Two types of learnable prompts are proposed: \textit{static} and \textit{dynamic}. Static prompts are shared across all images, serving to preliminarily adapt CLIP for ZSAD. 
+In contrast, dynamic prompts are generated for each test image, providing CLIP with dynamic adaptation capabilities. 
+The combination of static and dynamic prompts is referred to as hybrid prompts, and yields enhanced ZSAD performance. 
+Extensive experiments conducted across 14 real-world anomaly detection datasets from industrial and medical domains indicate that AdaCLIP outperforms other ZSAD methods and can generalize better to different categories and even domains. 
+Finally, our analysis highlights the importance of diverse auxiliary data and optimized prompts for enhanced generalization capacity.
+
+## Overview of AdaCLIP
+![overview](asset/framework.png)
+
+## 🛠️ Getting Started
+
+### Installation
+To set up the AdaCLIP environment, follow one of the methods below:
+
+- Clone this repo:
+  ```shell
+  git clone https://github.com/caoyunkang/AdaCLIP.git && cd AdaCLIP
+  ```
+- You can use our provided installation script for an automated setup::
+  ```shell
+  sh install.sh
+  ```
+- If you prefer to construct the experimental environment manually, follow these steps:
+  ```shell
+  conda create -n AdaCLIP python=3.9.5 -y
+  conda activate AdaCLIP
+  pip install torch==1.10.1+cu111 torchvision==0.11.2+cu111 torchaudio==0.10.1 -f https://download.pytorch.org/whl/cu111/torch_stable.html
+  pip install tqdm tensorboard setuptools==58.0.4 opencv-python scikit-image scikit-learn matplotlib seaborn ftfy regex numpy==1.26.4
+  pip install gradio # Optional, for app 
+  ```
+- Remember to update the dataset root in config.py according to your preference:
+  ```python
+  DATA_ROOT = '../datasets' # Original setting
+  ```
+
+### Dataset Preparation 
+Please download our processed visual anomaly detection datasets to your `DATA_ROOT` as needed. 
+
+#### Industrial Visual Anomaly Detection Datasets
+Note: some links are still in processing...
+
+| Dataset | Google Drive | Baidu Drive | Task
+|------------|------------------|------------------| ------------------|
+| MVTec AD    | [Google Drive](链接) | [Baidu Drive](链接) | Anomaly Detection & Localization |
+| VisA    | [Google Drive](链接) | [Baidu Drive](链接) | Anomaly Detection & Localization |
+| MPDD    | [Google Drive](链接) | [Baidu Drive](链接) | Anomaly Detection & Localization | 
+| BTAD    | [Google Drive](链接) | [Baidu Drive](链接) | Anomaly Detection & Localization |
+| KSDD    | [Google Drive](链接) | [Baidu Drive](链接) | Anomaly Detection & Localization |
+| DAGM    | [Google Drive](链接) | [Baidu Drive](链接) | Anomaly Detection & Localization |
+| DTD-Synthetic    | [Google Drive](链接) | [Baidu Drive](链接) | Anomaly Detection & Localization |
+
+
+
+
+#### Medical Visual Anomaly Detection Datasets
+| Dataset | Google Drive | Baidu Drive | Task
+|------------|------------------|------------------|  ------------------|
+| HeadCT    | [Google Drive](链接) | [Baidu Drive](链接) | Anomaly Detection |
+| BrainMRI    | [Google Drive](链接) | [Baidu Drive](链接) | Anomaly Detection |
+| Br35H    | [Google Drive](链接) | [Baidu Drive](链接) | Anomaly Detection |
+| ISIC    | [Google Drive](链接) | [Baidu Drive](链接) | Anomaly Localization |
+| ColonDB    | [Google Drive](链接) | [Baidu Drive](链接) | Anomaly Localization |
+| ClinicDB    | [Google Drive](链接) | [Baidu Drive](链接) | Anomaly Localization |
+| TN3K    | [Google Drive](链接) | [Baidu Drive](链接) | Anomaly Localization |
+
+#### Custom Datasets
+To use your custom dataset, follow these steps:
+
+1. Refer to the instructions in `./data_preprocess` to generate the JSON file for your dataset.
+2. Use `./dataset/base_dataset.py` to construct your own dataset.
+
+
+### Weight Preparation
+
+We offer various pre-trained weights on different auxiliary datasets. 
+Please download the pre-trained weights in `./weights`.
+
+| Pre-trained Datasets | Google Drive | Baidu Drive 
+|------------|------------------|------------------|  
+| MVTec AD & ClinicDB    | [Google Drive](https://drive.google.com/file/d/1xVXANHGuJBRx59rqPRir7iqbkYzq45W0/view?usp=drive_link) | [Baidu Drive](链接) | 
+| VisA & ColonDB    | [Google Drive](https://drive.google.com/file/d/1QGmPB0ByPZQ7FucvGODMSz7r5Ke5wx9W/view?usp=drive_link) | [Baidu Drive](链接) | 
+| All Datasets Mentioned Above   | [Google Drive](https://drive.google.com/file/d/1Cgkfx3GAaSYnXPLolx-P7pFqYV0IVzZF/view?usp=drive_link) | [Baidu Drive](链接) |
+
+
+### Train
+
+By default, we use MVTec AD & ClinicDB for training and VisA for validation:
+```shell
+CUDA_VISIBLE_DEVICES=0 python train.py --save_fig True --training_data mvtec colondb --testing_data visa
+```
+
+
+Alternatively, for evaluation on MVTec AD & ClinicDB, we use VisA & ColonDB for training and MVTec AD for validation.
+```shell
+CUDA_VISIBLE_DEVICES=0 python train.py --save_fig True --training_data visa clinicdb --testing_data mvtec
+```
+Since we have utilized half-precision (FP16) for training, the training process can occasionally be unstable.
+It is recommended to run the training process multiple times and choose the best model based on performance
+on the validation set as the final model.
+
+
+To construct a robust ZSAD model for demonstration, we also train our AdaCLIP on all AD datasets mentioned above:
+```shell
+CUDA_VISIBLE_DEVICES=0 python train.py --save_fig True \
+--training_data \
+br35h brain_mri btad clinicdb colondb \
+dagm dtd headct isic mpdd mvtec sdd tn3k visa \
+--testing_data mvtec
+```
+
+### Test
+
+Manually select the best models from the validation set and place them in the `weights/` directory. Then, run the following testing script:
+```shell
+sh test.sh
+```
+
+If you want to test on a single image, you can refer to `test_single_image.sh`:
+```shell
+CUDA_VISIBLE_DEVICES=0 python test.py --testing_model image --ckt_path weights/pretrained_all.pth --save_fig True \
+ --image_path asset/img.png --class_name candle --save_name test.png
+```
+
+## Main Results
+
+Due to differences in versions utilized, the reported performance may vary slightly compared to the detection performance 
+with the provided pre-trained weights. Some categories may show higher performance while others may show lower.
+
+![Table_industrial](./asset/Table_industrial.png)
+![Table_medical](./asset/Table_medical.png)
+![Fig_detection_results](./asset/Fig_detection_results.png)
+
+### :page_facing_up: Demo App
+
+To run the demo application, use the following command:
+
+```bash
+python app.py
+```
+
+![Demo](./asset/Fig_app.png)
+
+## 💘 Acknowledgements
+Our work is largely inspired by the following projects. Thanks for their admiring contribution.
+
+- [VAND-APRIL-GAN](https://github.com/ByChelsea/VAND-APRIL-GAN)
+- [AnomalyCLIP](https://github.com/zqhang/AnomalyCLIP)
+- [SAA](https://github.com/caoyunkang/Segment-Any-Anomaly)
+
+
+## Stargazers over time
+[![Stargazers over time](https://starchart.cc/caoyunkang/AdaCLIP.svg?variant=adaptive)](https://starchart.cc/caoyunkang/AdaCLIP)
+
+
+## Citation
+
+If you find this project helpful for your research, please consider citing the following BibTeX entry.
+
+```BibTex
+
+
+
+```
@@ -0,0 +1,133 @@
+import gradio as gr
+from PIL import Image, ImageDraw, ImageFont
+import warnings
+import os
+os.environ['CUBLAS_WORKSPACE_CONFIG'] = ':4096:8'
+import json
+import os
+import torch
+from scipy.ndimage import gaussian_filter
+import cv2
+from method import AdaCLIP_Trainer
+import numpy as np
+
+############ Init Model
+ckt_path1 = 'weights/pretrained_mvtec_colondb.pth'
+ckt_path2 = "weights/pretrained_visa_clinicdb.pth"
+ckt_path3 = 'weights/pretrained_all.pth'
+
+# Configurations
+image_size = 518
+device = 'cuda' if torch.cuda.is_available() else 'cpu'
+# device = 'cpu'
+model = "ViT-L-14-336"
+prompting_depth = 4
+prompting_length = 5
+prompting_type = 'SD'
+prompting_branch = 'VL'
+use_hsf = True
+k_clusters = 20
+
+config_path = os.path.join('./model_configs', f'{model}.json')
+
+# Prepare model
+with open(config_path, 'r') as f:
+    model_configs = json.load(f)
+
+# Set up the feature hierarchy
+n_layers = model_configs['vision_cfg']['layers']
+substage = n_layers // 4
+features_list = [substage, substage * 2, substage * 3, substage * 4]
+
+model = AdaCLIP_Trainer(
+    backbone=model,
+    feat_list=features_list,
+    input_dim=model_configs['vision_cfg']['width'],
+    output_dim=model_configs['embed_dim'],
+    learning_rate=0.,
+    device=device,
+    image_size=image_size,
+    prompting_depth=prompting_depth,
+    prompting_length=prompting_length,
+    prompting_branch=prompting_branch,
+    prompting_type=prompting_type,
+    use_hsf=use_hsf,
+    k_clusters=k_clusters
+).to(device)
+
+
+def process_image(image, text, options):
+    # Load the model based on selected options
+    if 'MVTec AD+Colondb' in options:
+        model.load(ckt_path1)
+    elif 'VisA+Clinicdb' in options:
+        model.load(ckt_path2)
+    elif 'All' in options:
+        model.load(ckt_path3)
+    else:
+        # Default to 'All' if no valid option is provided
+        model.load(ckt_path3)
+        print('Invalid option. Defaulting to All.')
+
+    # Ensure image is in RGB mode
+    image = image.convert('RGB')
+
+    # Convert PIL image to NumPy array
+    np_image = np.array(image)
+
+    # Convert RGB to BGR for OpenCV
+    np_image = cv2.cvtColor(np_image, cv2.COLOR_RGB2BGR)
+    np_image = cv2.resize(np_image, (image_size, image_size))
+    # Preprocess the image and run the model
+    img_input = model.preprocess(image).unsqueeze(0)
+    img_input = img_input.to(model.device)
+
+    with torch.no_grad():
+        anomaly_map, anomaly_score = model.clip_model(img_input, [text], aggregation=True)
+
+    # Process anomaly map
+    anomaly_map = anomaly_map[0, :, :].cpu().numpy()
+    anomaly_score = anomaly_score[0].cpu().numpy()
+    anomaly_map = gaussian_filter(anomaly_map, sigma=4)
+    anomaly_map = (anomaly_map * 255).astype(np.uint8)
+
+    # Apply color map and blend with original image
+    heat_map = cv2.applyColorMap(anomaly_map, cv2.COLORMAP_JET)
+    vis_map = cv2.addWeighted(heat_map, 0.5, np_image, 0.5, 0)
+
+    # Convert OpenCV image back to PIL image for Gradio
+    vis_map_pil = Image.fromarray(cv2.cvtColor(vis_map, cv2.COLOR_BGR2RGB))
+
+    return vis_map_pil, f'{anomaly_score:.3f}'
+
+# Define examples
+examples = [
+    ["asset/img.png", "candle", "MVTec AD+Colondb"],
+    ["asset/img2.png", "bottle", "VisA+Clinicdb"],
+    ["asset/img3.png", "button", "All"],
+]
+
+# Gradio interface layout
+demo = gr.Interface(
+    fn=process_image,
+    inputs=[
+        gr.Image(type="pil", label="Upload Image"),
+        gr.Textbox(label="Class Name"),
+        gr.Radio(["MVTec AD+Colondb",
+                  "VisA+Clinicdb",
+                  "All"],
+        label="Pre-trained Datasets")
+    ],
+    outputs=[
+        gr.Image(type="pil", label="Output Image"),
+        gr.Textbox(label="Anomaly Score"),
+    ],
+    examples=examples,
+    title="AdaCLIP -- Zero-shot Anomaly Detection",
+    description="Upload an image, enter class name, and select pre-trained datasets to do zero-shot anomaly detection"
+)
+
+# Launch the demo
+demo.launch()
+# demo.launch(server_name="0.0.0.0", server_port=10002)
+
@@ -0,0 +1 @@
+DATA_ROOT = '../datasets'
@@ -0,0 +1,50 @@
+import os
+import json
+import random
+from config import DATA_ROOT
+
+Br35h_ROOT = os.path.join(DATA_ROOT, 'Br35h_anomaly_detection')
+class Br35hSolver(object):
+    CLSNAMES = [
+        'br35h',
+    ]
+
+    def __init__(self, root=Br35h_ROOT, train_ratio=0.5):
+        self.root = root
+        self.meta_path = f'{root}/meta.json'
+        self.train_ratio = train_ratio
+
+    def run(self):
+        self.generate_meta_info()
+
+    def generate_meta_info(self):
+        info = dict(train={}, test={})
+        for cls_name in self.CLSNAMES:
+            cls_dir = f'{self.root}/{cls_name}'
+            for phase in ['train', 'test']:
+                cls_info = []
+                species = os.listdir(f'{cls_dir}/{phase}')
+                for specie in species:
+                    is_abnormal = True if specie not in ['good'] else False
+                    img_names = os.listdir(f'{cls_dir}/{phase}/{specie}')
+                    img_names.sort()
+
+                    for idx, img_name in enumerate(img_names):
+                        info_img = dict(
+                            img_path=f'{cls_name}/{phase}/{specie}/{img_name}',
+                            mask_path=f'',
+                            cls_name=cls_name,
+                            specie_name=specie,
+                            anomaly=1 if is_abnormal else 0,
+                        )
+                        cls_info.append(info_img)
+
+                info[phase][cls_name] = cls_info
+
+        with open(self.meta_path, 'w') as f:
+            f.write(json.dumps(info, indent=4) + "\n")
+
+
+if __name__ == '__main__':
+    runner = Br35hSolver(root=Br35h_ROOT)
+    runner.run()