Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow "tiff" and more extensions in DetectionDataset.from_yolo function #1636

Merged
merged 7 commits into from
Jan 8, 2025
17 changes: 13 additions & 4 deletions supervision/dataset/formats/yolo.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@
from pathlib import Path
from typing import TYPE_CHECKING, Dict, List, Optional, Tuple

import cv2
import numpy as np
from PIL import Image

from supervision.config import ORIENTED_BOX_COORDINATES
from supervision.dataset.utils import approximate_mask_with_polygons
Expand Down Expand Up @@ -153,7 +153,7 @@ def load_yolo_annotations(
image_paths = [
str(path)
for path in list_files_with_extensions(
directory=images_directory_path, extensions=["jpg", "jpeg", "png"]
directory=images_directory_path, extensions=["*"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a small but important side-effect of this change. If there are other files than images in the directory, we will also try to load them. For example, macOS puts a .DS_Store file in the directory. @patel-zeel I suggest putting here a list of image extensions that you have tested.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. On it!

)
]

Expand All @@ -167,10 +167,19 @@ def load_yolo_annotations(
annotations[image_path] = Detections.empty()
continue

image = cv2.imread(image_path)
# PIL is much faster than cv2 for checking image shape and mode: https://github.com/roboflow/supervision/issues/1554
image = Image.open(image_path)
lines = read_txt_file(file_path=annotation_path, skip_empty=True)
h, w, _ = image.shape
w, h = image.size
resolution_wh = (w, h)
if image.mode != "RGB":
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@patel-zeel looks like we can simplify the code here. There is no need for nested ifs.

if image.mode not in ("RGB", "L"):
    raise ValueError(
        f"Images must be 'RGB' or 'grayscale', but {image_path} mode is '{image.mode}'."
    )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree.

if image.mode == "L":
image = image.convert("RGB")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that conversion to RGB is not necessary. The image is not used in the further part of the function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, that'd save us the time wasted in the conversion. I checked the other extreme as well. If we convert all images to RGB by default, that adds a bit of unnecessary overhead. So, this change will improve the speed further. Thank you for pointing it out, @SkalskiP.

else:
raise ValueError(
f"Images must be 'RGB' or 'grayscale', \
but {image_path} mode is '{image.mode}'."
)

with_masks = _with_mask(lines=lines)
with_masks = force_masks if force_masks else with_masks
Expand Down
Loading