Using supervision with tflite #1644

acode-x · 2024-11-02T00:24:02Z

acode-x
Nov 2, 2024

Hi,
I am having yolo11.tflite model. How to use supervision along with tflite?

I found an existing implementation (but for images): https://github.com/ultralytics/ultralytics/blob/main/examples/YOLOv8-OpenCV-int8-tflite-Python/main.py (the latest commit here has optimisation fix)

Is it possible to reuse the same/similar implementation for video processing?
I couldn't use ultralytics directly because I had to run inference on edge device and it was too heavy.

Thanks!

LinasKo · 2024-11-02T07:26:19Z

LinasKo
Nov 2, 2024
Maintainer

Hi @acode-x 👋

Without going deeper into the file you sent - yes, it should be possible.

When you use supervision, you always start by converting the model output to a general format. This means calling a method such as from_ultralytics or from_inference to get sv.Detections. So, if you can create from_tflite, you can use the result in supervision.

What you need to do is make a function that constructs sv.Detections. If you look inside supervision/detections/core.py, you'll see that its central structure is simple - it's a class that holds a few arrays, each of size N, or None. There's also a data dict, where each value is a list/array also of size N. It's basically a container for all detections in an image.

Therefore, if you can print the results of tflite, find the right values, and call the constructor of sv.Detections, you can use supervision. You'll need:

xyxy: bounding box (or box around segmentation mask, array (N, 2) np.float32, the only arg that cannot be None`.
mask: array (N, H, W), bool
confidence: array (N, np.float32)
class_id: array (N, int)
tracker_id: ignore this one, it's set automatically
data: contains arrays (N, any). Some are special such as "class_name": array (N, str), as that's what LabelAnnotator looks for when drawing class boxes. Special keys are defined in config.py

Also, you may return sv.Detections.empty() when nothing is detected.

Does that answer your question?

1 reply

acode-x Nov 2, 2024
Author

Yes @LinasKo

Am also trying to take reference code shared by @onuralpszr previous in another thread https://colab.research.google.com/drive/1eDM_MFuMgvb3znAXq31GtwQHYHPcNaL6?usp=sharing .

Will keep you updated. I should be able to test this in few hours.
Thanks

acode-x · 2024-11-02T12:11:31Z

acode-x
Nov 2, 2024
Author

The YOLOv8-TFLite-Python/main.py was recently updated here:
https://github.com/ultralytics/ultralytics/blob/main/examples/YOLOv8-TFLite-Python/main.py

I only modified the postprocess and detect function in above:

def postprocess(self, img: np.ndarray, outputs: np.ndarray, pad: Tuple[float, float]) -> Tuple[np.ndarray, np.ndarray, np.ndarray]:
    outputs[:, 0] -= pad[1]
    outputs[:, 1] -= pad[0]
    outputs[:, :4] *= max(img.shape)

    outputs = outputs.transpose(0, 2, 1)
    outputs[..., 0] -= outputs[..., 2] / 2
    outputs[..., 1] -= outputs[..., 3] / 2

    all_boxes = []
    all_scores = []
    all_class_ids = []

    for out in outputs:
        scores = out[:, 4:].max(-1)
        keep = scores > self.conf
        if np.any(keep):
            boxes = out[keep, :4]
            scores = scores[keep]
            class_ids = out[keep, 4:].argmax(-1)

            all_boxes.append(boxes)
            all_scores.append(scores)
            all_class_ids.append(class_ids)

        # Convert lists to 2D numpy arrays for consistency
        if all_boxes:
            return (
                np.concatenate(all_boxes, axis=0),
                np.concatenate(all_scores, axis=0),
                np.concatenate(all_class_ids, axis=0),
            )
        else:
            return np.zeros((0, 4)), np.zeros((0,)), np.zeros((0,))

def detect(self, img: np.ndarray) -> Tuple[np.ndarray, np.ndarray, np.ndarray]:
    x, pad = self.preprocess(img)
    if self.int8:
        x = (x / self.in_scale + self.in_zero_point).astype(np.int8)
    self.model.set_tensor(self.in_index, x)

    self.model.invoke()

    y = self.model.get_tensor(self.out_index)

    if self.int8:
        y = (y.astype(np.float32) - self.out_zero_point) * self.out_scale

    return self.postprocess(img, y, pad)

I also removed NMS and draw_detections used internally. Am applying same using supervision APIs below

Then using return of detect calling supervision apis:

def callback(frame: np_ndarray, _: int) -> np_ndarray:
    detection_output = detector.detect(frame) # YOLOv8TFLite class instance
    if detection_output is None or len(detection_output[0]) == 0:
        return frame

    boxes, scores, class_ids = detection_output
    detections = Detections(xyxy=boxes, confidence=scores, class_id=class_ids)
    detections = detections.with_nms(threshold=0.5)
    detections = tracker.update_with_detections(detections)

Somehow it fails to detect any object. Anything obvious I could be missing out

0 replies

acode-x · 2024-11-03T12:28:41Z

acode-x
Nov 3, 2024
Author

I was able to fix the code:

    def postprocess(
        self, img: np_ndarray, outputs: np_ndarray, pad: Tuple[float, float]
    ) -> Tuple[np_ndarray, np_ndarray, np_ndarray]:
        outputs[:, 0] -= pad[1]
        outputs[:, 1] -= pad[0]
        outputs[:, :4] *= max(img.shape)

        outputs = outputs.transpose(0, 2, 1)
        outputs[..., 0] -= outputs[..., 2] / 2
        outputs[..., 1] -= outputs[..., 3] / 2

        out = outputs[0]  # batch size 1  (sufficient for my use case)
        scores = out[:, 4:].max(-1)
        keep = scores > self.conf
        boxes = out[keep, :4]
        scores = scores[keep]
        class_ids = out[keep, 4:].argmax(-1)

        if len(scores) == 0:
            return [], [], []

        boxes = xywh2xyxy(boxes)
        indices = NMSBoxes(boxes, scores, self.conf, self.iou).flatten()
        return boxes[indices], scores[indices], class_ids[indices]

    def detect(
        self, img: np_ndarray
    ) -> Tuple[np_ndarray, np_ndarray, np_ndarray]:
        x, pad = self.preprocess(img)
        if self.int8:
            x = (x / self.in_scale + self.in_zero_point).astype(np_int8)
        self.model.set_tensor(self.in_index, x)

        self.model.invoke()

        y = self.model.get_tensor(self.out_index)

        if self.int8:
            y = (y.astype(np_float32) - self.out_zero_point) * self.out_scale

        return self.postprocess(img, y, pad)

Not sure if it is super optimised. But it works.

0 replies

acode-x · 2024-11-03T12:42:17Z

acode-x
Nov 3, 2024
Author

However the model boxes are slightly shifted away. Could it be because of the quantization or the code I've written. Any thoughts?

2 replies

LinasKo Nov 4, 2024
Maintainer

Looks like the model gave the (x,y) of box center, but you interpreted it as the top-left corner. Shift it by half-width and half-height and it'll be fine

acode-x Nov 4, 2024
Author

Thanks! It is fixed now 👍

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using supervision with tflite #1644

{{title}}

Replies: 4 comments 3 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

Using supervision with tflite #1644

acode-x Nov 2, 2024

Replies: 4 comments · 3 replies

LinasKo Nov 2, 2024 Maintainer

acode-x Nov 2, 2024 Author

acode-x Nov 2, 2024 Author

acode-x Nov 3, 2024 Author

acode-x Nov 3, 2024 Author

LinasKo Nov 4, 2024 Maintainer

acode-x Nov 4, 2024 Author

acode-x
Nov 2, 2024

Replies: 4 comments 3 replies

LinasKo
Nov 2, 2024
Maintainer

acode-x Nov 2, 2024
Author

acode-x
Nov 2, 2024
Author

acode-x
Nov 3, 2024
Author

acode-x
Nov 3, 2024
Author

LinasKo Nov 4, 2024
Maintainer

acode-x Nov 4, 2024
Author