Too low (frame) FPS compared to documentation #583

clausMeko · 2024-08-16T08:44:38Z

Search before asking

I have searched the Inference issues and found no similar bug report.

Bug

Set Up

I use a Basler Camera acA1920-40uc.

It provides ~50 fps as cv2.Image via opencv. I use your sdk to post those images.

On the same device: jetson orin nano (no network latency) docker runs the inference-server-jets-5.1.1 image.

For testing I ran the same setup on my notebook(dell precision 5570 - i7-12700H) with the cpu image.

Problem

The inference takes longer than expected ~200ms (self computed ~ 5 fps). This is disappointing for 2 reasons:

unspecialized cpu on my notebook is as fast (~ 5 fps)
your blog-post made me expect at least 10-15 fps if the performance of orin nano is approx. jetson navier nx, however it must be better than it's predecessor

Question

Is there anything I am not considering so I can improve my performance?

Environment

Inference: roboflow/roboflow-inference-server-jetson-5.1.1:latest
OS: JP 5.1.1
Device: NVIDIA Jetson Orin Nano
Python: 3.9
Model Type: Roboflow 3.0 Instance Segmentation (Accurate)

Minimal Reproducible Example

Sorry - I merged 2 files if something seems odd.

import supervision as sv
import cv2
import json
from inference_sdk import InferenceHTTPClient


# Infer via the Roboflow Infer API and return the result
def infer(img: cv2.typing.MatLike) -> cv2.typing.MatLike:
    with open('roboflow_config.json') as f:
        config = json.load(f)

        ROBOFLOW_API_KEY = config["ROBOFLOW_API_KEY"]
        ROBOFLOW_MODEL = config["ROBOFLOW_MODEL"]
        ROBOFLOW_SIZE = config["ROBOFLOW_SIZE"]

        FRAMERATE = config["FRAMERATE"]
        BUFFER = config["BUFFER"]


    # local inference
    client = InferenceHTTPClient(
        api_url="http://localhost:9001",
        api_key=ROBOFLOW_API_KEY,
    )
    results = client.infer(img, model_id=ROBOFLOW_MODEL)

    # remote inference (slower)
    # from inference import get_model
    # model = get_model(ROBOFLOW_MODEL, ROBOFLOW_API_KEY)
    # results = model.infer(img)[0]

    detections = sv.Detections.from_inference(results)
    if len(detections) == 0:
        return img

    box_annotator = sv.BoxAnnotator()
    label_annotator = sv.LabelAnnotator()
    labels = [
        f"{class_name} {confidence:.2f}"
        for class_name, confidence
        in zip(detections['class_name'], detections.confidence)
    ]

    annotated_image = box_annotator.annotate(
        scene=img, detections=detections)
    annotated_image = label_annotator.annotate(
        scene=annotated_image, detections=detections, labels=labels)

    return annotated_image

'''
A simple Program for grabing video from basler camera and converting it to opencv img.
Tested on Basler acA1300-200uc (USB3, linux 64bit , python 3.5)

'''
from pypylon import pylon
import time

# conecting to the first available camera
camera = pylon.InstantCamera(pylon.TlFactory.GetInstance().CreateFirstDevice())

# Grabing Continusely (video) with minimal delay
camera.StartGrabbing(pylon.GrabStrategy_LatestImageOnly)
converter = pylon.ImageFormatConverter()

# converting to opencv bgr format
converter.OutputPixelFormat = pylon.PixelType_BGR8packed
converter.OutputBitAlignment = pylon.OutputBitAlignment_MsbAligned

while camera.IsGrabbing():
    start_time = time.time()  # start time of the loop

    ########################
    # your fancy code here #
    ########################
    grabResult = camera.RetrieveResult(5000, pylon.TimeoutHandling_ThrowException)

    if grabResult.GrabSucceeded():
        # Access the image data
        image = converter.Convert(grabResult)
        img = image.GetArray()

        #  annotation integration
        img = infer(img)

        cv2.namedWindow('title', cv2.WINDOW_NORMAL)
        cv2.imshow('title', img)

        print("FPS: ", round(1.0 / (time.time() - start_time),2))  # FPS = 1 / time to process loop
        if cv2.waitKey(1) == ord('q'):
            break

    grabResult.Release()

# Releasing the resource
camera.StopGrabbing()

cv2.destroyAllWindows()

Additional

No response

Are you willing to submit a PR?

Yes I'd like to help by submitting a PR!

The text was updated successfully, but these errors were encountered:

PawelPeczek-Roboflow · 2024-08-16T12:32:14Z

Hi there
Could you please tell us which model you use?

clausMeko · 2024-08-16T12:33:24Z

Model Type: Roboflow 3.0 Instance Segmentation (Accurate)

Is that what you mean?

PawelPeczek-Roboflow · 2024-08-16T12:34:56Z

no, I mean what is the name and version of the model you use

clausMeko · 2024-08-16T12:36:21Z

You mean from my login - its not public? emerald-fbdh0/4

PawelPeczek-Roboflow · 2024-08-16T12:38:34Z

ok, I will take a look at metadata and try to reproduce problem on similar model to profile the server
This will take some time, realistically can be done next week

clausMeko · 2024-08-16T12:40:10Z

Ty,
In parallel you could name me a public model with some expected performance so I can see if my setup is in general as fast as expected.

PawelPeczek-Roboflow · 2024-08-16T12:41:01Z

ok, that would be even better

PawelPeczek-Roboflow · 2024-08-16T12:41:29Z

will check and send a link

PawelPeczek-Roboflow · 2024-08-16T12:46:58Z

that should be the model with similar characteristics: "yolov8s-seg-640"

I just checked the number from our benchmarks and last time we checked it was faster than you report, so I will redo test once you confirm this 5 FPS on public model and we will see

clausMeko · 2024-08-16T13:03:08Z

@PawelPeczek-Roboflow

It is still 5 FPS, i.e. 200ms/image on the the public model. I added some info.

So somehow I am stuck with 5FPS on an Orin Nano. I am glad for any ideas.

Looking forward to next week.

Cheers,
Claus

Using "yolov8s-seg-640"

➜  ~ docker run --net=host --runtime=nvidia --env INSTANCES=2 -d roboflow/roboflow-inference-server-jetson-5.1.1
2c64571536487f15d998db82ed931cc3daed943db4c8958e3e09cc9e4503f101

➜  ~ docker logs -f upbeat_hoover
UserWarning: Unable to import Axes3D. This may be due to multiple versions of Matplotlib being installed (e.g. as a system package and as a pip package). As a result, the 3D projection is not available.
SupervisionWarnings: BoundingBoxAnnotator is deprecated: `BoundingBoxAnnotator` is deprecated and has been renamed to `BoxAnnotator`. `BoundingBoxAnnotator` will be removed in supervision-0.26.0.
UserWarning: Field name "schema" in "WorkflowsBlocksSchemaDescription" shadows an attribute in parent "BaseModel"
INFO:     Started server process [19]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:9001 (Press CTRL+C to quit)
INFO:     192.168.2.12:45126 - "GET /model/registry HTTP/1.1" 200 OK
UserWarning: Specified provider 'OpenVINOExecutionProvider' is not in available provider names.Available providers: 'TensorrtExecutionProvider, CUDAExecutionProvider, CPUExecutionProvider'
INFO:     192.168.2.12:45132 - "POST /model/add HTTP/1.1" 200 OK
INFO:     192.168.2.12:43590 - "POST /infer/instance_segmentation HTTP/1.1" 200 OK
INFO:     192.168.2.12:43598 - "GET /model/registry HTTP/1.1" 200 OK
INFO:     192.168.2.12:43614 - "POST /infer/instance_segmentation HTTP/1.1" 200 OK
# goes on forever....

Using "Ran0mMod3lID"

I checked that the model ID is not arbitrary. So it uses your provided model:

# the docker container confirms a random modelID as invalided
inference.core.exceptions.InvalidModelIDError: Model ID: `Ran0mMod3lID` is invalid.
INFO:     192.168.2.12:47106 - "POST /model/add HTTP/1.1" 400 Bad Request

PawelPeczek-Roboflow · 2024-08-16T13:04:33Z

ok, will verify on my end and reach you back

yeldarby · 2024-08-16T18:31:15Z

The benchmarks you're citing are for a nano-sized object detection model vs a small-sized instance segmentation model. Should be yolov8n-640.

clausMeko · 2024-08-19T06:12:03Z

@yeldarby your proposed Model yolov8n-640 runs at ~6fps on my orin nano.

@PawelPeczek-Roboflow I checked what a reduced resolution changes.

    client = InferenceHTTPClient(
        api_url="http://localhost:9001",
        api_key=ROBOFLOW_API_KEY,
    )
   # 100 times less pixels
    img = cv2.resize(img, (0, 0), fx = 0.1, fy = 0.1)
    results = client.infer(img, model_id=ROBOFLOW_MODEL)

it doubled the frame rate to ~11fps.

I don´t know how sensible that is - just fyi.

PawelPeczek-Roboflow · 2024-08-19T07:16:04Z

just checking at my jetson now - my first guess was that camera may be providing high res frames, but let's see what my test shows

PawelPeczek-Roboflow · 2024-08-19T07:37:20Z

Ok, seems that @clausMeko is right with his results, those are benchmarks for segmentation models:

yolov8n-seg-640

python -m inference_cli.main benchmark api-speed --model_id yolov8n-seg-640 --legacy-endpoints
Loading images...: 100%|█████████████████████████████████████████████████████████| 8/8 [00:01<00:00,  4.81it/s]
Warming up API...: 100%|███████████████████████████████████████████████████████| 10/10 [00:04<00:00,  2.12it/s]
Detected images dimensions: {(612, 612), (440, 640), (427, 640), (500, 375), (334, 500), (480, 640), (375, 500)}
avg: 123.8ms	| rps: 8.0	| p75: 127.4ms	| p90: 158.7	| %err: 0.0	|
avg: 124.9ms	| rps: 7.9	| p75: 124.0ms	| p90: 156.8	| %err: 0.0	|
avg: 125.0ms	| rps: 8.0	| p75: 125.9ms	| p90: 156.7	| %err: 0.0	|
avg: 121.7ms	| rps: 8.3	| p75: 128.5ms	| p90: 157.8	| %err: 0.0	|
avg: 122.4ms	| rps: 8.2	| p75: 133.0ms	| p90: 164.8	| %err: 0.0	|
avg: 121.6ms	| rps: 8.2	| p75: 133.3ms	| p90: 162.2	| %err: 0.0	|
avg: 122.4ms	| rps: 8.2	| p75: 134.8ms	| p90: 159.4	| %err: 0.0	|
avg: 120.5ms	| rps: 8.4	| p75: 132.4ms	| p90: 156.4	| %err: 0.0	|

yolov8s-seg-640

python -m inference_cli.main benchmark api-speed --model_id yolov8s-seg-640 --legacy-endpoints
Loading images...: 100%|█████████████████████████████████████████████████████████| 8/8 [00:01<00:00,  4.37it/s]
Warming up API...: 100%|███████████████████████████████████████████████████████| 10/10 [00:06<00:00,  1.67it/s]
Detected images dimensions: {(612, 612), (440, 640), (427, 640), (500, 375), (334, 500), (480, 640), (375, 500)}
avg: 155.5ms	| rps: 6.4	| p75: 171.1ms	| p90: 210.7	| %err: 0.0	|
avg: 153.7ms	| rps: 6.4	| p75: 174.5ms	| p90: 197.3	| %err: 0.0	|
avg: 152.1ms	| rps: 6.5	| p75: 173.5ms	| p90: 194.8	| %err: 0.0	|
avg: 149.9ms	| rps: 6.7	| p75: 175.4ms	| p90: 192.1	| %err: 0.0	|
avg: 146.5ms	| rps: 6.9	| p75: 167.9ms	| p90: 184.5	| %err: 0.0	|
avg: 149.2ms	| rps: 6.7	| p75: 174.8ms	| p90: 186.9	| %err: 0.0	|
avg: 153.1ms	| rps: 6.5	| p75: 172.0ms	| p90: 186.7	| %err: 0.0	|

Docs are probably referring to object detection models which looks like that:

yolov8n-640

python -m inference_cli.main benchmark api-speed --model_id yolov8n-640 --legacy-endpoints
Loading images...: 100%|█████████████████████████████████████████████████████████| 8/8 [00:01<00:00,  5.07it/s]
Warming up API...: 100%|███████████████████████████████████████████████████████| 10/10 [00:00<00:00, 13.53it/s]
Detected images dimensions: {(612, 612), (440, 640), (427, 640), (500, 375), (334, 500), (480, 640), (375, 500)}
avg: 67.7ms	| rps: 14.6	| p75: 69.5ms	| p90: 76.2	| %err: 0.0	|
avg: 66.0ms	| rps: 15.3	| p75: 67.3ms	| p90: 74.1	| %err: 0.0	|
avg: 66.7ms	| rps: 15.0	| p75: 67.9ms	| p90: 75.6	| %err: 0.0	|
avg: 66.8ms	| rps: 15.0	| p75: 69.0ms	| p90: 76.4	| %err: 0.0	|
avg: 65.9ms	| rps: 15.2	| p75: 67.9ms	| p90: 75.5	| %err: 0.0	|
avg: 66.4ms	| rps: 15.2	| p75: 68.5ms	| p90: 75.7	| %err: 0.0	|
avg: 67.2ms	| rps: 14.9	| p75: 69.9ms	| p90: 75.9	| %err: 0.0	|
avg: 66.4ms	| rps: 15.1	| p75: 68.0ms	| p90: 75.1	| %err: 0.0	|
avg: 66.6ms	| rps: 15.0	| p75: 67.8ms	| p90: 75.8	| %err: 0.0	|
avg: 67.6ms	| rps: 14.8	| p75: 71.3ms	| p90: 76.3	| %err: 0.0	|

yolov8s-640

python -m inference_cli.main benchmark api-speed --model_id yolov8s-640 --legacy-endpoints
Loading images...: 100%|█████████████████████████████████████████████████████████| 8/8 [00:01<00:00,  4.98it/s]
Warming up API...: 100%|███████████████████████████████████████████████████████| 10/10 [00:05<00:00,  1.82it/s]
Detected images dimensions: {(612, 612), (440, 640), (427, 640), (500, 375), (334, 500), (480, 640), (375, 500)}
avg: 85.6ms	| rps: 11.6	| p75: 86.1ms	| p90: 94.5	| %err: 0.0	|
avg: 86.1ms	| rps: 11.7	| p75: 88.6ms	| p90: 96.0	| %err: 0.0	|
avg: 86.5ms	| rps: 11.6	| p75: 89.1ms	| p90: 96.0	| %err: 0.0	|
avg: 87.1ms	| rps: 11.5	| p75: 90.0ms	| p90: 96.0	| %err: 0.0	|
avg: 86.9ms	| rps: 11.6	| p75: 89.6ms	| p90: 96.1	| %err: 0.0	|
avg: 86.2ms	| rps: 11.7	| p75: 87.0ms	| p90: 95.9	| %err: 0.0	|
avg: 86.7ms	| rps: 11.5	| p75: 90.2ms	| p90: 96.1	| %err: 0.0	|
avg: 87.5ms	| rps: 11.4	| p75: 91.8ms	| p90: 96.3	| %err: 0.0	|
avg: 89.7ms	| rps: 11.1	| p75: 95.5ms	| p90: 102.3	| %err: 0.0	|

clausMeko · 2024-08-19T08:11:45Z

@PawelPeczek-Roboflow so you would recommend choosing object detection over segmentation models if it is about performance?

PawelPeczek-Roboflow · 2024-08-19T11:36:44Z

That really depends on your use case - some tasks would be possible to be performed by both types of models, some not.
Additionally (for on-line video processing) - given that you process video skipping frames you are not able to process with you model, some applications will be possible even if you process 5, 10 or 15 fps but not each and every frame of the stream. It usually depends on camera POV regarding observed area (how close to the objects of interest camera is) and speed of observed objects. Happy to help if I know more details

clausMeko · 2024-08-20T09:27:19Z

@PawelPeczek-Roboflow I would like to use concurrency for inference. I.e. if a request takes ~100ms then I could do 3 requests every 33ms etc.

Do you you have a python code-snippet to do that? I saw your envVar INSTANCES=3 which probably only allows for parallel access but no speed up.

PawelPeczek-Roboflow · 2024-08-22T16:15:58Z

Sorry for late response,

I believe we do not have script to distribute requests. Cannot really find this INSTANCES env var in the codebase now

clausMeko added the bug Something isn't working label Aug 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Too low (frame) FPS compared to documentation #583

Too low (frame) FPS compared to documentation #583

clausMeko commented Aug 16, 2024

PawelPeczek-Roboflow commented Aug 16, 2024

clausMeko commented Aug 16, 2024 •

edited

Loading

PawelPeczek-Roboflow commented Aug 16, 2024

clausMeko commented Aug 16, 2024 •

edited

Loading

PawelPeczek-Roboflow commented Aug 16, 2024

clausMeko commented Aug 16, 2024 •

edited

Loading

PawelPeczek-Roboflow commented Aug 16, 2024

PawelPeczek-Roboflow commented Aug 16, 2024

PawelPeczek-Roboflow commented Aug 16, 2024

clausMeko commented Aug 16, 2024 •

edited

Loading

PawelPeczek-Roboflow commented Aug 16, 2024

yeldarby commented Aug 16, 2024

clausMeko commented Aug 19, 2024 •

edited

Loading

PawelPeczek-Roboflow commented Aug 19, 2024

PawelPeczek-Roboflow commented Aug 19, 2024

clausMeko commented Aug 19, 2024 •

edited

Loading

PawelPeczek-Roboflow commented Aug 19, 2024

clausMeko commented Aug 20, 2024 •

edited

Loading

PawelPeczek-Roboflow commented Aug 22, 2024

Too low (frame) FPS compared to documentation #583

Too low (frame) FPS compared to documentation #583

Comments

clausMeko commented Aug 16, 2024

Search before asking

Bug

Set Up

Problem

Question

Environment

Minimal Reproducible Example

Additional

Are you willing to submit a PR?

PawelPeczek-Roboflow commented Aug 16, 2024

clausMeko commented Aug 16, 2024 • edited Loading

PawelPeczek-Roboflow commented Aug 16, 2024

clausMeko commented Aug 16, 2024 • edited Loading

PawelPeczek-Roboflow commented Aug 16, 2024

clausMeko commented Aug 16, 2024 • edited Loading

PawelPeczek-Roboflow commented Aug 16, 2024

PawelPeczek-Roboflow commented Aug 16, 2024

PawelPeczek-Roboflow commented Aug 16, 2024

clausMeko commented Aug 16, 2024 • edited Loading

Using "yolov8s-seg-640"

Using "Ran0mMod3lID"

PawelPeczek-Roboflow commented Aug 16, 2024

yeldarby commented Aug 16, 2024

clausMeko commented Aug 19, 2024 • edited Loading

PawelPeczek-Roboflow commented Aug 19, 2024

PawelPeczek-Roboflow commented Aug 19, 2024

clausMeko commented Aug 19, 2024 • edited Loading

PawelPeczek-Roboflow commented Aug 19, 2024

clausMeko commented Aug 20, 2024 • edited Loading

PawelPeczek-Roboflow commented Aug 22, 2024

clausMeko commented Aug 16, 2024 •

edited

Loading

clausMeko commented Aug 16, 2024 •

edited

Loading

clausMeko commented Aug 16, 2024 •

edited

Loading

clausMeko commented Aug 16, 2024 •

edited

Loading

clausMeko commented Aug 19, 2024 •

edited

Loading

clausMeko commented Aug 19, 2024 •

edited

Loading

clausMeko commented Aug 20, 2024 •

edited

Loading