Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -181,7 +181,7 @@ services:
image: ghcr.io/shared-reality-lab/image-preprocessor-object-detection-llm:${REGISTRY_TAG}
restart: "no"
environment:
- CONF_THRESHOLD=0.9
- CONF_THRESHOLD=0.8
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could argue this should be settable in docker-compose, so that it can be tuned by people who have different requirements for number of items vs. accuracy, or who change the model used, and the confidence scale shifts.

As discussed in slack earlier, filtering should also really be a handler issue in the long-run. So confidence threshold should be tuned to "too much" rather than "too little" and individual handlers can decide what tradeoffs they want to make.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is settable in docker-compose, or am I missing something?

Do you want me to change the threshold to 0 and let handers decide what they want to do?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, go with a threshold you're comfortable with, since we don't have anyone to implement handler changes at this point. But this needs to be resolved in the future. And, great it is settable in docker-compose. Didn't remember that!

- PII_LOGGING_ENABLED=${PII_LOGGING_ENABLED}
- WARMUP_ENABLED=true
labels:
Expand Down
2 changes: 1 addition & 1 deletion preprocessors/object-detection-llm/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -25,4 +25,4 @@ ENV FLASK_APP=object-detection-llm.py

HEALTHCHECK --interval=60s --timeout=10s --start-period=120s --retries=5 CMD curl -f http://localhost:5000/health || exit 1

CMD [ "gunicorn", "object-detection-llm:app", "-b", "0.0.0.0:5000", "--capture-output", "--log-level=debug" ]
CMD [ "gunicorn", "object-detection-llm:app", "-b", "0.0.0.0:5000", "--capture-output", "--log-level=debug", "--timeout", "75"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Timeout adjustments should really be done in docker-compose, but that isn't implemented until #1077 is complete.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there anything I can do about it now?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't think of anything since #1077 is not implemented. But at least these comments should show up there now. :)

42 changes: 29 additions & 13 deletions preprocessors/object-detection-llm/object-detection-llm.py
Original file line number Diff line number Diff line change
Expand Up @@ -68,33 +68,50 @@ def normalize_bbox(bbox, width, height):
]


def filter_objects_by_confidence(objects, threshold):
def process_objects(objects, threshold):
"""
Filter objects based on confidence score threshold
and replace underscores in labels with spaces.
Process detected objects by filtering, transforming, and enriching them.

- Filters objects by confidence threshold
- Normalizes labels (replaces underscores with spaces)
- Renumbers IDs sequentially
- Calculates geometric properties (area, centroid)

Args:
objects (list): List of detected objects with confidence scores
threshold (float): Minimum confidence score (0-1)

Returns:
list: Filtered list of objects meeting the confidence threshold
list: Processed objects with computed properties
"""
filtered = []
processed = []
for obj in objects:
if obj.get("confidence", 0) >= threshold:
obj['type'] = obj['type'].replace('_', ' ')
filtered.append(obj)
processed.append(obj)

# Renumber IDs sequentially after filtering
for idx, obj in enumerate(filtered):
for idx, obj in enumerate(processed):
obj['ID'] = idx

x1, y1, x2, y2 = obj["dimensions"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't expect to find this in a function "filter_objects" since it has nothing to do with filtering. make function name more generic?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, renamed.


# Calculate area (width * height)
area = (x2 - x1) * (y2 - y1)

# Calculate centroid
centroid_x = (x1 + x2) / 2
centroid_y = (y1 + y2) / 2

# Create object entry according to schema
obj["area"] = area
obj["centroid"] = [centroid_x, centroid_y]

logging.debug(
f"Filtered {len(objects)} objects to {len(filtered)} "
f"Processed {len(objects)} objects to {len(processed)} "
f"objects with confidence >= {threshold}"
)
return filtered
return processed


@app.route("/preprocessor", methods=['POST'])
Expand Down Expand Up @@ -148,8 +165,6 @@ def detect_objects():
parse_json=True
)

logging.pii(f"LLM object detection output: {object_json}")

if object_json is None or len(object_json.get("objects", [])) == 0:
logging.error("Failed to extract objects from the graphic.")
return jsonify({"error": "No objects extracted"}), 204
Expand All @@ -162,8 +177,9 @@ def detect_objects():
obj["dimensions"], width, height
)

# Filter objects by confidence threshold
object_json["objects"] = filter_objects_by_confidence(
# Filter objects by confidence threshold, add area and centroid,
# remove underscores from labels, and renumber IDs
object_json["objects"] = process_objects(
object_json["objects"],
CONF_THRESHOLD
)
Expand Down
1 change: 1 addition & 0 deletions utils/llm/prompts.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@
3. Use simple and common object labels (e.g., "car", "person", "tree").
4. Include only objects that are clearly visible and identifiable.
5. Focus on the major and important objects in the image.
6. Multiple objects can have the same confidence score.
"""
###

Expand Down