-
Notifications
You must be signed in to change notification settings - Fork 69
Description
Hi, I am Tejas Stanley, a first-year Robotics MSc student at TU Delft.
I had previously commented in the tutorials discussion regarding adding support for the nuScenes dataset. I would like to propose adding nuScenes support for 2D object detection tasks.
I have implemented a working prototype locally in my fork and would like to get feedback from the maintainers before opening a full PR.
What I have implemented so far
- A new dataset class
NuScenesDetectionDatasetinheriting fromImageDetectionDataset - Added the dataset import in
datasets/__init__.py - Parsing sample images and projecting nuScenes 3D bounding boxes into 2D camera boxes using the nuScenes devkit helpers
- Camera-specific loading (e.g.,
CAM_FRONT) - Keeping original nuScenes class names (e.g.,
vehicle.car,vehicle.truck,human.pedestrian.adult) - Dropping some nuScenes class names from ground truth, these are a lot, making the visualization worse, like "movable_object.barrier".
- Tested integration inside
tutorial_image_detection.ipynb
Prototype implementation:
https://github.com/tejasstanley/PerceptionMetrics/blob/nuScenes_detection/perceptionmetrics/datasets/nuscenes_detection.py
Issue 1 — Bounding box scaling mismatch in tutorials (COCO + nuScenes)
While testing tutorial_image_detection.ipynb, predicted bounding boxes are not scaled back to the original image resolution during visualization.
This happens for both:
- COCO (default dataset)
- nuScenes (my integration)
Before resizing (misaligned boxes)
After resizing (temporary fix inside tutorial)
Currently I handled scaling directly inside the tutorial for debugging, but ideally this should be fixed in the dataset or visualization layer rather than inference.
Issue 2 — GUI inference shows the same scaling problem
The same mismatch appears when running inference through the GUI.
Issue 3 — Precision–Recall curve returns 0 AUC-PR
In the detection tutorial, the PR curve cell currently returns:
AUC-PR = 0
This occurs even when using:
- COCO dataset
- Provided pretrained weights
These issues might be specific to my environment (installed via Poetry). It would help if someone could verify whether this is reproducible.
nuScenes-specific considerations
- Testing done using
v1.0-minidataset provided in the nuScenes - Using provided RCNN weights only to verify the pipeline (not fine-tuned), so detections are mostly incorrect.
- Requires installing the nuScenes devkit
- nuScenes provides 3D bounding boxes which are projected into 2D camera space. Some ground truth boxes appear even when objects are heavily occluded. This occurs even with visibility filtering. This can be improved by using lidar based filtering or additional visibility heurestics.
Segmentation
nuScenes does not provide segmentation masks, but does provide lidar segmentation masks.
Questions / guidance requested
- Is this dataset structure acceptable for integration?
- Should class names remain original or be merged into a smaller ontology?
- Preferred place to handle bbox scaling (dataset vs visualization vs tutorial)?
- Any recommendation for filtering occluded projected boxes?
- Should I also integrate nuScenes into the dataset viewer tab?
- Are there any additional requirements or conventions I should follow before opening a PR?
- Is there any LiDAR segmentation tutorial or reference implementation available in the repository? I can use it as a guide to add nuScenes LiDAR support.
If this direction looks good, I can clean up the code and open a PR referencing this issue.
Thanks for your time and feedback!