Skip to content

Add support for nuScenes dataset #393

@tejasstanley

Description

@tejasstanley

Hi, I am Tejas Stanley, a first-year Robotics MSc student at TU Delft.
I had previously commented in the tutorials discussion regarding adding support for the nuScenes dataset. I would like to propose adding nuScenes support for 2D object detection tasks.

I have implemented a working prototype locally in my fork and would like to get feedback from the maintainers before opening a full PR.

What I have implemented so far

  1. A new dataset class NuScenesDetectionDataset inheriting from ImageDetectionDataset
  2. Added the dataset import in datasets/__init__.py
  3. Parsing sample images and projecting nuScenes 3D bounding boxes into 2D camera boxes using the nuScenes devkit helpers
  4. Camera-specific loading (e.g., CAM_FRONT)
  5. Keeping original nuScenes class names (e.g., vehicle.car, vehicle.truck, human.pedestrian.adult)
  6. Dropping some nuScenes class names from ground truth, these are a lot, making the visualization worse, like "movable_object.barrier".
  7. Tested integration inside tutorial_image_detection.ipynb
    Prototype implementation:
    https://github.com/tejasstanley/PerceptionMetrics/blob/nuScenes_detection/perceptionmetrics/datasets/nuscenes_detection.py

Issue 1 — Bounding box scaling mismatch in tutorials (COCO + nuScenes)

While testing tutorial_image_detection.ipynb, predicted bounding boxes are not scaled back to the original image resolution during visualization.

This happens for both:

  • COCO (default dataset)
  • nuScenes (my integration)

Before resizing (misaligned boxes)

After resizing (temporary fix inside tutorial)

Currently I handled scaling directly inside the tutorial for debugging, but ideally this should be fixed in the dataset or visualization layer rather than inference.


Issue 2 — GUI inference shows the same scaling problem

The same mismatch appears when running inference through the GUI.


Issue 3 — Precision–Recall curve returns 0 AUC-PR

In the detection tutorial, the PR curve cell currently returns:

AUC-PR = 0

This occurs even when using:

  • COCO dataset
  • Provided pretrained weights

These issues might be specific to my environment (installed via Poetry). It would help if someone could verify whether this is reproducible.


nuScenes-specific considerations

  • Testing done using v1.0-mini dataset provided in the nuScenes
  • Using provided RCNN weights only to verify the pipeline (not fine-tuned), so detections are mostly incorrect.
  • Requires installing the nuScenes devkit
  • nuScenes provides 3D bounding boxes which are projected into 2D camera space. Some ground truth boxes appear even when objects are heavily occluded. This occurs even with visibility filtering. This can be improved by using lidar based filtering or additional visibility heurestics.

Segmentation

nuScenes does not provide segmentation masks, but does provide lidar segmentation masks.


Questions / guidance requested

  • Is this dataset structure acceptable for integration?
  • Should class names remain original or be merged into a smaller ontology?
  • Preferred place to handle bbox scaling (dataset vs visualization vs tutorial)?
  • Any recommendation for filtering occluded projected boxes?
  • Should I also integrate nuScenes into the dataset viewer tab?
  • Are there any additional requirements or conventions I should follow before opening a PR?
  • Is there any LiDAR segmentation tutorial or reference implementation available in the repository? I can use it as a guide to add nuScenes LiDAR support.

If this direction looks good, I can clean up the code and open a PR referencing this issue.

Thanks for your time and feedback!

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions