Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add blender dataset + alpha training support #573

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

thomakah
Copy link

@thomakah thomakah commented Mar 4, 2025

This pull request adds the following:

  • Support for the commonly-used synthetic blender dataset
  • Support for source images with alpha channels, and small changes to training and evaluation to accommodate this (more info below)

The latter is the main intended purpose of the pull request, and the former is to allow for convenient testing on a public dataset with transparent images. Building off of this work, users can train models using matted photos to only produce splats for the object of interest - however this would require a small additional change on the COLMAP parsing to load the alpha channel (and is not included in this pull request).

Alpha is handled in the following way:

  • For evaluation, both the render and ground truth photo are composed on a fixed background (defined by a new configuration parameters, defaulting to white).
  • For training
    • if the (existing but slightly repurposed) configuration setting random_bkgd is True: on each iteration the same random background color is applied for both the photo and render
    • otherwise: the fixed background color is used

Using the random background in training encourages better alpha consistency and reduces floaters. However, the final evaluation metrics are marginally worse than if using a fixed background. For this reason, I have not enforced a setting of random_bkgd = True when blender data is used.

The metrics are included below. alpha_iou is the intersection over union comparing the rendered alpha to the source image alpha channel (thresholding on > 127). nerfbaselines has also assessed the Blender dataset on an earlier version of gsplat (patching in the dataset compatibility on their end). Compared to their results, 6 scenes show marginally better PSNR while 2 (materials and ficus) have marginally worse PSNR.

scene strategy background psnr ssim lpips alpha_iou
chair default fixed_bkgd 35.823 0.987 0.008 0.991
chair default random_bkgd 35.718 0.987 0.008 0.997
chair mcmc fixed_bkgd 36.398 0.989 0.008 0.993
chair mcmc random_bkgd 36.149 0.989 0.009 0.997
drums default fixed_bkgd 26.121 0.954 0.039 0.982
drums default random_bkgd 26.024 0.953 0.040 0.989
drums mcmc fixed_bkgd 26.209 0.956 0.040 0.973
drums mcmc random_bkgd 26.127 0.955 0.042 0.989
ficus default fixed_bkgd 34.360 0.987 0.010 0.952
ficus default random_bkgd 33.639 0.987 0.012 0.965
ficus mcmc fixed_bkgd 34.957 0.988 0.008 0.958
ficus mcmc random_bkgd 34.023 0.987 0.011 0.967
hotdog default fixed_bkgd 37.668 0.985 0.014 0.981
hotdog default random_bkgd 37.417 0.985 0.013 0.998
hotdog mcmc fixed_bkgd 38.373 0.988 0.011 0.762
hotdog mcmc random_bkgd 37.717 0.988 0.010 0.997
lego default fixed_bkgd 35.286 0.981 0.012 0.996
lego default random_bkgd 35.046 0.981 0.013 0.997
lego mcmc fixed_bkgd 35.732 0.984 0.011 0.994
lego mcmc random_bkgd 35.431 0.984 0.012 0.997
materials default fixed_bkgd 29.885 0.960 0.019 0.997
materials default random_bkgd 29.791 0.960 0.021 0.997
materials mcmc fixed_bkgd 30.577 0.965 0.018 0.997
materials mcmc random_bkgd 30.464 0.965 0.018 0.997
mic default fixed_bkgd 35.335 0.991 0.006 0.992
mic default random_bkgd 35.256 0.992 0.006 0.994
mic mcmc fixed_bkgd 37.226 0.994 0.004 0.992
mic mcmc random_bkgd 36.522 0.993 0.005 0.994
ship default fixed_bkgd 30.764 0.906 0.088 0.955
ship default random_bkgd 30.324 0.906 0.088 0.994
ship mcmc fixed_bkgd 31.437 0.911 0.075 0.829
ship mcmc random_bkgd 30.739 0.910 0.077 0.994

@thomakah
Copy link
Author

thomakah commented Mar 6, 2025

To better illustrate the benefit of the random background, here is a validation render from training with a fixed background (using MCMC, though we see similar effects with the default strategy). The figures show the ground truth image on the left, and the corresponding render on the right.

val_step29999_0167_fixed

Here is the validation render using a random background:
val_step29999_0167_random

Note that the large patch of white floaters below the object is no longer present after using the random background in training.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants