Skip to content

Conversation

@RunDevelopment
Copy link
Contributor

@RunDevelopment RunDevelopment commented Nov 23, 2025

If a reference image doesn't exist or doesn't match, this will now automatically update the reference image.

This makes it a lot easier to add/change test images, because updating the references just involves running cargo test.

@fintelia
Copy link
Contributor

Wait, so running cargo test twice in a row will have it always succeed the second time? That doesn't seem ideal

@RunDevelopment
Copy link
Contributor Author

Why does this not seem ideal to you? Do you think that people might accidentally update references?

Alternatively, I could also make it so that it only creates/updates references if a certain env var is set. So e.g. BLESS=1 cargo test.

@fintelia
Copy link
Contributor

Yeah, the concern is that if there's a bug in a decoder that you're trying to fix that every time you run the tests the incorrect output will be used to overwrite the correct values.

Theoretically folks should be getting the reference images from some other test suite or generating them using a different decoder that's already known to match the spec. Producing the references from the same decoder that's being tested prevents accidental changes in behavior, but doesn't guarantee that the output matches what anyone else thinks is correct.

@RunDevelopment
Copy link
Contributor Author

RunDevelopment commented Nov 24, 2025

I think that our different perspective might stem from a difference in workflow. I typically like to use larger images (larger= more than 100 pixels) to test for correct rounding, since rounding error might only occur in very rare cases. These types of defects are very hard to catch with tiny test images. So I like to use images with smooth gradients to test lots of colors. However, this type of workflow is very hard to do with the current test infrastructure. assert_eq! outputs are useless if they compare arrays of thousands of numbers that are mostly the same. In those cases, it's a lot easier to just write out the wrong output and use another tool (e.g. image editing tools) to search for and analyze the defect. Or in other words, this allows you to easily visualize incorrect results to figure out the error.

Anyway, I think I'm going to change it to only write out missing/mismatching reference image when explicitly asked for via an env var. So your workflow will be the default and devs will have another tool. I hope this is more acceptable.

Yeah, the concern is that if there's a bug in a decoder that you're trying to fix that every time you run the tests the incorrect output will be used to overwrite the correct values.

We all use version control. So unless people blindly commit everything, reference files changing should cause them to investigate. (And with the wrong output in front of them, seeing exactly what is wrong should be rather easy.)

Theoretically folks should be getting the reference images from some other test suite or generating them using a different decoder that's already known to match the spec.

Unfortunately, this doesn't work for some formats. E.g. DDS BC1 allows an error of +-3%. So there is no single reference image. (And yes, different implementations of BC1 do produce different results. E.g. AMD, Intel, and Nvidia hardware all decode BC1 slightly differently.)

@fintelia fintelia merged commit 791680c into image-rs:main Nov 25, 2025
7 checks passed
@RunDevelopment RunDevelopment deleted the test-update-references branch November 25, 2025 11:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants