-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify passing PIL images #54
Comments
We already have a central location that loads the image path. I think the original intension was to have this function to support str or PIL. Outside of this we would just need to update type hints. |
So, to throw another wrench into your system, not only do i need to specify rank, i am also using a CustomLabelsClassifier which unfortunately gives the error
here you can see the main code im trying to do. Basically I have Yolo detections for multiple insects on a flat plane in a high resolution image. I have cropped and rotated sub images of just a single insect at a time that I want to feed pybioclip. Before I could do it with the @bucket-of-bugs projects, but we had to save all the cropped images to disk first and lose their location on the larger original image. I want to feed pybioclip these PIL images directly
|
@quitmeyer CustomLabelsClassifier doesn't work with Rank. With CustomLabelsClassifier you can supply labels that aren't even part of taxonomy. What are some of the labels you are using with CustomLabelsClassifier in this scenario? |
@johnbradley here's our old example from beetlepalooza |
I have also been trying to trick bioclip by making a temporary file, but when pybioclip tries to open it i get a "Permission Denied" error (even when running as admin)
but then im back at my original problem of just saving a ton of files to disk that i later have to clean up manually |
@quitmeyer In that code the classifier isn't making use of rank(that I can tell). Perhaps you manually grouped the items. It sounds like you have custom labels that are species, but you want to group them by order(or other taxon rank). We have a new feature called binning that might be what you are looking for. See these two doc links: The feature hasn't been released yet so you may need to install like so:
The above documentation shows how to use binning with image paths.
|
Just to provide all the options, if you can use
|
@johnbradley, @quitmeyer, for bucket-of-bugs we did just use |
There can be differences in performance based on image processing. I saw this with another project where the interpolation method was different between the demo and local running: one opened images with tf.keras.utils.load_img default at the desired size, the other resized with PIL and it changed the predictions. |
Interesting @egrace479 , it seems to only mess up really bad on this butterfly which is also about 10-20x larger than most of the other images (small moths and beetles). Do you think there's a chance having too big of an image screws things up? Is there a way to specify the interpolation method? |
Do you think there's a chance having too big of an image screws things up? Is there a way to specify the interpolation method? BTW i just ran a quick experiment running the same image at full, 1/4 and 1/8 resolution and they ended up with the same results. so that might not be a prob |
The code scales images to 224x224 as part of an image preprocessing step when using the default model: pybioclip/src/bioclip/predict.py Lines 155 to 164 in d8035ef
To match what is done in the Are you passing the order as the label to CustomLabelsClassifier? ie. |
Here's what we get up to (condensed the code a bit):
|
@quitmeyer Based on your code I think you are creating a
My understanding is bioclip was trained on species. I think you will get better results if you pass the list |
ahhh ok, i think we might have mistakenly thought that's how it was functioning if we chose orders other than taxa |
If you would like to use the TreeOfLifeClassifier but filter to the target classes and add "hole" and "smudge" I could work up some code to do that. |
Otherwise the new binning feature is probably your best but, but generating 13K text embeddings is going to take quite some time. We have an issue with some rough notes on saving embeddings: #17 |
The main key for us is to be able to filter to specific regions and taxa only (in our case Panama and Insects) so that even if it gets some things wrong it doesn't give something totally impossible (like a polar bear in panama) and that we have some reproducible way of setting up our script for other regions (like for collaborators in the US or Peru), by doing something like feeding it a different GBIF download. Here's the latest script ive been hacking on for a better picture You can see i got it successfully working doing IDs on ROIs thanks to you! |
Ohhh and yeah if i can save embeddings that would be amazing, because that's the thing that kinda takes the longest for us right now i think! |
that would be incredible! |
@quitmeyer I pushed a notebook to a branch that shows how to filter the TOL classifier: https://github.com/Imageomics/pybioclip/blob/54-pil/FilterTOLExample.ipynb |
Thanks so much! I'll check it out and let you know how it goes! |
SO cool we can use PIL images now! |
Currently to create predictions for multiple PIL images instead image paths users must call three functions.
Something like the following:
The text was updated successfully, but these errors were encountered: