Skip to content

Latest commit

 

History

History
46 lines (29 loc) · 1.87 KB

File metadata and controls

46 lines (29 loc) · 1.87 KB

PaliGemma Android HF

This repository is an implementation of inferring the PaliGemma Vision Language Model on Android using Hugging Face-Gradio Client API for tasks such as zero-shot object detection, image captioning and visual question-answering.

Pipeline:

Demo Outputs:

Visual question-answering, zero-shot object detection, image captioning

Reference Expression Segmentation

Model used: Florence-2

Resources:

Citation

If you find this project useful for your work, please cite it using the following BibTeX entry:

@misc{PaliGemma on Android using Hugging Face API,
  authors      = {Nitin Tiwari, Sagar Malhotra, Savio Rodrigues},
  title        = {PaliGemma on Android using Hugging Face API},
  year         = {2024},
  publisher    = {GitHub},
  howpublished = {\url{https://github.com/NSTiwari/PaliGemma-Android-HF}},
}

Acknowledgment

This project was developed during Google's ML Developer Programs AI Sprint. Thanks to the MLDP team for providing Google Cloud credits to support this project.