A tiny learning project to understand AI APIs, prompts, tokens, and structured output.
This is a small experimental project I built to understand how an AI Vision API behaves when used inside real code. The goal wasn’t to create a polished app — just to learn:
- how to send images to an AI model
- how prompts affect the response
- how token usage works
- how to request structured JSON
- how models handle errors or unexpected inputs
The project classifies a book cover image and tries to extract basic details like:
- title
- author
- number of pages (if detectable)
It also displays:
- input tokens used
- output tokens generated
Note: The UI is intentionally simple — this project was built for learning, not design.
- Upload a book cover image
- Vision model attempts to extract book details
- Handles incorrect images (non-books) gracefully
- Displays token usage for learning purposes
- Minimal UI — built only for experimentation
Create your .env file:
The
.env.examplefile shows the correct format. Never commit your real API key.
This is a learning experiment, not production code.
The project uses dangerouslyAllowBrowser: true for quick local testing.
Do not expose real API keys in public repos or production builds.
- UI is intentionally minimal.
- Model behavior may vary depending on the image.
- The project is expected to evolve as I learn more about LLMs, prompting, and model parameters.



