Identification Document Extraction using Fireworks AI

This project demonstrates a few end-to-end proof-of-concepts for extracting structured information from images of U.S. identification documents (drivers licenses, state IDs, and passports) using Fireworks AI.

This was done as a demo for Fireworks AI job application process in early 2025, which took 6 weeks, 6 interviews, a demo take home-home, defense of that take home, and then on completion, I got a no thank you from them. So, approximately 2 full weeks of unpaid time. They also made me sign paperwork to allow them to use the code as they see fit before the demo. This was the first time they heard of a validation agent according to the interviewer of the demo.

To Run Apps

streamlit run app.py
stremalit run test_passports.py

What's Inside

This is an agentic workflow for OCR processing. It sends the image ot Google for OCR, and then back to the DocType Operator to start the pipeline. You can see the approximate path below.

Key Features in Document Inlining

Document Inlining:
The solution leverages Fireworks AI’s Document Inlining feature to process mulitple images by embedding image data directly (via Base64 encoding) into the API request.
JSON Mode Structured Responses:
We use Fireworks AI’s JSON Mode to instruct the model to return results in a well-structured JSON format. This allows easy validation and further processing of the extracted data.

Setup and Installation (Docker removed for now)

Clone the repository:

git clone <your-repo-url>
cd <your-repo-directory>

Create a .env file with your key:

touch .env

Add these values to your .env:

BASE_URL="https://api.fireworks.ai/inference/v1"
FIREWORKS_API_KEY="xxxx"
GOOGLE_API_KEY="xxxx"

Run the apps:
```
pip install -r requirements.txt
```
To run the single image test:
```
streamlit run app.py
```
To run the multimage testing, you'll need to have images installed in synthetic_passports, along with an appropriate jsonl file. You can use the generate_passports.py file under the fine_tuning folder to select a template (hard coded), and then it should populate. From there, you simply come back, and run:
```
streamlit run test_passports.py
```

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
_archived		_archived
agents		agents
eval		eval
fine_tuning		fine_tuning
images		images
Dockerfile		Dockerfile
README.md		README.md
agents.py		agents.py
app.py		app.py
docker-compose.yml		docker-compose.yml
passport_test_results.json		passport_test_results.json
requirements.txt		requirements.txt
test_passports.py		test_passports.py
test_results.json		test_results.json
workflow.png		workflow.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Identification Document Extraction using Fireworks AI

To Run Apps

What's Inside

Key Features in Document Inlining

Setup and Installation (Docker removed for now)

About

Uh oh!

Releases

Packages

Languages

lesreaper/ocr-fireworks-agentic-workflow

Folders and files

Latest commit

History

Repository files navigation

Identification Document Extraction using Fireworks AI

To Run Apps

What's Inside

Key Features in Document Inlining

Setup and Installation (Docker removed for now)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages