


- Similar Image Search
- Record User Feedback
This is the initial design for an image search engine. To handle the processing of images efficiently, I have chosen to use a worker-based system to asynchronously process any images uploaded to the /images folder. This approach is essential because the image-to-embedding process is computationally intensive and time-consuming, and new images might be constantly being uploaded.
In the future, we can enhance scalability by leveraging blob storage solutions such as Amazon S3 or a self-hosted object storage service.
For vector storage and similar image search, I have selected Qdrant, a vector database. This choice ensures the system can manage an ever-growing number of images effectively, providing robust and scalable similarity search functionality.
Additionally, the model(CLIP) responsible for generating embeddings is deployed as an independent service. This decoupled architecture enables flexibility in replacing or upgrading the model without affecting the rest of the system. If we receive negative feedback from users regarding search results or develop a better-trained model, we can seamlessly switch to the new model by updating the standalone deployment, ensuring continuous improvement in the system’s accuracy and relevance.
To record user feedback securely, I utilize JWT tokens for authentication, ensuring that requests are valid and authorized.
- Make command: Ensure make is installed on your system.
- Docker & Docker Compose: Required for containerized services.
# Build the service:
make build
# Run the service:
make run
Note: The startup process might take some time because the model service needs to download the weights for the CLIP model.
# clean up the local
make down
To upload images:
- Copy the image files to the /images folder located at the root of the project.
- The worker will automatically detect any new images in the folder and process them into embeddings.
curl --location 'http://localhost:3000/api/v1/search-image' \
--header 'Content-Type: application/json' \
--data '{
"text":"tennis"
}'
example response
{
"text": "tennis", // The query text provided for the image search
"model_name": "CLIP", // The model used for processing (in this case, CLIP)
"matches": [ // Array of matched images with their respective scores
{
"image_name": "COCO_val2014_000000000962.jpg", // Name of the matched image
"score": 0.28906357 // Similarity score between the query text and the image
}
],
"jwt": "jwt_token_used_in_feedback" // JWT token for authenticating the feedback request
}
user's feedback range from 1 to 10.
curl --location 'http://localhost:3000/api/v1/create-feedback' \
--header 'Content-Type: application/json' \
--data '{
"user_feedback":5,
"jwt":"jwt_token_used_in_feedback"
}'