Event information{:target="_blank" rel="noopener"}
🗓️: Thursday Feb 13, 2025
⏰: 9 am PST / 11 am CST / 12 pm EST / 5pm GMT
Duration: 1 hour 30 mins
Event recording will be available soon
Check resources - code, presentation slides ..etc
- Welcome
- Quick intro about AI Alliance (3 min)
- Milvus introduction (15 mins)
- Workshop: RAG pipeline With Data Prep Kit + Milvus + Granite (60 mins)
- Q&A (10 mins)
- Wrap-up
by Stafan Webb @ Zilliz
Milvus is a popular, open-source vector database. In this talk Stefan will walk through some of the core features of Milvus.
by Sujee Maniyam @ Node51
In this workshop we will do the following:
- Extract content from various documents (PDFs, DOCX, HTML) using Docling.
- Use Data Prep Kit to streamline data preparation including markup removal, de-duplication, remove problematic data like spam, creating chunks and creating embeddings
- Vector Database Integration: We will use Milvus - a popular open source vector DB, to manage and search vectorized data effectively.
- And utilize an open source LLM like Meta-LLama or IBM-Granite to answer questions about documents
Here the compoents used in this RAG pipeline:
- docking{:target="_blank" rel="noopener"}
- Data Prep Kit{:target="_blank" rel="noopener"}
- Milvus
- Granite
What do you need to participate in this workshop?
To get the most out of this hands-on workshop, we recommend the following
- A laptop with Python development environment (setup instructions are here)
- A Replicate account (FREE) - get one at replicate.com
Session Type:
hands on workshop
Audience:
LLM app developers, data scientists, data engineers
Technical Level:
Intermediate
Prerequisites:
None
📊 Presentation: RAG with Data Prepkit + Milvus + Granite
💻 Code
https://github.com/IBM/data-prep-kit
🖥️ code
🙋 Ask questions, get help, give us feedback at Data Prep Kit discussion forum{:target="_blank" rel="noopener"}
Developer Evangelist @ Zilliz
AI Engineer, Developer Advocate @ Node51 (Consulting for IBM / The AI Alliance)
Sujee Maniyam is an expert in Generative AI, Machine Learning, Deep Learning, Big Data, Distributed Systems, and Cloud technologies. He is passionate about developer education, fostering community engagement. Sujee has led numerous training sessions, hackathons, and workshops. He is also an author, open source contributor and frequent speaker at conferences and meetups.
[email protected] •
Linkedin{:target="_blank" rel="noopener"} •
💼 portfolio{:target="_blank" rel="noopener"}