Event information{:target="_blank" rel="noopener"}
🗓️: Thursday Jan 30, 2025
⏰: 9 am PST / 11 am CST / 12 pm EST / 5pm GMT
Duration: 1 hour
Event recording is available{:target="_blank" rel="noopener"}
Check resources - code, presentation slides ..etc
- Welcome
- Quick intro about AI Alliance (3 min)
- Presentation: Data preparation with Docling (40 mins)
- Q&A (10 mins)
- Wrap-up
When building machine learning applications, a significant portion of your time will be dedicated to data wrangling - starting with content extraction from various documents like PDF, DOCX ..etc.
Docling is an open source, versatile document processor that handles various file types.
In this session, I will introduce Docling and walk through some its core features. And will show a demo (attendees can run it alongside)
More about docking{:target="_blank" rel="noopener"}
Session Type:
Presentation and Demo
Audience:
LLM app developers, data scientists, data engineers
Technical Level:
Beginner - Intermediate
Prerequisites:
None
📺 Presentation: Data Prep for LLM Applications with Docling - Part 1{:target="_blank" rel="noopener"}
💻 Code
https://github.com/sujee/data-prep-kit-examples
Docling code examples are in /docling{:target="_blank" rel="noopener"}
You can walk through the README to setup a local Docling environment.
Let's start with this notebook : docling_1_intro.ipynb{:target="_blank" rel="noopener"} - We can execute this on Google colab
🙋 Ask questions, get help, give us feedback at Data Prep Kit discussion forum{:target="_blank" rel="noopener"}
AI Engineer, Developer Advocate @ Node51 (Consulting for IBM / The AI Alliance)
Sujee Maniyam is an expert in Generative AI, Machine Learning, Deep Learning, Big Data, Distributed Systems, and Cloud technologies. He is passionate about developer education, fostering community engagement. Sujee has led numerous training sessions, hackathons, and workshops. He is also an author, open source contributor and frequent speaker at conferences and meetups.
[email protected] •
Linkedin{:target="_blank" rel="noopener"} •
💼 portfolio{:target="_blank" rel="noopener"}