Skip to content

Commit f60c6b1

Browse files
committed
added 2025-01-30 session
1 parent 8c4d404 commit f60c6b1

File tree

2 files changed

+84
-1
lines changed

2 files changed

+84
-1
lines changed

README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -28,4 +28,4 @@ You can find details, recordings and resources for events below.
2828
- **2025-Feb-20:** How to contribute to an open source project
2929
- **2025-Feb-13:** Open source RAG pipeline using Data Prep Kit + Milvus + Granite
3030
- **2025-Feb-06:** Using Data Prep Kit to process data
31-
- **2025-Jan-30:** Using Docling to process data
31+
- **2025-Jan-30:** [Data processing using Docling](events/2025-01-30__docling.md)

events/2025-01-30__docling.md

+83
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,83 @@
1+
# Data Preparation with Docling (2025-Jan-30)
2+
3+
<!-- ## 🔗 [#](#) -->
4+
5+
<!-- <img src="../assets/qrcode_2025-02-27__data-prep-review.png" width="400px"> -->
6+
7+
## Event Details
8+
9+
[Event information](https://www.meetup.com/ibm-developer-sf-bay-area-meetup/events/305798918/){:target="_blank" rel="noopener"}<br>
10+
🗓️: **Thursday Jan 30, 2025** <br>
11+
⏰: **9 am PST / 11 am CST / 12 pm EST / 5pm GMT**<br>
12+
Duration: **1 hour**
13+
14+
**[Event recording is available](https://www.youtube.com/watch?v=RyapKHqom9Q){:target="_blank" rel="noopener"}**
15+
16+
**[Check resources](#resources)** - code, presentation slides ..etc
17+
18+
---
19+
20+
## Agenda
21+
22+
- Welcome
23+
- Quick intro about AI Alliance (3 min)
24+
- Presentation: Data preparation with Docling (40 mins)
25+
- Q&A (10 mins)
26+
- Wrap-up
27+
28+
## Session: Data preparation with Docling
29+
30+
### About
31+
32+
When building machine learning applications, a significant portion of your time will be dedicated to data wrangling - starting with content extraction from various documents like PDF, DOCX ..etc.
33+
34+
Docling is an open source, versatile document processor that handles various file types.
35+
36+
In this session, I will introduce Docling and walk through some its core features. And will show a demo (attendees can run it alongside)
37+
38+
39+
More about [docking](https://github.com/DS4SD/docling){:target="_blank" rel="noopener"}
40+
41+
42+
**Session Type:**
43+
Presentation and Demo
44+
45+
**Audience**:
46+
LLM app developers, data scientists, data engineers
47+
48+
**Technical Level**:
49+
Beginner - Intermediate
50+
51+
**Prerequisites**:
52+
None
53+
54+
### Resources
55+
56+
57+
📺 **Presentation**: [Data Prep for LLM Applications with Docling - Part 1](https://docs.google.com/presentation/d/1SkghvqrdTo9wIAye36jO_KNVTWbO6v5bqfVI7CWA3-g/edit?usp=sharing){:target="_blank" rel="noopener"}
58+
59+
60+
💻 **Code**
61+
62+
https://github.com/sujee/data-prep-kit-examples
63+
64+
65+
Docling code examples are in [/docling](https://github.com/sujee/data-prep-kit-examples/tree/main/docling){:target="_blank" rel="noopener"}
66+
67+
You can walk through the README to setup a local Docling environment.
68+
69+
Let's start with this notebook : [docling_1_intro.ipynb](https://github.com/sujee/data-prep-kit-examples/blob/main/docling/docling_1_intro.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/sujee/data-prep-kit-examples/blob/main/docling/docling_1_intro.ipynb){:target="_blank" rel="noopener"}
70+
71+
#### Support and Community
72+
73+
🙋 Ask questions, get help, give us feedback at [Data Prep Kit discussion forum](https://github.com/IBM/data-prep-kit/discussions){:target="_blank" rel="noopener"}
74+
75+
## Speaker: Sujee Maniyam
76+
77+
**AI Engineer, Developer Advocate @ Node51 (Consulting for [IBM / The AI Alliance](https://thealliance.ai/))** <br>
78+
79+
Sujee Maniyam is an expert in Generative AI, Machine Learning, Deep Learning, Big Data, Distributed Systems, and Cloud technologies. He is passionate about developer education, fostering community engagement. Sujee has led numerous training sessions, hackathons, and workshops. He is also an author, open source contributor and frequent speaker at conferences and meetups.
80+
81+
[email protected] &nbsp;&nbsp;
82+
<img src="../assets/linkedin.svg" width="16 px"> [Linkedin](https://www.linkedin.com/in/sujeemaniyam/){:target="_blank" rel="noopener"} &nbsp;&nbsp;
83+
💼 [portfolio](https://sujee.dev/portfolio){:target="_blank" rel="noopener"}

0 commit comments

Comments
 (0)