I've created a comprehensive codebase that covers all the sections of your workshop. Here's how the code is organized and can be used throughout the workshop:
The code includes:
- A basic OpenAI API call function
- A simple interactive chat implementation
- A LangChain example showing conversation memory
This demonstrates the fundamental ways to interact with an LLM and how LangChain can simplify these interactions.
The code provides functions for:
- Loading JSON documents
- Loading CSV files
- Fetching and parsing RSS feeds (simulated)
These functions show different document sources you can use in a RAG system.
Several text splitting strategies are implemented:
- Character-based splitting
- Token-based splitting
- Paragraph-based splitting
- Sentence-based splitting
There's also a demo function showing how each strategy affects the resulting chunks for comparison.
The code covers:
- Generating embeddings using OpenAI
- A simple in-memory vector store with cosine similarity search
- SQLite integration for document storage
- ChromaDB integration for vector storage
This demonstrates both DIY approaches and purpose-built vector databases.
Functions for:
- Basic RAG implementation
- Document summarization
- Advanced RAG with more context handling
Plus a complete end-to-end workflow example that ties all sections together.
For the actual workshop, I recommend:
-
Introduction (5 min)
- Explain RAG concepts and the workshop flow
- Set up environment variables for API keys
-
For each section:
- Present the concept (theory)
- Show the relevant code examples
- Let participants run the code
- Discuss the results and implications
-
Interactive elements:
- For @pavlik's practice sessions, have participants modify the code (e.g., changing prompt templates, adjusting chunk sizes)
- For document storage, compare results from different approaches
-
Final demonstration:
- Run the
completeWorkshopDemo()
function to show the entire workflow - Discuss potential optimizations and real-world applications
- Run the
Would you like me to elaborate on any specific section of the code or workshop structure?