After you deploy the NVIDIA RAG Blueprint, use the following procedure to start testing and experimenting in the NVIDIA RAG Blueprint User Interface (RAG UI).
:::{important} The RAG UI is provided as a sample and for experimentation only. It is not intended for your production environment. :::
-
Open a web browser and navigate to
http://localhost:8090for a local deployment orhttp://<workstation-ip-address>:8090for a remote deployment.The RAG UI appears.
:width: 750px -
Click New Collection to add a new collection of documents. The Create New Collection dialog appears.
:width: 750px -
Choose some files to upload in the collection. Wait while the files are ingested.
The following file types are supported:
- Documents:
.pdf,.docx,.pptx,.txt,.md,.html,.json - Images:
.png,.jpeg,.bmp,.tiff - Audio:
.mp3,.wav - Video:
.mp4,.mov,.avi,.mkv
:::{note} The UI file upload interface has a hard limit of 100 files per upload batch. When selecting more than 100 files, only the first 100 are processed. For bulk uploads beyond this limit, use multiple upload batches or the programmatic API. :::
- Documents:
-
Create two collections, one named test_collection_1 and one named test_collection_2.
-
For Collections, add the two collections that you created.
-
In Ask a question about your documents, submit a query related (or not) to the documents that you uploaded to the collections. You can query a minimum of 1 and a maximum of 5 collections. You should see results similar to the following.
:width: 750px -
(Optional) Click Sources to view the documents that were used to generate the answer.
-
(Optional) Click Settings to experiment with the settings to see the effect on generated answers.
The chat interface provides several features beyond basic question answering.
You can attach images to your chat messages for visual analysis:
- Click the + icon in the chat input area
- Select Add image
- Choose one or more image files (JPEG, PNG, GIF, or WebP, up to 10MB each)
- The attached images appear as previews above the input
- Type your question and send
:::{note} Image analysis requires VLM Inference to be enabled in Settings > Feature Toggles. :::
When citations are enabled, responses include source references:
- Look for numbered citations in the response text
- Click Sources to expand the citations panel
- Each citation shows:
- The source document name
- A relevance score indicating how well the content matched your query
- A preview of the relevant text or image content
:width: 750px
To clear your conversation history:
- Click the + icon in the chat input area
- Select Clear chat
- Confirm the action in the dialog
The RAG UI provides data catalog capabilities for organizing and managing your document collections with rich metadata.
When creating a collection, you can expand the Data Catalog section to specify:
- Description: A text description of the collection's purpose and contents
- Tags: Keywords or labels for categorization and discoverability
- Owner: The person or team responsible for the collection
- Business Domain: The organizational domain or department (e.g., Engineering, Legal, HR)
- Status: The collection's current state (Active, Archived, or Deprecated)
You can define custom metadata fields that apply to all documents in a collection:
- When creating a new collection, use the Metadata Schema Editor
- Click Add Field to create a new metadata field
- For each field, specify:
- Name: The field identifier
- Type: Choose from string, integer, float, boolean, array, or datetime
- When uploading documents, you can fill in values for each metadata field
Custom metadata enables advanced filtering when querying your collections.
Click on any collection name in the sidebar to open the collection drawer. The drawer displays:
- Collection Catalog Info Panel: Shows all metadata including description, tags, owner, business domain, and status
- Content Metrics: Displays the total file count and content type indicators (tables, charts, images, audio)
- Documents List: All documents in the collection with their individual metadata
:width: 750px
Each document in a collection can have:
- Description: A summary or note about the document
- Tags: Document-specific labels for filtering and organization
- Custom Metadata: Values for fields defined in the collection's metadata schema
To edit document information:
- Click on a collection to open the collection drawer
- Find the document you want to edit
- Click the pencil icon next to the document
- Update the description and/or tags
- Click Save to apply changes
You can filter query results based on document metadata using the Filter Bar.
- Click in the Filters area above the chat input
- Select a metadata field from the dropdown
- Choose an operator (varies by field type):
- Text fields: =, !=, like, in, not in
- Number fields: =, !=, >, <, >=, <=, in, not in
- Boolean fields: =, !=
- Datetime fields: before, after, =, !=, >, <
- Array fields: array_contains, array_contains_all, array_contains_any
- Enter or select a value
- Press Enter to add the filter
You can add multiple filters and combine them with AND or OR logic:
- Click the logic button between filters to toggle between AND/OR
- Remove filters by clicking the X on the filter chip
The RAG UI supports automatic document summarization during ingestion. When enabled, the system generates AI-powered summaries for each uploaded document.
- Open the collection drawer by clicking on a collection name
- Click Add Sources to open the upload panel
- Expand the Collection Configuration section
- Toggle Document Summarization to enable or disable
:::{note} Document summarization may increase processing time and costs depending on your deployment configuration. Summaries are generated asynchronously after document ingestion completes. :::
Once documents are ingested with summarization enabled:
- Open the collection drawer
- Expand any document in the documents list
- The summary appears below the document metadata (if available)
- Click on the summary to expand/collapse the full text
The Settings panel provides configuration options for customizing RAG behavior. Access it by clicking the Settings icon in the header.
Fine-tune the retrieval and generation parameters:
:width: 750px
| Setting | Description | Range |
|---|---|---|
| Temperature | Controls randomness in responses. Higher = more creative, lower = more focused. | 0.0 - 1.0 |
| Top P | Limits token selection to cumulative probability. Lower = more focused. | 0.0 - 1.0 |
| Confidence Score Threshold | Minimum confidence for document relevance. Higher = more selective. | 0.0 - 1.0 |
| Vector DB Top K | Number of documents to retrieve from the vector database. | 1 - 400 |
| Reranker Top K | Number of documents to return after reranking. | 1 - 50 |
| Max Tokens | Maximum number of tokens in the generated response. | Varies |
Enable or disable various features:
| Feature | Description | Default |
|---|---|---|
| Enable Reranker | Uses reranking to improve document relevance. | Enabled |
| Include Citations | Adds source citations to responses. | Enabled |
| Use Guardrails | Applies NeMo Guardrails for safety filtering. | Disabled |
| Query Rewriting | Rewrites queries for better retrieval. | Disabled |
| VLM Inference | Enables vision-language model for image analysis. | Disabled |
| Filter Generator | Auto-generates metadata filters from queries. | Disabled |
Configure the AI models used for different tasks:
- Chat/LLM Model: The language model for generating responses
- Embedding Model: The model for creating document embeddings
- Reranker Model: The model for reranking retrieved documents
Set up custom API endpoints for LLM, embedding, and reranker services.
The RAG UI provides real-time notifications for tracking document ingestion and system health.
When you upload documents:
- A notification appears showing the upload task
- Progress updates display as documents are processed
- The notification shows completion status (success or failure)
- Click on a notification to view details
The UI automatically monitors backend services and shows notifications when issues are detected:
- Databases: Vector database (Milvus/Elasticsearch) connectivity
- NIM Services: LLM, embedding, and reranker model availability
- Processing: Document ingestion service status
Health notifications include the service name, error details, and response time to help with troubleshooting.
- Click the bell icon in the header to view all notifications
- Notifications show the collection name, document count, and status
- Use Clear All to remove all notifications
- Old notifications are automatically cleaned up after 24 hours
The following issues might arise when you work with the RAG UI:
-
If you try to upload multiple files at the same time, you might see an error similar to
Error uploading documents: { code: 'ECONNRESET' }. In this case, use the API directly for bulk uploading. -
The RAG UI has a hard limit of 100 files per upload batch. For larger uploads, use multiple batches or the API. The default timeout for file uploads is 1 hour.
-
Immediately after document ingestion, there might be a delay before the UI accurately reflects the number of documents in a collection.
-
Document summaries may take additional time to generate after ingestion completes. The UI shows "Generating summary..." until the process finishes.