Add multimodal embeddings guide (#3364)

guimachiavelli · dureuill · web-flow · commit 2854af9f5718 · 2025-09-30T16:23:30.000+02:00
---------

Co-authored-by: Louis Dureuil &lt;louis@meilisearch.com&gt;
diff --git a/docs.json b/docs.json
@@ -167,6 +167,7 @@
                   "learn/ai_powered_search/getting_started_with_ai_search",
                   "learn/ai_powered_search/configure_rest_embedder",
                   "learn/ai_powered_search/document_template_best_practices",
+                  "learn/ai_powered_search/image_search_with_multimodal_embeddings",
                   "learn/ai_powered_search/image_search_with_user_provided_embeddings",
                   "learn/ai_powered_search/search_with_user_provided_embeddings",
                   "learn/ai_powered_search/retrieve_related_search_results",
diff --git a/learn/ai_powered_search/image_search_with_multimodal_embeddings.mdx b/learn/ai_powered_search/image_search_with_multimodal_embeddings.mdx
@@ -0,0 +1,229 @@
+---
+title: Image search with multimodal embeddings
+description: This article shows you the main steps for performing multimodal text-to-image searches
+---
+
+This guide shows the main steps to search through a database of images using Meilisearch's experimental multimodal embeddings.
+
+## Requirements
+
+- A database of images
+- A Meilisearch project  
+- Access to a multimodal embedding provider (for example, [VoyageAI multimodal embeddings](https://docs.voyageai.com/reference/multimodal-embeddings-api))  
+
+## Enable multimodal embeddings
+
+First, enable the `multimodal` experimental feature:
+
+```sh
+curl \
+  -X PATCH 'MEILISEARCH_URL/experimental-features/' \
+  -H 'Content-Type: application/json' \
+  --data-binary '{
+    "multimodal": true
+  }'
+```
+
+You may also enable multimodal in your Meilisearch Cloud project's general settings, under "Experimental features".
+
+## Configure a multimodal embedder
+
+Much like other embedders, multimodal embedders must set their `source` to `rest` and explicitly declare their `url`. Depending on your chosen provider, you may also have to specify `apiKey`.
+
+All multimodal embedders must contain an `indexingFragments` field and a `searchFragments` field. Fragments are sets of embeddings built out of specific parts of document data.
+
+Fragments must follow the structure defined by the REST API of your chosen provider.
+
+### `indexingFragments`
+
+Use `indexingFragments` to tell Meilisearch how to send document data to the provider's API when generating document embeddings.
+
+For example, when using VoyageAI's multimodal model, an indexing fragment might look like this:
+
+```json
+"indexingFragments": {
+  "TEXTUAL_FRAGMENT_NAME": {
+    "value": {
+      "content": [
+        {
+          "type": "text",
+          "text": "A document named {{doc.title}} described as {{doc.description}}"
+        }
+      ]
+    }
+  },
+  "IMAGE_FRAGMENT_NAME": {
+    "value": {
+      "content": [
+        {
+          "type": "image_url",
+          "image_url": "{{doc.poster_url}}"
+        }
+      ]
+    }
+  }
+}
+```
+
+The example above requests Meilisearch to create two sets of embeddings during indexing: one for the textual description of an image, and another for the actual image.
+
+Any JSON string value appearing in a fragment is handled as a Liquid template, where you interpolate document data present in `doc`. In `IMAGE_FRAGMENT_NAME`, that's  `image_url` which outputs the plain URL string in the document field `poster_url`. In `TEXT_FRAGMENT_NAME`, `text` contains a longer string contextualizing two document fields, `title` and `description`.
+
+### `searchFragments`
+
+Use `searchFragments` to tell Meilisearch how to send search query data to the chosen provider's REST API when converting them into embeddings:
+
+```json
+"searchFragments": {
+  "USER_TEXT_FRAGMENT": {
+    "value": {
+      "content": [
+        {
+          "type": "text",
+          "text": "{{q}}"
+        }
+      ]
+    }
+  },
+  "USER_SUBMITTED_IMAGE_FRAGMENT": {
+    "value": {
+      "content": [
+        {
+          "type": "image_base64",
+          "image_base64": "data:{{media.image.mime}};base64,{{media.image.data}}"
+        }
+      ]
+    }
+  }
+}
+```
+
+In this example, two modes of search are configured:
+
+1. A textual search based on the `q` parameter, which will be embedded as text
+2. An image search based on [data url](https://developer.mozilla.org/en-US/docs/Web/URI/Reference/Schemes/data) rebuilt from the `image.mime` and `image.data` field in the `media` field of the query
+
+Search fragments have access to data present in the query parameters `media` and `q`.
+
+Each semantic search query for this embedder should match exactly one search fragment of this embedder, so the fragments should each have at least one disambiguating field
+
+### Complete embedder configuration
+
+Your embedder should look similar to this example with all fragments and embedding provider data:
+
+```sh
+curl \
+  -X PATCH 'MEILISEARCH_URL/indexes/INDEX_NAME/settings' \
+  -H 'Content-Type: application/json' \
+  --data-binary '{
+    "embedders": {
+      "MULTIMODAL_EMBEDDER_NAME": {
+        "source": "rest",
+        "url": "https://api.voyageai.com/v1/multimodal-embeddings",
+        "apiKey": "VOYAGE_API_KEY",
+        "indexingFragments": {
+          "TEXTUAL_FRAGMENT_NAME": {
+            "value": {
+              "content": [
+                {
+                  "type": "text",
+                  "text": "A document named {{doc.title}} described as {{doc.description}}"
+                }
+              ]
+            }
+          },
+          "IMAGE_FRAGMENT_NAME": {
+            "value": {
+              "content": [
+                {
+                  "type": "image_url",
+                  "image_url": "{{doc.poster_url}}"
+                }
+              ]
+            }
+          }
+        },
+        "searchFragments": {
+          "USER_TEXT_FRAGMENT": {
+            "value": {
+              "content": [
+                {
+                  "type": "text",
+                  "text": "{{q}}"
+                }
+              ]
+            }
+          },
+          "USER_SUBMITTED_IMAGE_FRAGMENT": {
+            "value": {
+              "content": [
+                {
+                  "type": "image_base64",
+                  "image_base64": "data:{{media.image.mime}};base64,{{media.image.data}}"
+                }
+              ]
+            }
+          }
+        }
+      }
+    }
+  }'
+```
+
+## Add documents
+
+Once your embedder is configured, you can [add documents to your index](/learn/getting_started/cloud_quick_start) with the [`/documents` endpoint](/reference/api/documents).
+
+During indexing, Meilisearch will automatically generate multimodal embeddings for each document using the configured `indexingFragments`.
+
+## Perform searches
+
+The final step is to perform searches using different types of content.
+
+### Use text to search for images
+
+Use the following search query to retrieve a mix of documents with images matching the description, documents with and documents containing the specified keywords:
+
+```sh
+curl -X POST 'http://localhost:7700/indexes/INDEX_NAME/search' \
+  -H 'Content-Type: application/json' \
+  --data-binary '{
+    "q": "a mountain sunset with snow",
+    "hybrid": {
+      "embedder": "MULTIMODAL_EMBEDDER_NAME"
+    }
+  }'
+```
+
+### Use an image to search for images
+
+You can also use an image to search for other, similar images:
+
+```sh
+curl -X POST 'http://localhost:7700/indexes/INDEX_NAME/search' \
+  -H 'Content-Type: application/json' \
+  --data-binary '{
+    "media": {
+      "image": {
+        "mime": "image/jpeg",
+        "data": "<BASE64_ENCODED_IMAGE>"
+      }
+    },
+    "hybrid": {
+      "embedder": "MULTIMODAL_EMBEDDER_NAME"
+    }
+  }'
+```
+
+<Tip>
+In most cases you will need a GUI interface that allows users to submit their images and converts these images to Base64 format. Creating this is outside the scope of this guide.
+</Tip>
+
+## Conclusion
+
+With multimodal embedders you can:
+
+1. Configure Meilisearch to embed both images and queries  
+2. Add image documents — Meilisearch automatically generates embeddings  
+3. Accept text or image input from users  
+4. Run hybrid searches using a mix of textual and input from other types of media, or run pure semantic semantic searches using only non-textual input
diff --git a/reference/api/settings.mdx b/reference/api/settings.mdx
@@ -2939,6 +2939,14 @@ For example, for [VoyageAI's multimodal embedding route](https://docs.voyageai.c
 
 Use Liquid templates to interpolate document data into the fragment fields, where `doc` gives you access to all fields within a document.
 
+<Warning>
+If a Liquid template appearing inside of a fragment cannot be rendered, no embedding will be generated for that fragment and that document. If a document has no indexing fragments, it will not be returned in multimodal searches. In most cases, a fragment is not rendered because a field it references is missing in the document.
+
+This is different from embeddings based on `documentTemplate`, which abort the indexing task if the document template cannot be rendered for a document.
+
+You can check which documents have embeddings for a given fragment using [vector filters](/learn/filtering_and_sorting/filter_expression_reference#vector-filters).
+</Warning>
+
 `indexingFragments` is optional when using the `rest` source.
 
 `indexingFragments` is incompatible with all other embedder sources.
@@ -2974,19 +2982,31 @@ curl \
 
 As with `indexingFragments`, the content of `value` should follow your model's specification.
 
-Use Liquid templates to interpolate search query data into the fragment fields, where `media` gives you access to all multimodal data received with a query:
+Use Liquid templates to interpolate search query data into the fragment fields, where `{{media.*}}` gives you access to all [multimodal data received with a query](/reference/api/search#media) and `{{q}}` gives you access to the regular textual query:
 
 ```json
-"SEARCH_FRAGMENT_A": {
-  "value": {
-    "content": [
-      {
-        "type": "image_base64",
-        "image_base64": "data:{{media.image.mime}};base64,{{media.image.data}}"
-      }
-    ]
+{
+  "SEARCH_FRAGMENT_A": {
+    "value": {
+      "content": [
+        {
+          "type": "image_base64",
+          "image_base64": "data:{{media.image.mime}};base64,{{media.image.data}}"
+        }
+      ]
+    }
+  },
+  "SEARCH_FRAGMENT_B": {
+    "value": {
+      "content": [
+        {
+          "type": "text",
+          "text": "{{q}}"
+        }
+      ]
+    }
   }
-},
+}
 ```
 
 `searchFragments` is optional when using the `rest` source.