-
Notifications
You must be signed in to change notification settings - Fork 124
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Milvus-doc-bot
authored and
Milvus-doc-bot
committed
Jan 29, 2024
1 parent
0f375f4
commit eb470c2
Showing
21 changed files
with
278 additions
and
91 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,55 @@ | ||
--- | ||
id: chunk_cache.md | ||
title: Configure Chunk Cache | ||
--- | ||
|
||
# Connfigure Chunk Cache | ||
|
||
The chunk cache mechanism enables Milvus to pre-load data into cache memory on the local hard disk of the query nodes before it is needed. This mechanism significantly improves vector retrieval performance by reducing the time it takes to load data from disk to memory. | ||
|
||
## Background | ||
|
||
Before conducting queries to retrieve vectors, you need to load the data from object storage to the memory cache on the local hard disk of the query nodes. This is a time-consuming process. Before all data is loaded, Milvus may respond to some vector retrieval requests with a delay. | ||
|
||
To improve the query performance, Milvus provides a chunk cache mechanism to pre-load data from object storage into the cache memory on the local hard disk before it is needed. When a query request is received, the Segcore first checks if the data is in the cache memory, instead of the object storage. If the data is in the cache memory, Segcore can quickly retrieve it from the cache memory and return the result to the client. | ||
|
||
## Configure Chunk Cache | ||
|
||
This guide provides instructions on how to configure the chunk cache mechanism for a Milvus instance. Configuration varies with the way you install the Milvus instance. | ||
|
||
- For Milvus instances installed using Helm Charts | ||
|
||
Add the configuration to the `values.yaml` file under the `config` section. For details, refer to [Configure Milvus with Helm Charts](configure-helm.md). | ||
|
||
- For Milvus instances installed using Docker Compose | ||
|
||
Add the configuration to the `milvus.yaml` file you have used to start the Milvus instance. For details, refer to [Configure Milvus with Docker Compose](configure-docker.md). | ||
|
||
- For Milvus instances installed using Operator | ||
|
||
Add the configuration to the `spec.components` section of the `Milvus` custom resource. For details, refer to [Configure Milvus with Operator](configure_operator.md). | ||
|
||
### Configuration options | ||
|
||
```yaml | ||
queryNode: | ||
cache: | ||
warmup: async | ||
``` | ||
The `warmup` parameter determines whether Milvus pre-loads data from the object storage into the memory cache on the local hard disk of the query nodes before it is needed. This parameter defaults to `async`. Possible options are as follows: | ||
|
||
- `async`: Milvus pre-loads data asynchronously in the background, which does not affect the time it takes to load a collection. However, users may experience a delay when retrieving vectors for a short period of time after the load process is complete. This is the default option. | ||
- `sync`: Milvus pre-loads data synchronously, which may affect the time it takes to load a collection. However, users can perform queries immediately after the load process is complete without any delay. | ||
- `off`: Milvus does not pre-load data into the memory cache. | ||
|
||
Note that the chunk cache settings also apply when new data is inserted into collections or the collection indexes are rebuilt. | ||
|
||
### FAQ | ||
|
||
- **How can I determine whether the chunk cache mechanism is working correctly?** | ||
|
||
You are advised to check the latency of a search or query request after loading a collection. If the latency is significantly higher than expected (e.g., several seconds), it may indicate that the chunk cache mechanism is still working. | ||
|
||
If the query latency stays high for a long time. You can check the throughput of the object storage to ensure that the chunk cache is still working. In normal cases, the working chunk cache will generate high throughput on the object storage. Alternatively, you can simply try chunk cache in the `sync` mode. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,51 @@ | ||
--- | ||
id: limit_collection_counts.md | ||
title: Set Limits on Collection Number | ||
--- | ||
|
||
# Limit Collection Counts | ||
|
||
A Milvus instance allows up to 65,536 collections. However, too many collections may result in performance issues. Therefore, it is recommended to limit the number of collections created in a Milvus instance. | ||
|
||
This guide provides instructions on how to set limits on the number of collections in a Milvus instance. | ||
|
||
Configuration varies with the way you install the Milvus instance. | ||
|
||
- For Milvus instances installed using Helm Charts | ||
|
||
Add the configuration to the `values.yaml` file under the `config` section. For details, refer to [Configure Milvus with Helm Charts](configure-helm.md). | ||
|
||
- For Milvus instances installed using Docker Compose | ||
|
||
Add the configuration to the `milvus.yaml` file you have used to start the Milvus instance. For details, refer to [Configure Milvus with Docker Compose](configure-docker.md). | ||
|
||
- For Milvus instances installed using Operator | ||
|
||
Add the configuration to the `spec.components` section of the `Milvus` custom resource. For details, refer to [Configure Milvus with Operator](configure_operator.md). | ||
|
||
## Configuration options | ||
|
||
```yaml | ||
rootCoord: | ||
maxGeneralCapacity: 1024 | ||
``` | ||
The `maxGeneralCapacity` parameter sets the maximum number of collections that the current Milvus instance can hold. The default value is `1024`. | ||
|
||
## Calculating the number of collections | ||
|
||
In a collection, you can set up multiple shards and partitions. Shards are logical units used to distribute data write operations among multiple data nodes. Partitions are logical units used to improve data retrieval efficiency by loading only a subset of collection data. When calculating the number of collections in the current Milvus instance, you also need to count the shards and partitions. | ||
|
||
For example, let's assume you have already created **100** collections, with **2** shards and **4** partitions in **60** of them and with **1** shard and **12** partitions in the rest **40** collections. The current number of collections can be calculated as: | ||
|
||
``` | ||
60 (collections) x 2 (shards) x 4 (partitions) + 40 (collections) x 1 (shard) x 12 (partitions) = 960 | ||
``` | ||
|
||
In the above example, you have already used **960** out of the default limits. Now if you want to create a new collection with **4** shards and **20** partitions, you will receive the following error prompt because the total number of collections exceeds the maximum capacity: | ||
|
||
```shell | ||
failed checking constraint: sum_collections(parition*shard) exceeding the max general capacity: | ||
``` | ||
|
||
To avoid this error, you can either reduce the number of shards or partitions in existing or new collections, delete some collections, or increase the `maxGeneralCapacity` value. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.