Skip to content

Commit effd5f3

Browse files
Milvus-doc-botMilvus-doc-bot
Milvus-doc-bot
authored and
Milvus-doc-bot
committed
Release new docs
1 parent 8d7cfbf commit effd5f3

30 files changed

+75
-855
lines changed

v0.10.1/assets/IP.png

9.4 KB
Loading

v0.10.1/assets/normalization.png

11 KB
Loading

v0.10.1/assets/normalize.png

6.91 KB
Loading

v0.10.1/site/en/about_milvus/roadmap.md

-40
This file was deleted.

v0.10.1/site/en/about_milvus/traditional_db.md

-12
This file was deleted.

v0.10.1/site/en/contribute/contribute.md

-33
This file was deleted.

v0.10.1/site/en/faq/README.md

-4
This file was deleted.

v0.10.1/site/en/milvus_basics/index.md

+11-11
Original file line numberDiff line numberDiff line change
@@ -71,30 +71,30 @@ IVF_FLAT is the most basic IVF index, and the encoded data stored in each unit i
7171

7272
### IVF_SQ8
7373

74-
IVF_SQ8 does scalar quantization for each vector placed in the unit based on IVF. Scalar quantization converts each dimension of the original vector from a 4-byte floating-point number to a 1-byte unsigned integer, so the IVF_SQ8 index file occupies much less space than the IVF_FLAT index file. However, scalar quantization results in a loss of accuracy during searching vectors.
74+
IVF\_SQ8 does scalar quantization for each vector placed in the unit based on IVF. Scalar quantization converts each dimension of the original vector from a 4-byte floating-point number to a 1-byte unsigned integer, so the IVF\_SQ8 index file occupies much less space than the IVF\_FLAT index file. However, scalar quantization results in a loss of accuracy during searching vectors.
7575

76-
- IVF_SQ8 has the same index building parameters as IVF_FLAT.
77-
- IVF_SQ8 has the same search parameters as IVF_FLAT.
76+
- IVF\_SQ8 has the same index building parameters as IVF\_FLAT.
77+
- IVF\_SQ8 has the same search parameters as IVF\_FLAT.
7878

7979
### IVF_SQ8H
8080

81-
Optimized version of `IVF_SQ8` that requires both CPU and GPU to work. Unlike `IVF_SQ8`, `IVF_SQ8H` uses a GPU-based coarse quantizer, which greatly reduces time to quantize.
81+
Optimized version of IVF\_SQ8 that requires both CPU and GPU to work. Unlike IVF\_SQ8, IVF\_SQ8H uses a GPU-based coarse quantizer, which greatly reduces time to quantize.
8282

83-
IVF_SQ8H is an IVF_SQ8 index that optimizes query execution.
83+
IVF\_SQ8H is an IVF\_SQ8 index that optimizes query execution.
8484

8585
The query method is as follows:
8686

8787
- If `NQ` ≥ `gpu_search_threshold`, GPU handles the entire query task.
8888
- If `NQ` < `gpu_search_threshold`, GPU handles the task of retrieving the `nprobe` nearest unit in the IVF index file, and CPU handles the rest.
8989

90-
- IVF_SQ8H has the same index building parameters as IVF_FLAT.
91-
- IVF_SQ8H has the same search parameters as IVF_FLAT.
90+
- IVF\_SQ8H has the same index building parameters as IVF\_FLAT.
91+
- IVF\_SQ8H has the same search parameters as IVF\_FLAT.
9292

9393
### IVF_PQ
9494

9595
`PQ` (Product Quantization) uniformly decomposes the original high-dimensional vector space into Cartesian products of `m` low-dimensional vector spaces, and then quantizes the decomposed low-dimensional vector spaces. Instead of calculating the distances between the target vector and the center of all the units, product quantization enables the calculation of distances between the target vector and the clustering center of each low-dimensional space and greatly reduces the time complexity and space complexity of the algorithm.
9696

97-
IVF_PQ quantizes the product of vectors, and then performs IVF index clustering. Its index file is even smaller than IVF_SQ8, but it also causes a loss of accuracy during searching vectors.
97+
IVF\_PQ quantizes the product of vectors, and then performs IVF index clustering. Its index file is even smaller than IVF\_SQ8, but it also causes a loss of accuracy during searching vectors.
9898

9999
- Index building parameters
100100

@@ -105,7 +105,7 @@ IVF_PQ quantizes the product of vectors, and then performs IVF index clustering.
105105

106106
**Example:** `{"nlist": 2048, "m": 16}`
107107

108-
- IVF_PQ has the same search parameters as IVF_FLAT.
108+
- IVF\_PQ has the same search parameters as IVF\_FLAT.
109109

110110
### RNSG
111111

@@ -154,8 +154,8 @@ Reference: <a href="https://arxiv.org/abs/1603.09320">Efficient and robust appro
154154

155155
| Parameter | Description | Range |
156156
| ---------------- | ------------------ | --------- |
157-
| `M` | Maximum degree of the node | [5, 48] |
158-
| `efConstruction` | Search scope | [100, 500] |
157+
| `M` | Maximum degree of the node | [4, 64] |
158+
| `efConstruction` | Search scope | [8, 512] |
159159

160160
**Example:** `{"M": 16, "efConstruction": 40}`
161161

v0.10.1/site/en/quick_start/get_started/create_drop_index_python.md

+6-6
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ Currently, a collection only supports one index type. When you change the index
1414
<code>create_index()</code> specifies the index type of a collection and synchronously creates indexes for the previously inserted data. When the size of the subsequently inserted data reaches the <code>index_file_size</code>, Milvus automatically creates indexes in the background. For streaming data, it is recommended to create indexes before inserting the vector so that the system can automatically build indexes for the next data. For static data, it is recommended to import all the data at first and then create indexes. See the <a href="https://github.com/milvus-io/pymilvus/tree/master/examples/indexes">index sample program</a> for details about using index.
1515
</div>
1616

17-
1. Prepare the parameters needed to create indexes (take `IVF_FLAT` as an example). The index parameters are stored in a JSON string, which is represented by a dictionary in the Python SDK.
17+
1. Prepare the parameters needed to create indexes (take IVF_FLAT as an example). The index parameters are stored in a JSON string, which is represented by a dictionary in the Python SDK.
1818

1919
```python
2020
# Prepare index param
@@ -27,11 +27,11 @@ Currently, a collection only supports one index type. When you change the index
2727

2828
| Index Type | Index Parameter | Exmaple Parameter | Range |
2929
| ---------- | --------------- | ----------------- | ----- |
30-
| `IVFLAT` / `SQ8`/ `SQ8H` | `nlist`: The number of clusters to perform clustering operations on vector data files during index building. To facilitate later search, the index file records the results of the clustering operation, including the type of index, the center vector of each cluster, and the vectors in cluster.| `{nlist: 16384}` | `nlist`: [1, 999999] |
31-
| `IVFPQ` | `nlist`: The number of clusters to perform clustering operations on vector data files during index building. To facilitate later search, the index file records the results of the clustering operation, including the type of index, the center vector of each cluster, and the vectors in cluster. </br></br> `m`: The compression rate during index building. The smaller the `m`, the higher the compression rate. | `{nlist: 16384, m: 12}` | `nlist`: [1, 999999] </br></br> `m`: a value in {96, 64, 56, 48, 40, 32, 28, 24, 20, 16, 12, 8, 4, 3, 2, 1} |
32-
| `NSG` | `search_length`: The larger the value, the more nodes searched in the graph, the higher the recall rate, but the slower the speed. `search_length` should be less than `candidate_pool` and within [40, 80].</br></br> `out_degree`: The larger the value, the greater the memory usage and the better the search performance.</br></br> `candidate_pool`: The value affects the index quality and should be within [200,500].</br></br> `knng`: The value affects the index quality and should equal to `out_degree` + 20. | `{search_length: 45, out_degree:50, candidate_pool_size:300, knng:100}` | `search_length`: [10, 300]</br></br>`out_degree`: [5, 300]</br></br>`candidate_pool_size`: [50, 1000]</br></br>`knng`: [5, 300] |
33-
| `HNSW` | `M`: The value affects the build time and index quality. The larger the `M`, the longer it takes to build indexes, the higher the index quality, and the greater the memory footprint.</br></br> `efConstruction`: The value affects the build time and index quality. The larger the `efConstruction`, the longer it takes to build indexes, the higher the index quality, and the larger the memory footprint. | `{M: 16, efConstruction:500}` | `M`: [5, 48]</br>`efConstruction`: [100, 500] |
34-
| `ANNOY` | `n_trees`: The value affects the index building time and index size. The larger the value, the more accurate the search results, but the larger the index file. | `{"n_trees": 8}` | [1, 1024] |
30+
| IVF\_FLAT / IVF\_SQ8/ IVF\_SQ8H | `nlist`: The number of clusters to perform clustering operations on vector data files during index building. To facilitate later search, the index file records the results of the clustering operation, including the type of index, the center vector of each cluster, and the vectors in cluster. | `{nlist: 16384}` | `nlist`: [1, 65536] |
31+
| IVF\_PQ | `nlist`: The number of clusters to perform clustering operations on vector data files during index building. To facilitate later search, the index file records the results of the clustering operation, including the type of index, the center vector of each cluster, and the vectors in cluster. </br></br> `m`: The compression rate during index building. The smaller the `m`, the higher the compression rate. | `{nlist: 16384, m: 12}` | `nlist`: [1, 65536] </br></br> `m`: a value in {96, 64, 56, 48, 40, 32, 28, 24, 20, 16, 12, 8, 4, 3, 2, 1}, and the dimensions of the low-dimensional vector space must be in {1, 2, 3, 4, 6, 8, 10, 12, 16, 20, 24, 28, 32}. Besides, when computing with GPU, ensure that the result of m x 1024 does not exceed MaxSharedMemPerBlock of your graphics card. |
32+
| RNSG | `search_length`: The larger the value, the more nodes searched in the graph, the higher the recall rate, but the slower the speed. `search_length` should be less than `candidate_pool` and within [40, 80].</br></br> `out_degree`: The larger the value, the greater the memory usage and the better the search performance.</br></br> `candidate_pool`: The value affects the index quality and should be within [200,500].</br></br> `knng`: The value affects the index quality and should equal to `out_degree` + 20. | `{search_length: 45, out_degree:50, candidate_pool_size:300, knng:100}` | `search_length`: [10, 300]</br></br>`out_degree`: [5, 300]</br></br>`candidate_pool_size`: [50, 1000]</br></br>`knng`: [5, 300] |
33+
| HNSW | `M`: The value affects the build time and index quality. The larger the `M`, the longer it takes to build indexes, the higher the index quality, and the greater the memory footprint.</br></br> `efConstruction`: The value affects the build time and index quality. The larger the `efConstruction`, the longer it takes to build indexes, the higher the index quality, and the larger the memory footprint. | `{M: 16, efConstruction: 500}` | `M`: [4, 64]</br></br>`efConstruction`: [8, 512] |
34+
| ANNOY | `n_trees`: The value affects the index building time and index size. The larger the value, the more accurate the search results, but the larger the index file. | `{n_trees: 8}` | [1, 1024] |
3535

3636
See [Milvus Index Type](index.md) for details.
3737

v0.10.1/site/en/quick_start/get_started/search_vector_python.md

+5-5
Original file line numberDiff line numberDiff line change
@@ -20,11 +20,11 @@ Milvus supports searching vectors in a collection or partition.
2020

2121
| Index Type | Search Parameter | Exmaple Parameter | Range |
2222
| ---------- | --------------- | ----------------- | ----- |
23-
| `FLAT` | - | | - |
24-
| `IVFLAT`/`SQ8`/`SQ8H`/`IVFPQ` | `nprobe`: The number of vector classes involved in the query. `nprobe` affects query accuracy. The larger the value, the higher the accuracy and the slower the speed. | `{nprobe: 32}` | [1, `nlist`] |
25-
| `NSG` | `search_length`: The larger the value, the more nodes to search in the graph, the higher the recall rate, and the slower the speed. | `{search_length:100}` | [10, 300] |
26-
| `HNSW` | `ef`: The larger the value, the more data to search in the index, the higher the recall rate, and the slower the speed. | `{ef: 64}` | [`top_k`, 4096] |
27-
| `ANNOY` | `search_k`: The value affects search performance. The larger the value, the more accurate the search results, but the longer the search time.</br>-1 represents the default value, taking 5% of the total data. | `{"search_k": -1}` | {-1} ∪ [`top_k`, ∞) |
23+
| FLAT | - | | - |
24+
| IVF\_FLAT / IVF\_SQ8/ IVF\_SQ8H / IVF\_PQ | `nprobe`: The number of vector classes involved in the query. `nprobe` affects query accuracy. The larger the value, the higher the accuracy and the slower the speed. | `{nprobe: 32}` | CPU: [1, nlist] </br> GPU: [1, min(2048, nlist)] |
25+
| RNSG | `search_length`: The larger the value, the more nodes to search in the graph, the higher the recall rate, and the slower the speed. | `{search_length: 100}` | [10, 300] |
26+
| HNSW | `ef`: The larger the value, the more data to search in the index, the higher the recall rate, and the slower the speed. | `{ef: 64}` | [`top_k`, 4096] |
27+
| ANNOY | `search_k`: The value affects search performance. The larger the value, the more accurate the search results, but the longer the search time.</br>-1 represents the default value, taking 5% of the total data. | `{search_k: -1}` | {-1} ∪ [`top_k`, n × n_trees] |
2828

2929
<div class="alert note">
3030
<code>top_k</code> means searching the k vectors most similar to the target vector. It is defined during the search. The range of <code>top_k</code> is (0, 2048].

0 commit comments

Comments
 (0)