Skip to content

Commit 024cd82

Browse files
Milvus-doc-botMilvus-doc-bot
Milvus-doc-bot
authored and
Milvus-doc-bot
committed
Release new docs
1 parent 7175b07 commit 024cd82

File tree

5 files changed

+24
-10
lines changed

5 files changed

+24
-10
lines changed

v0.10.2/assets/IP.png

9.4 KB
Loading

v0.10.2/assets/normalization.png

11 KB
Loading

v0.10.2/assets/normalize.png

6.91 KB
Loading

v0.10.2/site/en/reference/metric.md

+12-5
Original file line numberDiff line numberDiff line change
@@ -27,20 +27,27 @@ It's the most commonly used distance metric, and is very useful when the data is
2727

2828
### Inner product (IP)
2929

30-
IP measures the cosine of the angle between 2 vectors, and returns the normalized dot product of them.
31-
32-
So the formula for IP is:
30+
The IP distance between two embeddings are defined as follows:
3331

3432
![ip](../../../assets/ip_metric.png)
3533

36-
where A and B are vectors, `||A||` and `||B||` are the norms of A and B, and cosθ is the cosine of the angle between A and B.
34+
where A and B are embeddings, `||A||` and `||B||` are the norms of A and B.
3735

3836
IP is more useful if you are more interested in measuring the orientation but not the magnitude of the vectors.
3937

4038
<div class="alert note">
41-
If the vectors are normalized, IP is equivalent to cosine similarity. Thus, Milvus does not provide a metric for cosine similarity.
39+
If you use IP to calculate embeddings similarities, you must normalize your embeddings. After normalization, inner product equals cosine similarity.
4240
</div>
4341

42+
43+
Suppose X' is normalized from embedding X:
44+
45+
![normalize](../../../assets/normalize.png)
46+
47+
The correlation between the two embeddings is as follows:
48+
49+
![normalization](../../../assets/normalization.png)
50+
4451
### Jaccard distance
4552

4653
Jaccard similarity coefficient measures the similarity between two sample sets, and is defined as the cardinality of the intersection of the defined sets divided by the cardinality of the union of them. It can only be applied to finite sample sets.

v0.10.2/site/zh-CN/reference/metric.md

+12-5
Original file line numberDiff line numberDiff line change
@@ -27,20 +27,27 @@ Milvus 基于不同的距离计算方式比较向量间的距离。选择合适
2727

2828
### 内积 (IP)
2929

30-
内积计算两条向量之间的夹角余弦,并返回相应的点积。
30+
两条向量内积距离的计算公式为:
3131

32-
内积距离的计算公式为:
32+
![ip](../../../assets/IP.png)
3333

34-
![ip](../../../assets/ip_metric.png)
3534

36-
假设有 A 和 B 两条向量,则 `||A||``||B||` 分别代表 A 和 B 归一化后的值。cosθ 代表 A 与 B 之间的余弦夹角。
35+
假设有 A 和 B 两条向量,则 `||A||``||B||` 分别代表 A 和 B 归一化后的值。
3736

3837
内积更适合计算向量的方向而不是大小。
3938

4039
<div class="alert note">
41-
在向量归一化之后,内积与余弦相似度等价。因此 Milvus 并没有单独提供余弦相似度作为向量距离计算方式
40+
如需使用点积计算向量相似度,则必须对向量作归一化处理。处理后点积与余弦相似度等价
4241
</div>
4342

43+
假设 X' 是向量 X 的归一化向量:
44+
45+
![normalize](../../../assets/normalize.png)
46+
47+
两者之间的关系为:
48+
49+
![normalization](../../../assets/normalization.png)
50+
4451
### 杰卡德距离
4552

4653
杰卡德相似系数计算数据集之间的相似度,计算方式为:数据集交集的个数和并集个数的比值。计算公式可以表示为:

0 commit comments

Comments
 (0)