Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v2] Merge MIEB into v2 #1973

Merged
merged 331 commits into from
Feb 9, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
331 commits
Select commit Hold shift + click to select a range
d9ab239
Update tasks table
github-actions[bot] Jan 30, 2025
9a6275e
Update tasks table
github-actions[bot] Jan 30, 2025
e35c8dd
Update tasks table
github-actions[bot] Jan 30, 2025
d510ddb
Update tasks table
github-actions[bot] Jan 30, 2025
4bb4ec6
Update tasks table
github-actions[bot] Jan 30, 2025
9076213
Update tasks table
github-actions[bot] Jan 30, 2025
12ad5bd
Update tasks table
github-actions[bot] Jan 30, 2025
6df0b8c
1.31.6
invalid-email-address Jan 30, 2025
490b59c
Update tasks table
github-actions[bot] Jan 30, 2025
93d631f
Update tasks table
github-actions[bot] Jan 30, 2025
257578c
Update tasks table
github-actions[bot] Jan 30, 2025
1275f93
Update tasks table
github-actions[bot] Jan 30, 2025
47c63c3
Update tasks table
github-actions[bot] Jan 30, 2025
e29348c
Update tasks table
github-actions[bot] Jan 30, 2025
253a499
Update tasks table
github-actions[bot] Jan 30, 2025
4d30059
Update tasks table
github-actions[bot] Jan 30, 2025
0a073df
Update tasks table
github-actions[bot] Jan 30, 2025
58d5248
Update tasks table
github-actions[bot] Jan 30, 2025
1faa897
Update tasks table
github-actions[bot] Jan 30, 2025
f539d9a
Update tasks table
github-actions[bot] Jan 30, 2025
803e973
Update tasks table
github-actions[bot] Jan 30, 2025
1b1efa7
Update tasks table
github-actions[bot] Jan 30, 2025
10d4604
Update tasks table
github-actions[bot] Jan 30, 2025
baab628
Update tasks table
github-actions[bot] Jan 30, 2025
4a15db1
Update tasks table
github-actions[bot] Jan 30, 2025
2156389
Update tasks table
github-actions[bot] Jan 30, 2025
55cf386
Update tasks table
github-actions[bot] Jan 30, 2025
750a9a9
Update tasks table
github-actions[bot] Jan 30, 2025
acb3778
Update tasks table
github-actions[bot] Jan 30, 2025
616733d
Update tasks table
github-actions[bot] Jan 30, 2025
3c57df3
Update tasks table
github-actions[bot] Jan 30, 2025
0f7206b
Update tasks table
github-actions[bot] Jan 30, 2025
fffce31
Update tasks table
github-actions[bot] Jan 30, 2025
2d9a472
Update tasks table
github-actions[bot] Jan 30, 2025
9a2665b
Update tasks table
github-actions[bot] Jan 30, 2025
28cf270
Update tasks table
github-actions[bot] Jan 30, 2025
ba68e07
Update tasks table
github-actions[bot] Jan 30, 2025
fb90fa9
Update tasks table
github-actions[bot] Jan 30, 2025
da7092c
Update tasks table
github-actions[bot] Jan 30, 2025
b755b79
Update tasks table
github-actions[bot] Jan 30, 2025
1c24ef3
Update tasks table
github-actions[bot] Jan 30, 2025
4cc0fee
Update tasks table
github-actions[bot] Jan 30, 2025
2191e83
Update tasks table
github-actions[bot] Jan 30, 2025
50f2598
Update tasks table
github-actions[bot] Jan 30, 2025
ebaa650
Update tasks table
github-actions[bot] Jan 30, 2025
605f571
Update tasks table
github-actions[bot] Jan 30, 2025
579f946
Update tasks table
github-actions[bot] Jan 31, 2025
44fc1ae
Update tasks table
github-actions[bot] Jan 31, 2025
4e2167a
Update tasks table
github-actions[bot] Jan 31, 2025
eb80d8b
Update tasks table
github-actions[bot] Jan 31, 2025
ef3fe1b
Update tasks table
github-actions[bot] Jan 31, 2025
03941bd
Update tasks table
github-actions[bot] Jan 31, 2025
d44d893
Update tasks table
github-actions[bot] Jan 31, 2025
6dd2734
Update tasks table
github-actions[bot] Jan 31, 2025
d4eaa91
Update tasks table
github-actions[bot] Jan 31, 2025
d6752c0
Update tasks table
github-actions[bot] Jan 31, 2025
fd57157
Update tasks table
github-actions[bot] Jan 31, 2025
9661751
Update tasks table
github-actions[bot] Jan 31, 2025
20a16fb
Update tasks table
github-actions[bot] Jan 31, 2025
3f09026
Update tasks table
github-actions[bot] Jan 31, 2025
92d5d17
Update tasks table
github-actions[bot] Jan 31, 2025
b60080d
Update tasks table
github-actions[bot] Jan 31, 2025
d2de690
Update tasks table
github-actions[bot] Jan 31, 2025
fe6e2cd
Update tasks table
github-actions[bot] Jan 31, 2025
cdd121a
Update tasks table
github-actions[bot] Jan 31, 2025
2a9fb4b
Update tasks table
github-actions[bot] Jan 31, 2025
9e9f2d1
Update tasks table
github-actions[bot] Jan 31, 2025
b384dae
Update tasks table
github-actions[bot] Jan 31, 2025
4cd6ad0
Update tasks table
github-actions[bot] Jan 31, 2025
74551ca
Update tasks table
github-actions[bot] Jan 31, 2025
b3eb993
Update tasks table
github-actions[bot] Jan 31, 2025
8866cc2
Update tasks table
github-actions[bot] Jan 31, 2025
50bfeaf
Update tasks table
github-actions[bot] Jan 31, 2025
d949520
Update tasks table
github-actions[bot] Jan 31, 2025
fc940e5
Update tasks table
github-actions[bot] Jan 31, 2025
e8b37f7
Update tasks table
github-actions[bot] Jan 31, 2025
1d08e42
Update tasks table
github-actions[bot] Jan 31, 2025
e1926ea
Update tasks table
github-actions[bot] Jan 31, 2025
5651f6f
Update tasks table
github-actions[bot] Jan 31, 2025
03ffb4a
Update tasks table
github-actions[bot] Jan 31, 2025
0ed6c34
Update tasks table
github-actions[bot] Jan 31, 2025
11c2452
Update tasks table
github-actions[bot] Jan 31, 2025
5f2e277
Update tasks table
github-actions[bot] Jan 31, 2025
1791fc8
Update tasks table
github-actions[bot] Jan 31, 2025
dea83b1
Update tasks table
github-actions[bot] Jan 31, 2025
b58a615
Update tasks table
github-actions[bot] Jan 31, 2025
be4f0da
Update tasks table
github-actions[bot] Jan 31, 2025
9e4166f
Update tasks table
github-actions[bot] Jan 31, 2025
8aa5a69
Update tasks table
github-actions[bot] Jan 31, 2025
813e711
Update tasks table
github-actions[bot] Jan 31, 2025
51a314c
Update tasks table
github-actions[bot] Jan 31, 2025
8baee52
Update tasks table
github-actions[bot] Jan 31, 2025
21d32f0
fix: remove SummaryRetrieval as a type (#1915)
KennethEnevoldsen Feb 1, 2025
eb837f1
Update tasks table
github-actions[bot] Feb 1, 2025
3c86eee
Update tasks table
github-actions[bot] Feb 1, 2025
07a02c5
Update tasks table
github-actions[bot] Feb 1, 2025
8ffa6cf
Update tasks table
github-actions[bot] Feb 1, 2025
4e8288d
Update tasks table
github-actions[bot] Feb 1, 2025
4977c93
Update tasks table
github-actions[bot] Feb 1, 2025
33ce26a
Update tasks table
github-actions[bot] Feb 1, 2025
da378de
Update tasks table
github-actions[bot] Feb 1, 2025
1a60580
Update tasks table
github-actions[bot] Feb 1, 2025
7184a29
Update tasks table
github-actions[bot] Feb 1, 2025
ec2cf13
Update tasks table
github-actions[bot] Feb 1, 2025
e4329f0
Update tasks table
github-actions[bot] Feb 1, 2025
42bea66
Update tasks table
github-actions[bot] Feb 1, 2025
4b88d1c
Update tasks table
github-actions[bot] Feb 1, 2025
6c0070a
Update tasks table
github-actions[bot] Feb 1, 2025
2cb0c3a
Update tasks table
github-actions[bot] Feb 1, 2025
7258174
Update tasks table
github-actions[bot] Feb 1, 2025
d2e1361
Update tasks table
github-actions[bot] Feb 1, 2025
3231736
Update tasks table
github-actions[bot] Feb 1, 2025
2b4a467
Update tasks table
github-actions[bot] Feb 1, 2025
37ef436
Update tasks table
github-actions[bot] Feb 1, 2025
a6c2841
Update tasks table
github-actions[bot] Feb 1, 2025
f70a994
Update tasks table
github-actions[bot] Feb 1, 2025
635ed80
Update tasks table
github-actions[bot] Feb 1, 2025
75ff333
fix: revert rename and add to description (#1918)
isaac-chung Feb 1, 2025
d9c9b9e
Update tasks table
github-actions[bot] Feb 1, 2025
03b2380
Update tasks table
github-actions[bot] Feb 1, 2025
88a2fe1
Update tasks table
github-actions[bot] Feb 1, 2025
1f7971f
Update tasks table
github-actions[bot] Feb 1, 2025
ad1deff
Update tasks table
github-actions[bot] Feb 1, 2025
9641319
Update tasks table
github-actions[bot] Feb 1, 2025
d9ba681
Update tasks table
github-actions[bot] Feb 1, 2025
96f3aff
Update tasks table
github-actions[bot] Feb 1, 2025
2e34cc7
Update tasks table
github-actions[bot] Feb 1, 2025
5c2cbfc
Update tasks table
github-actions[bot] Feb 1, 2025
b61de5d
Update tasks table
github-actions[bot] Feb 1, 2025
26ffe3a
Update tasks table
github-actions[bot] Feb 1, 2025
ff4e7c6
Update tasks table
github-actions[bot] Feb 1, 2025
13fd52e
Update tasks table
github-actions[bot] Feb 1, 2025
2850833
Update tasks table
github-actions[bot] Feb 1, 2025
d57f988
Update tasks table
github-actions[bot] Feb 1, 2025
c01563d
Update tasks table
github-actions[bot] Feb 1, 2025
a9be716
Update tasks table
github-actions[bot] Feb 1, 2025
c3ea285
Update tasks table
github-actions[bot] Feb 1, 2025
e823bd7
Update tasks table
github-actions[bot] Feb 1, 2025
2756d67
Update tasks table
github-actions[bot] Feb 1, 2025
6d051da
Update tasks table
github-actions[bot] Feb 1, 2025
6cb089f
Update tasks table
github-actions[bot] Feb 1, 2025
f1ea61a
Update tasks table
github-actions[bot] Feb 1, 2025
54d1bd1
Update tasks table
github-actions[bot] Feb 1, 2025
887ebf2
Update tasks table
github-actions[bot] Feb 1, 2025
471ea4c
Update tasks table
github-actions[bot] Feb 1, 2025
c72a4ba
Update tasks table
github-actions[bot] Feb 1, 2025
e5ae84f
Update tasks table
github-actions[bot] Feb 1, 2025
d6ff9d0
Update tasks table
github-actions[bot] Feb 1, 2025
745e2e6
Update tasks table
github-actions[bot] Feb 1, 2025
dafbb80
Update tasks table
github-actions[bot] Feb 1, 2025
ea6c1a2
Update tasks table
github-actions[bot] Feb 1, 2025
53f4e2e
Update tasks table
github-actions[bot] Feb 1, 2025
bf3256a
Update tasks table
github-actions[bot] Feb 1, 2025
0e8a539
Update tasks table
github-actions[bot] Feb 1, 2025
52c000d
Update tasks table
github-actions[bot] Feb 1, 2025
0ac5bf2
Update tasks table
github-actions[bot] Feb 1, 2025
f42d5d0
Update tasks table
github-actions[bot] Feb 1, 2025
df3ef70
Update tasks table
github-actions[bot] Feb 1, 2025
e183458
Update tasks table
github-actions[bot] Feb 1, 2025
2b5f320
Update tasks table
github-actions[bot] Feb 1, 2025
996c522
Update tasks table
github-actions[bot] Feb 1, 2025
de3a1f9
Update tasks table
github-actions[bot] Feb 1, 2025
0cd396e
Update tasks table
github-actions[bot] Feb 1, 2025
974ff3c
Update tasks table
github-actions[bot] Feb 1, 2025
0ae0417
Update tasks table
github-actions[bot] Feb 1, 2025
c275b10
Update tasks table
github-actions[bot] Feb 1, 2025
4294389
Update tasks table
github-actions[bot] Feb 1, 2025
8cdb25a
Update tasks table
github-actions[bot] Feb 1, 2025
9146cc3
Update tasks table
github-actions[bot] Feb 1, 2025
7474c97
Update tasks table
github-actions[bot] Feb 1, 2025
4be5352
Update tasks table
github-actions[bot] Feb 1, 2025
a420249
Update tasks table
github-actions[bot] Feb 1, 2025
597b8fc
Update tasks table
github-actions[bot] Feb 1, 2025
e344a2e
Update tasks table
github-actions[bot] Feb 1, 2025
2b95d66
Update tasks table
github-actions[bot] Feb 1, 2025
6072eae
Update tasks table
github-actions[bot] Feb 1, 2025
e932dfc
Update tasks table
github-actions[bot] Feb 1, 2025
6f673ba
docs: Add sort to domains for task metadata (#1922)
KennethEnevoldsen Feb 1, 2025
14616dc
Update tasks table
github-actions[bot] Feb 1, 2025
23f626d
1.31.7
invalid-email-address Feb 1, 2025
f3526fc
docs: Updated citation for mteb(scandinavian) (#1914)
KennethEnevoldsen Feb 1, 2025
9c762da
fix: Add datasets in CodeRAG-Bench (#1595)
hepengfe Feb 1, 2025
57db0f9
Update tasks table
github-actions[bot] Feb 1, 2025
dba7a95
1.31.8
invalid-email-address Feb 1, 2025
476afc7
Leaderboard: Acks (#1930)
Muennighoff Feb 3, 2025
07c489d
misc: add warning for save_suffix removal from AbsTask (#1940)
isaac-chung Feb 3, 2025
e16acf8
misc: add bgev1 models (#1928)
Samoed Feb 4, 2025
3cf2bed
Updated links in MTEB(eng) and eng,classic (#1948)
x-tabdeveloping Feb 4, 2025
7ef3a90
feat: add beir (#1933)
Samoed Feb 4, 2025
4e453e5
1.32.0
invalid-email-address Feb 4, 2025
c81827d
Fixed join_revisions if results are empty (#1949)
x-tabdeveloping Feb 4, 2025
6d63d06
feat: Merge MIEB into main 🎉 (#1944)
isaac-chung Feb 4, 2025
cd657f5
1.33.0
invalid-email-address Feb 4, 2025
f43b661
fix: fix task types in MIEB
KennethEnevoldsen Feb 4, 2025
a21f0b7
fix: fix task types in MIEB (#1952)
KennethEnevoldsen Feb 4, 2025
635b204
1.33.1
invalid-email-address Feb 4, 2025
de8f384
feat: Add new benchmark BEIR-NL (#1909)
nikolay-banar Feb 4, 2025
3036c05
Merge branch 'main' of https://github.com/embeddings-benchmark/mteb
KennethEnevoldsen Feb 4, 2025
d162645
Update tasks table
github-actions[bot] Feb 4, 2025
9a03325
1.34.0
invalid-email-address Feb 4, 2025
d87d17e
docs: Add instruction for running leaderboard (#1925)
KennethEnevoldsen Feb 4, 2025
c22f14d
Update ModelMeta of inf-retriever-v1 (#1964)
SamuelYang1 Feb 5, 2025
cde11ae
misc: lint main (#1961)
isaac-chung Feb 5, 2025
195e688
misc: add warnings (#1945)
Samoed Feb 5, 2025
2c62d21
Revert "feat: Merge MIEB into main 🎉 " (#1957)
isaac-chung Feb 5, 2025
4232427
fix: Changed callback for slider, accounted for None input (#1969)
x-tabdeveloping Feb 5, 2025
a37e559
Fix: Corrected model size for UAE (#1965)
x-tabdeveloping Feb 5, 2025
29713ce
1.34.1
invalid-email-address Feb 5, 2025
bc05a9d
Merge MIEB into main 🎉 (#1968)
isaac-chung Feb 5, 2025
fc6696f
Update tasks table
github-actions[bot] Feb 5, 2025
2a8c25a
Fix: added link for text-embedding-005 and other google models (#1962)
imadtyx Feb 5, 2025
b14dacb
Merge branch 'refs/heads/main' into merge_mieb_into_v2
Samoed Feb 5, 2025
2b01d3c
start integrating mieb into v2
Samoed Feb 5, 2025
a3aef69
comment out image imports
Samoed Feb 5, 2025
995f8ce
remove descripte stat from mock
Samoed Feb 5, 2025
ce8f057
add descriptive stats
Samoed Feb 5, 2025
9047eb2
remove is_multilingual
Samoed Feb 5, 2025
64c17b6
fix: leaderboard and benchmark.py inconstiencies (#1956)
KennethEnevoldsen Feb 5, 2025
65fc0cc
remove multisubset loader
Samoed Feb 5, 2025
adc06ab
1.34.2
invalid-email-address Feb 5, 2025
1520979
Add mini-gte model (#1906)
Ressnn Feb 5, 2025
9618505
Fix HagridRetrieval load_dataset "trust_remote_code" (#1989)
SamuelYang1 Feb 6, 2025
370b26c
fix: Meta information ru_sentence_models (#1991)
sergeyz-zh Feb 6, 2025
56f90a7
1.34.3
invalid-email-address Feb 6, 2025
8583383
fix: Added description and resolved bug in rangeslider (#1990)
KennethEnevoldsen Feb 6, 2025
4a37967
1.34.4
invalid-email-address Feb 6, 2025
fb6bade
CI clean up: Remove MMTEB points check (#1994)
KennethEnevoldsen Feb 6, 2025
e2c44ed
Add gte-modernbert-base (#1845)
afalf Feb 7, 2025
d810e4e
fix: training data for gritlm (#1932)
Muennighoff Feb 7, 2025
3887d83
Sort benchmarks and add temp. reference to old leaderboard (#1993)
KennethEnevoldsen Feb 7, 2025
d424037
1.34.5
invalid-email-address Feb 7, 2025
4fe4c99
fix: Update faq of on leaderboard (#2004)
Muennighoff Feb 7, 2025
7ac4650
1.34.6
invalid-email-address Feb 7, 2025
b1ac052
fix: BEIR-NL metadata mistake (#2010)
nikolay-banar Feb 7, 2025
64f256e
Update tasks table
github-actions[bot] Feb 7, 2025
eacde2f
1.34.7
invalid-email-address Feb 7, 2025
e46539a
add model memory usage (#1934)
Samoed Feb 7, 2025
612626d
Merge branch 'refs/heads/main' into merge_mieb_into_v2
Samoed Feb 7, 2025
09b3c5a
Merge branch 'refs/heads/v2.0.0' into merge_mieb_into_v2
Samoed Feb 7, 2025
f1d0268
Merge branch 'refs/heads/v2.0.0' into merge_mieb_into_v2
Samoed Feb 7, 2025
4187c57
fix metas
Samoed Feb 7, 2025
540652a
fix tests
Samoed Feb 8, 2025
af5d786
Merge branch 'refs/heads/v2.0.0' into merge_mieb_into_v2
Samoed Feb 8, 2025
f03c14e
lint
Samoed Feb 8, 2025
1375a92
merge 2 test_model_meta
Samoed Feb 8, 2025
419f101
update imports
Samoed Feb 8, 2025
039ecc0
remove todo comment from fast loading
Samoed Feb 8, 2025
cae6c9e
fix imports
Samoed Feb 8, 2025
2abad39
add abstask imports
Samoed Feb 8, 2025
08c6f23
remove empy files
Samoed Feb 8, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
2 changes: 1 addition & 1 deletion .github/workflows/lint.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ jobs:

- uses: actions/setup-python@v4
with:
python-version: "3.9"
python-version: "3.10"
cache: "pip"

- name: Install dependencies
Expand Down
63 changes: 0 additions & 63 deletions .github/workflows/mmteb.yml

This file was deleted.

3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -149,3 +149,6 @@ uv.lock
# model loading tests
model_names.txt
mteb/leaderboard/__cached_results.json

# gradio
.gradio/
19 changes: 19 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -445,6 +445,23 @@ model_w_contamination = ModelMeta(
```


</details>

<details>
<summary> Running the leaderboard locally </summary>


### Running the Leaderboard

It is possible to completely deploy the leaderboard locally or self-host it. This can e.g. be relevant for companies that might want to
integrate build their own benchmarks or integrate custom tasks into existing benchmarks.

Running the leaderboard is quite easy. Simply run:
```py
python -m mteb.leaderboard.app

The leaderboard requires gradio install, which can be installed using `pip install mteb[gradio]` and requires python >3.10.

</details>

<details>
Expand Down Expand Up @@ -484,6 +501,7 @@ evaluation.run(model, ...)
| 👩‍💻 [Adding a benchmark] | How to add a new benchmark to MTEB and to the leaderboard |
| 🤝 [Contributing] | How to contribute to MTEB and set it up for development |
| 🌐 [MMTEB] | An open-source effort to extend MTEB to cover a broad set of languages |
| 🖼️ [MIEB] | Extension of MTEB to image embeddings |

[Tasks]: docs/tasks.md
[Benchmarks]: docs/benchmarks.md
Expand All @@ -493,6 +511,7 @@ evaluation.run(model, ...)
[Adding a benchmark]: docs/adding_a_benchmark.md
[Leaderboard]: https://huggingface.co/spaces/mteb/leaderboard
[MMTEB]: docs/mmteb/readme.md
[MIEB]: docs/mieb.md
[Reproducible workflows]: docs/reproducible_workflow.md

## Citing
Expand Down
3 changes: 2 additions & 1 deletion docs/adding_a_model.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ The MTEB Leaderboard is available [here](https://huggingface.co/spaces/mteb/lead
revision="5617a9f61b028005a4858fdac845db406aefb181",
release_date="2024-06-28",
n_parameters=568_000_000,
memory_usage_mb=2167,
embed_dim=4096,
license="mit",
max_tokens=8194,
Expand All @@ -25,7 +26,7 @@ The MTEB Leaderboard is available [here](https://huggingface.co/spaces/mteb/lead
training_datasets={"your_dataset": ["train"]},
)
```
By default, the model will run using the [`sentence_transformers_loader`](../mteb/models/sentence_transformer_wrapper.py) loader function. If you need to use a custom implementation, you can specify the `loader` parameter in the `ModelMeta` class. For example:
To calculate `memory_usage_mb` you can run `model_meta.memory_usage_mb()`. By default, the model will run using the [`sentence_transformers_loader`](../mteb/models/sentence_transformer_wrapper.py) loader function. If you need to use a custom implementation, you can specify the `loader` parameter in the `ModelMeta` class. For example:
```python
from mteb.models.wrapper import Wrapper
from mteb.encoder_interface import PromptType
Expand Down
116 changes: 116 additions & 0 deletions docs/mieb.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
# Welcome to MIEB! 👋

The Massive Image Embedding Benchmark (MIEB) is an image extension of [MTEB](https://arxiv.org/abs/2210.07316) to cover embedding tasks for image-text tasks.

## 🌱 Background

MIEB intends to extend MTEB and MMTEB to cover image representation learning and image-text alignment tasks.

## 🪴 Contributing to MIEB

The FIRST step is to _always_ create an issue in the MTEB repo (this one), and add the `mieb` label. PRs without issues will not be accepted.

There are a few ways for anyone to contribute to MIEB:

1. Add a dataset as an existing task type. This means that the `AbsTask` already exists, e.g. `AbsTaskImageClassification`, and the effort is solely in adding an instance of it.
2. Add a model. This could mean either: a) The model wrapper, e.g. `OpenCLIPWrapper`, already exists, and the effort is solely in adding a filled out `ModelMeta` object, and/or b) Add a new model wrapper.
3. Add a new task type. This means that the existing task types do not cover this new task. An accompanying evaluator should also be implemented.

Let's go through an example.

## Example

Here is an example implementing a zero-shot image classification from scratch. Let's say we wish to implement CIFAR10 as a task and evaluate an OpenCLIP model on it.

To solve this task, we need to encode the `images`, encode the `class label candidates with prompts` (e.g. "this is a dog pic", "this is a cat pic"), and compare them by calculating similarity, and then argmax out the class prediction for each image. We begin by implementing a model wrapper.

### Model Wrapper
See the [`ImageEncoder` class](https://github.com/embeddings-benchmark/mteb/blob/mieb/mteb/encoder_interface.py) for more details. The model class implements `get_text_embeddings`, `get_image_embeddings`, and `calculate_probs` methods.
As an example, [`OpenCLIPWrapper`](https://github.com/embeddings-benchmark/mteb/blob/mieb/mteb/models/openclip_models.py) is first implemented, with metadata defined below.
```python
class OpenCLIPWrapper:
...
```
See also [adding a model](adding_a_model.md) for reference.

### X Evaluator
With the model, [ZeroshotClassificationEvaluator](https://github.com/embeddings-benchmark/mteb/blob/mieb/mteb/evaluation/evaluators/Image/ZeroshotClassificationEvaluator.py) is implemented here. This defines how the model are used to do zero-shot classification and get back results on desired metrics.
```python
class ZeroshotClassificationEvaluator(Evaluator):
def __init__(self, ...):
...
def __call__(self, model: Encoder, *, encode_kwargs: dict[str, Any] = {}):
"""Get embeddings and calculate scores."""
...
```

### AbsTask X
With the evaluator, [AbsTaskZeroshotClassification](https://github.com/embeddings-benchmark/mteb/blob/mieb/mteb/abstasks/Image/AbsTaskZeroshotClassification.py) is defined, operating on the dataset, calling the defined Evaluator, and gives out results.
```python
class AbsTaskZeroshotClassification(AbsTask):
...
```


### Dataset class
With all these, we can then define the dataset. [CIFAR10](https://github.com/embeddings-benchmark/mteb/blob/mieb/mteb/tasks/Image/ZeroshotClassification/eng/CIFAR.py) is implemented like this, subclassing `AbsTaskZeroshotClassification`, and overwrite the `get_candidate_labels` function, which gives `["a photo of {label_name}"]` to be used in the evaluator.
```python
class CIFAR10ZeroShotClassification(AbsTaskZeroshotClassification):
metadata = TaskMetadata(...)

def get_candidate_labels(self) -> list[str]:
...
```
See also [adding a dataset](adding_a_dataset.md) for reference.

### Putting them all together
With all these, we can then
```python
import mteb

model_name = "laion/CLIP-ViT-L-14-laion2B-s32B-b82K"
model = mteb.get_model(model_name=model_name)

tasks = mteb.get_tasks(tasks=["CIFAR10ZeroShot"])
evaluation = mteb.MTEB(tasks=tasks)
results = evaluation.run(model)
```

By default, results will be under `results/laion__CLIP-ViT-L-14-laion2B-s32B-b82K/REVISION/CIFAR10ZeroShot.json`. Sometimes metrics can be a bit different than what the original paper claimed. This might be due to the resolution/layout difference of images in the remake of the dataset.


## Specific Model running Instructions

Some models require some specific steps before running. Those are collected here.

<details>
<summary> Vista </summary>

## set up VISTA

```
git clone https://github.com/FlagOpen/FlagEmbedding.git
cd FlagEmbedding/research/visual_bge
pip install -e .
pip install torchvision timm einops ftfy
```
back to the root folder of mteb; download the vision tower for bge-base
```
cd ..
wget https://huggingface.co/BAAI/bge-visualized/resolve/main/Visualized_base_en_v1.5.pth?download=true
```
rename it to `visualized_base_en_V1.5.pth`
```
mv Visualized_base_en_v1.5.pth?download=true visualized_base_en_V1.5.pth
```
download the vision tower for bge-m3
```
wget https://huggingface.co/BAAI/bge-visualized/resolve/main/Visualized_m3.pth?download=true
```
rename it to `visualized_m3.pth`
```
mv Visualized_m3.pth?download=true visualized_m3.pth
```


</details>
Loading