Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: beir-cellar/beir
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: v0.2.0
Choose a base ref
...
head repository: beir-cellar/beir
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: main
Choose a head ref
Loading
Showing with 6,333 additions and 2,276 deletions.
  1. +3 −0 .gitmodules
  2. +12 −0 .pre-commit-config.yaml
  3. +7 −0 CONTRIBUTORS.txt
  4. +1 −1 LICENSE
  5. +10 −4 NOTICE.txt
  6. +177 −348 README.md
  7. +7 −1 beir/__init__.py
  8. +49 −41 beir/datasets/data_loader.py
  9. +169 −0 beir/datasets/data_loader_hf.py
  10. +8 −1 beir/generation/__init__.py
  11. +134 −72 beir/generation/generate.py
  12. +9 −1 beir/generation/models/__init__.py
  13. +119 −56 beir/generation/models/auto_model.py
  14. +92 −0 beir/generation/models/tilde.py
  15. +7 −3 beir/logging.py
  16. +9 −1 beir/losses/__init__.py
  17. +39 −27 beir/losses/bpr_loss.py
  18. +38 −0 beir/losses/margin_mse_loss.py
  19. +7 −1 beir/reranking/__init__.py
  20. +9 −1 beir/reranking/models/__init__.py
  21. +15 −10 beir/reranking/models/cross_encoder.py
  22. +169 −0 beir/reranking/models/mono_t5.py
  23. +20 −16 beir/reranking/rerank.py
  24. +73 −34 beir/retrieval/custom_metrics.py
  25. +66 −53 beir/retrieval/evaluation.py
  26. +21 −3 beir/retrieval/models/__init__.py
  27. +27 −14 beir/retrieval/models/bpr.py
  28. +0 −42 beir/retrieval/models/dpr.py
  29. +154 −0 beir/retrieval/models/huggingface.py
  30. +114 −0 beir/retrieval/models/llm2vec.py
  31. +84 −0 beir/retrieval/models/nvembed.py
  32. +24 −0 beir/retrieval/models/pooling.py
  33. +126 −15 beir/retrieval/models/sentence_bert.py
  34. +49 −26 beir/retrieval/models/sparta.py
  35. +162 −0 beir/retrieval/models/splade.py
  36. +92 −0 beir/retrieval/models/tldr.py
  37. +149 −0 beir/retrieval/models/unicoil.py
  38. +0 −50 beir/retrieval/models/use_qa.py
  39. +19 −0 beir/retrieval/models/util.py
  40. +5 −0 beir/retrieval/search/__init__.py
  41. +15 −0 beir/retrieval/search/base.py
  42. +26 −1 beir/retrieval/search/dense/__init__.py
  43. +82 −39 beir/retrieval/search/dense/exact_search.py
  44. +274 −0 beir/retrieval/search/dense/exact_search_multi_gpu.py
  45. +62 −50 beir/retrieval/search/dense/faiss_index.py
  46. +395 −131 beir/retrieval/search/dense/faiss_search.py
  47. +21 −14 beir/retrieval/search/dense/util.py
  48. +5 −1 beir/retrieval/search/lexical/__init__.py
  49. +67 −36 beir/retrieval/search/lexical/bm25_search.py
  50. +168 −123 beir/retrieval/search/lexical/elastic_search.py
  51. +5 −1 beir/retrieval/search/sparse/__init__.py
  52. +31 −17 beir/retrieval/search/sparse/sparse_search.py
  53. +112 −82 beir/retrieval/train.py
  54. +92 −30 beir/util.py
  55. +5 −3 examples/beir-pyserini/config.py
  56. +52 −36 examples/beir-pyserini/main.py
  57. +22 −19 examples/benchmarking/benchmark_bm25.py
  58. +28 −25 examples/benchmarking/benchmark_bm25_ce_reranking.py
  59. +39 −23 examples/benchmarking/benchmark_sbert.py
  60. +30 −16 examples/dataset/download_dataset.py
  61. +1 −1 examples/dataset/md5.csv
  62. +36 −27 examples/dataset/scrape_tweets.py
  63. +54 −0 examples/generation/passage_expansion_tilde.py
  64. +21 −12 examples/generation/query_gen.py
  65. +32 −22 examples/generation/query_gen_and_train.py
  66. +28 −25 examples/generation/query_gen_multi_gpu.py
  67. +1 −1 examples/retrieval/evaluation/README.md
  68. +26 −26 examples/retrieval/evaluation/custom/evaluate_custom_dataset.py
  69. +14 −14 examples/retrieval/evaluation/custom/evaluate_custom_dataset_files.py
  70. +19 −15 examples/retrieval/evaluation/custom/evaluate_custom_metrics.py
  71. +21 −18 examples/retrieval/evaluation/custom/evaluate_custom_model.py
  72. +29 −17 examples/retrieval/evaluation/dense/evaluate_ance.py
  73. +37 −28 examples/retrieval/evaluation/dense/evaluate_bpr.py
  74. +38 −32 examples/retrieval/evaluation/dense/evaluate_dim_reduction.py
  75. +38 −20 examples/retrieval/evaluation/dense/evaluate_dpr.py
  76. +58 −39 examples/retrieval/evaluation/dense/evaluate_faiss_dense.py
  77. +111 −0 examples/retrieval/evaluation/dense/evaluate_huggingface.py
  78. +94 −0 examples/retrieval/evaluation/dense/evaluate_llm2vec.py
  79. +92 −0 examples/retrieval/evaluation/dense/evaluate_nvembed.py
  80. +50 −21 examples/retrieval/evaluation/dense/evaluate_sbert.py
  81. +86 −0 examples/retrieval/evaluation/dense/evaluate_sbert_hf_loader.py
  82. +113 −0 examples/retrieval/evaluation/dense/evaluate_sbert_multi_gpu.py
  83. +130 −0 examples/retrieval/evaluation/dense/evaluate_tldr.py
  84. +0 −60 examples/retrieval/evaluation/dense/evaluate_useqa.py
  85. +102 −0 examples/retrieval/evaluation/late-interaction/README.md
  86. +1 −0 examples/retrieval/evaluation/late-interaction/beir-ColBERT
  87. +30 −19 examples/retrieval/evaluation/lexical/evaluate_anserini_bm25.py
  88. +31 −22 examples/retrieval/evaluation/lexical/evaluate_bm25.py
  89. +33 −23 examples/retrieval/evaluation/lexical/evaluate_multilingual_bm25.py
  90. +22 −19 examples/retrieval/evaluation/reranking/evaluate_bm25_ce_reranking.py
  91. +98 −0 examples/retrieval/evaluation/reranking/evaluate_bm25_monot5_reranking.py
  92. +21 −18 examples/retrieval/evaluation/reranking/evaluate_bm25_sbert_reranking.py
  93. +45 −39 examples/retrieval/evaluation/sparse/evaluate_anserini_docT5query.py
  94. +225 −0 examples/retrieval/evaluation/sparse/evaluate_anserini_docT5query_parallel.py
  95. +66 −44 examples/retrieval/evaluation/sparse/evaluate_deepct.py
  96. +19 −16 examples/retrieval/evaluation/sparse/evaluate_sparta.py
  97. +72 −0 examples/retrieval/evaluation/sparse/evaluate_splade.py
  98. +67 −0 examples/retrieval/evaluation/sparse/evaluate_unicoil.py
  99. +44 −29 examples/retrieval/training/train_msmarco_v2.py
  100. +76 −53 examples/retrieval/training/train_msmarco_v3.py
  101. +82 −57 examples/retrieval/training/train_msmarco_v3_bpr.py
  102. +192 −0 examples/retrieval/training/train_msmarco_v3_margin_MSE.py
  103. +31 −20 examples/retrieval/training/train_sbert.py
  104. +50 −33 examples/retrieval/training/train_sbert_BM25_hardnegs.py
  105. BIN images/HF.png
  106. BIN images/tu-darmstadt.png
  107. BIN images/ukp.png
  108. BIN images/uwaterloo.png
  109. +91 −0 pyproject.toml
  110. +0 −2 setup.cfg
  111. +0 −36 setup.py
3 changes: 3 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[submodule "examples/retrieval/evaluation/late-interaction/beir-ColBERT"]
path = examples/retrieval/evaluation/late-interaction/beir-ColBERT
url = https://github.com/NThakur20/beir-ColBERT.git
12 changes: 12 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
repos:
- repo: https://github.com/astral-sh/ruff-pre-commit
# Ruff version.
rev: v0.9.4
hooks:
# Run the linter.
- id: ruff
types_or: [ python, pyi ]
args: [--exit-non-zero-on-fix]
# Run the formatter.
- id: ruff-format
types_or: [ python, pyi ]
7 changes: 7 additions & 0 deletions CONTRIBUTORS.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
Individual Contributors to the BEIR Repository (BEIR contributors) include:
1. Nandan Thakur
2. Nils Reimers
3. Iryna Gurevych
4. Jimmy Lin
5. Andreas Rücklé
6. Abhishek Srivastava
2 changes: 1 addition & 1 deletion LICENSE
Original file line number Diff line number Diff line change
@@ -186,7 +186,7 @@
same "printed page" as the copyright notice for easier
identification within third-party archives.

Copyright [yyyy] [name of copyright owner]
Copyright 2020-2023 Nandan Thakur

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
14 changes: 10 additions & 4 deletions NOTICE.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
-------------------------------------------------------------------------------
Copyright 2021
Ubiquitous Knowledge Processing (UKP) Lab
Technische Universität Darmstadt
-------------------------------------------------------------------------------
Copyright since 2022
University of Waterloo
-------------------------------------------------------------------------------

-------------------------------------------------------------------------------
Copyright since 2020
Ubiquitous Knowledge Processing (UKP) Lab, Technische Universität Darmstadt
-------------------------------------------------------------------------------

For individual contributors, please refer to the CONTRIBUTORS file.
Loading