Update README.md

This commit is contained in:
Xiao 2024-02-11 12:33:00 +00:00 committed by system
parent f2dfce09de
commit 3ab7155aa9
No known key found for this signature in database
GPG Key ID: 6A528E38E0733467

@ -237,9 +237,13 @@ We compare BGE-M3 with some popular methods, including BM25, openAI embedding, e
- NarritiveQA:
![avatar](./imgs/nqa.jpg)
- BM25
- Comparison with BM25
We utilized Pyserini to implement BM25, and the test results can be reproduced by this [script](https://github.com/FlagOpen/FlagEmbedding/tree/master/C_MTEB/MLDR#bm25-baseline).
We tested BM25 using two different tokenizers:
one using Lucene Analyzer and the other using the same tokenizer as M3 (i.e., the tokenizer of xlm-roberta).
The results indicate that BM25 remains a competitive baseline,
especially in long document retrieval.
![avatar](./imgs/bm25.jpg)