Update README.md

This commit is contained in:
Xiao 2024-02-01 09:09:40 +00:00 committed by system
parent 053722828b
commit a12bcd780a
No known key found for this signature in database
GPG Key ID: 6A528E38E0733467

@ -44,7 +44,19 @@ Utilizing the re-ranking model (e.g., [bge-reranker](https://github.com/FlagOpen
- Sparse retrieval (lexical matching): a vector of size equal to the vocabulary, with the majority of positions set to zero, calculating a weight only for tokens present in the text. e.g., BM25, [unicoil](https://arxiv.org/pdf/2106.14807.pdf), and [splade](https://arxiv.org/abs/2107.05720)
- Multi-vector retrieval: use multiple vectors to represent a text, e.g., [ColBERT](https://arxiv.org/abs/2004.12832).
**2. How to use BGE-M3 in other projects?**
**2. Comparison with BGE-v1.5 and other monolingual models**
BGE-M3 is a multilingual model, and its ability in monolingual embedding retrieval may not surpass models specifically designed for single languages.
However, we still recommend trying BGE-M3 because of its versatility (support for multiple languages and long texts).
Moreover, it can simultaneously generate multiple representations, and using them together can enhance accuracy and generalization,
unlike most existing models that can only perform dense retrieval.
In the open-source community, there are many excellent models (e.g., jina-embedding, colbert, e5, etc),
and users can choose a model that suits their specific needs based on practical considerations,
such as whether to require multilingual or cross-language support, and whether to process long texts.
**3. How to use BGE-M3 in other projects?**
For embedding retrieval, you can employ the BGE-M3 model using the same approach as BGE.
The only difference is that the BGE-M3 model no longer requires adding instructions to the queries.
@ -52,7 +64,7 @@ For sparse retrieval methods, most open-source libraries currently do not suppor
Contributions from the community are welcome.
**3. How to fine-tune bge-M3 model?**
**4. How to fine-tune bge-M3 model?**
You can follow the common in this [example](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune)
to fine-tune the dense embedding.