Update README.md

This commit is contained in:
Xiao 2024-02-01 16:39:36 +00:00 committed by system
parent 1f5d3ac2e1
commit 4277867103
No known key found for this signature in database
GPG Key ID: 6A528E38E0733467

@ -209,6 +209,13 @@ print(model.compute_score(sentence_pairs,
- Long Document Retrieval
- MLDR:
![avatar](./imgs/long.jpg)
Please note that MLDR is a document retrieval dataset we constructed via LLM,
covering 13 languages, including test set, validation set, and training set.
We utilized the training set from MLDR to enhance the model's long document retrieval capabilities.
Therefore, comparing baseline with `Dense w.o.long`(fine-tuning without long document dataset) is more equitable.
Additionally, this long document retrieval dataset will be open-sourced to address the current lack of open-source multilingual long text retrieval datasets.
We believe that this data will be helpful for the open-source community in training document retrieval models.
- NarritiveQA:
![avatar](./imgs/nqa.jpg)