haystack/releasenotes/notes/add-most-diverse-ranker-21cf310be4554551.yaml
Vladimir Blagojevic 540d0fad97
feat: Add DiversityRanker (#5398)
* Introduce DiversityRanker

* improve most_diverse_order speed

* Compute mean for numerical stability

* Add release note

* Add cosine similarity 

* Test both dot product and cosine similarity

* Add pydocs hook

---------

Co-authored-by: Michel Bartels <login@michelbartels.com>
2023-08-01 12:48:34 +02:00

18 lines
945 B
YAML

---
prelude: >
We're introducing a new ranker to Haystack - DiversityRanker. This
ranker aims to maximize the overall diversity of the given documents.
It leverages sentence-transformer models to calculate semantic embeddings
for each document. It orders documents so that the next one, on average,
is least similar to the already selected documents. Such ranking results in a
list where each subsequent document contributes the most to the overall
diversity of the selected document set.
features:
- |
The DiversityRanker can be used like other rankers in Haystack and
it can be particularly helpful in cases where you have highly relevant
yet similar sets of documents. By ensuring a diversity of documents,
this new ranker facilitates a more comprehensive utilization of the
documents and, particularly in RAG pipelines, potentially contributes
to more accurate and rich model responses.