Distinguish labels for calculating similarity scores (#1124)

* Distinguish labels for calculating similarity scores

* Explain label "0" and "1" of TextPairClassifier in Ranker
This commit is contained in:
Julian Risch 2021-06-02 17:33:36 +02:00 committed by GitHub
parent b555bc525c
commit 8e3d0d1287
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 2 additions and 1 deletions

View File

@ -30,6 +30,7 @@ Alternatively, [this example](https://github.com/deepset-ai/FARM/blob/master/exa
### Description
The FARMRanker consists of a Transformer-based model for document re-ranking using the TextPairClassifier of [FARM](https://github.com/deepset-ai/FARM).
Given a text pair of query and passage, the TextPairClassifier either predicts label "1" if the pair is similar or label "0" if they are dissimilar (accompanied with a probability).
While the underlying model can vary (BERT, Roberta, DistilBERT, ...), the interface remains the same.
With a FARMRanker, you can:
* Directly get predictions (re-ranked version of the supplied list of Document) via predict() if supplying a pre-trained model

View File

@ -258,7 +258,7 @@ class FARMRanker(BaseRanker):
# calculate similarity of query and each document
query_and_docs = [{"text": (query, doc.text)} for doc in documents]
result = self.inferencer.inference_from_dicts(dicts=query_and_docs)
similarity_scores = [pred["probability"] for preds in result for pred in preds["predictions"]]
similarity_scores = [pred["probability"] if pred["label"] == "1" else 1-pred["probability"] for preds in result for pred in preds["predictions"]]
# rank documents according to scores
sorted_scores_and_documents = sorted(zip(similarity_scores, documents),