mirror of https://github.com/deepset-ai/haystack.git synced 2026-01-10 06:07:08 +00:00

* Add api pages

* Add latest docstring and tutorial changes

* First sweep of usage docs

* Add link to conversion script

* Add import statements

* Add summarization page

* Add web crawler documentation

* Add confidence scores usage

* Add crawler api docs

* Regenerate api docs

* Update summarizer and translator api

* Add api pages

* Add latest docstring and tutorial changes

* First sweep of usage docs

* Add link to conversion script

* Add import statements

* Add summarization page

* Add web crawler documentation

* Add confidence scores usage

* Add crawler api docs

* Regenerate api docs

* Update summarizer and translator api

* Add indentation (pydoc-markdown 3.10.1)

* Comment out metadata

* Remove Finder deprecation message

* Remove Finder in FAQ

* Update tutorial link

* Incorporate reviewer feedback

* Regen api docs

* Add type annotations

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

2021-04-22 16:45:29 +02:00

2.3 KiB

Raw Blame History

Translator

Texts come in different languages. This is not different for search and there are plenty of options to deal with it. One of them is actually to translate the incoming query, the documents or the search results.

Let's imagine you have an English corpus of technical docs, but the mother tongue of many of your users is French. You can use a Translator node in your pipeline to

Translate the incoming query from French to English
Search in your English corpus for the right document / answer
Translate the results back from English to French

Example (Stand-alone Translator)

You can use the Translator component directly to translate your query or document(s):

from haystack.schema import Document
from haystack.translator import TransformersTranslator

DOCS = [
        Document(
            text="""Heinz von Foerster was an Austrian American scientist 
                  combining physics and philosophy, and widely attributed 
                  as the originator of Second-order cybernetics."""
        )
    ]
translator = TransformersTranslator(model_name_or_path="Helsinki-NLP/opus-mt-en-fr")
res = translator.translate(documents=DOCS, query=None)

Example (Wrapping another Pipeline)

You can also wrap one of your existing pipelines and "add" the translation nodes at the beginning and at the end of your pipeline. For example, lets translate the incoming query to from French to English, then do our document retrieval and then translate the results back from English to French:

from haystack.pipeline import TranslationWrapperPipeline, DocumentSearchPipeline
from haystack.translator import TransformersTranslator

pipeline = DocumentSearchPipeline(retriever=my_dpr_retriever)

in_translator = TransformersTranslator(model_name_or_path="Helsinki-NLP/opus-mt-fr-en")
out_translator = TransformersTranslator(model_name_or_path="Helsinki-NLP/opus-mt-en-fr")

pipeline_with_translation = TranslationWrapperPipeline(input_translator=in_translator,
                                                       output_translator=out_translator,
                                                       pipeline=pipeline)

2.3 KiB Raw Blame History

Translator

2.3 KiB

Raw Blame History