haystack/test at a92f1860f6d77bf14bbda3126a61a84aa5f69bc4 - haystack - Gitea: Git with a cup of tea

yujunjun/haystack

mirror of https://github.com/deepset-ai/haystack.git synced 2025-09-22 06:33:43 +00:00

History

Shahrukh Khan 4822536886

Add ImageToTextConverter and PDFToTextOCRConverter that utilize OCR (#1349 )

* add image.py converter

* add PDFtoImageConverter

* add init to PDFtoImageConverter and classes to __init__

* update imagetotext pipeline

* update imagetotext pipeline

* update imagetotext pipeline

* update imagetotext pipeline

* update imagetotext pipeline

* update imagetotext pipeline

* update imagetotext pipeline

* revert change in base.py in file_conv

* Update base.py

* Update pdf.py

* add ocr file_converter testcase & update dockerfile

* fix tesseract exception message typo

* fix _image_to_text doctstring

* add tesseract installation to CI

* add tesseract installation to CI

* add content test for PDF OCR converter

* update PDFToTextOCRConverter constructor doctsring

* replace image files with tmp paths for image.py convert

* replace image files with tmp paths for image.py convert

* Update README.md

Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>

2021-09-01 16:42:25 +02:00

..

Implement OpenSearch ANN (#1225 )

2021-07-26 10:52:52 +02:00

Refactor replicas config for Ray Pipelines (#1378 )

2021-08-31 10:14:55 +02:00

conftest.py

delete_all_documents() replaced by delete_documents() (#1377 )

2021-08-30 15:18:28 +02:00

pytest.ini

Add Longform-QA (LFQA), Seq2SeqGenerator for generative QA and Retribert Retriever (#1086 )

2021-06-14 17:53:43 +02:00

test_classifier.py

Add FARMClassifier node for Document Classification (#1265 )

2021-07-13 21:44:26 +02:00

test_connector.py

Editing docs read.me for new docs website workflow (#1372 )

2021-08-30 14:59:40 +02:00

test_document_store.py

delete_all_documents() replaced by delete_documents() (#1377 )

2021-08-30 15:18:28 +02:00

test_eval.py

Add new QA eval metric: Semantic Answer Similarity (SAS) (#1338 )

2021-08-12 14:31:48 +02:00

test_faiss_and_milvus.py

delete_all_documents() replaced by delete_documents() (#1377 )

2021-08-30 15:18:28 +02:00

test_file_converter.py

Add ImageToTextConverter and PDFToTextOCRConverter that utilize OCR (#1349 )

2021-09-01 16:42:25 +02:00

test_generator.py

Improve document stores unit test parametrization (#1202 )

2021-06-22 16:08:23 +02:00

test_knowledge_graph.py

knowledge graph example (#934 )

2021-04-08 14:05:33 +02:00

test_pipeline.py

Removing probability field from answers in favor of score field (#1340 )

2021-08-17 10:27:11 +02:00

test_preprocessor.py

Fix validation for split_respect_sentence_boundary in Preprocessor (#869 )

2021-03-04 15:09:08 +01:00

test_question_generator.py

Add QuestionGenerator (#1267 )

2021-07-26 17:20:43 +02:00

test_ranker.py

Add SentenceTransformersRanker with pre-trained Cross-Encoder (#1209 )

2021-07-07 17:31:45 +02:00

test_ray.py

Refactor replicas config for Ray Pipelines (#1378 )

2021-08-31 10:14:55 +02:00

test_reader.py

Removing probability field from answers in favor of score field (#1340 )

2021-08-17 10:27:11 +02:00

test_rest_api.py

[pipeline] Allow for batch indexing when using Pipelines fix #1168 (#1231 )

2021-06-30 14:13:46 +02:00

test_retriever.py

Improve document stores unit test parametrization (#1202 )

2021-06-22 16:08:23 +02:00

test_schema.py

Using text hash as id to prevent document duplication (#1000 )

2021-05-17 17:51:52 +02:00

test_summarizer.py

Adding Translator (standalone component & wrapper for pipelines) (#782 )

2021-02-12 15:58:26 +01:00

test_translator.py

Adding Translator (standalone component & wrapper for pipelines) (#782 )

2021-02-12 15:58:26 +01:00

test_utils.py

[RAG] Integrate "Retrieval-Augmented Generation" with Haystack (#484 )

2020-10-30 18:06:02 +01:00

test_weaviate.py

delete_all_documents() replaced by delete_documents() (#1377 )

2021-08-30 15:18:28 +02:00