haystack

mirror of https://github.com/deepset-ai/haystack.git synced 2025-11-17 18:43:58 +00:00

Author	SHA1	Message	Date
Tuana Çelik	83d33f2aed	Update README.md (#4625 )	2023-04-07 09:09:16 +02:00
Malte Pietsch	fabf77388c	Update readme with new companies using haystack (#4621 )	2023-04-06 19:42:25 +02:00
Silvano Cerza	e85dc79eaa	test: Add pytest fixture to block requests in unit tests (#4433 ) * Add pytest fixture to block requests in unit tests * Mark test correctly as integration * Fix crawler unit test failing cause it tries to install chromedriver	2023-04-06 18:04:57 +02:00
Silvano Cerza	c3abf73332	refactor: Rework prompt tests (#4600 ) * Rework some PromptNode and PromptModel tests * Remove duplicate code in PromptNode * Fix mypy * Fix test cause of missing fixture * Revert "Fix mypy" This reverts commit e530295a06cb260d9a8bd89679534958cb3d9776. * Revert "Remove duplicate code in PromptNode" This reverts commit 4a678ae81504dcc78a737372c061d12dc8799639.	2023-04-06 14:47:44 +02:00
Agnieszka Marzec	f2c6ce39e6	Docs: Fix QuestionGenerator and Summarizer docstrings (#4594 ) * Add missing params and fix the docstrings * Add reviewer's comments	2023-04-06 13:40:56 +02:00
Silvano Cerza	ee7b25b8cf	Remove unecessary literal_eval (#4570 )	2023-04-06 13:30:45 +02:00
Tuana Çelik	e0895f0ac2	Adding missing emoji (#4613 )	2023-04-06 11:20:16 +02:00
Tuana Çelik	1a37caad79	feat: Load documents from remote - helper function (#4545 ) * first draft of the load documents from remote function * resolving comments * pylint fixes * pylint fixes * fixed import * fixed black * fixing returned instance * pythonic list comprehension * Addressed comments --------- Co-authored-by: Mayank Jobanputra <mayankjobanputra@gmail.com>	2023-04-06 10:19:35 +02:00
Massimiliano Pippi	52fb935936	build xpdf on bionic (#4606 )	2023-04-05 15:52:44 +02:00
Agnieszka Marzec	bc95de5dc1	Update models and docstrings lg (#4595 )	2023-04-04 16:48:14 +02:00
Vladimir Blagojevic	a8d283cfac	Fix HF stop words (single stop word) (#4584 )	2023-04-04 14:45:10 +02:00
ZanSara	ce61eda970	feat: Haystack CLI (#4568 ) * first implementation * only version * delete rest api management * pylint	2023-04-04 14:24:00 +02:00
Stefano Fiorucci	423b135e14	refactor: remove variadic parameters in `WebSearch` initialization; make new nodes directly importable (#4581 ) * make new nodes directly importable * avoid circular imports * rm variadic parameters in __init___ * forgotten import * update docstrings * don't expose SearchEngine	2023-04-04 14:21:26 +02:00
Agnieszka Marzec	7338e60362	Docs: Hide private modules from API docs (#4555 ) * Hide private modules and fix order * Add underscore	2023-04-04 14:07:18 +02:00
Agnieszka Marzec	7c5f9313ff	Docs: Update Whisper API. (#4539 ) * Update lg * Blackify	2023-04-04 12:32:06 +02:00
Mayank Jobanputra	ce82bfb197	chore: add citation (#4573 ) * basic structure * Added names, modified title	2023-04-04 10:10:44 +02:00
Agnieszka Marzec	c00bb1b732	Docs: Shaper API update (#4542 ) * Update Shaper API * Blackify	2023-04-04 08:21:58 +02:00
Silvano Cerza	1cc4c9c651	refactor: Refactor prompt node (#4580 ) * Refactor prompt structure * Refactor prompt tests structure * Fix pylint * Move TestPromptTemplateSyntax to test_prompt_template.py	2023-04-03 11:49:49 +02:00
ZanSara	c202866093	feat!: drop Python3.7 support (#4421 ) * drop py3.7 * importlib-metadata	2023-04-03 10:34:58 +02:00
Silvano Cerza	af02803cce	Skip flaky prompt node integration test (#4572 )	2023-04-03 09:49:30 +02:00
Massimiliano Pippi	322652c306	fix: provide a fallback for PyMuPDF (#4564 ) * add a fallback xpdf alternative to PyMuPDF * add xpdpf to the base images * to be reverted * silence mypy on conditional error * do not install pdf extras in base images * bring back the xpdf build strategy * remove leftovers from old build * fix indentation * Apply suggestions from code review Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com> * revert test workflow --------- Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>	2023-03-31 14:37:05 +02:00
Julian Risch	57415ef8ab	test: Remove duplicate test and edit docstring (#4567 )	2023-03-31 12:39:18 +02:00
Silvano Cerza	458e9f1897	Checkout correct ref in docstring-labeler.yml (#4563 )	2023-03-30 18:11:43 +02:00
Eren	b09289241b	docs: fix broken readme links (#4560 ) * docs: fix broken links * fix Azure typo	2023-03-30 15:17:00 +02:00
Eren	5c6b295fb2	fix: update tutorials link (#4559 )	2023-03-30 14:58:23 +02:00
Agnieszka Marzec	815dcdebbd	docs: Update PromptNode API docs (#4549 ) * Update docstrings * adapt test to changed logging message --------- Co-authored-by: Julian Risch <julian.risch@deepset.ai>	2023-03-30 14:27:44 +02:00
Silvano Cerza	3782ebc835	ci: Fix slack messages formatting (#4556 ) * Fix slack messages formatting * Remove unneeded file	2023-03-30 10:56:20 +02:00
Tuana Çelik	0876bc13b1	update to readme (#4533 ) * update to readme * Update README.md Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> * Update README.md Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com> * resolving comments * Small final fixes --------- Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>	2023-03-30 10:51:01 +02:00
GitIgnoreMaybe	514dea2443	feat: Change default save_dir for FARMReader.train (#4553 ) Co-authored-by: Marco <marco.herzog@nuzzera.com>	2023-03-30 13:20:25 +05:30
Stefano Fiorucci	57f87e24a3	refactor: `OpenAIAnswerGenerator` - avoid tokenizing all documents several times (#4504 )	2023-03-29 22:38:27 +02:00
Zoltan Fedor	32091d66cb	Adding filtering support for Weaviate when used for BM25 querying (#4385 )	2023-03-29 16:51:22 +02:00
Silvano Cerza	e00f1461bc	Use bigger runner for Docker release (#4538 )	2023-03-29 13:14:46 +02:00
oryx1729	5fc84904f1	fix: update envs for the backend image of annotation tool (#4535 )	2023-03-29 12:54:21 +02:00
ZanSara	16bd7d0625	add back tutorial_running() (#4534 )	2023-03-29 12:20:24 +02:00
Silvano Cerza	78216196d1	Fix docker images testing (#4536 )	2023-03-29 12:20:06 +02:00
Silvano Cerza	85ade5c878	Fix Slack messages formatting on job failure (#4520 )	2023-03-29 09:24:41 +02:00
Massimiliano Pippi	0dfa5d6ad7	fix: do not override bake's platform definitions (#4518 ) * do not override bake's platform definitions * test * fix job name and remove override from minor version job * test * bump docker login action * fix plurals * Remove platform from matrix and test both platform in a single job * Remove branch trigger used for testing --------- Co-authored-by: Silvano Cerza <silvanocerza@gmail.com>	2023-03-28 17:57:29 +02:00
ZanSara	651be37afc	proposal: `DocumentStores` and `Retrievers` (#4370 ) * add proposal * add proposal * pr number * pr number * start second draft * second draft * node examples * phrasing * get_documents -> filter_documents	2023-03-28 16:31:42 +02:00
Agnieszka Marzec	aae2ad8e5c	Add whisper api (#4511 )	2023-03-28 15:43:59 +02:00
Silvano Cerza	f4fb8dd946	Revert "ci: Change docker_release.yml workflow to run after successful PyPi release (#4293 )" (#4513 ) This reverts commit 6e241262ada9e59359d653a779246d2ad03c1223.	2023-03-28 15:28:15 +02:00
Vladimir Blagojevic	7c9f719496	refactor: Adjust WhisperTranscriber to pipeline run methods (#4510 ) * Retrofit WhisperTranscriber run methods * Add pipeline unit test --------- Co-authored-by: ZanSara <sara.zanzottera@deepset.ai>	2023-03-28 13:52:21 +02:00
Silvano Cerza	098342da32	Use new Slack action to send failure messages (#4464 )	2023-03-28 10:49:32 +02:00
Silvano Cerza	dbdb682225	Enhance release_docs.py (#4459 )	2023-03-28 09:56:42 +02:00
Silvano Cerza	cfb8dfd470	Fix pipeline config and agent tools hashing for telemetry (#4508 )	2023-03-28 09:41:50 +02:00
ZanSara	c777302fb4	chore: disable posthog in rest api tests (#4507 )	2023-03-27 21:12:27 +02:00
bogdankostic	ed1837c0c9	feat: Deduplicate duplicate Answers resulting from overlapping Documents in `FARMReader` (#4470 ) * Deduplicate answers resulting from document split overlap * Add tests * Fix Pylint * Adapt existing test * Incorporate PR feedback	2023-03-27 20:04:59 +02:00
github-actions[bot]	de825ded1c	Update unstable version (#4506 ) Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2023-03-27 19:01:04 +02:00
tstadel	3a7ae8239c	chore: Fix imports (#4505 )	2023-03-27 18:58:53 +02:00
ZanSara	b52bbea0a5	refactor: reduce telemetry events count (#4501 ) * remove telemetry v1 * more pipeline methods to take out * send_event_2 * reduce events * mypy * consolidate llm nodes * pylint * mypy * mypy again * remove test * mypy * pylint * black * mypy	2023-03-27 18:53:56 +02:00
Vladimir Blagojevic	be25655663	feat: Add agent tools (#4437 ) * Initial commit, add search_engine * Add TopPSampler * Add more TopPSampler unit tests * Remove SearchEngineSampler (converted to TopPSampler) * Add some basic WebSearch unit tests * Rename unit tests * Add WebRetriever into agent_tools * Adjust to WebRetriever * Add WebRetriever mode [snippet\|document] * Minor changes * SerperDev: add peopleAlsoAsk search results * First agent for hotpotqa * Making WebRetriever work on hotpotqa * refactor: minor WebRetriever improvements (#4377) * refactor: remove doc ids rebuild + antecipate cache * refactor: improve caching, fix Document ids * Minor WebRetriever improvements * Overlooked minor fixes * feat: add Bing API as search engine * refactor: let kwargs pass-through * feat: increase search context * check sampler result, improve batch typing * refactor: increase mypy compliance * Initial commit, add search_engine * Add TopPSampler * Add more TopPSampler unit tests * Remove SearchEngineSampler (converted to TopPSampler) * Add some basic WebSearch unit tests * Rename unit tests * Add WebRetriever into agent_tools * Adjust to WebRetriever * Add WebRetriever mode [snippet\|document] * Minor changes * SerperDev: add peopleAlsoAsk search results * First agent for hotpotqa * Making WebRetriever work on hotpotqa * refactor: minor WebRetriever improvements (#4377) * refactor: remove doc ids rebuild + antecipate cache * refactor: improve caching, fix Document ids * Minor WebRetriever improvements * Overlooked minor fixes * feat: add Bing API as search engine * refactor: let kwargs pass-through * feat: increase search context * check sampler result, improve batch typing * refactor: increase mypy compliance * Fix mypy * Minor example fixes * Fix the descriptions * PR feedback updates * More fixes * TopPSampler: handle top p None value, add unit test * Add top_k to WebSearch * Use boilerpy3 instead trafilatura * Remove date finding * Add more WebRetriever docs * Refactor long methods * making the preprocessor optional * hide WebSearch and make NeuralWebSearch a pipeline * remove unused imports * add WebQAPipeline and split example into two * change example search engine to SerperDev * Turn off progress bars in WebRetriever's PreProcesssor * Agent tool examples - final updates * Add webqa test, search results ranking scores * Better answer box handling for SerperDev and SerpAPI * Minor fixes * pylint * pylint fixes * extract TopPSampler from WebRetriever * use sampler only for WebRetriever modes other than snippet * add web retriever tests * add web retriever tests * exclude rdflib@6.3.2 due to license issues * add test for preprocessed docs and kwargs examples in docstrings * Move test_webqa_pipeline to test/pipelines * change docstring for join_documents_and_scores * Use WebQAPipeline in examples/web_lfqa.py * Use WebQAPipeline in examples/web_lfqa.py * Move test_webqa_pipeline to e2e * Updated lg * Sampler added automatically in WebQAPipeline, no need to add it * Updated lg * Updated lg * :ignore Update agent tools examples to new templates (#4503) * Update examples to new templates * Add print back * fix linting and black format issues --------- Co-authored-by: Daniel Bichuetti <daniel.bichuetti@gmail.com> Co-authored-by: agnieszka-m <amarzec13@gmail.com> Co-authored-by: Julian Risch <julian.risch@deepset.ai>	2023-03-27 18:14:58 +02:00

... 29 30 31 32 33 ...

3597 Commits