haystack

mirror of https://github.com/deepset-ai/haystack.git synced 2025-06-26 22:00:13 +00:00

Author	SHA1	Message	Date
Silvano Cerza	1ce12c7a6a	Remove example (#7458 )	2024-04-03 14:27:43 +02:00
Silvano Cerza	5aee378baf	chore: Remove all examples and point to cookbooks repo (#7350 ) * Remove all examples and point to cookbooks repo * Remove workflow testing examples	2024-03-12 18:04:39 +01:00
Daniel Barker	e4f37e9460	Fixed pipeline import statement (#7348 )	2024-03-12 15:12:35 +01:00
Sebastian Husch Lee	ceda4cd655	feat: Add support for `device_map` (#6679 ) * Getting device_map working to support 8bit loading and multi device inference * Update to take account the device specified by the user * add release notes * Add device_map support for ExtractiveReader * Update test * Update to model that doesn't have issues * Update test * Update pytest approx * Update release notes * Start supporting device map * Update ExtractiveReader to use new ComponentDevice * Update similarity ranker to follow extractive reader implementation * Fixing pylint * Make mypy mostly happy * Add new unit test to test device_map * Adding unit tests * Some refactoring * Add more tests * Add more tests * Add another unit test * Update first_device property to return a ComponentDevice to be able to use the to methods * Updating tests for test_device * Update tests and now explicitly modify device_map in model_kwargs * Update haystack/utils/hf.py Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com> * Make mypy happy * mypy * Remove unneeded optional flag * Update ExtractiveReader with new logic * Update ranker to follow new logic * Removing unneeded code * Make mypy happy * fxi pylint * Fix test * Adding unit tests for device_map="auto" * Add unit tests for ranker * PR comments * Make util method * Adding unit tests * Fix type annotation * Fix pylint * Fix test --------- Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>	2024-01-30 13:47:57 +01:00
Vladimir Blagojevic	c47b82c54f	Remove pipeline_utils package and dependent code (#6806 )	2024-01-23 18:40:43 +01:00
ZanSara	288ed150c9	feat!: Rename `model_name` or `model_name_or_path` to `model` in all Embedder classes (#6733 ) * rename model parameter in the openai doc embedder * fix tests for openai doc embedder * rename model parameter in the openai text embedder * fix tests for openai text embedder * rename model parameter in the st doc embedder * fix tests for st doc embedder * rename model parameter in the st backend * fix tests for st backend * rename model parameter in the st text embedder * fix tests for st text embedder * fix docstring * fix pipeline utils * fix e2e * reno * fix the indexing pipeline _create_embedder function * fix e2e eval rag pipeline * pytest	2024-01-12 15:30:17 +01:00
ZanSara	79d67b0338	expand example to use bytestream (#6718 )	2024-01-11 12:04:25 +01:00
Massimiliano Pippi	e1ec4e5e4d	refact!: Remove symbols under the `haystack.document_stores` namespace (#6714 ) * remove symbols under the haystack.document_stores namespace * Update haystack/document_stores/types/protocol.py Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com> * fix * same for retrievers * leftovers * more leftovers * add relnote * leftovers * one more * fix examples --------- Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>	2024-01-10 21:20:42 +01:00
ZanSara	9fe80fd225	feat: Add example script about routing metadata to converters in indexing pipelines (#6702 ) * support single metadata dict in markdown2document * reno * unwrap list * direct key access * typing * add example of indexing pipeline using Multiplexer * reno	2024-01-09 14:59:22 +01:00
Massimiliano Pippi	93b2aaee09	chore: move `DocumentJoiner` to new `joiners` package (#6692 ) * move DocumentJoiner to new joiners package * relnote * leftovers * fix docstrings generation * fix unrelated pydoc misconfiguration * more unrelated work, yay! * fix assertions	2024-01-08 22:06:27 +01:00
Stefano Fiorucci	c773c30c66	refactor!: rename all remaining `metadata` to `meta` (#6650 ) * change metadata to meta * release note	2023-12-28 12:18:15 +01:00
Vladimir Blagojevic	506ab81d26	chore: Rename GPT generators, deprecate old names (#6626 )	2023-12-22 19:37:29 +01:00
Stefano Fiorucci	7cc6080dfa	chore: replace metadata w meta in tests/examples (#6612 ) * replace metadata w meta in tests/examples * do not touch already broken e2e tests * Revert "do not touch already broken e2e tests" This reverts commit 1f911920d98954b57daacfe8d8ed02fd77d136db.	2023-12-21 14:09:31 +01:00
ZanSara	ae5297bfd7	example: self-correcting loop for RAG (#6420 ) * add example * docstrings * reno * use condrouter * move functions * tests * reno * add component * reno * add tests * mypy * pylint * logger * module name * multiplexer * draw * query_multiplexer * reno * typo	2023-12-20 11:35:05 +01:00
Vladimir Blagojevic	628e8aa3d4	feat: Improve getting started examples (#6510 ) * Improve rag and indexing pipelines * Update examples * Simplify user interface and code, improve embedder model * Improve default vals for embedder * resolve typing * resolve typing 2 * Fix unit test --------- Co-authored-by: Timo Möller <timo.moeller@deepset.ai>	2023-12-09 19:01:13 +01:00
Vladimir Blagojevic	008a322023	feat: Add Indexing Pipeline (#6424 ) * Add build_indexing_pipeline utils function * Pylint fixes * Move into another package to avoid circular deps * Revert change * Revert haystack/utils/__init__.py change * Add example * Use DocumentStore type, remove typing checks	2023-12-04 16:08:53 +01:00
ZanSara	a38f871dbd	feat: Add RAG pipeline (#6461 ) * add rag pipeline * Update examples/getting_started/rag.py Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> --------- Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com> Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>	2023-12-04 15:25:29 +01:00
Julian Risch	19ff30217c	docs: Add RAG pipeline example (#6446 )	2023-11-30 14:38:15 +01:00
Massimiliano Pippi	00e1dd6eb8	chore: rearrange the `core` package, move tests and clean up (#6427 ) * rearrange code * fix tests * relnote * merge test modules * remove extra * rearrange draw tests * forgot * remove unused import	2023-11-28 09:58:56 +01:00
Julian Risch	c3a5d0d32f	docs: Add indexing example (#6412 ) * docs: Add indexing example * use Path for current directory	2023-11-27 18:44:44 +01:00
Silvano Cerza	db759b0717	Add black step when testing examples (#6425 )	2023-11-27 15:01:33 +01:00
Malte Pietsch	09b4f53ce5	docs: Add example for loop in pipeline to autocorrect JSON (#6418 ) * add example for pipeline loop * add pydantic to CI * Fix comment --------- Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>	2023-11-27 13:29:16 +01:00
Massimiliano Pippi	9a8bef63c9	move snippets up one folder	2023-11-24 15:54:23 +01:00
Silvano Cerza	e6637f5ec2	Fix all tests	2023-11-24 14:48:43 +01:00
Massimiliano Pippi	09e7831f60	clean up 1.x code --------- Co-authored-by: Silvano Cerza <silvanocerza@gmail.com>	2023-11-24 11:47:47 +01:00
Timo Moeller	b34c35d982	initial (#6355 )	2023-11-23 10:32:54 +01:00
Stefano Fiorucci	92a8704de4	mypy ignore specific errors (#6278 )	2023-11-10 18:10:38 +01:00
Julian Risch	59e89b1031	test: Remove anthropic from "getting started" example test (#6024 )	2023-10-12 22:36:49 +02:00
Nicola Procopio	c102b152dc	fix: Run update_embeddings in examples (#6008 ) * added hybrid search example Added an example about hybrid search for faq pipeline on covid dataset * formatted with back formatter * renamed document * fixed * fixed typos * added test added test for hybrid search * fixed withespaces * removed test for hybrid search * fixed pylint * commented logging * updated hybrid search example * release notes * Update hybrid_search_faq_pipeline.py-815df846dca7e872.yaml * Update hybrid_search_faq_pipeline.py * mention hybrid search example in release notes * reduce installed dependencies in examples test workflow * do not install cuda dependencies * skip models if API key not set; delete document indices * skip models if API key not set; delete document indices * skip models if API key not set; delete document indices * keep roberta-base model and inference extra * pylint * disable pylint no-logging-basicconfig rule --------- Co-authored-by: Julian Risch <julian.risch@deepset.ai>	2023-10-10 16:38:52 +02:00
Timo Moeller	d048bb5352	docs: Add minimal getting started code to showcase haystack + RAG (#5578 ) * init * Change question * Add TODO comment * Addressing feedback * Add local folder option. Move additional functions inside haystack.utils for easier imports * Apply Daria's review suggestions Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> * Add integration test * change string formatting Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com> * Add outputparser to HF * Exclude anthropic test --------- Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>	2023-09-06 12:14:08 +02:00
Vladimir Blagojevic	6787ad2435	fix: Improve imports for new rankers (#5696 ) * Proper imports for new rankers * Small fix	2023-08-31 13:33:29 +02:00
Vladimir Blagojevic	2118f68769	feat: Add domain scoping to WebRetriever (#5587 ) * WebSearch: add allowed_domains scoped search * Add talk to website example * Add release note * Add allowed_domains to WebSearch * Minor fix --------- Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>	2023-08-28 20:02:02 +02:00
Vladimir Blagojevic	da67700318	Rename web_lfqa_improved and update questions (#5588 )	2023-08-17 17:10:49 +02:00
Vladimir Blagojevic	a75b9dd4bb	feat: LinkContentFetcher - add content-type resolution, user agent switching, PDF handler (#5374 ) * Add content type resolution, pdf handler, user agent switching --------- Co-authored-by: Daria Fokina <daria.fokina@deepset.ai> Co-authored-by: ZanSara <sara.zanzottera@deepset.ai>	2023-08-09 18:14:04 +02:00
Vladimir Blagojevic	abc6737e63	feat: Improve LFQA Web Example (#5504 ) * Improve web_lfqa example * Turn off pylint for logging setup * Another way to turn off logging	2023-08-04 14:20:06 +02:00
Vladimir Blagojevic	1876c41f07	feat: Add LostInTheMiddleRanker (#5457 ) * Add lost in the middle ranker * Add release note * Julian's feedback: more precise version of truncate * Better comments for the litm algorithm * Sebastian PR feedback * Add check for invalid values of word_count_threshold * Remove _truncate as it is not needed any more --------- Co-authored-by: Darja Fokina <daria.f93@gmail.com>	2023-08-02 17:05:13 +02:00
Vladimir Blagojevic	40a2e9b56a	refactor: Update WebRetriever to use LinkContentFetcher (#5229 ) * Refactor WebRetriever to use LinkContentFetcher * PR feedback --------- Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>	2023-08-02 12:45:03 +02:00
Vladimir Blagojevic	540d0fad97	feat: Add DiversityRanker (#5398 ) * Introduce DiversityRanker * improve most_diverse_order speed * Compute mean for numerical stability * Add release note * Add cosine similarity * Test both dot product and cosine similarity * Add pydocs hook --------- Co-authored-by: Michel Bartels <login@michelbartels.com>	2023-08-01 12:48:34 +02:00
Nicola Procopio	8a2ab82651	feat: Added hybrid search example (#5376 ) * added hybrid search example Added an example about hybrid search for faq pipeline on covid dataset * formatted with back formatter * renamed document * fixed * fixed typos * added test added test for hybrid search * fixed withespaces * removed test for hybrid search * fixed pylint * commented logging	2023-07-24 12:54:21 +02:00
Vladimir Blagojevic	597df1414c	feat: Update Anthropic Claude support with the latest models, new streaming API, context window sizes (#5406 ) * Update Claude support with the latest models, new streaming API, context window sizes * Use Github Anthropic SDK link for tokenizer, revert _init_tokenizer * Change example key name to ANTHROPIC_API_KEY	2023-07-21 13:33:07 +02:00
Vladimir Blagojevic	f21005f8ea	refactor: Extract link retrieval from WebRetriever, introduce LinkContentRetriever (#5227 ) * Extract link retrieval from WebRetriever, introduce LinkContentRetriever * Add example --------- Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Daria Fokina <daria.f93@gmail.com>	2023-07-13 12:54:40 +02:00
Bilge Yücel	6a1b6b1ae3	feat: Update ConversationalAgent (#5065 ) * feat: Update ConversationalAgent * Add Tools * Add test * Change default params * fix tests * Fix circular import error * Update conversational-agent prompt * Add conversational-agent-without-tools to legacy list * Add warning to add tools to conversational agent * Add callable tools * Add example script * Fix linter errors * Update ConversationalAgent depending on the existance of tools * Initialize the base Agent with different arguments when there's tool * Inject memory to the prompt in both cases, update prompts accordingly * Override the add_tools method to prevent adding tools to ConversationalAgent without tools * Update test * Fix linter error * Remove unused import * Update docstrings and api reference * Fix imports and doc string code snippet * docstrings update * Update conversational.py * Mock PromptNode * Prevent circular import error * Add max_steps to the ConversationalAgent * Update resolver description * Add prompt_template as parameter * Change docstring --------- Co-authored-by: Darja Fokina <daria.f93@gmail.com>	2023-06-20 13:09:21 +03:00
Vladimir Blagojevic	8d8de65492	Add AgentToolLogger, unit test, and example usage (#5087 )	2023-06-15 08:43:20 +02:00
ZanSara	9612aa90bb	fix examples (#5041 )	2023-05-29 15:15:38 +02:00
Vladimir Blagojevic	9d52998b25	feat: Add conversational agent (#4931 )	2023-05-17 15:19:09 +02:00
Vladimir Blagojevic	8091ced8d5	refactor: Extract ToolsManager, add it to Agent by composition (#4794 ) * Extract ToolsManager, add it to Agent by the composition * PR feedback Massi --------- Co-authored-by: Massimiliano Pippi <mpippi@gmail.com> Co-authored-by: Darja Fokina <daria.f93@gmail.com>	2023-05-03 16:45:40 +02:00
Vladimir Blagojevic	3fefc475b4	fix: Deprecate Seq2SeqGenerator and RAGenerator (#4745 ) * Deprecate Seq2SeqGenerator * changed the warning to include suggestion * Added example and msg to API reference docs * Added RAG deprecation * renamed name to adapt to naming conven * update docstrings --------- Co-authored-by: Mayank Jobanputra <mayankjobanputra@gmail.com> Co-authored-by: Darja Fokina <daria.f93@gmail.com>	2023-04-26 13:59:35 +02:00
Vladimir Blagojevic	be25655663	feat: Add agent tools (#4437 ) * Initial commit, add search_engine * Add TopPSampler * Add more TopPSampler unit tests * Remove SearchEngineSampler (converted to TopPSampler) * Add some basic WebSearch unit tests * Rename unit tests * Add WebRetriever into agent_tools * Adjust to WebRetriever * Add WebRetriever mode [snippet\|document] * Minor changes * SerperDev: add peopleAlsoAsk search results * First agent for hotpotqa * Making WebRetriever work on hotpotqa * refactor: minor WebRetriever improvements (#4377) * refactor: remove doc ids rebuild + antecipate cache * refactor: improve caching, fix Document ids * Minor WebRetriever improvements * Overlooked minor fixes * feat: add Bing API as search engine * refactor: let kwargs pass-through * feat: increase search context * check sampler result, improve batch typing * refactor: increase mypy compliance * Initial commit, add search_engine * Add TopPSampler * Add more TopPSampler unit tests * Remove SearchEngineSampler (converted to TopPSampler) * Add some basic WebSearch unit tests * Rename unit tests * Add WebRetriever into agent_tools * Adjust to WebRetriever * Add WebRetriever mode [snippet\|document] * Minor changes * SerperDev: add peopleAlsoAsk search results * First agent for hotpotqa * Making WebRetriever work on hotpotqa * refactor: minor WebRetriever improvements (#4377) * refactor: remove doc ids rebuild + antecipate cache * refactor: improve caching, fix Document ids * Minor WebRetriever improvements * Overlooked minor fixes * feat: add Bing API as search engine * refactor: let kwargs pass-through * feat: increase search context * check sampler result, improve batch typing * refactor: increase mypy compliance * Fix mypy * Minor example fixes * Fix the descriptions * PR feedback updates * More fixes * TopPSampler: handle top p None value, add unit test * Add top_k to WebSearch * Use boilerpy3 instead trafilatura * Remove date finding * Add more WebRetriever docs * Refactor long methods * making the preprocessor optional * hide WebSearch and make NeuralWebSearch a pipeline * remove unused imports * add WebQAPipeline and split example into two * change example search engine to SerperDev * Turn off progress bars in WebRetriever's PreProcesssor * Agent tool examples - final updates * Add webqa test, search results ranking scores * Better answer box handling for SerperDev and SerpAPI * Minor fixes * pylint * pylint fixes * extract TopPSampler from WebRetriever * use sampler only for WebRetriever modes other than snippet * add web retriever tests * add web retriever tests * exclude rdflib@6.3.2 due to license issues * add test for preprocessed docs and kwargs examples in docstrings * Move test_webqa_pipeline to test/pipelines * change docstring for join_documents_and_scores * Use WebQAPipeline in examples/web_lfqa.py * Use WebQAPipeline in examples/web_lfqa.py * Move test_webqa_pipeline to e2e * Updated lg * Sampler added automatically in WebQAPipeline, no need to add it * Updated lg * Updated lg * :ignore Update agent tools examples to new templates (#4503) * Update examples to new templates * Add print back * fix linting and black format issues --------- Co-authored-by: Daniel Bichuetti <daniel.bichuetti@gmail.com> Co-authored-by: agnieszka-m <amarzec13@gmail.com> Co-authored-by: Julian Risch <julian.risch@deepset.ai>	2023-03-27 18:14:58 +02:00
Massimiliano Pippi	5e0de4a9ed	do not run launch_es in the CI (#3981 )	2023-01-27 16:43:17 +01:00
Tuana Celik	e1502c8029	Adding Example Scripts to Haystack (#3588 ) * add 2 example scripts * fixing faq script * updating PR based on comments * black * updating s3 buckets * first attempt at testing * Add basic tests to two scripts PR: #3588 * make tests runnable * reformat files * only run in PRs touching an example Co-authored-by: bilgeyucel <bilgeyucel96@gmail.com> Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>	2023-01-27 14:54:59 +01:00

50 Commits