haystack

mirror of https://github.com/deepset-ai/haystack.git synced 2025-07-19 06:52:56 +00:00

Author	SHA1	Message	Date
Stefano Fiorucci	24405f851c	refactor: `InMemoryDocumentStore` - manage documents without embedding & fix mypy errors (#4113 ) * refactoring and test * try to replace error with warning * more expressive and robust get_scores methods * make get_scores methods internal	2023-02-14 17:43:11 +01:00
Silvano Cerza	d86a511cc1	Fix Docker images test on release (#4153 )	2023-02-14 14:18:49 +01:00
bogdankostic	4a88fae1e7	Update annotation tool readme (#4123 )	2023-02-14 09:53:27 +01:00
Sebastian	75ef959678	feat: Update OpenAIAnswerGenerator defaults and with learnings from PromptNode (#4038 ) * added instruction_prompt and update defaults * Change back max_tokens * Code formatting * Starting to update instruction_prompt to be a PromptTemplate * Using PromptTemplate in OpenAIAnswerGenerator * Removed hardcoded value * pylint and make examples and examples_context optional prompt parameters * Added new test for when prompt length goes past max token limit * Improve doc strings. * Make "text-davinci-003" the new default model * Renaming variable to prompt_template and name to question-answering-with-examples * Reduced repetitive code. * Added some comments to explain key logic for future debuggers * Update docs for max_tokens and increase defaul * Updating variable name to prompt_template and docs. * Updated test and handled Answer case where no documents are used. * Slight update to docs. * Adding more doc strings * lg updates * Blackify --------- Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai> Co-authored-by: agnieszka-m <amarzec13@gmail.com>	2023-02-12 00:08:07 +01:00
Silvano Cerza	3cdfe9ca40	Revert changes introduced in PR #4124 (#4137 )	2023-02-10 17:54:20 +01:00
Silvano Cerza	d9a7e8011f	Add load arg to docker/bake-action before testing Docker images (#4124 )	2023-02-10 17:41:27 +01:00
bogdankostic	27aaa92800	docs: Remove some classes regarding PromptNode from API reference docs (#4132 )	2023-02-10 15:56:38 +01:00
Vladimir Blagojevic	d839b9314f	Update PromptTemplate tests (#4131 )	2023-02-10 15:24:01 +01:00
bogdankostic	05950719ba	fix: Deduplicate same Documents in isolated evaluation of Reader (#4114 ) * Deduplicate same Documents in one MultiLabel * Add tests * Update label * Update label * Update test * Update test * Revert change to check CI * Revert reversion * Use deepcopy * Update tests	2023-02-10 13:55:14 +01:00
Agnieszka Marzec	3c793e4edc	Docs: Update docstrings (#4119 ) * Update docstrings * Blackify * Bring back the template wording * Blackify	2023-02-10 11:51:51 +01:00
Silvano Cerza	2cc938ff90	ci: Add workflow to label PRs that edit docstrings (#4115 ) * Add workflow to label PRs that edit docstrings * Add python-version arg in setup-python steps * Run workflow only in haystack and rest_api python files edit * Fix labeling job * Fix labeling conditional * Fix files globbing in docstrings_checksum.py * Fix typing * Rework workflow to use a single job	2023-02-09 18:57:30 +01:00
Silvano Cerza	0b23f84205	Exclude .github folder from triggering tests in CI (#4120 )	2023-02-09 18:07:27 +01:00
Jack Butler	e6b6f70ae2	fix: Fix `TableTextRetriever` for input consisting of tables only (#4048 ) * fix: update kwargs for TriAdaptiveModel * fix: squeeze batch for TTR inference * test: add test for ttr + dataframe case * test: update and reorganise ttr tests * refactor: make triadaptive model handle shapes * refactor: remove duplicate reshaping * refactor: rename test with duplicate name * fix: add device assignment back to TTR * fix: remove duplicated vars in test --------- Co-authored-by: bogdankostic <bogdankostic@web.de>	2023-02-09 11:38:16 +01:00
bogdankostic	986472c26f	feat: Add BM25 support for tables in InMemoryDocumentStore (#4090 ) * Add BM25 support for tables in InMemoryDocumentStore * Add table type to query method * Fix import order * Adapt tests	2023-02-09 10:47:35 +01:00
Mayank Jobanputra	93962c09fc	fix: fix torchaudio version (#4102 ) * fix torchaudio version * added comment for keeping torchaudio last * removed torchaudio from base	2023-02-09 15:14:10 +05:30
oryx1729	8ecadd1cac	fix: query filters in REST API (#4105 ) * Remove legacy _format_filters() * Remove test case	2023-02-09 10:42:31 +01:00
Bijay Gurung	79f57d8460	Proposal: Add a JsonConverter node (#3959 ) * Add Proposal: JsonConverter * Add jsonl support + schema to JsonConverter Proposal * Remove format option from JsonConverter Proposal --------- Co-authored-by: ZanSara <sara.zanzottera@deepset.ai>	2023-02-09 09:57:00 +01:00
hsm207	508d9f6b32	feat: add support for custom headers (#4040 )	2023-02-09 07:08:40 +01:00
Silvano Cerza	adf4a3ea2f	Fix pylint CI check running with no files (#4097 )	2023-02-08 16:33:07 +01:00
Silvano Cerza	274746db07	style: Update black (#4101 ) * Update black version * Format file with new black style * Update black pre-commit hook version	2023-02-08 15:34:43 +01:00
Sebastian	1bbf10a376	Remove double batching in retrieve_batch (#4014 ) * Removed double batching around embed_queries * Add back tests for retrieve_batch for dpr and embedding retrievers * Updated table-text-retriever to not double batch * Fixing pylint * Update to test * Remove code breaking test * Updating dev comment to be clearer	2023-02-08 14:39:20 +01:00
Silvano Cerza	c66f855caf	Add missing env vars in rest_api CI tests (#4098 )	2023-02-08 12:48:20 +01:00
Sebastian	01d39df863	feat: Update allowed models to be used with Prompt Node (#4018 ) * Update allowed models to be used with Prompt Node * Added try except block around the config to skip over OpenAI models. * Fixing tests * Adding warning message * Adding test for different HF models that could be used in prompt node	2023-02-08 12:47:52 +01:00
Agnieszka Marzec	8135e75139	Add shaper to api docs (#4083 )	2023-02-08 12:15:08 +01:00
Stefano Fiorucci	5c009c2a1a	feat: OpenAI - warn users if `max_tokens` is too short (#4094 ) * warn users if max_tokens is too short * skip test if not API KEY * add counters * correctly run precommit	2023-02-08 10:39:40 +01:00
tstadel	92c58cfda1	feat: Support multiple document_ids in Answer object (for generative QA) (#4062 ) * initial version without shapers * set document_ids for BaseGenerator * introduce question-answering-with-references template * better prompt * make PromptTemplate control output_variable * update schema * fix add_doc_meta_data_to_answer * Revert "fix add_doc_meta_data_to_answer" This reverts commit b994db423ad8272c140ce2b785cf359d55383ff9. * fix add_doc_meta_data_to_answer * fix eval * fix pylint * fix pinecone * fix other tests * fix test * fix flaky test * Revert "fix flaky test" This reverts commit 7ab04275ffaaaca96b4477325ba05d5f34d38775. * adjust docstrings * make Label loading backward-compatible * fix Label backward compatibility for pinecone * fix Label backward compatibility for search engines * fix Label backward compatibility for deepset Cloud * fix tests * fix None issue * fix test_write_feedback * add tests for legacy label support * add document_id test for pinecone * reduce unnecessary contents * add comment to pinecone test	2023-02-08 08:37:22 +01:00
Silvano Cerza	5689c43e7e	ci: Make tests run conditionally in CI (#4086 ) * Make tests run conditionally in CI * Move rest_api test into separate workflow * Avoid running tests.yml when rest_api is modified	2023-02-07 21:16:56 +01:00
Zoltan Fedor	a3016f065f	feat: Support multiple `RayPipelines` (#4078 )	2023-02-07 11:01:07 +01:00
Silvano Cerza	3e4a2201df	ci: Change actionlint pre-commit hook to use Dockerized tool (#4060 ) * Change actionlint pre-commit hook to use Dockerized tool * Add ignore rule for actionlint --------- Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>	2023-02-07 09:34:25 +01:00
Julian Risch	0e282e5ca4	refactor: replace mutable default arguments (#4070 ) * refactor: replace mutable default arguments * change type annotation in BasePreProcessor to Optional[List]	2023-02-07 09:30:33 +01:00
Vladimir Blagojevic	3273a2714d	fix: Add PromptTemplate __repr__ method (#4058 ) Co-authored-by: ZanSara <sarazanzo94@gmail.com>	2023-02-07 08:14:32 +01:00
Sebastian	a9f13d4641	feat: Allow all training options for training a SentenceTransformers EmbeddingRetriever (#4026 ) * Add additional options to pass to the SentenceTransformers trainer * Make options accessible to the EmbeddingRetriever.train * Update file-converters.yml * Update transformers-img-to-text.yml * Update 3550-csv-converter.md * move type: ignore to correct line * Moving type ignore again * Fixing pylint and mypy * Update haystack/nodes/retriever/_embedding_encoder.py Co-authored-by: bogdankostic <bogdankostic@web.de> * Update haystack/nodes/retriever/_embedding_encoder.py Co-authored-by: bogdankostic <bogdankostic@web.de> * Update haystack/nodes/retriever/_embedding_encoder.py Co-authored-by: bogdankostic <bogdankostic@web.de> * Updated docstring to be less misleading. --------- Co-authored-by: bogdankostic <bogdankostic@web.de>	2023-02-07 08:05:21 +01:00
Silvano Cerza	bcf3bfdf79	Fix pylint workflow check running on tests files (#4076 )	2023-02-06 19:41:36 +01:00
Julian Risch	51f30487e1	fix: add inner query for mysql compatibility (#4068 )	2023-02-06 18:18:25 +01:00
Silvano Cerza	9cd94f3dc3	ci: Move formatting and linting checks out of tests.yml (#4046 ) * Move formatting and linting checks out of tests.yml * Revert "Move formatting and linting checks out of tests.yml" This reverts commit b88b54b7e6404ce10401f308770348465e44b4fc. * Move pylint and mypy out of tests.yml * Fix black version * Handle skipped but required checks	2023-02-06 16:47:48 +01:00
Zoltan Fedor	f4a30a552a	fix: use correct count of outgoing edges in RayPipeline (#4066 )	2023-02-06 10:52:32 +01:00
Julian Risch	d819d6badf	proposal: Add Agents for extended LLM support (#3925 ) * draft proposal * add link to colab notebook (api keys required) * Add alternative name ideas for MRKLAgent * Breakdown of agent steps * Added more sections * Add even more sections * simplify tool/action mentions, shorten * agents as new abstraction instead of BaseComponent * agent tools can be pipelines or nodes --------- Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>	2023-02-06 09:47:10 +01:00
Massimiliano Pippi	5e65905659	fix workflow (#4055 )	2023-02-06 08:40:13 +01:00
Stefano Fiorucci	b9ab7b3ca2	fix: make the crawler more robust on Windows (#4049 ) * first try * simplify the code a bit * fix; better docstrings * add URL	2023-02-03 16:43:18 +01:00
ZanSara	76db26f228	logging-format-interpolation (#3907 )	2023-02-03 13:30:56 +01:00
Massimiliano Pippi	8824f3a10a	re-organize pydoc config files (#4042 )	2023-02-03 12:51:10 +01:00
Jack Butler	f006eded7d	fix: allow Biadaptive & Triadaptive to work with EarlyStopping (#4033 ) * fix: allow str when saving tri/bi-adaptive models * fix: make trainer model loading class-agnostic * test: add test for DPR with EarlyStopping * refactor: simplify model reloading via classmethod --------- Co-authored-by: Julian Risch <julian.risch@deepset.ai>	2023-02-03 11:13:18 +01:00
Silvano Cerza	a092eac2c7	Add missing env var in PyPi release slack notification (#4052 )	2023-02-03 11:03:01 +01:00
Silvano Cerza	6a9cb8651b	Fix pylint version to prevent crash (#4043 )	2023-02-02 17:57:39 +01:00
Massimiliano Pippi	76bb105388	chore: remove unneeded files (#4036 ) * remove unneeded files * readme file should stay	2023-02-02 15:38:56 +01:00
tstadel	9611b64ec5	fix: document retrieval metrics for non-document_id document_relevance_criteria (#3885 ) * fix document retrieval metrics for all document_relevance_criteria * fix tests * fix eval_batch metrics * small refactorings * evaluate metrics on label level * document retrieval tests added * fix pylint * fix test * support file retrieval * add comment about threshold * rename test	2023-02-02 15:00:07 +01:00
Silvano Cerza	e62d24d0eb	ci: Add linting of workflow and related pre-commit hook (#4032 ) * Add actionlint pre-commit hook * Add workflow to lint workflows * Remove unused input in Python Cache action * Move from deprecated set-output syntax to new one * Add actionlint config to specify self-hosted runners labels	2023-02-02 14:33:23 +01:00
Massimiliano Pippi	2878c57645	Update pyproject.toml (#4035 )	2023-02-02 11:59:17 +01:00
Silvano Cerza	d79d39b28a	Bump act10ns/slack from v1 to v2 (#4031 )	2023-02-02 09:39:36 +01:00
Silvano Cerza	938cb62144	Fix PyPi release workflow (#4029 )	2023-02-02 09:36:23 +01:00

... 37 38 39 40 41 ...

3803 Commits