haystack

mirror of https://github.com/deepset-ai/haystack.git synced 2025-11-12 08:03:50 +00:00

Author	SHA1	Message	Date
Sara Zan	54947cb840	Return intermediate nodes output in pipelines (#1558 ) * First rough implementation * Add a flag to dump the debug logs to the console as well * Typing run() and _dispatch_run() * Allow debug and debug_logs to be passed as arguments of run() * Avoid overwriting _debug, later we might want to store other objects in it * Put logs under a separate key of the _debug dictionary and add input and output of the node alongside it * Introduce global arguments for pipeline.run() that get applied to every node when defined * Change default values of debug variables to None, otherwise their default would override the params values * Remove a potential infinite recursion on the overridden __getattr__ * Do not append the output of the last node in the _debug key, it causes infinite recursion * Add tests * Move the input/output collection into _dispatch_run to gather only relevant info * Add partial Pipeline.run() docstring * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>	2021-10-07 22:13:25 +02:00
Julian Risch	7e063b77d2	Format doc classifier usage example (#1550 ) * Format doc classifier usage example * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-10-01 15:01:19 +02:00
Julian Risch	24483d7bad	TransformersDocumentClassifier replacing FARMClassifier (#1540 ) * Initial draft of TransformersClassifier * Add transformers classifier implementation * Add test for SentenceTransformersClassifier * Add truncation and corresponding test case to Classifier * Add zero-shot classification and test * Add document classifier documentation * Add latest docstring and tutorial changes * print meta data with print_documents() * Add latest docstring and tutorial changes * Remove top_k param from Classifier usage example * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-10-01 11:22:56 +02:00
Julian Risch	0e7338f0c6	Remove mentions of FARM from Ranker comments (#1535 ) * Remove mentions of FARM from Ranker comments * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-09-29 11:57:30 +02:00
Sara Zan	a30a826c6c	Standardize `delete_documents(filter=...)` across all document stores (#1509 ) * Make InMemoryDocumentStore accept and apply filters in delete_documents() * Modify test_document_store.py to test the filtered deletion in memory, sql and milvus too * Make FAISSDocumentStore accept and properly apply filters in delete_documents() * Add latest docstring and tutorial changes * Remove accidentally duplicated test * Remove unnecessary decorators from test/test_document_store.py::test_delete_documents_with_filters * Add embeddings count test for FAISS and Milvus; Milvus fails it. * Fixed a bug that made Milvus not deleting embeddings * Remove batch size parametrization in tests & update all documentstore's docstrings with a filter example * Add latest docstring and tutorial changes Co-authored-by: prafgup <prafulgupta6@gmail.com>	2021-09-29 09:27:06 +02:00
Julian Risch	f9d2f786ca	Replace FARM import statements; add dependencies (#1492 ) * Replace FARM import statements; add dependencies * Add InferenceProc., TextCl.Proc., TextPairCl.Proc. * Remove FARMRanker, add type annotations, rename max_sample * Add sample_to_features_text for InferenceProc. * Fix type annotations: model_name_or_path is str not Path * Fix mypy errors: implement _create_dataset in TextCl.Proc. * Add task_type "embeddings" in Inferencer * Allow loading AdaptiveModel for embedding task * Add SQuAD eval metrics; enable InferenceProc for embedding task * Add baskets as param to log_samples and handle empty basket list in log_samples * Remove unused dependencies * Remove FARMClassifier (doc classificer) due to ref to TextClassificationHead * Remove FARMRanker and Classifier from doc generation scripts Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-09-28 16:34:24 +02:00
Malte Pietsch	183fd5ae5a	Simplify tests & allow running on individual doc stores (#1487 ) * simplify tests for individual doc stores * WIP refactoring markers of tests * test alternative approach for tests with existing parametrization * fix skip logic of already parametrized tests * fix weaviate behaviour in tests - not parametrizing it in our general test cases. * Add latest docstring and tutorial changes * fix some tests * remove sql from document_store_types * fix markers for generator and pipeline test * remove inmemory marker * remove unneeded elasticsearch markers * update readme and contributing.md * update contributing * adjust example Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-09-27 10:52:07 +02:00
Markus Paff	d3fd888a76	Release Docs 0.10.0 (#1460 ) * updated tutorials and docstrings and new version * update to correct directory structure	2021-09-23 16:22:14 +02:00
bogdankostic	c644e2b4d0	Add comment to tutorial notebooks about restarting runtime in colab (#1486 ) * Add comment to tutorial notebooks about restarting runtime in colab * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-09-23 14:36:20 +02:00
Julian Risch	d569e66bc7	Update Tutorial1_Basic_QA_Pipeline.ipynb (#1489 ) * Update Tutorial1_Basic_QA_Pipeline.ipynb passing params to pipeline as dict * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-09-22 16:35:20 +02:00
Branden Chan	bddee2def4	Define SAS model in notebook (#1485 ) * Define SAS model in notebook * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-09-21 17:05:16 +02:00
Branden Chan	2c4baa7f4e	Regenerate API and Tutorial md files (#1480 ) * Change punctuation * Add latest docstring and tutorial changes * Change punctuation * Add documentation for Docs2Answer * Add latest docstring and tutorial changes * Generate new API docs * Replace Finder with Pipeline * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-09-21 14:42:18 +02:00
Markus Paff	39845c0624	Automate updates docstrings tutorials (#1461 ) * remove not needed githab actions and reactivate docstrings and tutorial generation * test workflow * update pydoc version * update python version * update watchdog * move to latest version pydoc-markdown * remove version check * Add latest docstring and tutorial changes * remove test workflow * test for param docstrings * pin pydoc-markdown version * add test workflow * pin watchdog version * Add latest docstring and tutorial changes * update original workflow and delete test Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-09-17 13:44:31 +02:00
oryx1729	9dd7c74f4f	Refactor communication between Pipeline Components (#1321 )	2021-09-10 11:41:16 +02:00
Bob van Luijt	c0cc8bc80f	Bump Weaviate version to 1.7.0 (#1412 ) * Bump Weaviate * Bump Weaviate * Bump Weaviate client * Bump Weaviate * Revert client version There is a change in the client API that needs to be addressed before bumping its version	2021-09-05 09:28:55 +02:00
Ikram Ali	3fc7f3f695	[docs] crawler api docs updated. (#1388 )	2021-09-01 12:07:32 +02:00
Branden Chan	1938fb001b	Add support for no Docker envs in Tutorial 13 (#1365 ) * Add support for no docker envs e.g. colab * Generate md	2021-08-31 15:22:51 +02:00
Markus Paff	be8d305190	Editing docs read.me for new docs website workflow (#1372 ) * editing docs read.me for new docs website workflow * added new links to docs	2021-08-30 14:59:40 +02:00
Shahrukh Khan	c3d8aa0643	Add query classifier usage docs (#1348 ) * Create query_classifier.md * Update query_classifier.md * Update query_classifier.md * Update query_classifier.md * Update query_classifier.md Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>	2021-08-24 15:56:11 +02:00
Markus Paff	cac15310bd	adding tutorial 13 and 14 (#1364 )	2021-08-23 11:37:06 +02:00
Markus Paff	ff2049cd45	updated tutorials (#1359 )	2021-08-19 21:16:56 +02:00
Bob van Luijt	ba071cc052	Bump Weaviate version (#1336 )	2021-08-12 09:54:09 +02:00
Markus Paff	7569ab97dd	Add faq annotation (#1333 ) * add annotation faq to read.me * design fix * add faq to docs page * changed format	2021-08-10 14:55:31 +02:00
Malte Pietsch	a0921f0c35	Remove `Finder` (#1326 ) * deprecate finder * remove import * add doc section for moving from finder to pipelines	2021-08-09 13:41:40 +02:00
Branden Chan	937247d628	Add QuestionGenerator (#1267 ) * Create basic Question Generation * Split texts into 50 word chunks * Allow prompt to be changed * Implement iteration functionality in DS * Add docstrings, create pipelines * Make pipelines work * Add comments * Add tests * Add tutorials and docs * Add doc string	2021-07-26 17:20:43 +02:00
Branden Chan	363be65a78	Implement OpenSearch ANN (#1225 ) * Simplify ODES init * Add arguments to ES init and create script * Rename similarity_fn_name and add util fn * Create OpenSearchDocumentStore * Specify params of Open Search HNSW * Add better argument handling * Update opensearch index mapping * Edit opensearch default port * Fix HNSW mapping * Force small HNSW params * Implement auto start and stopping of document store services * Fix starting and stopping of ds service * Restore HNSW params * Add opensearch query benchmarks * Add write wait time * Revert wait time * Add timeout * Update benchmarks * Update benchmarks * Update benchmarks json * Update documentation * Update documentation * Fix similarity name * Improve argument passing * Improve stopping and starting of service	2021-07-26 10:52:52 +02:00
Bob van Luijt	8dae844447	Bump Weaviate version to 1.5 (#1287 ) * bump Weaviate version to 1.5 * bump Weaviate version to 1.5	2021-07-15 08:26:22 +02:00
Branden Chan	10e332dabb	Fix Links (#1199 ) * Fix link highlight * Regen md files * Remove duplicate * Fix whitespace * fixing strings for website * Fix link Co-authored-by: PiffPaffM <markuspaff.mp@gmail.com>	2021-06-23 19:07:54 +02:00
Markus Paff	a8f3601e6a	Pin docs for 0.9.0	2021-06-22 10:38:08 +02:00
Markus Paff	6cd49105e7	update api markdown files and add markdown file for ranker (#1198 ) * update api markdown files and add markdown file for ranker * added docstrings for weaviate * new version of pydoc-markdown does not render arguments correctly. We used pydoc-markdown==3.11.0	2021-06-15 17:50:08 +02:00
Branden Chan	7dbd58f6be	Add about sections (#1195 )	2021-06-14 18:37:00 +02:00
vblagoje	2a5882578a	Add Longform-QA (LFQA), Seq2SeqGenerator for generative QA and Retribert Retriever (#1086 ) * Integrate LFQA with Haystack * Integrate LFQA with Haystack - unit tests * Properly initialize conftest default value for vector_dim * Update PR after inital feedback * Fix conftest.py import * Seq2SeqGenerator uses Callables instead of subclasses for custom model input * Update docstring * Fix Callable use * Add LFQA tutorials * Improve type error reporting for invalid input converter Callable * Generate docstrings * Format comments in tutorial script * Generate tutorial md * Add usage page Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai> Co-authored-by: brandenchan <brandenchan@icloud.com>	2021-06-14 17:53:43 +02:00
Bob van Luijt	f583d0bfaf	Minor change with a link to the Weaviate docs (#1180 ) Super minor change, but in line with other DocumentStore's	2021-06-11 21:20:23 +02:00
Branden Chan	e7937ac5d7	Reformat FAQ page (#1177 ) * Add faq page * Update faq.md * Fix mypy CI * Add question * Reformat faq	2021-06-11 11:59:52 +02:00
Branden Chan	783893c3d2	Tutorial update (#1166 ) * Add header / footer * Add Milvus example * Generate md files * Fix mypy CI	2021-06-11 11:09:15 +02:00
Branden Chan	13edff109d	Documentation update (#1162 ) * Add content * Add German BERT references * Mention preprocessor language * Fix mypy CI * Add document length recommendation * Add more languages	2021-06-11 11:06:57 +02:00
Branden Chan	41b537affe	Add FAQ page (#1151 ) * Add faq page * Update faq.md * Fix mypy CI * Add question	2021-06-10 17:29:14 +02:00
venuraja79	49886f88f0	Integrate Weaviate as another DocumentStore (#1064 ) * Annotation Tool: data is not persisted when using local version #853 * First version of weaviate * First version of weaviate * First version of weaviate * Updated comments * Updated comments * ran query, get and write tests * update embeddings, dynamic schema and filters implemented * Initial set of tests and fixes * Tests added for update_embeddings and delete documents * introduced duplicate documents fix * fixed mypy errors * Added Weaviate to requirements * Fix the weaviate docker env variables * Fixing test dependencies for now * Created weaviate test marker and fixed query * Update docstring * Add documentation * Bump up weaviate version * Bump up weaviate version in documentation * Bump up weaviate version in documentation * Updgrade weaviate version Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>	2021-06-10 09:43:53 +02:00
Branden Chan	5f0f85989a	Refresh API docs (#1152 )	2021-06-09 16:13:58 +02:00
Julian Risch	580e28344d	Add docu of confidence scores and calibration method (#1131 ) * Add docu of confidence scores and calibration method	2021-06-03 15:49:07 +02:00
Julian Risch	8e3d0d1287	Distinguish labels for calculating similarity scores (#1124 ) * Distinguish labels for calculating similarity scores * Explain label "0" and "1" of TextPairClassifier in Ranker	2021-06-02 17:33:36 +02:00
Branden Chan	b555bc525c	Remove duplicate run (#1132 )	2021-06-02 13:58:55 +02:00
Branden Chan	9356f637d4	Update Milvus benchmarks (#1128 ) * Update Milvus benchmarks * Add sentence transformers * Update sentence transformers index results * Remove duplicate row	2021-06-02 13:09:45 +02:00
Julian Risch	84c34295a1	Re-ranking component for document search without QA (#1025 ) * Adding ranker similar to retriever and reader * Sort documents according to query-document similarity scores * Reranking and model training runs for small example * Added EvalRanker node * Calculate recall@k in EvalRetriever and EvalRanker nodes * Renaming EvalRetriever to EvalDocuments and EvalReader to EvalAnswers * Added mean reciprocal rank as metric for EvalDocuments * Fix bug that appeared when ranking documents with same score * Remove commented code for unimplmented eval() of Ranker node * Add documentation of k parameter in EvalDocuments * Add Ranker docu and renaming top_k param	2021-05-31 15:31:36 +02:00
Avishekh Shrestha	c4ee32d47d	Fix typo in preprocessing.md(#1087 ) Correct variable name from 'd' to 'doc' in line 134.	2021-05-23 19:16:58 +02:00
Lalit Pagaria	f46b09c756	Using text hash as id to prevent document duplication (#1000 ) * using text hash as id to prevent document duplication. Also providing a way customize it. * Add latest docstring and tutorial changes * Fixing duplicate value test when text is same * Adding test for duplicate ids in document store * Changing exception to generic Exception type * add exception for inmemory. update docstring Document. remove id_hash_keys from object attribute * Add latest docstring and tutorial changes * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>	2021-05-17 17:51:52 +02:00
brandenchan	5b0b3e4616	Merge branch 'master' of https://github.com/deepset-ai/haystack	2021-04-30 16:41:05 +02:00
brandenchan	4cc853d1c3	Update link	2021-04-30 15:06:45 +02:00
Branden Chan	869b493b61	Regen api docs (#1015 )	2021-04-30 12:35:13 +02:00
Mario Jäckle	a00703256f	docs(document_store): add usage information for aws elastic search (#1008 ) Co-authored-by: Mario Jäckle <m.jaeckle@careerpartner.eu>	2021-04-30 11:38:25 +02:00

... 9 10 11 12 13 ...

661 Commits