haystack

mirror of https://github.com/deepset-ai/haystack.git synced 2025-10-30 01:09:43 +00:00

Author	SHA1	Message	Date
Julian Risch	17dcb8c23e	Use Reader's device by default (#1208 ) * Use Reader's device by default * Replace get_device with initialize_device_settings * Add import statements for init_device_settings * Remove unused get_device method	2021-06-24 09:22:34 +02:00
Branden Chan	10e332dabb	Fix Links (#1199 ) * Fix link highlight * Regen md files * Remove duplicate * Fix whitespace * fixing strings for website * Fix link Co-authored-by: PiffPaffM <markuspaff.mp@gmail.com>	2021-06-23 19:07:54 +02:00
Branden Chan	efc03f72db	Make PreProcessor.process() work on lists of documents (#1163 ) * Add process_batch method * Rename methods * Fix doc string, satisfy mypy * Fix mypy CI * Fix typp * Update tutorial * Fix argument name * Change arg name * Incorporate reviewer feedback	2021-06-23 18:13:51 +02:00
oryx1729	afee4f36ce	Add scaffold for defining custom components for Pipelines (#1205 )	2021-06-23 12:01:54 +02:00
vblagoje	02fc4c7783	Improve document stores unit test parametrization (#1202 )	2021-06-22 16:08:23 +02:00
Markus Paff	a8f3601e6a	Pin docs for 0.9.0	2021-06-22 10:38:08 +02:00
Ikram Ali	d835a9cdc5	[setup] version tag added to Haystack fix #1175 (#1216 )	2021-06-22 09:43:26 +02:00
Stefano	66049abff0	Add arg to support different languages in PreProcessor's sentence segmentation (#1160 ) * Add PreProcessor optional language parameter. * Add iso639 to nltk languages. * Update docstring Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>	2021-06-21 18:53:19 +02:00
Julian Risch	9e4d7bf9be	Increase Haystack version to 0.9.0 (#1215 ) v0.9.0	2021-06-21 18:39:00 +02:00
oryx1729	0168f04385	Remove unused function _get_pseudo_prob (#1201 )	2021-06-17 10:28:48 +02:00
C V Goudar	f9c4083006	Bugfix setting of device by defaulting to "cpu" (#1182 ) * Defaulting the device to cpu in case gpu is not available and use_gpu is set to True Co-authored-by: C V Goudar <cv.goudar@emplay.et>	2021-06-16 10:26:29 +02:00
Markus Paff	6cd49105e7	update api markdown files and add markdown file for ranker (#1198 ) * update api markdown files and add markdown file for ranker * added docstrings for weaviate * new version of pydoc-markdown does not render arguments correctly. We used pydoc-markdown==3.11.0	2021-06-15 17:50:08 +02:00
Julian Risch	215c45eb8a	Remove quickfix from reader and ranker (#1196 ) * Remove quickfix from ranker * remove quickfix from reader * Use inferencer's model instead of reloaded model	2021-06-15 09:46:11 +02:00
Branden Chan	7dbd58f6be	Add about sections (#1195 )	2021-06-14 18:37:00 +02:00
vblagoje	2a5882578a	Add Longform-QA (LFQA), Seq2SeqGenerator for generative QA and Retribert Retriever (#1086 ) * Integrate LFQA with Haystack * Integrate LFQA with Haystack - unit tests * Properly initialize conftest default value for vector_dim * Update PR after inital feedback * Fix conftest.py import * Seq2SeqGenerator uses Callables instead of subclasses for custom model input * Update docstring * Fix Callable use * Add LFQA tutorials * Improve type error reporting for invalid input converter Callable * Generate docstrings * Format comments in tutorial script * Generate tutorial md * Add usage page Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai> Co-authored-by: brandenchan <brandenchan@icloud.com>	2021-06-14 17:53:43 +02:00
venuraja	ae55927f58	Weaviate: Update Embeddings - Use update instead of replace (#1181 ) * Update Embeddings logic improved * Update Embeddings logic improved	2021-06-14 17:50:55 +02:00
Shahrukh Khan	1a3b4b9c74	Fix typo in Query Classifier Exception Message(#1190 )	2021-06-14 17:40:35 +02:00
Julian Risch	f6e70f0f3d	Removed single_model_path; added infer_tokenizer to dpr load() (#1060 )	2021-06-14 14:14:46 +02:00
Julian Risch	1c31589b43	Bump to FARM 0.8.0, torch 1.8.1 and transformers 4.6.1 (#1192 ) * bump to FARM 0.8.0, which in turn bumps torch 1.8.1 and transformers 4.6.1 (#1192) * Replace deprecated force_bos_token_to_be_generated parameter	2021-06-14 13:00:41 +02:00
Bob van Luijt	f583d0bfaf	Minor change with a link to the Weaviate docs (#1180 ) Super minor change, but in line with other DocumentStore's	2021-06-11 21:20:23 +02:00
Branden Chan	e7937ac5d7	Reformat FAQ page (#1177 ) * Add faq page * Update faq.md * Fix mypy CI * Add question * Reformat faq	2021-06-11 11:59:52 +02:00
Branden Chan	783893c3d2	Tutorial update (#1166 ) * Add header / footer * Add Milvus example * Generate md files * Fix mypy CI	2021-06-11 11:09:15 +02:00
Branden Chan	13edff109d	Documentation update (#1162 ) * Add content * Add German BERT references * Mention preprocessor language * Fix mypy CI * Add document length recommendation * Add more languages	2021-06-11 11:06:57 +02:00
Branden Chan	41b537affe	Add FAQ page (#1151 ) * Add faq page * Update faq.md * Fix mypy CI * Add question	2021-06-10 17:29:14 +02:00
venuraja79	49886f88f0	Integrate Weaviate as another DocumentStore (#1064 ) * Annotation Tool: data is not persisted when using local version #853 * First version of weaviate * First version of weaviate * First version of weaviate * Updated comments * Updated comments * ran query, get and write tests * update embeddings, dynamic schema and filters implemented * Initial set of tests and fixes * Tests added for update_embeddings and delete documents * introduced duplicate documents fix * fixed mypy errors * Added Weaviate to requirements * Fix the weaviate docker env variables * Fixing test dependencies for now * Created weaviate test marker and fixed query * Update docstring * Add documentation * Bump up weaviate version * Bump up weaviate version in documentation * Bump up weaviate version in documentation * Updgrade weaviate version Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>	2021-06-10 09:43:53 +02:00
Lalit Pagaria	db17d73a82	Fixing issues caused due to mypy upgrade (#1165 )	2021-06-09 16:24:39 +02:00
Branden Chan	5f0f85989a	Refresh API docs (#1152 )	2021-06-09 16:13:58 +02:00
Shahrukh Khan	545c625a37	Add QueryClassifier incl. baseline models (#1099 ) * restructure query classifier code and add s3 based pickles * make model and vectorizer optional in query classifier * update query classifier as per init style * add query classifiers sklearn/hf * update docstrings for query classifiers * add unit test for query classifier * add type patch for sklearn classifier * fix mypy type issue * revert to pure formatting * add query classifiers * resolve conflict * add output names for query classifier * revert output and update docstring queryclassifier * Update docstring for SklearnQueryClassifier * update transformer query classifier docstring * fix typo * change arg names in query classifier classes * add set_config(). rename attributes * fix set_config() Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>	2021-06-08 15:20:13 +02:00
Malte Pietsch	600636e77b	Update README.md	2021-06-08 09:23:56 +02:00
Branden Chan	59e3c55c47	Add More top_k handling to EvalDocuments (#1133 ) * Improve top_k support * Adjust warning * Satisfy mypy * Reinit eval counts if top_k has changed * Incorporate reviewer feedback	2021-06-07 12:11:00 +02:00
Branden Chan	c513865566	Add L2 support for FAISS HNSW (#1138 )	2021-06-04 11:05:18 +02:00
Julian Risch	580e28344d	Add docu of confidence scores and calibration method (#1131 ) * Add docu of confidence scores and calibration method	2021-06-03 15:49:07 +02:00
Malte Pietsch	a1472b040c	Add badges (#1136 )	2021-06-03 14:47:08 +02:00
Malte Pietsch	b41719b7c8	Add config to JoinDocuments node to allow yaml export in pipelines (#1134 ) * add config to JoinNode to allow yaml export * remove test print	2021-06-03 11:03:25 +02:00
Julian Risch	8e3d0d1287	Distinguish labels for calculating similarity scores (#1124 ) * Distinguish labels for calculating similarity scores * Explain label "0" and "1" of TextPairClassifier in Ranker	2021-06-02 17:33:36 +02:00
Branden Chan	b555bc525c	Remove duplicate run (#1132 )	2021-06-02 13:58:55 +02:00
Branden Chan	09ba75073c	Improve Milvus HNSW Performance (#1127 ) * Add simplified script * Optimize HNSW index creation * Adjust benchmark order * Rename script	2021-06-02 13:17:35 +02:00
Branden Chan	9356f637d4	Update Milvus benchmarks (#1128 ) * Update Milvus benchmarks * Add sentence transformers * Update sentence transformers index results * Remove duplicate row	2021-06-02 13:09:45 +02:00
Branden Chan	aa6f768efa	Prevent merge of same questions on different documents during evaluation (#1119 ) * Fix duplicate question in Reader.eval() * Add duplicate question support in document store * Support duplicate questions in retriever eval * Update tutorial * Rename key_tuple * Change error message * Add warning when more than 6 labels * Allow for label grouping options * Add support for aggregating by label meta * Satisfy mypy * Fix duplicate question in Reader.eval() * Add duplicate question support in document store * Support duplicate questions in retriever eval * Update tutorial * Rename key_tuple * Change error message * Add warning when more than 6 labels * Allow for label grouping options * Add support for aggregating by label meta * Satisfy mypy * Make label field flexible, add docstrings * Satisfy mypy * Fix failing tests * Adjust docstring * Fix tutorial Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>	2021-06-02 12:09:03 +02:00
Branden Chan	d8c47ed525	Preserve whitespace (#1121 )	2021-06-02 12:08:22 +02:00
Malte Pietsch	022f8586f6	Remove Python 3.6 support (#1059 ) * Remove Python 3.6 support * change cache key for CI	2021-06-01 15:24:44 +02:00
Julian Risch	a7ba146246	Removed comma from last item in json list (#1114 )	2021-06-01 12:32:21 +02:00
Julian Risch	40ceaf418a	Fixing grpcio-tools to version of colab's pre-installed grpcio (#1113 )	2021-05-31 19:09:10 +02:00
Alvise Sembenico	6326cf5710	🐳 add PDF converter dependencies to Docker (#1107 )	2021-05-31 19:01:02 +02:00
Branden Chan	6ca6ac0632	Add OpenDistro init (#1101 )	2021-05-31 18:59:20 +02:00
Julian Risch	84c34295a1	Re-ranking component for document search without QA (#1025 ) * Adding ranker similar to retriever and reader * Sort documents according to query-document similarity scores * Reranking and model training runs for small example * Added EvalRanker node * Calculate recall@k in EvalRetriever and EvalRanker nodes * Renaming EvalRetriever to EvalDocuments and EvalReader to EvalAnswers * Added mean reciprocal rank as metric for EvalDocuments * Fix bug that appeared when ranking documents with same score * Remove commented code for unimplmented eval() of Ranker node * Add documentation of k parameter in EvalDocuments * Add Ranker docu and renaming top_k param	2021-05-31 15:31:36 +02:00
Michaël Bitard	b5cae20ddb	Fix typo in streamlit UI (#1106 )	2021-05-28 11:18:09 +02:00
Ikram Ali	94f1a2b5c9	Improve speed of FAISSDocumentStore.delete_documents() (#1095 )	2021-05-26 07:56:09 +02:00
Ikram Ali	b76ed4c5a4	Add options for handling duplicate documents (skip, fail, overwrite) (#1088 ) * [document_stores] Duplicate document implmentation added for memorystore. * [document_stores]duplicate documents implementation done for faiss store. * [document_store] Duplicate document feature added for elasticsearch document store fixed #1069 * [document_store] Duplicate documents feature added for milvus document store and bug fixed in faiss document store fixed #1069 * [document_store] Code refactored fixed #1069 * [document_store]Test cases refactored. * [document_store] mypy issue fixed. * [test_case] faiss and milvus test case refactored to support duplicate documents implementation. fixed #1069 * [document_store] duplicate_documents_options code refactored. * [document_store] Code refactored.	2021-05-25 13:30:06 +02:00
Avishekh Shrestha	c4ee32d47d	Fix typo in preprocessing.md(#1087 ) Correct variable name from 'd' to 'doc' in line 134.	2021-05-23 19:16:58 +02:00

... 5 6 7 8 9 ...

976 Commits