haystack

mirror of https://github.com/deepset-ai/haystack.git synced 2025-11-14 00:54:22 +00:00

Author	SHA1	Message	Date
Sara Zan	a59bca3661	Apply black formatting (#2115 ) * Testing black on ui/ * Applying black on docstores * Add latest docstring and tutorial changes * Create a single GH action for Black and docs to reduce commit noise to the minimum, slightly refactor the OpenAPI action too * Remove comments * Relax constraints on pydoc-markdown * Split temporary black from the docs. Pydoc-markdown was obsolete and needs a separate PR to upgrade * Fix a couple of bugs * Add a type: ignore that was missing somehow * Give path to black * Apply Black * Apply Black * Relocate a couple of type: ignore * Update documentation * Make Linux CI run after applying Black * Triggering Black * Apply Black * Remove dependency, does not work well * Remove manually double trailing commas * Update documentation Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2022-02-03 13:43:18 +01:00
el2e10	377c20b8b1	Fix grammatical issue in optimization guides (#1941 )	2022-01-03 11:06:13 +01:00
nishanthcgit	cf603042b2	Capitalize starting letter in params (#1750 ) * Capitalize starting letter in params Capitalized the starting letter in code examples for params in keeping with the latest names for nodes where first letter is capitalized. Refer: https://github.com/deepset-ai/haystack/issues/1748 * Update standard_pipelines.py Capitalized some starting letters in the docstrings in keeping with the updated node names for standard pipelines	2021-11-15 12:38:13 +01:00
Julian Risch	c9087da2ac	rename text variable of document to content (#1704 )	2021-11-08 17:07:36 +01:00
Julian Risch	892ce4a760	Make weaviate more compliant to other doc stores (UUIDs and dummy embedddings) (#1656 ) * create uuid and dummy embeddding in weaviate doc store * handle and test for duplicate non-uuid-formatted ids in weaviate * add uuid and dummy embedding to doc strings * Add latest docstring and tutorial changes * Upgrade weaviate * Include weaviate in common doc store test cases * Add latest docstring and tutorial changes * Exclude weaviate doc store from eval tests * Incorporate index name in uuid generation * Ignore mypy error * Fix typo * Restore DOCS without uuid and embeddings generated by weaviate * Supply docs for retriever tests as fixture * Limit scope of fixture to function instead of session * Add comments Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-11-04 09:27:12 +01:00
oryx1729	9dd7c74f4f	Refactor communication between Pipeline Components (#1321 )	2021-09-10 11:41:16 +02:00
Bob van Luijt	c0cc8bc80f	Bump Weaviate version to 1.7.0 (#1412 ) * Bump Weaviate * Bump Weaviate * Bump Weaviate client * Bump Weaviate * Revert client version There is a change in the client API that needs to be addressed before bumping its version	2021-09-05 09:28:55 +02:00
Shahrukh Khan	c3d8aa0643	Add query classifier usage docs (#1348 ) * Create query_classifier.md * Update query_classifier.md * Update query_classifier.md * Update query_classifier.md * Update query_classifier.md Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>	2021-08-24 15:56:11 +02:00
Bob van Luijt	ba071cc052	Bump Weaviate version (#1336 )	2021-08-12 09:54:09 +02:00
Markus Paff	7569ab97dd	Add faq annotation (#1333 ) * add annotation faq to read.me * design fix * add faq to docs page * changed format	2021-08-10 14:55:31 +02:00
Malte Pietsch	a0921f0c35	Remove `Finder` (#1326 ) * deprecate finder * remove import * add doc section for moving from finder to pipelines	2021-08-09 13:41:40 +02:00
Branden Chan	937247d628	Add QuestionGenerator (#1267 ) * Create basic Question Generation * Split texts into 50 word chunks * Allow prompt to be changed * Implement iteration functionality in DS * Add docstrings, create pipelines * Make pipelines work * Add comments * Add tests * Add tutorials and docs * Add doc string	2021-07-26 17:20:43 +02:00
Branden Chan	363be65a78	Implement OpenSearch ANN (#1225 ) * Simplify ODES init * Add arguments to ES init and create script * Rename similarity_fn_name and add util fn * Create OpenSearchDocumentStore * Specify params of Open Search HNSW * Add better argument handling * Update opensearch index mapping * Edit opensearch default port * Fix HNSW mapping * Force small HNSW params * Implement auto start and stopping of document store services * Fix starting and stopping of ds service * Restore HNSW params * Add opensearch query benchmarks * Add write wait time * Revert wait time * Add timeout * Update benchmarks * Update benchmarks * Update benchmarks json * Update documentation * Update documentation * Fix similarity name * Improve argument passing * Improve stopping and starting of service	2021-07-26 10:52:52 +02:00
Bob van Luijt	8dae844447	Bump Weaviate version to 1.5 (#1287 ) * bump Weaviate version to 1.5 * bump Weaviate version to 1.5	2021-07-15 08:26:22 +02:00
Branden Chan	10e332dabb	Fix Links (#1199 ) * Fix link highlight * Regen md files * Remove duplicate * Fix whitespace * fixing strings for website * Fix link Co-authored-by: PiffPaffM <markuspaff.mp@gmail.com>	2021-06-23 19:07:54 +02:00
vblagoje	2a5882578a	Add Longform-QA (LFQA), Seq2SeqGenerator for generative QA and Retribert Retriever (#1086 ) * Integrate LFQA with Haystack * Integrate LFQA with Haystack - unit tests * Properly initialize conftest default value for vector_dim * Update PR after inital feedback * Fix conftest.py import * Seq2SeqGenerator uses Callables instead of subclasses for custom model input * Update docstring * Fix Callable use * Add LFQA tutorials * Improve type error reporting for invalid input converter Callable * Generate docstrings * Format comments in tutorial script * Generate tutorial md * Add usage page Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai> Co-authored-by: brandenchan <brandenchan@icloud.com>	2021-06-14 17:53:43 +02:00
Bob van Luijt	f583d0bfaf	Minor change with a link to the Weaviate docs (#1180 ) Super minor change, but in line with other DocumentStore's	2021-06-11 21:20:23 +02:00
Branden Chan	e7937ac5d7	Reformat FAQ page (#1177 ) * Add faq page * Update faq.md * Fix mypy CI * Add question * Reformat faq	2021-06-11 11:59:52 +02:00
Branden Chan	13edff109d	Documentation update (#1162 ) * Add content * Add German BERT references * Mention preprocessor language * Fix mypy CI * Add document length recommendation * Add more languages	2021-06-11 11:06:57 +02:00
Branden Chan	41b537affe	Add FAQ page (#1151 ) * Add faq page * Update faq.md * Fix mypy CI * Add question	2021-06-10 17:29:14 +02:00
venuraja79	49886f88f0	Integrate Weaviate as another DocumentStore (#1064 ) * Annotation Tool: data is not persisted when using local version #853 * First version of weaviate * First version of weaviate * First version of weaviate * Updated comments * Updated comments * ran query, get and write tests * update embeddings, dynamic schema and filters implemented * Initial set of tests and fixes * Tests added for update_embeddings and delete documents * introduced duplicate documents fix * fixed mypy errors * Added Weaviate to requirements * Fix the weaviate docker env variables * Fixing test dependencies for now * Created weaviate test marker and fixed query * Update docstring * Add documentation * Bump up weaviate version * Bump up weaviate version in documentation * Bump up weaviate version in documentation * Updgrade weaviate version Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>	2021-06-10 09:43:53 +02:00
Julian Risch	580e28344d	Add docu of confidence scores and calibration method (#1131 ) * Add docu of confidence scores and calibration method	2021-06-03 15:49:07 +02:00
Julian Risch	8e3d0d1287	Distinguish labels for calculating similarity scores (#1124 ) * Distinguish labels for calculating similarity scores * Explain label "0" and "1" of TextPairClassifier in Ranker	2021-06-02 17:33:36 +02:00
Julian Risch	84c34295a1	Re-ranking component for document search without QA (#1025 ) * Adding ranker similar to retriever and reader * Sort documents according to query-document similarity scores * Reranking and model training runs for small example * Added EvalRanker node * Calculate recall@k in EvalRetriever and EvalRanker nodes * Renaming EvalRetriever to EvalDocuments and EvalReader to EvalAnswers * Added mean reciprocal rank as metric for EvalDocuments * Fix bug that appeared when ranking documents with same score * Remove commented code for unimplmented eval() of Ranker node * Add documentation of k parameter in EvalDocuments * Add Ranker docu and renaming top_k param	2021-05-31 15:31:36 +02:00
Avishekh Shrestha	c4ee32d47d	Fix typo in preprocessing.md(#1087 ) Correct variable name from 'd' to 'doc' in line 134.	2021-05-23 19:16:58 +02:00
brandenchan	4cc853d1c3	Update link	2021-04-30 15:06:45 +02:00
Mario Jäckle	a00703256f	docs(document_store): add usage information for aws elastic search (#1008 ) Co-authored-by: Mario Jäckle <m.jaeckle@careerpartner.eu>	2021-04-30 11:38:25 +02:00
Julian Risch	65f1da00cc	knowledge graph documentation (#979 ) * Create knowledge_graph.md * add doc strings to Text2SparqlRetriever * Add doc strings to GraphDBKnowledgeGraph * Make method calls unambiguous so its clear which class is meant	2021-04-27 16:44:40 +02:00
Branden Chan	9626c0d65e	Update Documentation (#976 ) * Add api pages * Add latest docstring and tutorial changes * First sweep of usage docs * Add link to conversion script * Add import statements * Add summarization page * Add web crawler documentation * Add confidence scores usage * Add crawler api docs * Regenerate api docs * Update summarizer and translator api * Add api pages * Add latest docstring and tutorial changes * First sweep of usage docs * Add link to conversion script * Add import statements * Add summarization page * Add web crawler documentation * Add confidence scores usage * Add crawler api docs * Regenerate api docs * Update summarizer and translator api * Add indentation (pydoc-markdown 3.10.1) * Comment out metadata * Remove Finder deprecation message * Remove Finder in FAQ * Update tutorial link * Incorporate reviewer feedback * Regen api docs * Add type annotations Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-04-22 16:45:29 +02:00
Markus Paff	dfb0282b74	Update milvus links and docstrings (#959 ) * update milvus links and docstrings * Add latest docstring and tutorial changes * new milvus version * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-04-12 14:38:57 +02:00
lewtun	32050fdce3	Add Milvus to the retriever / document store table (#931 )	2021-03-29 09:53:26 +02:00
Mohamed Sayed	9ec2406a05	Remove broken tf-idf youtube link (#888 ) The youtube link is of a deleted video.	2021-03-11 14:23:05 +01:00
Branden Chan	325a4e4d14	Add Milvus Documentation (#838 ) * First commit * Add latest docstring and tutorial changes * Add DocStore external setup info * fixed tabs * Add Milvus recommendation Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Markus Paff <markuspaff.mp@gmail.com>	2021-02-24 11:43:40 +01:00
Lalit Pagaria	5bd94ac5f7	Adding Translator (standalone component & wrapper for pipelines) (#782 ) * Adding translator with many generic input parameter support * Making dict_key as generic * Fixing mypy issue * Adding pipeline and using opus models * Add latest docstring and tutorial changes * Adding test cases for end-to-end translation for generator, summerizer etc * raise error join and merge nodes * Fix test failure * add docstrings. add usage documentation. rm skip_special_tokens param * Add latest docstring and tutorial changes * fix code snippets in md * Adding few extra configuration parameters and fixing tests * Fixingmypy issue and updating usage document * fix for mypy issue in pipeline.py * reverting renaming of pytest_collection_modifyitems method * Addressing review comments * setting skip_special_tokens to True * removing model_max_length argument as None type is not supported to many models * Removing padding parameter. Better to leave it as default otherwise it cause tensor size miss match error. If this option required by used then it can be added later. Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>	2021-02-12 15:58:26 +01:00
Branden Chan	a3a12bc95b	Remove broken link	2021-01-13 17:32:10 +01:00
brandenchan	01fd9940d8	Fix tutorial link	2021-01-13 15:29:25 +01:00
Branden Chan	7376185b65	Create DPR training tutorial (#708 ) * WIP: Start DPR training tutorial * Create basics of DPR Train tutorial * Update documentation * Allow DPR to be initialized without document store * WIP: Add param descriptions to DPR notebook * Clean tutorial * Improve loading * Make doc store optional when loading DPR * Satisfy mypy type check * Add links * Add tutorial header * Add colab badge * Clear outputs * Incorporate reviewer feedback * WIP: Start DPR training tutorial * Create basics of DPR Train tutorial * Update documentation * Allow DPR to be initialized without document store * WIP: Add param descriptions to DPR notebook * Clean tutorial * Improve loading * Make doc store optional when loading DPR * Satisfy mypy type check * Add links * Add tutorial header * Add colab badge * Clear outputs * Incorporate reviewer feedback * Add readme links * Regenerate tutorials * Add excitement * Fix typo * Fix hard negatives comment * Wrap tutorial for windows users * Fix mypy issue	2021-01-13 10:33:55 +01:00
Branden Chan	bb8aba18e0	Create Preprocessing Tutorial (#706 ) * WIP: First version of preprocessing tutorial * stride renamed overlap, ipynb and py files created * rename split_stride in test * Update preprocessor api documentation * define order for markdown files * define order of modules in api docs * Add colab links * Incorporate review feedback Co-authored-by: PiffPaffM <markuspaff.mp@gmail.com>	2021-01-06 15:54:05 +01:00
Malte Pietsch	a2e5e6b09e	Update pipeline documentation and readme (#693 ) * Update README.md * Update pipelines.md * Update pipelines.md * Update README.md	2020-12-22 13:34:28 +01:00
Markus Paff	b752da1cd5	Add docs v0.6.0 (#689 ) * new docs version * updated directory structure * Add pipelines page * Add Finder deprecation suggestion * header for pipelines file * Document MySQL support * Mention DPR train tutorial coming soon * Mention open distro ES * Update doc strings regarding similarity fn * Add link to API docs * Wrap pipelines docs in box * add api reference for pipelines * copied latest version to v0.6.0 * Remove space * Remove space * Copy to v0.6.0 Co-authored-by: brandenchan <brandenchan@icloud.com>	2020-12-18 12:47:27 +01:00
Branden Chan	d8154939fc	Scale dot product into probabilities (#667 ) * scale dot product * Add tip in documentation * Add recommendation boxes * WIP: Use similarity attribute in all doc stores * Implement similarity for InMemoryDS * Add FAISS support * Clean printout * Update documentation * Implement document field map	2020-12-11 12:10:24 +01:00
Branden Chan	79555148ac	Add link to FAISS Info in documentation (#643 ) * Add link to FAISS info * Clean link	2020-12-02 15:24:22 +01:00
brandenchan	cdd009d1ef	Better payload example spacing	2020-12-01 13:07:29 +01:00
Branden Chan	e573c9e27d	Improve User Feedback Documentation (#539 ) * Extend docs * Add User Feedback API calls * Incorporate reviewer feedback	2020-12-01 12:55:31 +01:00
brandenchan	ce6cba227f	Fix website typo	2020-11-27 16:07:29 +01:00
Markus Paff	88d0ee2c98	Add boxes for recommendations (#629 ) * add boxes for recommendations * add more recommendation boxes Co-authored-by: brandenchan <brandenchan@icloud.com>	2020-11-27 16:00:20 +01:00
Branden Chan	1e8af84ecc	Make more changes to documentation (#578 ) * First batch of changes * Add RAG tutorial links * Prettify RAG tutorial * draft of generator doc * Add text * Complete generator page * Create optimization section * Split intro * Fix formatting tutorial 7	2020-11-19 14:58:27 +01:00
brandenchan	090a8cf3e9	Revert "First batch of changes" This reverts commit c07182aa0ab77106cdb142f4ca43ff02476e6fbf.	2020-11-12 12:27:16 +01:00
brandenchan	c07182aa0a	First batch of changes	2020-11-12 12:07:02 +01:00
Markus Paff	4cca3b5290	New docs version v0.5.0 (#560 )	2020-11-06 13:17:04 +01:00

1 2

58 Commits