haystack

mirror of https://github.com/deepset-ai/haystack.git synced 2025-11-11 23:54:37 +00:00

Author	SHA1	Message	Date
Shahrukh Khan	c3d8aa0643	Add query classifier usage docs (#1348 ) * Create query_classifier.md * Update query_classifier.md * Update query_classifier.md * Update query_classifier.md * Update query_classifier.md Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>	2021-08-24 15:56:11 +02:00
Markus Paff	cac15310bd	adding tutorial 13 and 14 (#1364 )	2021-08-23 11:37:06 +02:00
Markus Paff	ff2049cd45	updated tutorials (#1359 )	2021-08-19 21:16:56 +02:00
Bob van Luijt	ba071cc052	Bump Weaviate version (#1336 )	2021-08-12 09:54:09 +02:00
Markus Paff	7569ab97dd	Add faq annotation (#1333 ) * add annotation faq to read.me * design fix * add faq to docs page * changed format	2021-08-10 14:55:31 +02:00
Malte Pietsch	a0921f0c35	Remove `Finder` (#1326 ) * deprecate finder * remove import * add doc section for moving from finder to pipelines	2021-08-09 13:41:40 +02:00
Branden Chan	937247d628	Add QuestionGenerator (#1267 ) * Create basic Question Generation * Split texts into 50 word chunks * Allow prompt to be changed * Implement iteration functionality in DS * Add docstrings, create pipelines * Make pipelines work * Add comments * Add tests * Add tutorials and docs * Add doc string	2021-07-26 17:20:43 +02:00
Branden Chan	363be65a78	Implement OpenSearch ANN (#1225 ) * Simplify ODES init * Add arguments to ES init and create script * Rename similarity_fn_name and add util fn * Create OpenSearchDocumentStore * Specify params of Open Search HNSW * Add better argument handling * Update opensearch index mapping * Edit opensearch default port * Fix HNSW mapping * Force small HNSW params * Implement auto start and stopping of document store services * Fix starting and stopping of ds service * Restore HNSW params * Add opensearch query benchmarks * Add write wait time * Revert wait time * Add timeout * Update benchmarks * Update benchmarks * Update benchmarks json * Update documentation * Update documentation * Fix similarity name * Improve argument passing * Improve stopping and starting of service	2021-07-26 10:52:52 +02:00
Bob van Luijt	8dae844447	Bump Weaviate version to 1.5 (#1287 ) * bump Weaviate version to 1.5 * bump Weaviate version to 1.5	2021-07-15 08:26:22 +02:00
Branden Chan	10e332dabb	Fix Links (#1199 ) * Fix link highlight * Regen md files * Remove duplicate * Fix whitespace * fixing strings for website * Fix link Co-authored-by: PiffPaffM <markuspaff.mp@gmail.com>	2021-06-23 19:07:54 +02:00
Markus Paff	6cd49105e7	update api markdown files and add markdown file for ranker (#1198 ) * update api markdown files and add markdown file for ranker * added docstrings for weaviate * new version of pydoc-markdown does not render arguments correctly. We used pydoc-markdown==3.11.0	2021-06-15 17:50:08 +02:00
Branden Chan	7dbd58f6be	Add about sections (#1195 )	2021-06-14 18:37:00 +02:00
vblagoje	2a5882578a	Add Longform-QA (LFQA), Seq2SeqGenerator for generative QA and Retribert Retriever (#1086 ) * Integrate LFQA with Haystack * Integrate LFQA with Haystack - unit tests * Properly initialize conftest default value for vector_dim * Update PR after inital feedback * Fix conftest.py import * Seq2SeqGenerator uses Callables instead of subclasses for custom model input * Update docstring * Fix Callable use * Add LFQA tutorials * Improve type error reporting for invalid input converter Callable * Generate docstrings * Format comments in tutorial script * Generate tutorial md * Add usage page Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai> Co-authored-by: brandenchan <brandenchan@icloud.com>	2021-06-14 17:53:43 +02:00
Bob van Luijt	f583d0bfaf	Minor change with a link to the Weaviate docs (#1180 ) Super minor change, but in line with other DocumentStore's	2021-06-11 21:20:23 +02:00
Branden Chan	e7937ac5d7	Reformat FAQ page (#1177 ) * Add faq page * Update faq.md * Fix mypy CI * Add question * Reformat faq	2021-06-11 11:59:52 +02:00
Branden Chan	783893c3d2	Tutorial update (#1166 ) * Add header / footer * Add Milvus example * Generate md files * Fix mypy CI	2021-06-11 11:09:15 +02:00
Branden Chan	13edff109d	Documentation update (#1162 ) * Add content * Add German BERT references * Mention preprocessor language * Fix mypy CI * Add document length recommendation * Add more languages	2021-06-11 11:06:57 +02:00
Branden Chan	41b537affe	Add FAQ page (#1151 ) * Add faq page * Update faq.md * Fix mypy CI * Add question	2021-06-10 17:29:14 +02:00
venuraja79	49886f88f0	Integrate Weaviate as another DocumentStore (#1064 ) * Annotation Tool: data is not persisted when using local version #853 * First version of weaviate * First version of weaviate * First version of weaviate * Updated comments * Updated comments * ran query, get and write tests * update embeddings, dynamic schema and filters implemented * Initial set of tests and fixes * Tests added for update_embeddings and delete documents * introduced duplicate documents fix * fixed mypy errors * Added Weaviate to requirements * Fix the weaviate docker env variables * Fixing test dependencies for now * Created weaviate test marker and fixed query * Update docstring * Add documentation * Bump up weaviate version * Bump up weaviate version in documentation * Bump up weaviate version in documentation * Updgrade weaviate version Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>	2021-06-10 09:43:53 +02:00
Branden Chan	5f0f85989a	Refresh API docs (#1152 )	2021-06-09 16:13:58 +02:00
Julian Risch	580e28344d	Add docu of confidence scores and calibration method (#1131 ) * Add docu of confidence scores and calibration method	2021-06-03 15:49:07 +02:00
Julian Risch	8e3d0d1287	Distinguish labels for calculating similarity scores (#1124 ) * Distinguish labels for calculating similarity scores * Explain label "0" and "1" of TextPairClassifier in Ranker	2021-06-02 17:33:36 +02:00
Branden Chan	b555bc525c	Remove duplicate run (#1132 )	2021-06-02 13:58:55 +02:00
Branden Chan	9356f637d4	Update Milvus benchmarks (#1128 ) * Update Milvus benchmarks * Add sentence transformers * Update sentence transformers index results * Remove duplicate row	2021-06-02 13:09:45 +02:00
Julian Risch	84c34295a1	Re-ranking component for document search without QA (#1025 ) * Adding ranker similar to retriever and reader * Sort documents according to query-document similarity scores * Reranking and model training runs for small example * Added EvalRanker node * Calculate recall@k in EvalRetriever and EvalRanker nodes * Renaming EvalRetriever to EvalDocuments and EvalReader to EvalAnswers * Added mean reciprocal rank as metric for EvalDocuments * Fix bug that appeared when ranking documents with same score * Remove commented code for unimplmented eval() of Ranker node * Add documentation of k parameter in EvalDocuments * Add Ranker docu and renaming top_k param	2021-05-31 15:31:36 +02:00
Avishekh Shrestha	c4ee32d47d	Fix typo in preprocessing.md(#1087 ) Correct variable name from 'd' to 'doc' in line 134.	2021-05-23 19:16:58 +02:00
Lalit Pagaria	f46b09c756	Using text hash as id to prevent document duplication (#1000 ) * using text hash as id to prevent document duplication. Also providing a way customize it. * Add latest docstring and tutorial changes * Fixing duplicate value test when text is same * Adding test for duplicate ids in document store * Changing exception to generic Exception type * add exception for inmemory. update docstring Document. remove id_hash_keys from object attribute * Add latest docstring and tutorial changes * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>	2021-05-17 17:51:52 +02:00
brandenchan	5b0b3e4616	Merge branch 'master' of https://github.com/deepset-ai/haystack	2021-04-30 16:41:05 +02:00
brandenchan	4cc853d1c3	Update link	2021-04-30 15:06:45 +02:00
Branden Chan	869b493b61	Regen api docs (#1015 )	2021-04-30 12:35:13 +02:00
Mario Jäckle	a00703256f	docs(document_store): add usage information for aws elastic search (#1008 ) Co-authored-by: Mario Jäckle <m.jaeckle@careerpartner.eu>	2021-04-30 11:38:25 +02:00
Branden Chan	056be3354b	Add pipelines tutorial (#1013 )	2021-04-29 18:19:20 +02:00
Julian Risch	65f1da00cc	knowledge graph documentation (#979 ) * Create knowledge_graph.md * add doc strings to Text2SparqlRetriever * Add doc strings to GraphDBKnowledgeGraph * Make method calls unambiguous so its clear which class is meant	2021-04-27 16:44:40 +02:00
Markus Paff	cf8a622e35	Streamlit UI Evaluation mode (#920 ) * first running version of eval mode * restructuring, new naming of elements and testing * add new files to Docker, how to start with Haystack reference, remove not needed dependencies * Add latest docstring and tutorial changes * merged changes * fixing bugs after breaking changes from last release * newser version of states in streamlit, more docs for eval mode, eval file as env virable * eval file as env variable Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-04-22 17:30:17 +02:00
Branden Chan	9626c0d65e	Update Documentation (#976 ) * Add api pages * Add latest docstring and tutorial changes * First sweep of usage docs * Add link to conversion script * Add import statements * Add summarization page * Add web crawler documentation * Add confidence scores usage * Add crawler api docs * Regenerate api docs * Update summarizer and translator api * Add api pages * Add latest docstring and tutorial changes * First sweep of usage docs * Add link to conversion script * Add import statements * Add summarization page * Add web crawler documentation * Add confidence scores usage * Add crawler api docs * Regenerate api docs * Update summarizer and translator api * Add indentation (pydoc-markdown 3.10.1) * Comment out metadata * Remove Finder deprecation message * Remove Finder in FAQ * Update tutorial link * Incorporate reviewer feedback * Regen api docs * Add type annotations Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-04-22 16:45:29 +02:00
Branden Chan	77d4c2ca1c	Benchmark milvus (#850 ) * Add milvus benchmarking support * Add latest docstring and tutorial changes * Edit config * Disable docker interactive mode * Add milvus index type support * Adjust FAISS and Milvus node branching * Remove duplicate in config * Revert method for speedup * Add latest docstring and tutorial changes * Add latest benchmark run * Add latest docstring and tutorial changes * Add json files * Revert "Add latest docstring and tutorial changes" This reverts commit e2efa5f08aa4fb55bbeeed42aa76817d63fc8923. * Add latest docstring and tutorial changes * Revert "Add latest docstring and tutorial changes" This reverts commit b085a679b9d5f175e91c2c59565e73c5dec1374b. * Fix typo Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-04-13 14:54:15 +02:00
Markus Paff	b87daed62b	fixed link to dpr (#962 )	2021-04-13 09:45:04 +02:00
Markus Paff	dfb0282b74	Update milvus links and docstrings (#959 ) * update milvus links and docstrings * Add latest docstring and tutorial changes * new milvus version * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-04-12 14:38:57 +02:00
Timo Moeller	837dea4e6d	Integrate sentence transformers into benchmarks (#843 ) * Integrate sentence transformers into benchmarks * Add doc store asserts * switch data downloads from s3 client to https. add license info * Fix mypy, revert config Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-04-09 17:24:16 +02:00
Julian Risch	d38c07e0ee	knowledge graph example (#934 ) * Add knowledge graph module * Fix type hint * Add graph retriver module * Change type annotations, change return format * Add graph retriever that executes questions as sparql queries * Linking only those entities that are in the knowledge graph * Added logging and using relations extracted from Knowledge graph for linking * Preventing entity linking from linking the same token to multiple entities * Pruning triples that have no variables for select and count queries * Support knowledge graphs with Pipelines * Add text2sparql * Entity linking and relation linking consider more special cases now based on evaluation on labelled data * Separating example code from KGQA implementation * Add eval on combined extarctive and kg questions * Remove references to hp-test * Add fields sparql_query and long_answer_list to metadata * Removing modular Question2SPARQL approach * Removing additional classes used for modular kgqa approach * preparing lcquad data * change graph db * Translating namespaces in knowledge graph queries * Creating graphdb index and loading triples from .ttl file * Fetching graph config files, triples and model from S3 * Fix incompatibility issues with BaseGraphRetriever and BaseComponent * Removing unused utility functions * Adding doc strings and tutorial header * Adding sparqlwrapper dependency * Moving tutorial header * Sorting tutorials by number within name of notebook * Add latest docstring and tutorial changes * Creating test cases for knowledge graph * Changing knowledge graph example to harry potter * Add latest docstring and tutorial changes * Adapting the tutorial notebook to harry potter example * Add GraphDB fixture for tests * Add latest docstring and tutorial changes * Added GraphDB docker launch to CI * Use correct GraphDB fixture * Check if GraphDB instance is already running * Renaming question/query and incorporating other feedback from Timo and Tanay * Removed type annotation * Add latest docstring and tutorial changes Co-authored-by: oryx1729 <oryx1729@protonmail.com> Co-authored-by: Timo Moeller <timo.moeller@deepset.ai> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-04-08 14:05:33 +02:00
oryx1729	8c68699e1c	Refactor REST APIs to use Pipelines (#922 )	2021-04-07 17:53:32 +02:00
Julian Risch	64ad953c6a	Adding indentation to markup files (#947 )	2021-04-07 11:36:11 +02:00
Timo Moeller	5d2b16f3cc	Update farm version (#936 ) * Update farm version * Add new DPR loading, fix dpr param name * Add QA model confidence as answer probability, fix prams in test	2021-04-01 18:23:05 +02:00
Branden Chan	d77152c469	WIP: Add evaluation nodes for Pipelines (#904 ) * Add main eval fns * WIP: make pipeline_eval.py run * Fix typo * Add support for no_answers * Add latest docstring and tutorial changes * Working pipeline eval * Add timing of nodes * Add latest docstring and tutorial changes * Refactor and clean * Update tutorial script * Set default params * Update tutorials * Fix indent * Add latest docstring and tutorial changes * Address mypy issues * Add test * Fix mypy error * Clear outputs * Add doc strings * Incorporate reviewer feedback * Add latest docstring and tutorial changes * Revert query counting * Fix typo Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-04-01 17:35:18 +02:00
lewtun	32050fdce3	Add Milvus to the retriever / document store table (#931 )	2021-03-29 09:53:26 +02:00
Timo Moeller	1244d16010	Better default value for mp chunksize (#923 ) * Better default value for mp chunksize * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-03-25 19:00:45 +01:00
Lalit Pagaria	e904deefa7	Add Markdown file convertor (#875 )	2021-03-23 16:31:26 +01:00
Timo Moeller	f954f0db38	Fix top_k param in RAG tutorials (#906 ) * Fix top_k param * Add latest docstring and tutorial changes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2021-03-18 18:00:21 +01:00
Timo Moeller	7b559fa4e8	Improve dpr conversion (#826 ) * Bugfix dpr conversion * Add latest docstring and tutorial changes * Fix preprocessor changes	2021-03-18 14:51:01 +01:00
oryx1729	e9f0076dbd	Fix execution of Pipelines with parallel nodes (#901 )	2021-03-18 12:41:30 +01:00

... 5 6 7 8 9

426 Commits