haystack

mirror of https://github.com/deepset-ai/haystack.git synced 2025-11-11 23:54:37 +00:00

Author	SHA1	Message	Date
Branden Chan	a3a12bc95b	Remove broken link	2021-01-13 17:32:10 +01:00
brandenchan	01fd9940d8	Fix tutorial link	2021-01-13 15:29:25 +01:00
Branden Chan	7376185b65	Create DPR training tutorial (#708 ) * WIP: Start DPR training tutorial * Create basics of DPR Train tutorial * Update documentation * Allow DPR to be initialized without document store * WIP: Add param descriptions to DPR notebook * Clean tutorial * Improve loading * Make doc store optional when loading DPR * Satisfy mypy type check * Add links * Add tutorial header * Add colab badge * Clear outputs * Incorporate reviewer feedback * WIP: Start DPR training tutorial * Create basics of DPR Train tutorial * Update documentation * Allow DPR to be initialized without document store * WIP: Add param descriptions to DPR notebook * Clean tutorial * Improve loading * Make doc store optional when loading DPR * Satisfy mypy type check * Add links * Add tutorial header * Add colab badge * Clear outputs * Incorporate reviewer feedback * Add readme links * Regenerate tutorials * Add excitement * Fix typo * Fix hard negatives comment * Wrap tutorial for windows users * Fix mypy issue	2021-01-13 10:33:55 +01:00
Markus Paff	3af3ee1a12	Automate docstring and tutorial generation with every push to master (#718 ) * automate docstring and tutorial generation with every push to master * test CI for current branch * fixed yaml syntax * add setupttools to install process * checkout repo * fixed command for shell script * install wheel as it is needed for CI * install mkdocs * test without shell script * use package from github actions * test other configuration * back to right config * cleaning script	2021-01-11 16:25:43 +01:00
Branden Chan	bb8aba18e0	Create Preprocessing Tutorial (#706 ) * WIP: First version of preprocessing tutorial * stride renamed overlap, ipynb and py files created * rename split_stride in test * Update preprocessor api documentation * define order for markdown files * define order of modules in api docs * Add colab links * Incorporate review feedback Co-authored-by: PiffPaffM <markuspaff.mp@gmail.com>	2021-01-06 15:54:05 +01:00
Malte Pietsch	a2e5e6b09e	Update pipeline documentation and readme (#693 ) * Update README.md * Update pipelines.md * Update pipelines.md * Update README.md	2020-12-22 13:34:28 +01:00
Markus Paff	b752da1cd5	Add docs v0.6.0 (#689 ) * new docs version * updated directory structure * Add pipelines page * Add Finder deprecation suggestion * header for pipelines file * Document MySQL support * Mention DPR train tutorial coming soon * Mention open distro ES * Update doc strings regarding similarity fn * Add link to API docs * Wrap pipelines docs in box * add api reference for pipelines * copied latest version to v0.6.0 * Remove space * Remove space * Copy to v0.6.0 Co-authored-by: brandenchan <brandenchan@icloud.com>	2020-12-18 12:47:27 +01:00
Branden Chan	d8154939fc	Scale dot product into probabilities (#667 ) * scale dot product * Add tip in documentation * Add recommendation boxes * WIP: Use similarity attribute in all doc stores * Implement similarity for InMemoryDS * Add FAISS support * Clean printout * Update documentation * Implement document field map	2020-12-11 12:10:24 +01:00
Malte Pietsch	149d98a0fd	Add latest benchmark run (#652 ) * add latest benchmark run * update templates and fix small json errors * Change scale Co-authored-by: brandenchan <brandenchan@icloud.com>	2020-12-10 16:25:51 +01:00
Branden Chan	8c904d79d6	Fix links (#663 )	2020-12-08 10:28:31 +01:00
Malte Pietsch	e6ada08d0e	Update query arg in Tutorial 7 (#656 )	2020-12-04 08:42:09 +01:00
Branden Chan	79555148ac	Add link to FAISS Info in documentation (#643 ) * Add link to FAISS info * Clean link	2020-12-02 15:24:22 +01:00
brandenchan	cdd009d1ef	Better payload example spacing	2020-12-01 13:07:29 +01:00
Branden Chan	e573c9e27d	Improve User Feedback Documentation (#539 ) * Extend docs * Add User Feedback API calls * Incorporate reviewer feedback	2020-12-01 12:55:31 +01:00
Branden Chan	5e5dba9587	Add api md (#631 )	2020-11-27 17:26:53 +01:00
brandenchan	ce6cba227f	Fix website typo	2020-11-27 16:07:29 +01:00
Markus Paff	88d0ee2c98	Add boxes for recommendations (#629 ) * add boxes for recommendations * add more recommendation boxes Co-authored-by: brandenchan <brandenchan@icloud.com>	2020-11-27 16:00:20 +01:00
Branden Chan	ae530c3a41	Fix docstring examples (#604 ) * Fix docstring examples * Unify code example format * Add md files	2020-11-25 14:19:49 +01:00
Markus Paff	3dee284f20	cleaning the api docs (#616 )	2020-11-24 18:49:14 +01:00
Branden Chan	1e8af84ecc	Make more changes to documentation (#578 ) * First batch of changes * Add RAG tutorial links * Prettify RAG tutorial * draft of generator doc * Add text * Complete generator page * Create optimization section * Split intro * Fix formatting tutorial 7	2020-11-19 14:58:27 +01:00
Branden Chan	2aa3c071fd	Remove column in benchmark website (#608 ) * Make benchmarks clearer * remove column	2020-11-19 12:18:47 +01:00
Branden Chan	827a40b12a	Make benchmarks clearer (#606 )	2020-11-19 10:31:43 +01:00
brandenchan	090a8cf3e9	Revert "First batch of changes" This reverts commit c07182aa0ab77106cdb142f4ca43ff02476e6fbf.	2020-11-12 12:27:16 +01:00
brandenchan	c07182aa0a	First batch of changes	2020-11-12 12:07:02 +01:00
Malte Pietsch	ea0fd405d8	add concept sketch	2020-11-07 08:42:01 +01:00
Markus Paff	4cca3b5290	New docs version v0.5.0 (#560 )	2020-11-06 13:17:04 +01:00
Branden Chan	99e924aede	Update Documentation for Haystack 0.5.0 (#557 ) * Add languages and preprocessing pages * add content * address review comments * make link relative * update api ref with latest docstrings * move doc readme and update * add generator API docs * fix example code * design and link fix Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai> Co-authored-by: PiffPaffM <markuspaff.mp@gmail.com>	2020-11-06 10:53:22 +01:00
Markus Paff	40c5c8edb4	Added new formatting for examples in docstrings (#555 )	2020-11-05 15:50:08 +01:00
Malte Pietsch	df13a6830d	Update annotation docs for website (#505 ) * update annotation docs for website * add md file for docs * add user manual	2020-11-03 11:24:06 +01:00
Malte Pietsch	50709a3f9d	Fix retriever mAP benchmarks	2020-11-02 19:55:58 +01:00
Branden Chan	3793205aa3	Merge branch 'master' into fix_website	2020-10-29 10:29:25 +01:00
Branden Chan	2ba5417f8e	Fix metric for benchmarks website page	2020-10-29 10:26:48 +01:00
Branden Chan	7c81dfdc3a	Address reviewer comments	2020-10-27 12:41:11 +01:00
brandenchan	d3743d00e9	Merge branch 'master' into automate_benchmarks	2020-10-21 17:48:10 +02:00
Lalit Pagaria	63c12371b9	Change arg "model" to "model_name_or_path" in TransformersReader (#510 ) * Consistent parameter naming for TransformersReader along with removing unused imports as well. * Addressing review comments	2020-10-21 17:15:35 +02:00
Malte Pietsch	3434d5205d	Update doc string for ElasticsearchDocumentStore.write_documents() & sync markdown files (#501 ) * update doc string for ElasticsearchDocumentStore.write_documents() * update all markdowns with latest docstrings	2020-10-19 13:56:38 +02:00
Markus Paff	2531c8e061	Add versioning docs (#495 ) * add time and perf benchmark for es * Add retriever benchmarking * Add Reader benchmarking * add nq to squad conversion * add conversion stats * clean benchmarks * Add link to dataset * Update imports * add first support for neg psgs * Refactor test * set max_seq_len * cleanup benchmark * begin retriever speed benchmarking * Add support for retriever query index benchmarking * improve reader eval, retriever speed benchmarking * improve retriever speed benchmarking * Add retriever accuracy benchmark * Add neg doc shuffling * Add top_n * 3x speedup of SQL. add postgres docker run. make shuffle neg a param. add more logging * Add models to sweep * add option for faiss index type * remove unneeded line * change faiss to faiss_flat * begin automatic benchmark script * remove existing postgres docker for benchmarking * Add data processing scripts * Remove shuffle in script bc data already shuffled * switch hnsw setup from 256 to 128 * change es similarity to dot product by default * Error includes stack trace * Change ES default timeout * remove delete_docs() from timing for indexing * Add support for website export * update website on push to benchmarks * add complete benchmarks results * new json format * removed NaN as is not a valid json token * versioning for docs * unsaved changes * cleaning * cleaning * Edit format of benchmarks data * update also jsons in v0.4.0 Co-authored-by: brandenchan <brandenchan@icloud.com> Co-authored-by: deepset <deepset@Crenolape.localdomain> Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>	2020-10-19 11:46:51 +02:00
Branden Chan	1cebcb7dda	Create time and performance benchmarks for all readers and retrievers (#339 ) * add time and perf benchmark for es * Add retriever benchmarking * Add Reader benchmarking * add nq to squad conversion * add conversion stats * clean benchmarks * Add link to dataset * Update imports * add first support for neg psgs * Refactor test * set max_seq_len * cleanup benchmark * begin retriever speed benchmarking * Add support for retriever query index benchmarking * improve reader eval, retriever speed benchmarking * improve retriever speed benchmarking * Add retriever accuracy benchmark * Add neg doc shuffling * Add top_n * 3x speedup of SQL. add postgres docker run. make shuffle neg a param. add more logging * Add models to sweep * add option for faiss index type * remove unneeded line * change faiss to faiss_flat * begin automatic benchmark script * remove existing postgres docker for benchmarking * Add data processing scripts * Remove shuffle in script bc data already shuffled * switch hnsw setup from 256 to 128 * change es similarity to dot product by default * Error includes stack trace * Change ES default timeout * remove delete_docs() from timing for indexing * Add support for website export * update website on push to benchmarks * add complete benchmarks results * new json format * removed NaN as is not a valid json token * fix benchmarking for faiss hnsw queries. do sql calls in update_embeddings() as batches * update benchmarks for hnsw 128,20,80 * don't delete full index in delete_all_documents() * update texts for charts * update recall column for retriever * change scale and add units to desc * add units to legend * add axis titles. update desc * add html tags Co-authored-by: deepset <deepset@Crenolape.localdomain> Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai> Co-authored-by: PiffPaffM <markuspaff.mp@gmail.com>	2020-10-12 13:34:42 +02:00
Malte Pietsch	8edeb844f7	Remove phi normalization from FAISS, support more index types, 3x speedup (#467 ) * remove phi normalization * add special case for hnsw * rename vector_size to vector_dim * fix loading. fix extra dim in tests * switch to new ES syntax for vector similarity * 3x sql speed up. cascade deletes. add train_index() * add docstrings. remove vector_dim from load() * delete docs from faiss and sql * fix delete of docs in test * relax type hint for faiss index * rename metric to metric_type Co-authored-by: lalitpagaria <19303690+lalitpagaria@users.noreply.github.com>	2020-10-06 16:09:56 +02:00
Markus Paff	56852f820b	READ.me for Docstring Generation and remove not needed files (#468 )	2020-10-06 15:16:56 +02:00
Markus Paff	25f34babce	Separate data and view for benchmarks (#451 ) * separate data and view for benchmarks * fixed typo	2020-10-06 10:30:19 +02:00
Malte Pietsch	dfe244e287	Fix typos in roadmap (#434 )	2020-09-25 11:28:46 +02:00
Malte Pietsch	0a123707e4	Fix typos in roadmap (#433 )	2020-09-25 07:38:48 +02:00
Malte Pietsch	15c0064498	add roadmap section to docs (#432 )	2020-09-24 23:43:40 +02:00
Markus Paff	6b35e38e12	Fixed tabs for haystack-website issue (#427 )	2020-09-24 10:36:18 +02:00
Markus Paff	66a1893f79	Moved files to api directory (#418 )	2020-09-22 11:48:26 +02:00
Markus Paff	8e044dc16f	Fix typo in documentation (#406 ) Co-authored-by: Antonio Lanza <antoniolanza1996@gmail.com>	2020-09-21 13:31:00 +02:00
brandenchan	f4a1682570	Fix images	2020-09-18 14:58:03 +02:00
Branden Chan	7fdb85d63a	Create documentation website (#272 ) * Skeleton of doc website * Flesh out documentation pages * Split concepts into their own rst files * add tutorial rsts * Consistent level 1 markdown headers in tutorials * Change theme to readthedocs * Turn bullet points into prose * Populate sections * Add more text * Add more sphinx files * Add more retriever documentation * combined all documenations in one structure * rename of src to _src as it was ignored by git * Incorporate MP2's changes * add benchmark bar charts * Adapt docstrings in Readers * Improvements to intro, creation of glossary * Adapt docstrings in Retrievers * Adapt docstrings in Finder * Adapt Docstrings of Finder * Updates to text * Edit text * update doc strings * proof read tutorials * Edit text * Edit text * Add stacked chart * populate graph with data * Switch Documentation to markdown (#386) * add way to generate markdown files to sphinx * changed from rst to markdown and extended sphinx for it * fix spelling * Clean titles * delete file * change spelling * add sections to document store usage * add basic rest api docs * fix readme in setup.py * Update Tutorials * Change section names * add windows note to pip install * update intro * new renderer for markdown files * Fix typos * delete dpr_utils.py * fix windows note in get started * Fix docstrings * deleted rest api docs in api * fixed typo * Fix docstring * revert readme to rst * Fix readme * Update setup.py Co-authored-by: deepset <deepset@Crenolape.localdomain> Co-authored-by: PiffPaffM <markuspaff.mp@gmail.com> Co-authored-by: Bogdan Kostić <bogdankostic@web.de> Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>	2020-09-18 12:57:32 +02:00
Malte Pietsch	3782646948	Add logo to readme (#384 ) * add logo image * add logo to readme * change img path to master * Update README.rst	2020-09-16 18:36:22 +02:00

... 10 11 12 13 14

661 Commits