64 Commits

Author SHA1 Message Date
Massimiliano Pippi
15bb6c2ea2
remove tutorials from the repo (#3244) 2022-09-20 18:32:45 +02:00
Sara Zan
e92ea4fccb
refactor: rename master into main in documentation and links (#3063)
* master->main

* revert master rename

* Revert change to sphinx link and rename master schema
2022-08-24 19:05:12 +02:00
Ofek Lev
f6a4a14790
refactor: update package metadata (#3079)
* Update package metadata

* fix yaml

* remove Python version cap

* address review
2022-08-24 09:46:21 +02:00
Tuana Celik
2298155a20
changing Slack to Discord (#3040)
* changing Slack to Discord

* Update README.md

* updating contributing
2022-08-15 15:56:16 +03:00
Daniel Fleischer
d91a5b0e15
Typo README.md (#2895) 2022-07-27 16:00:50 +02:00
bogdankostic
353da8b1c1
Add Tutorials 16, 17 and 18 to README (#2758) 2022-07-05 12:04:58 +02:00
Sara Zan
735ffa635b
[CI refactoring] Tutorials on CI (#2547)
* Experimental Ci workflow for running tutorials

* Run on every push for now

* Not starting?

* Disabling paths temporarily

* Sort tutorials in natural order

* Install ipython

* remove ipython install

* Try running ipython with sudo

* env.pythonLocation

* Skipping tutorial2 and 9 for speed

* typo

* Use one runner per tutorial, for now

* Typo in dependend job

* Missing quotes broke scripts matrix

* Simplify setup for the tutorials, try to prevent containers conflict

* Remove needless job dependencies

* Try prevent cache issues, fix small Tut10 bug

* Missing deps for running notebook tutorials

* Create three groups of tutorials excluding the longest among them

* remove deps

* use proper bash loop

* Try with a single string

* Fix typo in echo

* Forgot do

* Typo

* Try to make the GraphDB tutorial without launching its own container

* Run notebook and script together

* Whitespace

* separate scrpits and notebooks execution

* Run notebooks first

* Try caching the GoT data before running the scripts

* add note

* fix mkdir

* Fix path

* Update Documentation & Code Style

* missing -r

* Fix folder numbering

* Run notebooks as well

* Typo in notebook command

* complete path in notebook command

* Try with TIKA_LOG_PATH

* Fix folder naming

* Do not use cached data in Tut9

* extracting the number better

* Small tweaks

* Same fix on Tut10 on the notebook

* Exclude GoT cache for tut5 too

* Remove faiss files after tutorial run

* Layout

* fix remove command

* Fix path in tut10 notebook

* Fix typo in node name in tut14

* Third block was too long, rebancing

* Reduce GoT dataset even more, why wasting time after all...

* Fix paths in tut10 again

* do git clean to make sure to cleanup everything (breaks post Python)

* Remove ES file with bad permission at the end of the run

* Split first block, takes >30mins

* take out tut15 for a moment, has an actual bug

* typo

* Forgot rm option

* Simply remove all ES files

* Improve logs of GoT reduction

* Exclude also tut16 from cache to try fix bug

* Replace ll with ls

* Reintroduce 15_TableQA

* Small regrouping

* regrouping to make the min num of runners go for about 30mins

* Add cron schedule and PR paths conditions

* Add some timing information

* Separate tutorials by diff and tutorials by cron

* temp add pull_request to tutorials nightly

* Add badge in README to keep track of the nightly tutorials run

* Remove prefixes from data folder names

* Add fetch depth to get diff with master

* Fix paths again

* typo

* Exclude long-running ones

* Typo

* Fix tutorials.yml as well

* Use head_ref

* Using an action for now

* exclude other files

* Use only the correct command to run the tutorial

* Add long running tutorials in separate runners, just for experiment

* Factor out the complex bash script

* Pass the python path to the bash script

* Fix paths

* adding log statement

* Missing dollarsign

* Resetting variable in loop

* using mini GoT dataset and improving bash script

* change dataset name

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-06-15 09:53:36 +02:00
Branden Chan
caf1336424
Adjust pydoc markdown config so methods shown with classes (#2511)
* add_member_class_prefix: true

* Update Documentation & Code Style

* Trigger redeploy

* Trigger redeploy

* Fix pydoc param

* Update Documentation & Code Style

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-06 16:00:08 +02:00
Branden Chan
75dcfd3fab
Delete files in docs/_src (#2322)
* Delete files in _src

* Filter unused images and re-add images that were in use in docs/img

* Remove all usages of user-images.githubusercontent.com

Co-authored-by: ZanSara <sarazanzo94@gmail.com>
2022-04-12 16:19:03 +02:00
Tuana Celik
a97a9d2b48
adding quotes for zsh shell issue (#2289)
* adding quotes for zsh shell issue
2022-03-08 17:29:08 +01:00
Dmitry Goryunov
548c285f8d
Add who uses Haystack section (#1975)
* Add Airbus, Alcatel-Lucent, Etlab, Deepset
* Add BetterUp, Sooth.ai, and Infineon as users

Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
2022-02-14 16:21:41 +01:00
Branden Chan
9551523ebb
Update README.md (#2160)
Add rest api and ui info
2022-02-10 15:00:09 +01:00
Branden Chan
287314b2d2
Update Readme to reflect changes to installation procedure (#2157)
* Update README.md

* change milvus to milvus1
2022-02-10 11:54:06 +01:00
Sara Zan
9af1292cda
Remove stray requirements.txt files and update README.md (#2075)
* Remove stray requirements.txt files and update README.md

* Remove requirement files

* Add details about pip bug and link to setup.cfg

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-01-27 11:22:14 +01:00
Branden Chan
bec14b63c3
Add live demo link to readme (#1839) 2021-12-03 14:34:19 +01:00
Branden Chan
da90acf650
Update README.md (#1682) 2021-10-29 18:19:21 +02:00
Branden Chan
b9ea9a8ae0
Add collapsing sections to readme (#1663)
* Add collapsing sections to readme

* Add emojis

* Test new collapse style

* Test formatting

* Test formatting

* Test formatting

* Test formatting
2021-10-29 16:39:58 +02:00
bogdankostic
9025615be7
Add TableQA tutorial (#1670)
* Add TableQA tutorial

* Add tutorial header

* Add latest docstring and tutorial changes

* Add more details

* Add latest docstring and tutorial changes

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-10-29 11:07:13 +02:00
Branden Chan
171fd7be38
Update README.md (#1653)
* Update README.md

* Incorporate link into Haystack logo

* Fix jobs link

* Update tutorials and demo

* Change order of sections

* Rename tutorial section

* Create jobs and community sections

* Change wording

* Change section title

* Change wording

* Add tutorial links and pipeline image
2021-10-27 15:55:34 +02:00
Branden Chan
9b2f40100d
Replace Haystack banner for readme (#1654)
* Replace haystack banner for readme

* Replace haystack banner for readme

* Update README.md

* Crop image

* Update README.md

revert to image from master branch
2021-10-26 17:59:45 +02:00
Andrey A
33892cf609
Link the logo in readme to the website (#1649) 2021-10-26 15:04:58 +02:00
Julian Risch
0aba5ca57d
Update jobs link in readme (#1629) 2021-10-21 12:10:18 +02:00
Malte Pietsch
451e51a224
Update code snippet in readme 2021-10-14 18:15:20 +02:00
Malte Pietsch
183fd5ae5a
Simplify tests & allow running on individual doc stores (#1487)
* simplify tests for individual doc stores

* WIP refactoring markers of tests

* test alternative approach for tests with existing parametrization

* fix skip logic of already parametrized tests

* fix weaviate behaviour in tests - not parametrizing it in our general test cases.

* Add latest docstring and tutorial changes

* fix some tests

* remove sql from document_store_types

* fix markers for generator and pipeline test

* remove inmemory marker

* remove unneeded elasticsearch markers

* update readme and contributing.md

* update contributing

* adjust example

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-09-27 10:52:07 +02:00
Malte Pietsch
ff1adb64c2
Update README.md 2021-09-21 17:56:40 +02:00
oryx1729
9dd7c74f4f
Refactor communication between Pipeline Components (#1321) 2021-09-10 11:41:16 +02:00
Bob van Luijt
c0cc8bc80f
Bump Weaviate version to 1.7.0 (#1412)
* Bump Weaviate

* Bump Weaviate

* Bump Weaviate client

* Bump Weaviate

* Revert client version

There is a change in the client API that needs to be addressed before bumping its version
2021-09-05 09:28:55 +02:00
Malte Pietsch
f3d1df1664
Enable docker-compose for GPUs & Add public UI image (#1406)
* add docker-compose-gpu file

* Update README.md

* Update docker-compose.yml

* Update docker-compose-gpu.yml

* Update docker-compose.yml

* Update docker-compose-gpu.yml
2021-09-02 17:39:21 +02:00
Shahrukh Khan
4822536886
Add ImageToTextConverter and PDFToTextOCRConverter that utilize OCR (#1349)
* add image.py converter

* add PDFtoImageConverter

* add init to PDFtoImageConverter and classes to __init__

* update imagetotext pipeline

* update imagetotext pipeline

* update imagetotext pipeline

* update imagetotext pipeline

* update imagetotext pipeline

* update imagetotext pipeline

* update imagetotext pipeline

* revert change in base.py in file_conv

* Update base.py

* Update pdf.py

* add ocr file_converter testcase & update dockerfile

* fix tesseract exception message typo

* fix _image_to_text doctstring

* add tesseract installation to CI

* add tesseract installation to CI

* add content test for PDF OCR converter

* update PDFToTextOCRConverter constructor doctsring

* replace image files with tmp paths for image.py convert

* replace image files with tmp paths for image.py convert

* Update README.md

Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
2021-09-01 16:42:25 +02:00
Markus Paff
be8d305190
Editing docs read.me for new docs website workflow (#1372)
* editing docs read.me for new docs website workflow

* added new links to docs
2021-08-30 14:59:40 +02:00
annagruendler
a3c746abf5
Update test documentation in readme (#1355) 2021-08-19 10:36:21 +02:00
Tanay Pant
79df82aec6
Remove empty bullet points (#1342) 2021-08-12 20:09:18 +02:00
Malte Pietsch
66b10a508b
Update TOC of readme 2021-08-09 11:40:20 +02:00
Malte Pietsch
fb4d6e0381
Update README.md 2021-08-09 11:25:47 +02:00
Malte Pietsch
5a3ea5843f
Fix Tutorial Links 2021-08-09 11:22:19 +02:00
Shahrukh Khan
f99c14268a
Update README.md for new tutorials 13 and 14 (#1325)
* Update README.md

* Update README.md
2021-08-09 10:44:42 +02:00
Julian Risch
90f826e95e
Add links to tutorial 12 to readme (#1274) 2021-07-13 11:23:10 +02:00
Malte Pietsch
600636e77b
Update README.md 2021-06-08 09:23:56 +02:00
Malte Pietsch
a1472b040c
Add badges (#1136) 2021-06-03 14:47:08 +02:00
Julian Risch
84c34295a1
Re-ranking component for document search without QA (#1025)
* Adding ranker similar to retriever and reader

* Sort documents according to query-document similarity scores

* Reranking and model training runs for small example

* Added EvalRanker node

* Calculate recall@k in EvalRetriever and EvalRanker nodes

* Renaming EvalRetriever to EvalDocuments and EvalReader to EvalAnswers

* Added mean reciprocal rank as metric for EvalDocuments

* Fix bug that appeared when ranking documents with same score

* Remove commented code for unimplmented eval() of Ranker node

* Add documentation of k parameter in EvalDocuments

* Add Ranker docu and renaming top_k param
2021-05-31 15:31:36 +02:00
Branden Chan
aadd8b049a
Add Tutorial 11 to Readme 2021-05-05 15:35:21 +02:00
Andrey A
58ea0a62e0
Add links to GitHub Discussion and SO (#984)
* Add link to Stack Overflow

* Add link to GitHub discussions and re-arrange links
2021-04-22 09:51:21 +02:00
Julian Risch
8333a13d6f
Adding tutorial on knowledge graphs to README 2021-04-12 15:26:02 +02:00
Lalit Pagaria
e904deefa7
Add Markdown file convertor (#875) 2021-03-23 16:31:26 +01:00
Peter Demin
992277e812
Run Grammarly over README.md (#890)
* Run Grammarly over README.md

* Update README.md

Co-authored-by: Andrey A. <56412611+aantti@users.noreply.github.com>

* Update README.md

Co-authored-by: Andrey A. <56412611+aantti@users.noreply.github.com>

* Update README.md

Co-authored-by: Andrey A. <56412611+aantti@users.noreply.github.com>

* Update README.md

Co-authored-by: Andrey A. <56412611+aantti@users.noreply.github.com>

* Update README.md

Co-authored-by: Andrey A. <56412611+aantti@users.noreply.github.com>

* Update README.md

Co-authored-by: Andrey A. <56412611+aantti@users.noreply.github.com>

* Update README.md

Co-authored-by: Andrey A. <56412611+aantti@users.noreply.github.com>
2021-03-16 18:00:57 +03:00
Branden Chan
a6a3b74199
Fix image in README 2021-02-16 17:05:15 +01:00
Andrey A
e0be5639ef
Update README.md 2021-02-16 18:47:14 +03:00
Andrey A
ab89fac76a
Update README.md 2021-02-16 18:45:20 +03:00
Andrey A
5c9f7d493c
Fix link to Quick Demo in ToC. (#831) 2021-02-16 16:38:04 +01:00
Branden Chan
7030c94325
Revamp Readme (#820)
* Text changes

* Add new images

* First improvements

* Next iteration

* Resize gif

* Add bold

* Update key concepts diagram

* Center image

* Initial import of a more detailed README.md

* Slight changes to ToC, requirements and across the text.

* Grammar and Streamlit UI png.

* Unfix size of gif for mobile

* Remove requirements, add formatting to numbered lists.

* Formatting, remove img size options.

* Another iteration of phrasing the note about open ports.

* Rephrase the note about the docker ports.

Co-authored-by: Andrey A <56412611+aantti@users.noreply.github.com>
2021-02-16 15:32:43 +01:00