Branden Chan
9626c0d65e
Update Documentation ( #976 )
...
* Add api pages
* Add latest docstring and tutorial changes
* First sweep of usage docs
* Add link to conversion script
* Add import statements
* Add summarization page
* Add web crawler documentation
* Add confidence scores usage
* Add crawler api docs
* Regenerate api docs
* Update summarizer and translator api
* Add api pages
* Add latest docstring and tutorial changes
* First sweep of usage docs
* Add link to conversion script
* Add import statements
* Add summarization page
* Add web crawler documentation
* Add confidence scores usage
* Add crawler api docs
* Regenerate api docs
* Update summarizer and translator api
* Add indentation (pydoc-markdown 3.10.1)
* Comment out metadata
* Remove Finder deprecation message
* Remove Finder in FAQ
* Update tutorial link
* Incorporate reviewer feedback
* Regen api docs
* Add type annotations
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-04-22 16:45:29 +02:00
Malte Pietsch
b1e8ebf81a
Create pull_request_template.md
2021-04-22 15:48:39 +02:00
Andrey A
58ea0a62e0
Add links to GitHub Discussion and SO ( #984 )
...
* Add link to Stack Overflow
* Add link to GitHub discussions and re-arrange links
2021-04-22 09:51:21 +02:00
Timo Moeller
2e39361f8a
Add maxsamples and convert data dir to path ( #989 )
2021-04-22 09:35:11 +02:00
oryx1729
7269530e45
Add validation for root node in Pipeline ( #987 )
2021-04-21 12:18:33 +02:00
oryx1729
8c1e411380
Fix update_embeddings() for FAISSDocumentStore ( #978 )
2021-04-21 09:56:35 +02:00
Guillim
0051a34ff9
Add root_path option to REST API for reverse proxy deployments ( #982 )
2021-04-20 11:19:28 +02:00
oryx1729
4dd5a7a744
Make FAISS import optional ( #971 )
2021-04-15 12:26:34 +02:00
oryx1729
237172f459
Make FAISS import conditional ( #970 )
2021-04-14 17:34:01 +02:00
Mario Jäckle
84f90e82c5
feature(aws): add aws iam auth method ( #965 )
...
Co-authored-by: Mario Jäckle <m.jaeckle@careerpartner.eu>
2021-04-14 16:34:24 +02:00
oryx1729
5bb66940a9
Fix equality check in preprocessor ( #969 )
2021-04-14 16:03:48 +02:00
Markus Paff
0633dae4d0
new docs version ( #964 )
2021-04-14 13:40:05 +02:00
oryx1729
bba1d80aef
Update Haystack version
v0.8.0
2021-04-13 16:31:19 +02:00
Branden Chan
77d4c2ca1c
Benchmark milvus ( #850 )
...
* Add milvus benchmarking support
* Add latest docstring and tutorial changes
* Edit config
* Disable docker interactive mode
* Add milvus index type support
* Adjust FAISS and Milvus node branching
* Remove duplicate in config
* Revert method for speedup
* Add latest docstring and tutorial changes
* Add latest benchmark run
* Add latest docstring and tutorial changes
* Add json files
* Revert "Add latest docstring and tutorial changes"
This reverts commit e2efa5f08aa4fb55bbeeed42aa76817d63fc8923.
* Add latest docstring and tutorial changes
* Revert "Add latest docstring and tutorial changes"
This reverts commit b085a679b9d5f175e91c2c59565e73c5dec1374b.
* Fix typo
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-04-13 14:54:15 +02:00
Markus Paff
b87daed62b
fixed link to dpr ( #962 )
2021-04-13 09:45:04 +02:00
Julian Risch
8333a13d6f
Adding tutorial on knowledge graphs to README
2021-04-12 15:26:02 +02:00
Markus Paff
dfb0282b74
Update milvus links and docstrings ( #959 )
...
* update milvus links and docstrings
* Add latest docstring and tutorial changes
* new milvus version
* Add latest docstring and tutorial changes
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-04-12 14:38:57 +02:00
oryx1729
406f7fa679
Disable Gunicorn preload option ( #960 )
2021-04-12 12:46:52 +02:00
Timo Moeller
837dea4e6d
Integrate sentence transformers into benchmarks ( #843 )
...
* Integrate sentence transformers into benchmarks
* Add doc store asserts
* switch data downloads from s3 client to https. add license info
* Fix mypy, revert config
Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-04-09 17:24:16 +02:00
Julian Risch
d38c07e0ee
knowledge graph example ( #934 )
...
* Add knowledge graph module
* Fix type hint
* Add graph retriver module
* Change type annotations, change return format
* Add graph retriever that executes questions as sparql queries
* Linking only those entities that are in the knowledge graph
* Added logging and using relations extracted from Knowledge graph for linking
* Preventing entity linking from linking the same token to multiple entities
* Pruning triples that have no variables for select and count queries
* Support knowledge graphs with Pipelines
* Add text2sparql
* Entity linking and relation linking consider more special cases now based on evaluation on labelled data
* Separating example code from KGQA implementation
* Add eval on combined extarctive and kg questions
* Remove references to hp-test
* Add fields sparql_query and long_answer_list to metadata
* Removing modular Question2SPARQL approach
* Removing additional classes used for modular kgqa approach
* preparing lcquad data
* change graph db
* Translating namespaces in knowledge graph queries
* Creating graphdb index and loading triples from .ttl file
* Fetching graph config files, triples and model from S3
* Fix incompatibility issues with BaseGraphRetriever and BaseComponent
* Removing unused utility functions
* Adding doc strings and tutorial header
* Adding sparqlwrapper dependency
* Moving tutorial header
* Sorting tutorials by number within name of notebook
* Add latest docstring and tutorial changes
* Creating test cases for knowledge graph
* Changing knowledge graph example to harry potter
* Add latest docstring and tutorial changes
* Adapting the tutorial notebook to harry potter example
* Add GraphDB fixture for tests
* Add latest docstring and tutorial changes
* Added GraphDB docker launch to CI
* Use correct GraphDB fixture
* Check if GraphDB instance is already running
* Renaming question/query and incorporating other feedback from Timo and Tanay
* Removed type annotation
* Add latest docstring and tutorial changes
Co-authored-by: oryx1729 <oryx1729@protonmail.com>
Co-authored-by: Timo Moeller <timo.moeller@deepset.ai>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-04-08 14:05:33 +02:00
oryx1729
fc6368c191
Fix passing a list of values as param ( #952 )
2021-04-07 19:50:50 +02:00
oryx1729
8c68699e1c
Refactor REST APIs to use Pipelines ( #922 )
2021-04-07 17:53:32 +02:00
Julian Risch
64ad953c6a
Adding indentation to markup files ( #947 )
2021-04-07 11:36:11 +02:00
lewtun
8894c4fae9
Reduce precision in pipeline eval print functions ( #943 )
...
A proposal to reduce the precision shown in the `EvalRetriever.print` and `EvalReader.print` to 4 significant figures. If the user wants the full precision, they can access the class attributes directly.
Before
```
Retriever
-----------------
has_answer recall: 0.8739495798319328 (208/238)
no_answer recall: 1.00 (120/120) (no_answer samples are always treated as correctly retrieved)
recall: 0.9162011173184358 (328 / 358)
```
After
```
Retriever
-----------------
has_answer recall: 0.8739 (208/238)
no_answer recall: 1.00 (120/120) (no_answer samples are always treated as correctly retrieved)
recall: 0.9162 (328 / 358)
```
2021-04-06 05:11:29 +02:00
lewtun
41a1c8329d
Fix division by zero error in EvalRetriever ( #938 )
...
If the first query in the evaluation returns a document with `no_answer=True` we got a division by zero error because neither `self.has_answer_correct` or `self.has_answer_count` get incremented. This fix moves the `self.has_answer_recall` calculation within the if-else block.
2021-04-03 18:13:36 +02:00
Timo Moeller
5d2b16f3cc
Update farm version ( #936 )
...
* Update farm version
* Add new DPR loading, fix dpr param name
* Add QA model confidence as answer probability, fix prams in test
2021-04-01 18:23:05 +02:00
Branden Chan
d77152c469
WIP: Add evaluation nodes for Pipelines ( #904 )
...
* Add main eval fns
* WIP: make pipeline_eval.py run
* Fix typo
* Add support for no_answers
* Add latest docstring and tutorial changes
* Working pipeline eval
* Add timing of nodes
* Add latest docstring and tutorial changes
* Refactor and clean
* Update tutorial script
* Set default params
* Update tutorials
* Fix indent
* Add latest docstring and tutorial changes
* Address mypy issues
* Add test
* Fix mypy error
* Clear outputs
* Add doc strings
* Incorporate reviewer feedback
* Add latest docstring and tutorial changes
* Revert query counting
* Fix typo
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-04-01 17:35:18 +02:00
lewtun
32050fdce3
Add Milvus to the retriever / document store table ( #931 )
2021-03-29 09:53:26 +02:00
Guillim
55b7a820d4
Fixing inconsistency ( #926 )
...
Fixing inconsistency between pipe and p in the doc
2021-03-26 18:55:02 +01:00
Timo Moeller
1244d16010
Better default value for mp chunksize ( #923 )
...
* Better default value for mp chunksize
* Add latest docstring and tutorial changes
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-03-25 19:00:45 +01:00
Peter Adorjan
cafa1230da
Warning instead of Exception in FAISS and Milvus filtering ( #913 )
2021-03-23 17:49:47 +01:00
Lalit Pagaria
e904deefa7
Add Markdown file convertor ( #875 )
2021-03-23 16:31:26 +01:00
Moshe Berchansky
47dc069afe
Fix for allocate memory exception by specifing max_processes ( #910 )
2021-03-19 18:11:25 +01:00
Timo Moeller
f954f0db38
Fix top_k param in RAG tutorials ( #906 )
...
* Fix top_k param
* Add latest docstring and tutorial changes
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-03-18 18:00:21 +01:00
Branden Chan
26093452a4
Add code of conduct
2021-03-18 16:39:16 +01:00
Timo Moeller
7b559fa4e8
Improve dpr conversion ( #826 )
...
* Bugfix dpr conversion
* Add latest docstring and tutorial changes
* Fix preprocessor changes
2021-03-18 14:51:01 +01:00
oryx1729
e9f0076dbd
Fix execution of Pipelines with parallel nodes ( #901 )
2021-03-18 12:41:30 +01:00
Branden Chan
24d0c4d42d
Fix DPR training batch size ( #898 )
...
* Adjust batch size
* Add latest docstring and tutorial changes
* Update training results
* Add latest docstring and tutorial changes
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-03-17 18:33:59 +01:00
Peter Demin
992277e812
Run Grammarly over README.md ( #890 )
...
* Run Grammarly over README.md
* Update README.md
Co-authored-by: Andrey A. <56412611+aantti@users.noreply.github.com>
* Update README.md
Co-authored-by: Andrey A. <56412611+aantti@users.noreply.github.com>
* Update README.md
Co-authored-by: Andrey A. <56412611+aantti@users.noreply.github.com>
* Update README.md
Co-authored-by: Andrey A. <56412611+aantti@users.noreply.github.com>
* Update README.md
Co-authored-by: Andrey A. <56412611+aantti@users.noreply.github.com>
* Update README.md
Co-authored-by: Andrey A. <56412611+aantti@users.noreply.github.com>
* Update README.md
Co-authored-by: Andrey A. <56412611+aantti@users.noreply.github.com>
2021-03-16 18:00:57 +03:00
Mohamed Sayed
9ec2406a05
Remove broken tf-idf youtube link ( #888 )
...
The youtube link is of a deleted video.
2021-03-11 14:23:05 +01:00
Malte Pietsch
91007c15dc
Add abstract run() method to Basecomponent ( #887 )
2021-03-11 12:47:10 +01:00
oryx1729
e0a118fd9a
Add support for parallel paths in Pipeline ( #884 )
2021-03-10 18:17:23 +01:00
oryx1729
6d00eff796
Add PDF converter in Dockerfiles ( #877 )
2021-03-08 09:55:11 +01:00
Malte Pietsch
81b83293c0
Update docker-compose.yml
2021-03-05 10:55:36 +01:00
Eric Lam
5484b8883b
Fix error when is_impossible not exist ( #870 )
2021-03-04 18:42:42 +01:00
oryx1729
f3fb9aacce
Fix validation for split_respect_sentence_boundary
in Preprocessor ( #869 )
2021-03-04 15:09:08 +01:00
oryx1729
4b188b8102
Add runtime parameters to component initialization ( #873 )
2021-03-04 12:18:12 +01:00
Paul Klyvis
1b609114b8
Fix elasticsearch auth modes ( #871 )
...
Co-authored-by: Paulius Klyvis <paul@convious.com>
2021-03-02 16:24:31 +01:00
Eric Lam
db75498278
Fix error when is_impossible not is_impossible and json dump encoding error ( #868 )
...
* Fix error when is_impossible not is_impossible and json dump encoding in multilingual data
Fixing #867
* Fix file encoding, all file open with utf-8
2021-03-02 13:54:58 +01:00
Malte Pietsch
762f194b27
Fix boolean progress_bar
for disabling tqdm progressbar ( #863 )
2021-02-26 10:49:31 +01:00