lewtun
32050fdce3
Add Milvus to the retriever / document store table ( #931 )
2021-03-29 09:53:26 +02:00
Guillim
55b7a820d4
Fixing inconsistency ( #926 )
...
Fixing inconsistency between pipe and p in the doc
2021-03-26 18:55:02 +01:00
Timo Moeller
1244d16010
Better default value for mp chunksize ( #923 )
...
* Better default value for mp chunksize
* Add latest docstring and tutorial changes
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-03-25 19:00:45 +01:00
Peter Adorjan
cafa1230da
Warning instead of Exception in FAISS and Milvus filtering ( #913 )
2021-03-23 17:49:47 +01:00
Lalit Pagaria
e904deefa7
Add Markdown file convertor ( #875 )
2021-03-23 16:31:26 +01:00
Moshe Berchansky
47dc069afe
Fix for allocate memory exception by specifing max_processes ( #910 )
2021-03-19 18:11:25 +01:00
Timo Moeller
f954f0db38
Fix top_k param in RAG tutorials ( #906 )
...
* Fix top_k param
* Add latest docstring and tutorial changes
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-03-18 18:00:21 +01:00
Branden Chan
26093452a4
Add code of conduct
2021-03-18 16:39:16 +01:00
Timo Moeller
7b559fa4e8
Improve dpr conversion ( #826 )
...
* Bugfix dpr conversion
* Add latest docstring and tutorial changes
* Fix preprocessor changes
2021-03-18 14:51:01 +01:00
oryx1729
e9f0076dbd
Fix execution of Pipelines with parallel nodes ( #901 )
2021-03-18 12:41:30 +01:00
Branden Chan
24d0c4d42d
Fix DPR training batch size ( #898 )
...
* Adjust batch size
* Add latest docstring and tutorial changes
* Update training results
* Add latest docstring and tutorial changes
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-03-17 18:33:59 +01:00
Peter Demin
992277e812
Run Grammarly over README.md ( #890 )
...
* Run Grammarly over README.md
* Update README.md
Co-authored-by: Andrey A. <56412611+aantti@users.noreply.github.com>
* Update README.md
Co-authored-by: Andrey A. <56412611+aantti@users.noreply.github.com>
* Update README.md
Co-authored-by: Andrey A. <56412611+aantti@users.noreply.github.com>
* Update README.md
Co-authored-by: Andrey A. <56412611+aantti@users.noreply.github.com>
* Update README.md
Co-authored-by: Andrey A. <56412611+aantti@users.noreply.github.com>
* Update README.md
Co-authored-by: Andrey A. <56412611+aantti@users.noreply.github.com>
* Update README.md
Co-authored-by: Andrey A. <56412611+aantti@users.noreply.github.com>
2021-03-16 18:00:57 +03:00
Mohamed Sayed
9ec2406a05
Remove broken tf-idf youtube link ( #888 )
...
The youtube link is of a deleted video.
2021-03-11 14:23:05 +01:00
Malte Pietsch
91007c15dc
Add abstract run() method to Basecomponent ( #887 )
2021-03-11 12:47:10 +01:00
oryx1729
e0a118fd9a
Add support for parallel paths in Pipeline ( #884 )
2021-03-10 18:17:23 +01:00
oryx1729
6d00eff796
Add PDF converter in Dockerfiles ( #877 )
2021-03-08 09:55:11 +01:00
Malte Pietsch
81b83293c0
Update docker-compose.yml
2021-03-05 10:55:36 +01:00
Eric Lam
5484b8883b
Fix error when is_impossible not exist ( #870 )
2021-03-04 18:42:42 +01:00
oryx1729
f3fb9aacce
Fix validation for split_respect_sentence_boundary in Preprocessor ( #869 )
2021-03-04 15:09:08 +01:00
oryx1729
4b188b8102
Add runtime parameters to component initialization ( #873 )
2021-03-04 12:18:12 +01:00
Paul Klyvis
1b609114b8
Fix elasticsearch auth modes ( #871 )
...
Co-authored-by: Paulius Klyvis <paul@convious.com>
2021-03-02 16:24:31 +01:00
Eric Lam
db75498278
Fix error when is_impossible not is_impossible and json dump encoding error ( #868 )
...
* Fix error when is_impossible not is_impossible and json dump encoding in multilingual data
Fixing #867
* Fix file encoding, all file open with utf-8
2021-03-02 13:54:58 +01:00
Malte Pietsch
762f194b27
Fix boolean progress_bar for disabling tqdm progressbar ( #863 )
2021-02-26 10:49:31 +01:00
Branden Chan
325a4e4d14
Add Milvus Documentation ( #838 )
...
* First commit
* Add latest docstring and tutorial changes
* Add DocStore external setup info
* fixed tabs
* Add Milvus recommendation
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Markus Paff <markuspaff.mp@gmail.com>
2021-02-24 11:43:40 +01:00
venuraja79
e930d8a717
Annotation Tool: data is not persisted when using local version #853 ( #855 )
2021-02-21 15:35:45 +01:00
Tu NGUYEN
ba91a90dd6
Fix download ntlk preprocessor ( #852 )
2021-02-21 10:17:50 +01:00
Malte Pietsch
e641bff7a6
Allow more options for elasticsearch client (auth, multiple hosts) ( #845 )
...
* allow more options for elasticsearch client (auth, multiple hosts)
* Add latest docstring and tutorial changes
* fix mypy
* Add latest docstring and tutorial changes
* test client connection via ping()
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-02-19 14:29:59 +01:00
Divya Yeruva
6c3ec540a4
Add crawler to get texts from websites ( #775 )
...
* add fetch_data_from_url to extract data and store as files
* corrected a typo
* corrected variable name error
* correction of urlparse error
* type error
* added selenium, urllib to requirements
* removed urllib
* minor changes and added function to find out inpage navigation links
* quick duplicate links fix
* quick type annotation fix
* created seperate module for crawler
* type error fix
* type error fix
* import fix
* quick type error fix
* addee return description
* updated include type to list
* refactor modules. Add Crawler class. rename params.
* add basic pipeline compatibility
* update docstrings
* fix mypy issues
* update args, docstrings, return filepaths
* fix mypy
* make urls optional in init
Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
2021-02-18 12:00:49 +01:00
Malte Pietsch
d700592c9a
Update GPU Dockerimage (Cuda 11, Fix faiss)( #836 )
2021-02-17 12:40:00 +01:00
Malte Pietsch
abf2d63c92
Upgrade FAISS to 1.7.0 ( #834 )
2021-02-17 10:00:33 +01:00
Branden Chan
a6a3b74199
Fix image in README
2021-02-16 17:05:15 +01:00
Andrey A
e0be5639ef
Update README.md
2021-02-16 18:47:14 +03:00
Andrey A
ab89fac76a
Update README.md
2021-02-16 18:45:20 +03:00
Andrey A
5c9f7d493c
Fix link to Quick Demo in ToC. ( #831 )
2021-02-16 16:38:04 +01:00
Tanay Soni
07907f9eac
Add support for indexing pipelines ( #816 )
2021-02-16 16:24:28 +01:00
Branden Chan
7030c94325
Revamp Readme ( #820 )
...
* Text changes
* Add new images
* First improvements
* Next iteration
* Resize gif
* Add bold
* Update key concepts diagram
* Center image
* Initial import of a more detailed README.md
* Slight changes to ToC, requirements and across the text.
* Grammar and Streamlit UI png.
* Unfix size of gif for mobile
* Remove requirements, add formatting to numbered lists.
* Formatting, remove img size options.
* Another iteration of phrasing the note about open ports.
* Rephrase the note about the docker ports.
Co-authored-by: Andrey A <56412611+aantti@users.noreply.github.com>
2021-02-16 15:32:43 +01:00
Malte Pietsch
47aae14efa
relax assert precision of arrays
2021-02-15 14:52:13 +01:00
Malte Pietsch
9b1924a54a
Revert TOP_K_PER_CANDIDATE value to 3
2021-02-15 14:30:04 +01:00
Malte Pietsch
0eaae3c0dd
Fix UI when API returns fewer answers than expected ( #828 )
...
* fix ui for few answers from api. add top_k_per_sample env
* Add latest docstring and tutorial changes
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-02-15 14:27:17 +01:00
brandenchan
fe47e3a45e
Fix link in documentation
2021-02-15 11:15:54 +01:00
Malte Pietsch
6798192d40
Add API endpoint to export accuracy metrics from user feedback + created_at timestamp ( #803 )
...
* WIP feedback metrics
* fix filters and zero division
* add created_at and model_name fields to labels
* add created_at value
* remove debug log level
* fix attribute init
* move timestamp creation down to docstore / db level
* fix import
2021-02-15 10:48:59 +01:00
brandenchan
03cda26d85
Fix link in Tutorial 8
2021-02-15 10:45:27 +01:00
Lalit Pagaria
5bd94ac5f7
Adding Translator (standalone component & wrapper for pipelines) ( #782 )
...
* Adding translator with many generic input parameter support
* Making dict_key as generic
* Fixing mypy issue
* Adding pipeline and using opus models
* Add latest docstring and tutorial changes
* Adding test cases for end-to-end translation for generator, summerizer etc
* raise error join and merge nodes
* Fix test failure
* add docstrings. add usage documentation. rm skip_special_tokens param
* Add latest docstring and tutorial changes
* fix code snippets in md
* Adding few extra configuration parameters and fixing tests
* Fixingmypy issue and updating usage document
* fix for mypy issue in pipeline.py
* reverting renaming of pytest_collection_modifyitems method
* Addressing review comments
* setting skip_special_tokens to True
* removing model_max_length argument as None type is not supported to many models
* Removing padding parameter. Better to leave it as default otherwise it cause tensor size miss match error. If this option required by used then it can be added later.
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
2021-02-12 15:58:26 +01:00
oryx1729
4059805d89
Fix ElasticsearchDocumentStore.query_by_embedding() ( #823 )
2021-02-12 14:57:06 +01:00
Pavel Soriano
8adf5b4737
Allow non-standard Tokenizers (e.g. CamemBERT) for DPR via new arg ( #811 )
...
* added parameter to infer DPR tokenizers class
* Add latest docstring and tutorial changes
* Update docstring. fix mypy
* Add latest docstring and tutorial changes
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
2021-02-12 14:17:55 +01:00
oryx1729
c4607cbd98
Revamp CI ( #825 )
2021-02-12 13:38:54 +01:00
Branden Chan
c807f0d050
Add key concepts diagram
2021-02-12 12:49:22 +01:00
Tanay Soni
8b0031bfc1
Remove conditional import of FAISS for Windows ( #819 )
2021-02-12 12:15:23 +01:00
Branden Chan
a1983ad84e
Add new images
2021-02-11 15:10:00 +01:00
Branden Chan
db0364c728
Fix uvloop version to maintain Python<3.7 support
...
uvloop released v0.15 which requires Python >=3.7. This commit fixes the version so that Haystack can be directly installed in colab using pip
2021-02-10 19:16:53 +01:00