Branden Chan
937247d628
Add QuestionGenerator ( #1267 )
...
* Create basic Question Generation
* Split texts into 50 word chunks
* Allow prompt to be changed
* Implement iteration functionality in DS
* Add docstrings, create pipelines
* Make pipelines work
* Add comments
* Add tests
* Add tutorials and docs
* Add doc string
2021-07-26 17:20:43 +02:00
Branden Chan
363be65a78
Implement OpenSearch ANN ( #1225 )
...
* Simplify ODES init
* Add arguments to ES init and create script
* Rename similarity_fn_name and add util fn
* Create OpenSearchDocumentStore
* Specify params of Open Search HNSW
* Add better argument handling
* Update opensearch index mapping
* Edit opensearch default port
* Fix HNSW mapping
* Force small HNSW params
* Implement auto start and stopping of document store services
* Fix starting and stopping of ds service
* Restore HNSW params
* Add opensearch query benchmarks
* Add write wait time
* Revert wait time
* Add timeout
* Update benchmarks
* Update benchmarks
* Update benchmarks json
* Update documentation
* Update documentation
* Fix similarity name
* Improve argument passing
* Improve stopping and starting of service
2021-07-26 10:52:52 +02:00
Malte Pietsch
4c2a0b914a
Remove pipeline eval example script ( #1297 )
2021-07-21 11:12:04 +02:00
Srevin Saju
7d6548100a
Add support for elasticsearch to connect without any authentication ( #1294 )
2021-07-21 10:47:52 +02:00
oryx1729
e857233313
Add Header in sample REST API Search Request ( #1293 )
2021-07-19 12:57:43 +02:00
oryx1729
3f58d4c13b
Fix SQLAlchemy relationship warnings ( #1289 )
2021-07-15 17:59:59 +02:00
Bob van Luijt
8dae844447
Bump Weaviate version to 1.5 ( #1287 )
...
* bump Weaviate version to 1.5
* bump Weaviate version to 1.5
2021-07-15 08:26:22 +02:00
Ikram Ali
97c1e2cc90
[document_store] Raise warning when labels are overwritten ( #1257 )
...
* [document_store]SQLDocumentStore write_labels() overwrite warning added.
* [document_store]SQLDocumentStore write_labels() overwrite warning added.
* [document_store] bug fixed. #1140
* [document_store] bug fixed. #1140
* [document_store] get_labels_by_id() method removed. #1140
* [document_store] Code refactor. fix #1140
* [document_store] Code refactor. fix #1140
* [document_store] elasticsearch document store Code refactor. fix #1140
* [document_store] elasticsearch document store Code refactor. fix #1140
* [document_store] elasticsearch document store Code refactor. fix #1140
* [document_store] Code refactor for better visibility. fix #1140
* [document_store] Inmemory document store duplicate labels warning added fix #1140
2021-07-14 16:21:04 +02:00
Branden Chan
da97d81305
Change variable names ( #1286 )
2021-07-14 14:03:34 +02:00
Branden Chan
7717e81ecc
Improve preprocessing logging ( #1263 )
...
* Improve preprocessing logging
* Change variable names
* Change variable names
* Satisfy mypy
2021-07-14 14:03:13 +02:00
oryx1729
c318b5853b
Serialize crawler output to JSON ( #1284 )
2021-07-14 13:16:27 +02:00
Julian Risch
4e6f7f349d
Add FARMClassifier node for Document Classification ( #1265 )
...
* Add FARM classification node
* Add classification output to meta field of document
* Update usage example
* Add test case for FARMClassifier
* Replace FARMRanker with FARMClassifier in documentation strings
* Remove base method not implemented by any child class, etc.
2021-07-13 21:44:26 +02:00
Antonio De Marinis
f79d9bdca6
Upgrade streamlit and adjust height of result texts dynamically ( #1279 )
...
* update to latest streamlit and st-annotated-text
* improve ui results by passing dynamic height to annotated-text
2021-07-13 18:59:39 +02:00
threepointsomeone
2f93c2ddd5
Added explicit refresh call during refresh_type is false in update embedding. ( #1259 )
...
Co-authored-by: vishwaspai <vishwas.pai@emplay.net>
2021-07-13 16:59:09 +02:00
Julian Risch
90f826e95e
Add links to tutorial 12 to readme ( #1274 )
2021-07-13 11:23:10 +02:00
Julian Risch
2a90471c73
Encapsulate tutorial code in method ( #1266 )
2021-07-09 17:08:19 +02:00
Julian Risch
dbb9efbd39
Add SentenceTransformersRanker with pre-trained Cross-Encoder ( #1209 )
...
* Add SentenceTransformersRanker with pre-trained Cross-Encoder
* Add test cases for Ranker nodes and update documentation
* update docstring
* Update docstring
* Update __init__.py
* update import for test
Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
2021-07-07 17:31:45 +02:00
Moshe Berchansky
495f98deba
Add global_loss_buffer_size to the DensePassageRetriever, in order to fix 'encoded data exceeds max_size' error with DDP. ( #1245 )
2021-07-06 13:56:41 +02:00
Ikram Ali
f5a8d3cf45
Add id in write_labels() for SQLDocumentStore ( #1253 )
2021-07-05 14:13:21 +02:00
Ikram Ali
8e117f5e11
ElasticsearchDocumentStore get_label_count() bug fixed. ( #1252 )
2021-07-03 20:51:33 +02:00
Ikram Ali
04a470f890
SQLDocumentStore get_label_count() bug fixed. ( #1251 )
2021-07-03 14:02:44 +02:00
Michaël Bitard
aaed22304d
Fix convert integer CONCURRENT_REQUEST_PER_WORKER ( #1247 )
2021-07-02 20:38:15 +02:00
Ikram Ali
29e140196b
[pipeline] Allow for batch indexing when using Pipelines fix #1168 ( #1231 )
...
* [pipeline] Allow for batch indexing when using Pipelines fix #1168
* [pipeline] Test case fixed fix #1168
* [file_converter] Path.suffix updated #1168
* [file_converter] meta can be one of these three cases:
A single dict that is applied to all files
One dict for each file being converted
None #1168
* [file_converter] mypy error fixed.
* [file_converter] mypy error fixed.
* [rest_api] batch file upload introduced in indexing API.
* [test_case] Test_api file upload parameter name updated.
* [ui] Streamlit file upload parameter updated.
2021-06-30 14:13:46 +02:00
Malte Pietsch
5e23e72f31
Update issue templates
2021-06-30 12:12:07 +02:00
Guillim
73a4f9825a
Add env var CONCURRENT_REQUEST_PER_WORKER ( #1235 )
...
* we create an env var `CONCURRENT_REQUEST_PER_WORKER` following your naming convention, (I came a few commit backwards to find the original name)
* default to 4
2021-06-29 07:44:25 +02:00
Malte Pietsch
2c964db62d
Relax typing for meta data in REST API ( #1224 )
2021-06-24 12:34:42 +02:00
Malte Pietsch
2caeea000e
Small UI and REST API fixes ( #1223 )
...
* small fixes
* change default question
2021-06-24 09:53:08 +02:00
Julian Risch
17dcb8c23e
Use Reader's device by default ( #1208 )
...
* Use Reader's device by default
* Replace get_device with initialize_device_settings
* Add import statements for init_device_settings
* Remove unused get_device method
2021-06-24 09:22:34 +02:00
Branden Chan
10e332dabb
Fix Links ( #1199 )
...
* Fix link highlight
* Regen md files
* Remove duplicate
* Fix whitespace
* fixing strings for website
* Fix link
Co-authored-by: PiffPaffM <markuspaff.mp@gmail.com>
2021-06-23 19:07:54 +02:00
Branden Chan
efc03f72db
Make PreProcessor.process() work on lists of documents ( #1163 )
...
* Add process_batch method
* Rename methods
* Fix doc string, satisfy mypy
* Fix mypy CI
* Fix typp
* Update tutorial
* Fix argument name
* Change arg name
* Incorporate reviewer feedback
2021-06-23 18:13:51 +02:00
oryx1729
afee4f36ce
Add scaffold for defining custom components for Pipelines ( #1205 )
2021-06-23 12:01:54 +02:00
vblagoje
02fc4c7783
Improve document stores unit test parametrization ( #1202 )
2021-06-22 16:08:23 +02:00
Markus Paff
a8f3601e6a
Pin docs for 0.9.0
2021-06-22 10:38:08 +02:00
Ikram Ali
d835a9cdc5
[setup] version tag added to Haystack fix #1175 ( #1216 )
2021-06-22 09:43:26 +02:00
Stefano
66049abff0
Add arg to support different languages in PreProcessor's sentence segmentation ( #1160 )
...
* Add PreProcessor optional language parameter.
* Add iso639 to nltk languages.
* Update docstring
Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
2021-06-21 18:53:19 +02:00
Julian Risch
9e4d7bf9be
Increase Haystack version to 0.9.0 ( #1215 )
v0.9.0
2021-06-21 18:39:00 +02:00
oryx1729
0168f04385
Remove unused function _get_pseudo_prob ( #1201 )
2021-06-17 10:28:48 +02:00
C V Goudar
f9c4083006
Bugfix setting of device by defaulting to "cpu" ( #1182 )
...
* Defaulting the device to cpu in case gpu is not available and use_gpu is set to True
Co-authored-by: C V Goudar <cv.goudar@emplay.et>
2021-06-16 10:26:29 +02:00
Markus Paff
6cd49105e7
update api markdown files and add markdown file for ranker ( #1198 )
...
* update api markdown files and add markdown file for ranker
* added docstrings for weaviate
* new version of pydoc-markdown does not render arguments correctly. We used pydoc-markdown==3.11.0
2021-06-15 17:50:08 +02:00
Julian Risch
215c45eb8a
Remove quickfix from reader and ranker ( #1196 )
...
* Remove quickfix from ranker
* remove quickfix from reader
* Use inferencer's model instead of reloaded model
2021-06-15 09:46:11 +02:00
Branden Chan
7dbd58f6be
Add about sections ( #1195 )
2021-06-14 18:37:00 +02:00
vblagoje
2a5882578a
Add Longform-QA (LFQA), Seq2SeqGenerator for generative QA and Retribert Retriever ( #1086 )
...
* Integrate LFQA with Haystack
* Integrate LFQA with Haystack - unit tests
* Properly initialize conftest default value for vector_dim
* Update PR after inital feedback
* Fix conftest.py import
* Seq2SeqGenerator uses Callables instead of subclasses for custom model input
* Update docstring
* Fix Callable use
* Add LFQA tutorials
* Improve type error reporting for invalid input converter Callable
* Generate docstrings
* Format comments in tutorial script
* Generate tutorial md
* Add usage page
Co-authored-by: Malte Pietsch <malte.pietsch@deepset.ai>
Co-authored-by: brandenchan <brandenchan@icloud.com>
2021-06-14 17:53:43 +02:00
venuraja
ae55927f58
Weaviate: Update Embeddings - Use update instead of replace ( #1181 )
...
* Update Embeddings logic improved
* Update Embeddings logic improved
2021-06-14 17:50:55 +02:00
Shahrukh Khan
1a3b4b9c74
Fix typo in Query Classifier Exception Message( #1190 )
2021-06-14 17:40:35 +02:00
Julian Risch
f6e70f0f3d
Removed single_model_path; added infer_tokenizer to dpr load() ( #1060 )
2021-06-14 14:14:46 +02:00
Julian Risch
1c31589b43
Bump to FARM 0.8.0, torch 1.8.1 and transformers 4.6.1 ( #1192 )
...
* bump to FARM 0.8.0, which in turn bumps torch 1.8.1 and transformers 4.6.1 (#1192 )
* Replace deprecated force_bos_token_to_be_generated parameter
2021-06-14 13:00:41 +02:00
Bob van Luijt
f583d0bfaf
Minor change with a link to the Weaviate docs ( #1180 )
...
Super minor change, but in line with other DocumentStore's
2021-06-11 21:20:23 +02:00
Branden Chan
e7937ac5d7
Reformat FAQ page ( #1177 )
...
* Add faq page
* Update faq.md
* Fix mypy CI
* Add question
* Reformat faq
2021-06-11 11:59:52 +02:00
Branden Chan
783893c3d2
Tutorial update ( #1166 )
...
* Add header / footer
* Add Milvus example
* Generate md files
* Fix mypy CI
2021-06-11 11:09:15 +02:00
Branden Chan
13edff109d
Documentation update ( #1162 )
...
* Add content
* Add German BERT references
* Mention preprocessor language
* Fix mypy CI
* Add document length recommendation
* Add more languages
2021-06-11 11:06:57 +02:00