* Started making changes to use native Pytorch AMP
* Updated compute_loss functions to use torch.cuda.amp.autocast
* Updating docstrings
* Add use_amp to trainer_checkpoint
* Removed mentions of apex and started to add the necessary warnings
* Removing unused instances of use_amp variable
* Added fast training test for FARMReader. Needed to add max_query_length as a parameter in FARMReader.__init__ and FARMReader.train
* Make max_query_length optional in FARMReader.train
* Update lg
Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
Co-authored-by: agnieszka-m <amarzec13@gmail.com>
* Adjust max token size for openai ADA-v2 embeddings
* Added requested changes and corrected old seq len
Apparently the limit for the older models is 2046 and not 2048, I included this change directly.
See (https://beta.openai.com/docs/guides/embeddings/what-are-embeddings) to check.
If you set the IMAGE_NAME variable, then the base image will use that name,
but the api image would previously use a hardcoded `deepset/haystack` image name.
* feat: Change `docker-compose.yml` file
* Add `volumes` to read from the local `/pipelines` folder
* Change the `PIPELINE_YAML_PATH` value and refer to the local `pipelines.haystack-pipeline.yml`
* Change the elasticsearch image
* Fix volume
* Update readme to direct users to the new demos repository
* Update pytorch base image
* Small corrections
* Revert back to load_schema() call
* reverted to import haystack for schema generation
Co-authored-by: Mayank Jobanputra <mayankjobanputra@gmail.com>
* bug: fix the docs rest api reference url
* revert openapi json changes
* remove last line on json files
* Add explanation about `servers` and remove `servers` parameter from FastAPI
* generate openapi schema without empty end line
* build pdftotext from sources
* trigger the build on my own PR - to be reverted
* trigger the build on my own PR - to be reverted
* Update docker_release.yml
* ci: add license compliance check
* ci: run check always for testing purposes
* revamp workflows
* temporary remove path directive
* triggering ci
* check rest api and ui too
* avoid cache to make sure env is clean
* add shield on readme
* ci: trigger CI to get latest scan
Co-authored-by: ZanSara <sarazanzo94@gmail.com>
Co-authored-by: Sara Zan <sara.zanzottera@deepset.ai>
* Refactor table reader to use util functions to reduce code duplication.
* Expanding the tests for the table reader
* Adding types
* Updating tests to work for RCIReader
* Fix bug in RCIReader. Saving the wrong queries list.
* Update _flatten_inputs to not change input variable
* Remove duplicate code
* Fixing broken BM25 support with Weaviate - fixes#3720
Unfortunately the BM25 support with Weaviate got broken with Haystack v1.11.0+, which is getting fixed with this commit.
Please see more under issue #3720.
* Fixing mypy issue - method signature wasn't matching the base class
* Mypy related test fix
Mypy forced me to set the signature of the `query` method of the Weaviate document store to the same as its parent, the `KeywordDocumentStore`, where the `query` parame is `Optional`, but has NO default value, so it must be provided (as None) at runtime.
I am not quite sure why the abstract method's `query` param was set without a default value while its type is `Optional`, but I didn't want to change that, so instead I have changed the Weaviate tests.
* Adding a note regarding an upcomming fix in Weaviate v1.17.0
* Apply suggestions from code review
* revert
* [EMPTY] Re-trigger CI
* first draft to add index param to tfidf
* better mypy handling
* Revert "better mypy handling"
This reverts commit 91a22516320f9dcbeae53827ec69f9dc51e1785c.
* new check in auto_fit
* new check also in retrieve
* better dict typings
* new test and improvements to other test
* remove unnecessary lambda
* improve test
* remove newline from openapi json
* fix test
* language fix
Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
* language fix 2
Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
* language fix 3
Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
* language fix 4
Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
* language fix 5
Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
* language fix 6
Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
* explicit index value handling
* fix test
* better error messages
Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
* Fixing the `query_batch` method of the deepsetcloud document store - fixes#3722
* Trigger Build
* Trigger Build
* Trigger CI
Co-authored-by: Thomas Stadelmann <thomas.stadelmann@deepset.ai>
* first try and new test
* fix test
* fix unused import
* remove comments
* no more dataclass
* add __eq__ and extend test
* better design from review
* Update schema.py
* fix black
* fix openapi
* fix openapi 2
* new try to fix openapi
* remove newline from openapi json