* Changing the name that crawled page is saved to avoid long file names error on some file systems
* Custom naming function for saving crawled files
* Update Documentation & Code Style
* Remove bad characters on file name and preffix
* Add test for naming function
* Update Documentation & Code Style
* Fix expensive regex recalculation and linter warns
* Check for exceptions on file dump
* Remove param_naming variable
* Fix file paths on Windows, Linux and Mac
* Update Documentation & Code Style
* Test using one of the docstrings examples
* Change default naming function
Update docstrings
* Applying formatting rules
* Update Documentation & Code Style
* Fix mypy incompatible assignment error
* Remove unused type declaration
* Fix typo
* Update tests for naming function
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* first version of save_to_remote for HF from FarmReader
* Update Documentation & Code Style
* Changes based on comments
* Update Documentation & Code Style
* imports order
* making small changes to pydoc
* indent fix
* Update Documentation & Code Style
* keyword arguments instead of positional
* Changing to repo_id
huggingface-hub package would have to be v0.5 or higher - checking how to handle with Thomas
* Update Documentation & Code Style
* adding huggingface-hub dependency 0.5 or above
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Sara Zan <sarazanzo94@gmail.com>
* extract common code for ES and OS into a base class
* Update Documentation & Code Style
* give the base class a more obvious name
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Add new audio answer primitives
* Add AnswerToSpeech
* Add dependency group
* Update Documentation & Code Style
* Extract TextToSpeech in a helper class, create DocumentToSpeech and primitives
* Add tests
* Update Documentation & Code Style
* Add ability to compress audio and more tests
* Add audio group to test, all and all-gpu
* fix pylint
* Update Documentation & Code Style
* Accidental git tag
* Try pleasing mypy
* Update Documentation & Code Style
* fix pylint
* Add warning for missing OS library and support in CI
* Try fixing mypy
* Update Documentation & Code Style
* Add docs, simplify args for audio nodes and add tutorials
* Fix mypy
* Fix run_batch
* Feedback on tutorials
* fix mypy and pylint
* Fix mypy again
* Fix mypy yet again
* Fix the ci
* Fix dicts merge and install ffmpeg on CI
* Make the audio nodes import safe
* Trying to increase tolerance in audio test
* Fix import paths
* fix linter
* Update Documentation & Code Style
* Add audio libs in unit tests
* Update _text_to_speech.py
* Update answer_to_speech.py
* Use dedicated dataset & update telemetry
* Remove and use distilled roberta
* Revert special primitives so that the nodes run in indexing
* Improve tutorials and fix smaller bugs
* Update Documentation & Code Style
* Fix serialization issue
* Update Documentation & Code Style
* Improve tutorial
* Update Documentation & Code Style
* Update _text_to_speech.py
* Minor lg updates
* Minor lg updates to tutorial
* Making indexing work in tutorials
* Update Documentation & Code Style
* Improve docstrings
* Try to use GPU when available
* Update Documentation & Code Style
* Fixi mypy and pylint
* Try to pass the device correctly
* Update Documentation & Code Style
* Use type of device
* use .cpu()
* Improve .ipynb
* update apt index to be able to download libsndfile1
* Fix SpeechDocument.from_dict()
* Change pip URL
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
* make crawler extract also hidden text
* Update Documentation & Code Style
* try to adapt test for extract_hidden_text
* Update Documentation & Code Style
* fix test bug
* fix bug in test
* added test for hidden text"
* Update Documentation & Code Style
* fix bug in test
* Update Documentation & Code Style
* fix test
* Update Documentation & Code Style
* fix other test bug
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* move OpenSearchDocumentStore into its own Python module
* Update Documentation & Code Style
* mark test with (sigh) elasticsearch
* skip opensearch tests on windows
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* include meta data when calculating embeddings in EmbeddingRetriever
* Update Documentation & Code Style
* fix None meta field
* remove default values
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* add documentation regarding the score of JoinDocuments when using concatenation
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Update version to 1.4.1rc0
* Elasticsearch is not an optional dependency
* Fix import path
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* fix small typo in Document doc string
Was going through the tutorial, then digging through the code and just noticed a small typo
* generate markdown file changes from docstrings
Co-authored-by: Julian Risch <julian.risch@deepset.ai>
* Remove BasePipeline and make a module for RayPipeline
* Can load pipelines from yaml, plenty of issues left
* Extract graph validation logic into _add_node_to_pipeline_graph & refactor load_from_config and add_node to use it
* Fix pipeline tests
* Move some tests out of test_pipeline.py and create MockDenseRetriever
* myoy and pylint (silencing too-many-public-methods)
* Fix issue found in some yaml files and in schema files
* Fix paths to YAML and fix some typos in Ray
* Fix eval tests
* Simplify MockDenseRetriever
* Fix Ray test
* Accidentally pushed merge coinflict, fixed
* Typo in schemas
* Typo in _json_schema.py
* Slightly reduce noisyness of version validation warnings
* Fix version logs tests
* Fix version logs tests again
* remove seemingly unused file
* Add check and test to avoid adding the same node to the pipeline twice
* Update Documentation & Code Style
* Revert config to pipeline_config
* Remo0ve unused import
* Complete reverting to pipeline_config
* Some more stray config=
* Update Documentation & Code Style
* Feedback
* Move back other_nodes tests into pipeline tests temporarily
* Update Documentation & Code Style
* Fixing tests
* Update Documentation & Code Style
* Fixing ray and standard pipeline tests
* Rename colliding load() methods in dense retrievers and faiss
* Update Documentation & Code Style
* Fix mypy on ray.py as well
* Add check for no root node
* Fix tests to use load_from_directory and load_index
* Try to workaround the disabled add_node of RayPipeline
* Update Documentation & Code Style
* Fix Ray test
* Fix FAISS tests
* Relax class check in _add_node_to_pipeline_graph
* Update Documentation & Code Style
* Try to fix mypy in ray.py
* unused import
* Try another fix for Ray
* Fix connector tests
* Update Documentation & Code Style
* Fix ray
* Update Documentation & Code Style
* use BaseComponent.load() in pipelines/base.py
* another round of feedback
* stray BaseComponent.load()
* Update Documentation & Code Style
* Fix FAISS tests too
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: tstadel <60758086+tstadel@users.noreply.github.com>