* Remove BasePipeline and make a module for RayPipeline
* Can load pipelines from yaml, plenty of issues left
* Extract graph validation logic into _add_node_to_pipeline_graph & refactor load_from_config and add_node to use it
* Fix pipeline tests
* Move some tests out of test_pipeline.py and create MockDenseRetriever
* myoy and pylint (silencing too-many-public-methods)
* Fix issue found in some yaml files and in schema files
* Fix paths to YAML and fix some typos in Ray
* Fix eval tests
* Simplify MockDenseRetriever
* Fix Ray test
* Accidentally pushed merge coinflict, fixed
* Typo in schemas
* Typo in _json_schema.py
* Slightly reduce noisyness of version validation warnings
* Fix version logs tests
* Fix version logs tests again
* remove seemingly unused file
* Add check and test to avoid adding the same node to the pipeline twice
* Update Documentation & Code Style
* Revert config to pipeline_config
* Remo0ve unused import
* Complete reverting to pipeline_config
* Some more stray config=
* Update Documentation & Code Style
* Feedback
* Move back other_nodes tests into pipeline tests temporarily
* Update Documentation & Code Style
* Fixing tests
* Update Documentation & Code Style
* Fixing ray and standard pipeline tests
* Rename colliding load() methods in dense retrievers and faiss
* Update Documentation & Code Style
* Fix mypy on ray.py as well
* Add check for no root node
* Fix tests to use load_from_directory and load_index
* Try to workaround the disabled add_node of RayPipeline
* Update Documentation & Code Style
* Fix Ray test
* Fix FAISS tests
* Relax class check in _add_node_to_pipeline_graph
* Update Documentation & Code Style
* Try to fix mypy in ray.py
* unused import
* Try another fix for Ray
* Fix connector tests
* Update Documentation & Code Style
* Fix ray
* Update Documentation & Code Style
* use BaseComponent.load() in pipelines/base.py
* another round of feedback
* stray BaseComponent.load()
* Update Documentation & Code Style
* Fix FAISS tests too
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: tstadel <60758086+tstadel@users.noreply.github.com>
* changing the name of the retrievers from es_retriever to retriever
* Update Documentation & Code Style
* name fix 2
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Add windows specific package for python-magic
* Disable some tests on Windows and add explanatory warning in case of issues with libmagic
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Change exception into warning, add strict_version param, and remove compatibility between schemas
* Simplify update_json_schema
* Rename unstable into master
* Prevent validate_config from changing the config to validate
* Fix version validation and add tests
* Rename master into ignore
* Complete parameter rename
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Upgrade pdftotext also on pinecone and milvus1 jobs
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Add failing test
* Remove `**kwargs` from docstores' `__init__` functions (#2407)
* Remove kwargs from ESDocStore subclasses
* Remove kwargs from subclasses of SQLDocumentStore
* Remove kwargs from Weaviate
* Revert change in pinecone
* Fix tests
* Fix retriever test wirh weaviate
* Change Exception into DocumentStoreError
* Update Documentation & Code Style
* Remove `**kwargs` from `FARMReader` (#2413)
* Remove FARMReader kwargs without trying to replace them functionally
* Update Documentation & Code Style
* enforce same index values before and after saving/loading eval dataframes (#2398)
* Add tests for missing `__init__` and `super().__init__()` in custom nodes (#2350)
* Add tests for missing init and super
* Update Documentation & Code Style
* change in with endswith
* Move test in pipeline.py and change test in pipeline_yaml.py
* Update Documentation & Code Style
* Use caplog to test the warning
* Update Documentation & Code Style
* move tests into test_pipeline and use get_config
* Update Documentation & Code Style
* Unmock version name
* Improve variadic args test
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* remove duplicate imports
* fix ungrouped-imports
* Fix wrong-import-position
* Fix unused-import
* pyproject.toml
* Working on wrong-import-order
* Solve wrong-import-order
* fix Pool import
* Move open_search_index_to_document_store and elasticsearch_index_to_document_store in elasticsearch.py
* remove Converter from modeling
* Fix mypy issues on adaptive_model.py
* create es_converter.py
* remove converter import
* change import path in tests
* Restructure REST API to not rely on global vars from search.apy and improve tests
* Fix openapi generator
* Move variable initialization
* Change type of FilterRequest.filters
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Delete files in _src
* Filter unused images and re-add images that were in use in docs/img
* Remove all usages of user-images.githubusercontent.com
Co-authored-by: ZanSara <sarazanzo94@gmail.com>
* Make initialize_device_settings take a devices list, and change signature of FARMReader
* reintroduce use_gpu and propagate devices to other methods
* fix typing for initialize_device_settings
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Fix 'bug' on Weaviate only returning max. 100 docs on get_all_documents
* Add type
* Update Weaviate version on the CI
* Fix bug on get_document_count where there are no documents
* Add more info in the docstrings of get_all_documents and get_all_documents_generator
* Add latest docstring and tutorial changes
* Apply Black
* Update Documentation & Code Style
* Trigger pipeline
* Update Documentation & Code Style
* Include StefanBogdan feedback
* Fix mypy issues and LogicalFilterClause
* Add more types
* Update Documentation & Code Style
* update setup.cfg
* Upgrade weaviate containers too
* Allow to filter for content field in Weaviate
* Use convert_to_weaviate instead of convert_to_pinecone
* Fix _get_all_documents_in_index
* Update docstrings and docs
* Catching an exception in get_document(s)_by_id
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: bogdankostic <bogdankostic@web.de>
* EvaluationSetClient for deepset cloud to fetch evaluation sets and labels for one specific evaluation set
* make DeepsetCloudDocumentStore able to fetch uploaded evaluation set names
* fix missing renaming of get_evaluation_set_names in DeepsetCloudDocumentStore
* update documentation for evaluation set functionality in deepset cloud document store
* DeepsetCloudDocumentStore tests for evaluation set functionality
* rename index to evaluation_set_name for DeepsetCloudDocumentStore evaluation set functionality
* raise DeepsetCloudError when no labels were found for evaluation set
* make use of .get_with_auto_paging in EvaluationSetClient
* Return result of get_with_auto_paging() as it parses the response already
* Make schema import source more specific
* fetch all evaluation sets for a workspace in deepset Cloud
* Rename evaluation_set_name to label_index
* make use of generator functionality for fetching labels
* Update Documentation & Code Style
* Adjust function input for DeepsetCloudDocumentStore.get_all_labels, adjust tests for it, fix typos, make linter happy
* Match error message with pytest.raises
* Update Documentation & Code Style
* DeepsetCloudDocumentStore.get_labels_count raises DeepsetCloudError when no evaluation set was found to count labels on
* remove unneeded import in tests
* DeepsetCloudDocumentStore tests, make reponse bodies a string through json.dumps
* DeepsetcloudDocumentStore.get_label_count - move raise to return
* stringify uuid before json.dump as uuid is not serilizable
* DeepsetcloudDocumentStore - adjust response mocking in tests
* DeepsetcloudDocumentStore - json dump response body in test
* DeepsetCloudDocumentStore introduce label_index, EvaluationSetClient rename label_index to evaluation_set
* Update Documentation & Code Style
* DeepsetCloudDocumentStore rename evaluation_set to evaluation_set_response as there is a name clash with the input variable
* DeepsetCloudDocumentStore - rename missed variable in test
* DeepsetCloudDocumentStore - rename missed label_index to index in doc string, rename label_index to evaluation_set in EvaluationSetClient
* Update Documentation & Code Style
* DeepsetCloudDocumentStore - update docstrings for EvaluationSetClient
* DeepsetCloudDocumentStore - fix typo in doc string
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* added core install and functionality of pinecone doc store (init, upsert, query, delete)
* implemented core functionality of Pinecone doc store
* Update Documentation & Code Style
* updated filtering to use Haystack filtering and reduced default batch_size
* Update Documentation & Code Style
* removed debugging code
* updated Pinecone filtering to use filter_utils
* removed uneeded methods and minor tweaks to current methods
* fixed typing issues
* Update Documentation & Code Style
* Allow filters in al methods except get_embedding_count
* Fix skipping document store tests
* Update Documentation & Code Style
* Fix handling of Milvus1 and Milvus2 in tests
* Update Documentation & Code Style
* Fix handling of Milvus1 and Milvus2 in tests
* Update Documentation & Code Style
* Remove SQL from tests requiring embeddings
* Update Documentation & Code Style
* Fix get_embedding_count of Milvus2
* Make sure to start Milvus2 tests with a new collection
* Add pinecone to test suite
* Update Documentation & Code Style
* Fix typing
* Update Documentation & Code Style
* Add pinecone to docstores dependendcy
* Add PineconeDocStore to API Documentation
* Add missing comma
* Update Documentation & Code Style
* Adapt format of doc strings
* Update Documentation & Code Style
* Set API key as environment variable
* Skip Pinecone tests in forks
* Add sleep after deleting index
* Add sleep after deleting index
* Add sleep after creating index
* Add check if index ready
* Remove printing of index stats
* Create new index for each pinecone test
* Use RestAPI instead of Python API for describe_index_stats
* Fix accessing describe_index_stats
* Remove usages of describe_index_stats
* Run pinecone tests separately
* Update Documentation & Code Style
* Add pdftotext to pinecone tests
* Remove sleep from doc store fixture
* Add describe_index_stats
* Remove unused imports
* Use pull_request_target trigger
* Revert use pull_request_target trigger
* Remove set_config
* Add os to conftest
* Integrate review comments
* Set include_values to False
* Remove quotation marks from pinecone.Index type
* Update Documentation & Code Style
* Update Documentation & Code Style
* Fix number of args in error messages
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: bogdankostic <bogdankostic@web.de>
* Similar test case seems to pass
* Update Documentation & Code Style
* Improve error message
* Slightly clarify info message
* Fix mismatch between node and node_class in the schema generation
* Remove condition that node class names cannot begin with Base and update tests
* Indentation
* Update Documentation & Code Style
* feedback
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* add basic telemetry features
* change pipeline_config to _component_config
* Update Documentation & Code Style
* add super().__init__() calls to error classes
* make posthog mock work with python 3.7
* Update Documentation & Code Style
* update link to docs web page
* log exceptions, send event for raised HaystackErrors, refactor Path(CONFIG_PATH)
* add comment on send_event in BaseComponent.init() and fix mypy
* mock NonPrivateParameters and fix pylint undefined-variable
* Update Documentation & Code Style
* check model path contains multiple /
* add test for writing to file
* add test for en-/disable telemetry
* Update Documentation & Code Style
* merge file deletion methods and ignore pylint global statement
* Update Documentation & Code Style
* set env variable in demo to activate telemetry
* fix mock of HAYSTACK_TELEMETRY_ENABLED
* fix mypy and linter
* add CI as env variable to execution contexts
* remove threading, add test for custom error event
* Update Documentation & Code Style
* simplify config/log file deletion
* add test for final event being sent
* force writing config file in test
* make test compatible with python 3.7
* switch to posthog production server
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* 'os' wrapper to function for brownfield support
* Changing function names and fixing default parameter values
* Including parameter keys
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Add BasePipeline.validate_config, BasePipeline.validate_yaml, and some new custom exception classes
* Make error composition work properly
* Clarify typing
* Help mypy a bit more
* Update Documentation & Code Style
* Enable autogenerated docs for Milvus1 and 2 separately
* Revert "Enable autogenerated docs for Milvus1 and 2 separately"
This reverts commit 282be4a78a6e95862a9b4c924fc3dea5ca71e28d.
* Update Documentation & Code Style
* Re-enable 'additionalProperties: False'
* Add pipeline.type to JSON Schema, was somehow forgotten
* Disable additionalProperties on the pipeline properties too
* Fix json-schemas for 1.1.0 and 1.2.0 (should not do it again in the future)
* Cal super in PipelineValidationError
* Improve _read_pipeline_config_from_yaml's error handling
* Fix generate_json_schema.py to include document stores
* Fix json schemas (retro-fix 1.1.0 again)
* Improve custom errors printing, add link to docs
* Add function in BaseComponent to list its subclasses in a module
* Make some document stores base classes abstract
* Add marker 'integration' in pytest flags
* Slighly improve validation of pipelines at load
* Adding tests for YAML loading and validation
* Make custom_query Optional for validation issues
* Fix bug in _read_pipeline_config_from_yaml
* Improve error handling in BasePipeline and Pipeline and add DAG check
* Move json schema generation into haystack/nodes/_json_schema.py (useful for tests)
* Simplify errors slightly
* Add some YAML validation tests
* Remove load_from_config from BasePipeline, it was never used anyway
* Improve tests
* Include json-schemas in package
* Fix conftest imports
* Make BasePipeline abstract
* Improve mocking by making the test independent from the YAML version
* Add exportable_to_yaml decorator to forget about set_config on mock nodes
* Fix mypy errors
* Comment out one monkeypatch
* Fix typing again
* Improve error message for validation
* Add required properties to pipelines
* Fix YAML version for REST API YAMLs to 1.2.0
* Fix load_from_yaml call in load_from_deepset_cloud
* fix HaystackError.__getattr__
* Add super().__init__()in most nodes and docstore, comment set_config
* Remove type from REST API pipelines
* Remove useless init from doc2answers
* Call super in Seq3SeqGenerator
* Typo in deepsetcloud.py
* Fix rest api indexing error mismatch and mock version of JSON schema in all tests
* Working on pipeline tests
* Improve errors printing slightly
* Add back test_pipeline.yaml
* _json_schema.py supports different versions with identical schemas
* Add type to 0.7 schema for backwards compatibility
* Fix small bug in _json_schema.py
* Try alternative to generate json schemas on the CI
* Update Documentation & Code Style
* Make linux CI match autoformat CI
* Fix super-init-not-called
* Accidentally committed file
* Update Documentation & Code Style
* fix test_summarizer_translation.py's import
* Mock YAML in a few suites, split and simplify test_pipeline_debug_and_validation.py::test_invalid_run_args
* Fix json schema for ray tests too
* Update Documentation & Code Style
* Reintroduce validation
* Usa unstable version in tests and rest api
* Make unstable support the latest versions
* Update Documentation & Code Style
* Remove needless fixture
* Make type in pipeline optional in the strings validation
* Fix schemas
* Fix string validation for pipeline type
* Improve validate_config_strings
* Remove type from test p[ipelines
* Update Documentation & Code Style
* Fix test_pipeline
* Removing more type from pipelines
* Temporary CI patc
* Fix issue with exportable_to_yaml never invoking the wrapped init
* rm stray file
* pipeline tests are green again
* Linux CI now needs .[all] to generate the schema
* Bugfixes, pipeline tests seems to be green
* Typo in version after merge
* Implement missing methods in Weaviate
* Trying to avoid FAISS tests from running in the Milvus1 test suite
* Fix some stray test paths and faiss index dumping
* Fix pytest markers list
* Temporarily disable cache to be able to see tests failures
* Fix pyproject.toml syntax
* Use only tmp_path
* Fix preprocessor signature after merge
* Fix faiss bug
* Fix Ray test
* Fix documentation issue by removing quotes from faiss type
* Update Documentation & Code Style
* use document properly in preprocessor tests
* Update Documentation & Code Style
* make preprocessor capable of handling documents
* import document
* Revert support for documents in preprocessor, do later
* Fix bug in _json_schema.py that was breaking validation
* re-enable cache
* Update Documentation & Code Style
* Simplify calling _json_schema.py from the CI
* Remove redundant ABC inheritance
* Ensure exportable_to_yaml works only on implementations
* Rename subclass to class_ in Meta
* Make run() and get_config() abstract in BasePipeline
* Revert unintended change in preprocessor
* Move outgoing_edges_input_node check inside try block
* Rename VALID_CODE_GEN_INPUT_REGEX into VALID_INPUT_REGEX
* Add check for a RecursionError on validate_config_strings
* Address usages of _pipeline_config in data silo and elasticsearch
* Rename _pipeline_config into _init_parameters
* Fix pytest marker and remove unused imports
* Remove most redundant ABCs
* Rename _init_parameters into _component_configuration
* Remove set_config and type from _component_configuration's dict
* Remove last instances of set_config and replace with super().__init__()
* Implement __init_subclass__ approach
* Simplify checks on the existence of _component_configuration
* Fix faiss issue
* Dynamic generation of node schemas & weed out old schemas
* Add debatable test
* Add docstring to debatable test
* Positive diff between schemas implemented
* Improve diff printing
* Rename REST API YAML files to trigger IDE validation
* Fix typing issues
* Fix more typing
* Typo in YAML filename
* Remove needless type:ignore
* Add tests
* Fix tests & validation feedback for accessory classes in custom nodes
* Refactor RAGeneratorType out
* Fix broken import in conftest
* Improve source error handling
* Remove unused import in test_eval.py breaking tests
* Fix changed error message in tests matches too
* Normalize generate_openapi_specs.py and generate_json_schema.py in the actions
* Fix path to generate_openapi_specs.py in autoformat.yml
* Update Documentation & Code Style
* Add test for FAISSDocumentStore-like situations (superclass with init params)
* Update Documentation & Code Style
* Fix indentation
* Remove commented set_config
* Store model_name_or_path in FARMReader to use in DistillationDataSilo
* Rename _component_configuration into _component_config
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>