* Add page number to Documents coming from PDFConverters and PreProcessor
* Fix mypy
* Update API Docs
* Update API Docs
* Remove unused imports
* Generate JSON schema
* Generate JSON schema
* Make test variable shorter
* Make regex a separate function
* Move counting of page breaks to a function
* Generate JSON schema
* Apply suggestions from code review
Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
* Update API Documentation
* Don't create instance for testing staticmethod
* Update haystack/nodes/preprocessor/preprocessor.py
Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
* move OpenSearchDocumentStore into its own Python module
* Update Documentation & Code Style
* mark test with (sigh) elasticsearch
* skip opensearch tests on windows
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Remove BasePipeline and make a module for RayPipeline
* Can load pipelines from yaml, plenty of issues left
* Extract graph validation logic into _add_node_to_pipeline_graph & refactor load_from_config and add_node to use it
* Fix pipeline tests
* Move some tests out of test_pipeline.py and create MockDenseRetriever
* myoy and pylint (silencing too-many-public-methods)
* Fix issue found in some yaml files and in schema files
* Fix paths to YAML and fix some typos in Ray
* Fix eval tests
* Simplify MockDenseRetriever
* Fix Ray test
* Accidentally pushed merge coinflict, fixed
* Typo in schemas
* Typo in _json_schema.py
* Slightly reduce noisyness of version validation warnings
* Fix version logs tests
* Fix version logs tests again
* remove seemingly unused file
* Add check and test to avoid adding the same node to the pipeline twice
* Update Documentation & Code Style
* Revert config to pipeline_config
* Remo0ve unused import
* Complete reverting to pipeline_config
* Some more stray config=
* Update Documentation & Code Style
* Feedback
* Move back other_nodes tests into pipeline tests temporarily
* Update Documentation & Code Style
* Fixing tests
* Update Documentation & Code Style
* Fixing ray and standard pipeline tests
* Rename colliding load() methods in dense retrievers and faiss
* Update Documentation & Code Style
* Fix mypy on ray.py as well
* Add check for no root node
* Fix tests to use load_from_directory and load_index
* Try to workaround the disabled add_node of RayPipeline
* Update Documentation & Code Style
* Fix Ray test
* Fix FAISS tests
* Relax class check in _add_node_to_pipeline_graph
* Update Documentation & Code Style
* Try to fix mypy in ray.py
* unused import
* Try another fix for Ray
* Fix connector tests
* Update Documentation & Code Style
* Fix ray
* Update Documentation & Code Style
* use BaseComponent.load() in pipelines/base.py
* another round of feedback
* stray BaseComponent.load()
* Update Documentation & Code Style
* Fix FAISS tests too
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: tstadel <60758086+tstadel@users.noreply.github.com>
* added core install and functionality of pinecone doc store (init, upsert, query, delete)
* implemented core functionality of Pinecone doc store
* Update Documentation & Code Style
* updated filtering to use Haystack filtering and reduced default batch_size
* Update Documentation & Code Style
* removed debugging code
* updated Pinecone filtering to use filter_utils
* removed uneeded methods and minor tweaks to current methods
* fixed typing issues
* Update Documentation & Code Style
* Allow filters in al methods except get_embedding_count
* Fix skipping document store tests
* Update Documentation & Code Style
* Fix handling of Milvus1 and Milvus2 in tests
* Update Documentation & Code Style
* Fix handling of Milvus1 and Milvus2 in tests
* Update Documentation & Code Style
* Remove SQL from tests requiring embeddings
* Update Documentation & Code Style
* Fix get_embedding_count of Milvus2
* Make sure to start Milvus2 tests with a new collection
* Add pinecone to test suite
* Update Documentation & Code Style
* Fix typing
* Update Documentation & Code Style
* Add pinecone to docstores dependendcy
* Add PineconeDocStore to API Documentation
* Add missing comma
* Update Documentation & Code Style
* Adapt format of doc strings
* Update Documentation & Code Style
* Set API key as environment variable
* Skip Pinecone tests in forks
* Add sleep after deleting index
* Add sleep after deleting index
* Add sleep after creating index
* Add check if index ready
* Remove printing of index stats
* Create new index for each pinecone test
* Use RestAPI instead of Python API for describe_index_stats
* Fix accessing describe_index_stats
* Remove usages of describe_index_stats
* Run pinecone tests separately
* Update Documentation & Code Style
* Add pdftotext to pinecone tests
* Remove sleep from doc store fixture
* Add describe_index_stats
* Remove unused imports
* Use pull_request_target trigger
* Revert use pull_request_target trigger
* Remove set_config
* Add os to conftest
* Integrate review comments
* Set include_values to False
* Remove quotation marks from pinecone.Index type
* Update Documentation & Code Style
* Update Documentation & Code Style
* Fix number of args in error messages
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: bogdankostic <bogdankostic@web.de>
* Bring back init defs to api in v1.2
* Update Documentation & Code Style
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* update remaining occurences of get_connection
* fix milvus2 import and fix wrong extra references
* change MilvusDocumentStore to Milvus1DocumentStore
* update milvus docstrings to reflect updated dependency management
* enable milvus 2 tests
* fix milvus2 env variable processing
* fix dropping collections for each milvus 2 test
* make Milvus 2 doc store tests work
* allow user to specify consistency level
* Fist attempt at running Milvus2 in the CI
* Install the correct pymilvus
* add batch deletion for milvus2
* change default from milvus 1 to milvus 2
* make milvus2 the default in the docstores extra
* Switch milvus1 and milvus2 in base test run on CI
* Rename docstore flags for pytest: 'milvus'->'milvus1', 'milvus2'->'milvus'
* Rename milvus.py->milvus1.py and milvus2x.py->milvus2.py
* Enable autogenerated docs for Milvus1 and 2 separately
* Partial fix to docstring of Milvus2DocumentStore
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Michel Bartels <kontakt@michelbartels.com>
Co-authored-by: Sara Zan <sara.zanzottera@deepset.ai>
* Upgrade pydoc-markdown and fix the YAMLs to work with it
* Pin pydoc-markdown to major version
* Generalize pydoc-markdown workflow
* Make a single Action to perform all tasks that require committing into the local branch
* Merge the code updates and the docs in the Linux CI to prevent the bot from always show the pipeline as green
* Installing Jupyter deps for Black
* Build cache before running generation tasks
* Add check not to run the code generation on master
* Simplify push action
* Add more test deps in setup.cfg and remove from GH Action workflow
* Remove forced upgrades on pip install
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>