* Adding support for table Documents when serializing Labels in Haystack
* Fix table label equality test
* Add serialization support and __eq__ support for table answers
* Made convenience functions for converting dataframes. Added some TODOs. Epxanded schema tests for table labels. Updated Multilabel to not convert Dataframes into strings.
* get Answer and Label to_json working with DataFrame
* Fix from_dict method of Label
* Use Dict and remove unneccessary if check
* Using pydantic instead of builtins for type detection
* Update haystack/schema.py
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
* Update haystack/schema.py
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
* Update haystack/schema.py
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
* Separated table label equivalency tests and added pytest.mark.unit
* Added unit test for _dict_factory
* Using more descriptive variable names
* Adding json files to test to_json and from_json functions
* Added sample files for tests
---------
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
* fixed test base for hub 0.13.3
* check if test succeed from branch
* 2nd check if test succeed from branch
* removed dependency changes
---------
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
* Upgrade to transformers 4.28.1
* Commenting out failing piece of test
* trailing-whitespace
* Adjust regex for error match - it changed between releases
* Remove RAG tests failing with transformers update
* fix recursion of death when deserializing prompttemplate
* add test
* set api_key
* fix test
* add generic test
* work in feedback on tests
---------
Co-authored-by: bogdankostic <bogdankostic@web.de>
* bug: fix load local HF Models in PromptNode pipeline
* Update hugging_face.py
remove duplicate validator
* update: black formatted
* update: update doc string, replace pop with get
* test HFLocalInvocationLayer with local model
* extract elasticsearch
* update pyproject.toml
* make more import optional
* move MockBaseRetriever in conftest
* install es in the es integration tests
* Starting adding support for TableCell
* Update tests to use row and col
* Added schema test to check to_dict and from_dict works for Table documents. Also updated Doc.__eq__ to work for tables.
* Update eval test to use TableCell
* Added more schema tests for table docs, labels and answers.
* Add boolean to toggle between Span and TableCell
* Add deprecation message
* Test that table answers work as responses in the rest API
---------
Co-authored-by: agnieszka-m <amarzec13@gmail.com>
* clean up the ES instance in a more robust way
* do not sleep, refresh the index instead
* remove client warnings
* fix unit tests
* fix opensearch compatibility
* fix unit tests
* update ES version
* bump elasticsearch-py
* adjust docs
* use recreate_index param
* use same fixture strategy for Opensearch
* Update lg
---------
Co-authored-by: agnieszka-m <amarzec13@gmail.com>
* Added warning messages for documents that are skipped by RouteDocuments. Begun adding support for new option return_remaining and List of List support for metadata value splitting.
* Simplify _split_by_content_type
* Added new unit test and updated _calculate_outgoing_edges
* Added some TODOs and turned assert into raising an error.
* Update logging messages and make new fixture in tests
* Update _split_by_metadata_values to work with return_remaining
* Remove unneeded code
* Documentation
* Add proper support for list of lists
* Fix mypy errors
* Added assert to make mypy happy
* Update haystack/nodes/other/route_documents.py
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
* PR comments
* Remove check for logging level
* make mypy happy
* Update docstring of metadata_values
* Removed duplicate check. Make explicit check for metadata_values
---------
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
* fix: list element and mapping logic around it added to ParsrConverter convert step + unit test covering the specific mapping of list content from Parsr's to Haystack's
* Code review changes
* changed the samples path after conftest changes
* added samples_path to function arg
---------
Co-authored-by: Namoush <fmpereira22@gmail.com>
Co-authored-by: Fernando Pereira <fernando.pereira@criticalsoftware.com>
Co-authored-by: Mayank Jobanputra <mayankjobanputra@gmail.com>
Co-authored-by: bogdankostic <bogdankostic@web.de>
* Add pytest fixture to block requests in unit tests
* Mark test correctly as integration
* Fix crawler unit test failing cause it tries to install chromedriver
* Rework some PromptNode and PromptModel tests
* Remove duplicate code in PromptNode
* Fix mypy
* Fix test cause of missing fixture
* Revert "Fix mypy"
This reverts commit e530295a06cb260d9a8bd89679534958cb3d9776.
* Revert "Remove duplicate code in PromptNode"
This reverts commit 4a678ae81504dcc78a737372c061d12dc8799639.
* Initial commit, add search_engine
* Add TopPSampler
* Add more TopPSampler unit tests
* Remove SearchEngineSampler (converted to TopPSampler)
* Add some basic WebSearch unit tests
* Rename unit tests
* Add WebRetriever into agent_tools
* Adjust to WebRetriever
* Add WebRetriever mode [snippet|document]
* Minor changes
* SerperDev: add peopleAlsoAsk search results
* First agent for hotpotqa
* Making WebRetriever work on hotpotqa
* refactor: minor WebRetriever improvements (#4377)
* refactor: remove doc ids rebuild + antecipate cache
* refactor: improve caching, fix Document ids
* Minor WebRetriever improvements
* Overlooked minor fixes
* feat: add Bing API as search engine
* refactor: let kwargs pass-through
* feat: increase search context
* check sampler result, improve batch typing
* refactor: increase mypy compliance
* Initial commit, add search_engine
* Add TopPSampler
* Add more TopPSampler unit tests
* Remove SearchEngineSampler (converted to TopPSampler)
* Add some basic WebSearch unit tests
* Rename unit tests
* Add WebRetriever into agent_tools
* Adjust to WebRetriever
* Add WebRetriever mode [snippet|document]
* Minor changes
* SerperDev: add peopleAlsoAsk search results
* First agent for hotpotqa
* Making WebRetriever work on hotpotqa
* refactor: minor WebRetriever improvements (#4377)
* refactor: remove doc ids rebuild + antecipate cache
* refactor: improve caching, fix Document ids
* Minor WebRetriever improvements
* Overlooked minor fixes
* feat: add Bing API as search engine
* refactor: let kwargs pass-through
* feat: increase search context
* check sampler result, improve batch typing
* refactor: increase mypy compliance
* Fix mypy
* Minor example fixes
* Fix the descriptions
* PR feedback updates
* More fixes
* TopPSampler: handle top p None value, add unit test
* Add top_k to WebSearch
* Use boilerpy3 instead trafilatura
* Remove date finding
* Add more WebRetriever docs
* Refactor long methods
* making the preprocessor optional
* hide WebSearch and make NeuralWebSearch a pipeline
* remove unused imports
* add WebQAPipeline and split example into two
* change example search engine to SerperDev
* Turn off progress bars in WebRetriever's PreProcesssor
* Agent tool examples - final updates
* Add webqa test, search results ranking scores
* Better answer box handling for SerperDev and SerpAPI
* Minor fixes
* pylint
* pylint fixes
* extract TopPSampler from WebRetriever
* use sampler only for WebRetriever modes other than snippet
* add web retriever tests
* add web retriever tests
* exclude rdflib@6.3.2 due to license issues
* add test for preprocessed docs and kwargs examples in docstrings
* Move test_webqa_pipeline to test/pipelines
* change docstring for join_documents_and_scores
* Use WebQAPipeline in examples/web_lfqa.py
* Use WebQAPipeline in examples/web_lfqa.py
* Move test_webqa_pipeline to e2e
* Updated lg
* Sampler added automatically in WebQAPipeline, no need to add it
* Updated lg
* Updated lg
* :ignore Update agent tools examples to new templates (#4503)
* Update examples to new templates
* Add print back
* fix linting and black format issues
---------
Co-authored-by: Daniel Bichuetti <daniel.bichuetti@gmail.com>
Co-authored-by: agnieszka-m <amarzec13@gmail.com>
Co-authored-by: Julian Risch <julian.risch@deepset.ai>
* store prompt in Answer
* store prompt in eval csv
* fix tests
* chore: fix context offset loadingQ
* add tests
* add test from PR #4476
* fix tests after merge