* bump ES version in CI
disable ssl
wait for service to start
set env vars
do not use choco to install ES
re-enable jobs deps
skip test on windows CI because of OOM
allocate more memory for ES
uniform ES installation and use default heap size
skip tests causing OOM
increase job timeout
restore memory limit for ES8
* Use latest elasticsearch version
* Add max_tokens to BaseGenerator params
* Make mypy happy
* Rebase and resolve conflicts
* Fix signature issues
* Update lg
* Add a mocked unit test method
* end-of-file-fixer corrected file
* Convert to unit test
* Mark test as integration
* make the test unit
---------
Co-authored-by: agnieszka-m <amarzec13@gmail.com>
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
* preserve root_node and add tests
* Added if statement to fix failing tests
---------
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
Co-authored-by: Sebastian Husch Lee <sjrl423@gmail.com>
* Deprecate name parameter
* Adapt existing tests and uses of PromptTemplate
* Move parameter `name` to end
* Adapt existing tests
* lg update
---------
Co-authored-by: Darja Fokina <daria.f93@gmail.com>
* fixed test base for hub 0.13.3
* check if test succeed from branch
* 2nd check if test succeed from branch
* removed dependency changes
---------
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
* Upgrade to transformers 4.28.1
* Commenting out failing piece of test
* trailing-whitespace
* Adjust regex for error match - it changed between releases
* Remove RAG tests failing with transformers update
* extract elasticsearch
* update pyproject.toml
* make more import optional
* move MockBaseRetriever in conftest
* install es in the es integration tests
* Starting adding support for TableCell
* Update tests to use row and col
* Added schema test to check to_dict and from_dict works for Table documents. Also updated Doc.__eq__ to work for tables.
* Update eval test to use TableCell
* Added more schema tests for table docs, labels and answers.
* Add boolean to toggle between Span and TableCell
* Add deprecation message
* Test that table answers work as responses in the rest API
---------
Co-authored-by: agnieszka-m <amarzec13@gmail.com>
* Added warning messages for documents that are skipped by RouteDocuments. Begun adding support for new option return_remaining and List of List support for metadata value splitting.
* Simplify _split_by_content_type
* Added new unit test and updated _calculate_outgoing_edges
* Added some TODOs and turned assert into raising an error.
* Update logging messages and make new fixture in tests
* Update _split_by_metadata_values to work with return_remaining
* Remove unneeded code
* Documentation
* Add proper support for list of lists
* Fix mypy errors
* Added assert to make mypy happy
* Update haystack/nodes/other/route_documents.py
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
* PR comments
* Remove check for logging level
* make mypy happy
* Update docstring of metadata_values
* Removed duplicate check. Make explicit check for metadata_values
---------
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
* fix: list element and mapping logic around it added to ParsrConverter convert step + unit test covering the specific mapping of list content from Parsr's to Haystack's
* Code review changes
* changed the samples path after conftest changes
* added samples_path to function arg
---------
Co-authored-by: Namoush <fmpereira22@gmail.com>
Co-authored-by: Fernando Pereira <fernando.pereira@criticalsoftware.com>
Co-authored-by: Mayank Jobanputra <mayankjobanputra@gmail.com>
Co-authored-by: bogdankostic <bogdankostic@web.de>
* Add pytest fixture to block requests in unit tests
* Mark test correctly as integration
* Fix crawler unit test failing cause it tries to install chromedriver
* Initial commit, add search_engine
* Add TopPSampler
* Add more TopPSampler unit tests
* Remove SearchEngineSampler (converted to TopPSampler)
* Add some basic WebSearch unit tests
* Rename unit tests
* Add WebRetriever into agent_tools
* Adjust to WebRetriever
* Add WebRetriever mode [snippet|document]
* Minor changes
* SerperDev: add peopleAlsoAsk search results
* First agent for hotpotqa
* Making WebRetriever work on hotpotqa
* refactor: minor WebRetriever improvements (#4377)
* refactor: remove doc ids rebuild + antecipate cache
* refactor: improve caching, fix Document ids
* Minor WebRetriever improvements
* Overlooked minor fixes
* feat: add Bing API as search engine
* refactor: let kwargs pass-through
* feat: increase search context
* check sampler result, improve batch typing
* refactor: increase mypy compliance
* Initial commit, add search_engine
* Add TopPSampler
* Add more TopPSampler unit tests
* Remove SearchEngineSampler (converted to TopPSampler)
* Add some basic WebSearch unit tests
* Rename unit tests
* Add WebRetriever into agent_tools
* Adjust to WebRetriever
* Add WebRetriever mode [snippet|document]
* Minor changes
* SerperDev: add peopleAlsoAsk search results
* First agent for hotpotqa
* Making WebRetriever work on hotpotqa
* refactor: minor WebRetriever improvements (#4377)
* refactor: remove doc ids rebuild + antecipate cache
* refactor: improve caching, fix Document ids
* Minor WebRetriever improvements
* Overlooked minor fixes
* feat: add Bing API as search engine
* refactor: let kwargs pass-through
* feat: increase search context
* check sampler result, improve batch typing
* refactor: increase mypy compliance
* Fix mypy
* Minor example fixes
* Fix the descriptions
* PR feedback updates
* More fixes
* TopPSampler: handle top p None value, add unit test
* Add top_k to WebSearch
* Use boilerpy3 instead trafilatura
* Remove date finding
* Add more WebRetriever docs
* Refactor long methods
* making the preprocessor optional
* hide WebSearch and make NeuralWebSearch a pipeline
* remove unused imports
* add WebQAPipeline and split example into two
* change example search engine to SerperDev
* Turn off progress bars in WebRetriever's PreProcesssor
* Agent tool examples - final updates
* Add webqa test, search results ranking scores
* Better answer box handling for SerperDev and SerpAPI
* Minor fixes
* pylint
* pylint fixes
* extract TopPSampler from WebRetriever
* use sampler only for WebRetriever modes other than snippet
* add web retriever tests
* add web retriever tests
* exclude rdflib@6.3.2 due to license issues
* add test for preprocessed docs and kwargs examples in docstrings
* Move test_webqa_pipeline to test/pipelines
* change docstring for join_documents_and_scores
* Use WebQAPipeline in examples/web_lfqa.py
* Use WebQAPipeline in examples/web_lfqa.py
* Move test_webqa_pipeline to e2e
* Updated lg
* Sampler added automatically in WebQAPipeline, no need to add it
* Updated lg
* Updated lg
* :ignore Update agent tools examples to new templates (#4503)
* Update examples to new templates
* Add print back
* fix linting and black format issues
---------
Co-authored-by: Daniel Bichuetti <daniel.bichuetti@gmail.com>
Co-authored-by: agnieszka-m <amarzec13@gmail.com>
Co-authored-by: Julian Risch <julian.risch@deepset.ai>
* add lanaguage classifier node
* Fix a few bugs and general code style
* whitespace
* first draft and refactoring
* draft of classes separation
* improve base class
* fix inivisible character; add some tests
* fix and more tests
* more docs and tests
* move __init__ to base
* add transformers node; improve tests
* incorporate feedback; little fix to other node
* labels_to_languages mapping
* better docstrings
* use logger instead of logging
---------
Co-authored-by: Stanislav Zamecnik <stanislav.zamecnik@telekom.com>
Co-authored-by: anakin87 <44616784+anakin87@users.noreply.github.com>
Co-authored-by: stazam <zamecnik.stanislav@gmail.com>
* Added changes from table-qa-pipeline
* Moved classes around to make diff to main look nicer.
* Cleaned things up. Removed option to return_no_answer (not needed), added docs and added integration marks.
* Remove unneeded code
* Added fix for test
* Add check for document_ids in answer
* Prevent passing of empty list to np.mean
* Batching doesn't work with TableQAPipeline b/c of HF issue
* Cleanup of table reader tests, added check for document ids.
* Fixing pylint
* More pylint
* PR comments
---------
Co-authored-by: bogdankostic <bogdankostic@web.de>