* Add pytest fixture to block requests in unit tests
* Mark test correctly as integration
* Fix crawler unit test failing cause it tries to install chromedriver
* Rework some PromptNode and PromptModel tests
* Remove duplicate code in PromptNode
* Fix mypy
* Fix test cause of missing fixture
* Revert "Fix mypy"
This reverts commit e530295a06cb260d9a8bd89679534958cb3d9776.
* Revert "Remove duplicate code in PromptNode"
This reverts commit 4a678ae81504dcc78a737372c061d12dc8799639.
* add a fallback xpdf alternative to PyMuPDF
* add xpdpf to the base images
* to be reverted
* silence mypy on conditional error
* do not install pdf extras in base images
* bring back the xpdf build strategy
* remove leftovers from old build
* fix indentation
* Apply suggestions from code review
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
* revert test workflow
---------
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
* do not override bake's platform definitions
* test
* fix job name and remove override from minor version job
* test
* bump docker login action
* fix plurals
* Remove platform from matrix and test both platform in a single job
* Remove branch trigger used for testing
---------
Co-authored-by: Silvano Cerza <silvanocerza@gmail.com>
* Initial commit, add search_engine
* Add TopPSampler
* Add more TopPSampler unit tests
* Remove SearchEngineSampler (converted to TopPSampler)
* Add some basic WebSearch unit tests
* Rename unit tests
* Add WebRetriever into agent_tools
* Adjust to WebRetriever
* Add WebRetriever mode [snippet|document]
* Minor changes
* SerperDev: add peopleAlsoAsk search results
* First agent for hotpotqa
* Making WebRetriever work on hotpotqa
* refactor: minor WebRetriever improvements (#4377)
* refactor: remove doc ids rebuild + antecipate cache
* refactor: improve caching, fix Document ids
* Minor WebRetriever improvements
* Overlooked minor fixes
* feat: add Bing API as search engine
* refactor: let kwargs pass-through
* feat: increase search context
* check sampler result, improve batch typing
* refactor: increase mypy compliance
* Initial commit, add search_engine
* Add TopPSampler
* Add more TopPSampler unit tests
* Remove SearchEngineSampler (converted to TopPSampler)
* Add some basic WebSearch unit tests
* Rename unit tests
* Add WebRetriever into agent_tools
* Adjust to WebRetriever
* Add WebRetriever mode [snippet|document]
* Minor changes
* SerperDev: add peopleAlsoAsk search results
* First agent for hotpotqa
* Making WebRetriever work on hotpotqa
* refactor: minor WebRetriever improvements (#4377)
* refactor: remove doc ids rebuild + antecipate cache
* refactor: improve caching, fix Document ids
* Minor WebRetriever improvements
* Overlooked minor fixes
* feat: add Bing API as search engine
* refactor: let kwargs pass-through
* feat: increase search context
* check sampler result, improve batch typing
* refactor: increase mypy compliance
* Fix mypy
* Minor example fixes
* Fix the descriptions
* PR feedback updates
* More fixes
* TopPSampler: handle top p None value, add unit test
* Add top_k to WebSearch
* Use boilerpy3 instead trafilatura
* Remove date finding
* Add more WebRetriever docs
* Refactor long methods
* making the preprocessor optional
* hide WebSearch and make NeuralWebSearch a pipeline
* remove unused imports
* add WebQAPipeline and split example into two
* change example search engine to SerperDev
* Turn off progress bars in WebRetriever's PreProcesssor
* Agent tool examples - final updates
* Add webqa test, search results ranking scores
* Better answer box handling for SerperDev and SerpAPI
* Minor fixes
* pylint
* pylint fixes
* extract TopPSampler from WebRetriever
* use sampler only for WebRetriever modes other than snippet
* add web retriever tests
* add web retriever tests
* exclude rdflib@6.3.2 due to license issues
* add test for preprocessed docs and kwargs examples in docstrings
* Move test_webqa_pipeline to test/pipelines
* change docstring for join_documents_and_scores
* Use WebQAPipeline in examples/web_lfqa.py
* Use WebQAPipeline in examples/web_lfqa.py
* Move test_webqa_pipeline to e2e
* Updated lg
* Sampler added automatically in WebQAPipeline, no need to add it
* Updated lg
* Updated lg
* :ignore Update agent tools examples to new templates (#4503)
* Update examples to new templates
* Add print back
* fix linting and black format issues
---------
Co-authored-by: Daniel Bichuetti <daniel.bichuetti@gmail.com>
Co-authored-by: agnieszka-m <amarzec13@gmail.com>
Co-authored-by: Julian Risch <julian.risch@deepset.ai>