Silvano Cerza
f4fb8dd946
Revert "ci: Change docker_release.yml workflow to run after successful PyPi release ( #4293 )" ( #4513 )
...
This reverts commit 6e241262ada9e59359d653a779246d2ad03c1223.
2023-03-28 15:28:15 +02:00
Vladimir Blagojevic
7c9f719496
refactor: Adjust WhisperTranscriber to pipeline run methods ( #4510 )
...
* Retrofit WhisperTranscriber run methods
* Add pipeline unit test
---------
Co-authored-by: ZanSara <sara.zanzottera@deepset.ai>
2023-03-28 13:52:21 +02:00
Silvano Cerza
098342da32
Use new Slack action to send failure messages ( #4464 )
2023-03-28 10:49:32 +02:00
Silvano Cerza
dbdb682225
Enhance release_docs.py ( #4459 )
2023-03-28 09:56:42 +02:00
Silvano Cerza
cfb8dfd470
Fix pipeline config and agent tools hashing for telemetry ( #4508 )
2023-03-28 09:41:50 +02:00
ZanSara
c777302fb4
chore: disable posthog in rest api tests ( #4507 )
2023-03-27 21:12:27 +02:00
bogdankostic
ed1837c0c9
feat: Deduplicate duplicate Answers resulting from overlapping Documents in FARMReader
( #4470 )
...
* Deduplicate answers resulting from document split overlap
* Add tests
* Fix Pylint
* Adapt existing test
* Incorporate PR feedback
2023-03-27 20:04:59 +02:00
github-actions[bot]
de825ded1c
Update unstable version ( #4506 )
...
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2023-03-27 19:01:04 +02:00
tstadel
3a7ae8239c
chore: Fix imports ( #4505 )
2023-03-27 18:58:53 +02:00
ZanSara
b52bbea0a5
refactor: reduce telemetry events count ( #4501 )
...
* remove telemetry v1
* more pipeline methods to take out
* send_event_2
* reduce events
* mypy
* consolidate llm nodes
* pylint
* mypy
* mypy again
* remove test
* mypy
* pylint
* black
* mypy
2023-03-27 18:53:56 +02:00
Vladimir Blagojevic
be25655663
feat: Add agent tools ( #4437 )
...
* Initial commit, add search_engine
* Add TopPSampler
* Add more TopPSampler unit tests
* Remove SearchEngineSampler (converted to TopPSampler)
* Add some basic WebSearch unit tests
* Rename unit tests
* Add WebRetriever into agent_tools
* Adjust to WebRetriever
* Add WebRetriever mode [snippet|document]
* Minor changes
* SerperDev: add peopleAlsoAsk search results
* First agent for hotpotqa
* Making WebRetriever work on hotpotqa
* refactor: minor WebRetriever improvements (#4377 )
* refactor: remove doc ids rebuild + antecipate cache
* refactor: improve caching, fix Document ids
* Minor WebRetriever improvements
* Overlooked minor fixes
* feat: add Bing API as search engine
* refactor: let kwargs pass-through
* feat: increase search context
* check sampler result, improve batch typing
* refactor: increase mypy compliance
* Initial commit, add search_engine
* Add TopPSampler
* Add more TopPSampler unit tests
* Remove SearchEngineSampler (converted to TopPSampler)
* Add some basic WebSearch unit tests
* Rename unit tests
* Add WebRetriever into agent_tools
* Adjust to WebRetriever
* Add WebRetriever mode [snippet|document]
* Minor changes
* SerperDev: add peopleAlsoAsk search results
* First agent for hotpotqa
* Making WebRetriever work on hotpotqa
* refactor: minor WebRetriever improvements (#4377 )
* refactor: remove doc ids rebuild + antecipate cache
* refactor: improve caching, fix Document ids
* Minor WebRetriever improvements
* Overlooked minor fixes
* feat: add Bing API as search engine
* refactor: let kwargs pass-through
* feat: increase search context
* check sampler result, improve batch typing
* refactor: increase mypy compliance
* Fix mypy
* Minor example fixes
* Fix the descriptions
* PR feedback updates
* More fixes
* TopPSampler: handle top p None value, add unit test
* Add top_k to WebSearch
* Use boilerpy3 instead trafilatura
* Remove date finding
* Add more WebRetriever docs
* Refactor long methods
* making the preprocessor optional
* hide WebSearch and make NeuralWebSearch a pipeline
* remove unused imports
* add WebQAPipeline and split example into two
* change example search engine to SerperDev
* Turn off progress bars in WebRetriever's PreProcesssor
* Agent tool examples - final updates
* Add webqa test, search results ranking scores
* Better answer box handling for SerperDev and SerpAPI
* Minor fixes
* pylint
* pylint fixes
* extract TopPSampler from WebRetriever
* use sampler only for WebRetriever modes other than snippet
* add web retriever tests
* add web retriever tests
* exclude rdflib@6.3.2 due to license issues
* add test for preprocessed docs and kwargs examples in docstrings
* Move test_webqa_pipeline to test/pipelines
* change docstring for join_documents_and_scores
* Use WebQAPipeline in examples/web_lfqa.py
* Use WebQAPipeline in examples/web_lfqa.py
* Move test_webqa_pipeline to e2e
* Updated lg
* Sampler added automatically in WebQAPipeline, no need to add it
* Updated lg
* Updated lg
* :ignore Update agent tools examples to new templates (#4503 )
* Update examples to new templates
* Add print back
* fix linting and black format issues
---------
Co-authored-by: Daniel Bichuetti <daniel.bichuetti@gmail.com>
Co-authored-by: agnieszka-m <amarzec13@gmail.com>
Co-authored-by: Julian Risch <julian.risch@deepset.ai>
2023-03-27 18:14:58 +02:00
tstadel
4f90e59796
feat: expose prompts to Answer and EvaluationResult ( #4341 )
...
* store prompt in Answer
* store prompt in eval csv
* fix tests
* chore: fix context offset loadingQ
* add tests
* add test from PR #4476
* fix tests after merge
2023-03-27 17:54:20 +02:00
ZanSara
6d578ebf3d
refactor: remove telemetry v1 ( #4496 )
...
* remove telemetry v1
* more pipeline methods to take out
* send_event_2
* mypy
* pylint
* mypy
* mypy again
* remove test
2023-03-27 17:38:43 +02:00
Silvano Cerza
3b5223fa1c
refactor: Mark MilvusDocumentStore as deprecated ( #4498 )
...
* Mark MilvusDocumentStore as deprecated
* Fix mypy
2023-03-27 15:31:48 +02:00
Silvano Cerza
5b63c2086e
refactor: Deprecate BaseKnowledgeGraph, GraphDBKnowledgeGraph, InMemoryKnowledgeGraph and Text2SparqlRetriever ( #4500 )
...
* Deprecate BaseKnowledgeGraph and InMemoryKnowledgeGraph
* Deprecate GraphDBKnowledgeGraph
* Fix mypy
* Deprecate Text2SparqlRetriever
2023-03-27 15:31:22 +02:00
tstadel
f8bb270d62
feat: prompt at query time ( #4454 )
...
* use outputshapers in prompttemplate
* fix pylint
* first iteration on regex
* implement new promptnode syntax based on f-strings
* finish fstring implementation
* add additional tests
* add security tests
* fix mypy
* fix pylint
* fix test_prompt_templates
* fix test_prompt_template_repr
* fix test_prompt_node_with_custom_invocation_layer
* fix test_invalid_template
* more security tests
* fix test_complex_pipeline_with_all_features
* fix agent tests
* refactor get_prompt_template
* fix test_prompt_template_syntax_parser
* fix test_complex_pipeline_with_all_features
* allow functions in comprehensions
* break out of fstring test
* fix additional tests
* mark new tests as unit tests
* fix agents tests
* convert missing templates
* proper use of get_prompt_template
* refactor and add docstrings
* fix tests
* fix pylint
* fix agents test
* fix tests
* refactor globals
* make allowed functions configurable via env variable
* better dummy variable
* fix special alias
* don't replace special char variables
* more special chars, better docstrings
* cherrypick fix audio tests
* fix test
* rework shapers
* fix pylint
* fix tests
* add new templates
* add reference parsing
* add more shaper tests
* add tests for join and to_string
* fix pylint
* fix pylint
* fix pylint for real
* auto fill shaper function params
* fix reference parsing for multiple references
* fix output variable inference
* consolidate qa prompt template output and make shaper work per-document
* implement prompt at query time
* support serialized PromptTemplates
* fix tests
* add tests for prompt template at query time
* fix types after merge
* fix types after merge
* improve test
* add test for nested shaper syntax in pipelines
* better docstrings
* Correct copilot errors
* found another copilot error
* Another one
* introduce output_parser
* introduce output_parser
* Fix tests for output_parser update
* fix black
* fix tests
* fix tests
* fix tests
* better docstring
* better docstring
* fix test
* fix mypy
* rename RegexAnswerParser to AnswerParser
* rename RegexAnswerParser to AnswerParser
* better docstrings
* better docstrings
* fix docstring example
2023-03-27 14:10:20 +02:00
Silvano Cerza
123dfc1b34
refactor: Remove ElasticsearchRetriever and ElasticsearchFilterOnlyRetriever ( #4499 )
2023-03-27 13:47:47 +02:00
tstadel
382ca8094e
feat: PromptTemplate extensions ( #4378 )
...
* use outputshapers in prompttemplate
* fix pylint
* first iteration on regex
* implement new promptnode syntax based on f-strings
* finish fstring implementation
* add additional tests
* add security tests
* fix mypy
* fix pylint
* fix test_prompt_templates
* fix test_prompt_template_repr
* fix test_prompt_node_with_custom_invocation_layer
* fix test_invalid_template
* more security tests
* fix test_complex_pipeline_with_all_features
* fix agent tests
* refactor get_prompt_template
* fix test_prompt_template_syntax_parser
* fix test_complex_pipeline_with_all_features
* allow functions in comprehensions
* break out of fstring test
* fix additional tests
* mark new tests as unit tests
* fix agents tests
* convert missing templates
* proper use of get_prompt_template
* refactor and add docstrings
* fix tests
* fix pylint
* fix agents test
* fix tests
* refactor globals
* make allowed functions configurable via env variable
* better dummy variable
* fix special alias
* don't replace special char variables
* more special chars, better docstrings
* cherrypick fix audio tests
* fix test
* rework shapers
* fix pylint
* fix tests
* add new templates
* add reference parsing
* add more shaper tests
* add tests for join and to_string
* fix pylint
* fix pylint
* fix pylint for real
* auto fill shaper function params
* fix reference parsing for multiple references
* fix output variable inference
* consolidate qa prompt template output and make shaper work per-document
* fix types after merge
* introduce output_parser
* fix tests
* better docstring
* rename RegexAnswerParser to AnswerParser
* better docstrings
2023-03-27 12:14:11 +02:00
ZanSara
9518bcb7a8
remove env var ( #4497 )
2023-03-27 10:33:58 +02:00
Julian Risch
45ce87bb48
bug: Exclude rdflib 6.3.2 because of fossa license issues ( #4495 )
2023-03-27 10:07:03 +02:00
Vladimir Blagojevic
c99b58100d
feat:Add agent event callbacks ( #4491 )
...
* Implement agent callbacks with events
* Fix mypy errors
* Fix prompt_params assignment
* PR review fixes
---------
Co-authored-by: Silvano Cerza <silvanocerza@gmail.com>
2023-03-27 10:06:11 +02:00
recrudesce
2a2226d63e
fix: Fix debug on PromptNode ( #4483 )
...
* Fix debug on PromptNode
Allow the ability to control debug output on PromptNode
* added tests, simplified code
---------
Co-authored-by: Mayank Jobanputra <mayankjobanputra@gmail.com>
2023-03-24 19:37:52 +05:30
Mayank Jobanputra
5f72cdc012
fix: stop loading FAISS and InMem doc Store for indexing pipelines ( #4396 )
...
* stop loading FAISS and InMem doc Store for indexing pipelines
* pylint fix
* Addressed comments
2023-03-24 19:35:29 +05:30
Silvano Cerza
b70715a74d
Remove retry_with_exponential_backoff in favor of tenacity ( #4460 )
2023-03-24 11:14:11 +01:00
Jose Pablo Fernandez
dda350088b
feat: add additional params to file upload endpoint ( #4445 )
...
* adds additional params to file upload endpoint
* fix mypy
---------
Co-authored-by: Mayank Jobanputra <mayankjobanputra@gmail.com>
2023-03-23 14:18:16 +01:00
Vladimir Blagojevic
7bb6499c29
feat: Enable PromptNode to use text-generation models ( #4349 )
2023-03-22 07:20:36 +01:00
Vladimir Blagojevic
3272e2b9fe
refactor: Add AgentStep ( #4431 )
2023-03-17 18:21:14 +01:00
ZanSara
4d19bd13a5
refactor: consolidate telemetry events ( #4275 )
...
* add specific Ray event
* group evaluation and training events
* consolidate pipeline run events
* fix send_event import
* review feedback
* typo
* send uptime
* track embeddingRetriever openai encoder
* track embeddingRetriever openai encoder
* pylitn
---------
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-03-17 18:14:35 +01:00
Florian Hardow
462484445d
feat: break retry loop for 401 unauthorized errors in promptnode ( #4389 )
...
* feat: break retry loop for 401 unauthorized errors in promptnode
* Fix black, pylint, mypy
* Update haystack/nodes/retriever/_embedding_encoder.py
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
* Update haystack/utils/openai_utils.py
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
* chore: blackify project
* chore: fix liniting error (remove elif after raise)
---------
Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2023-03-17 17:07:08 +01:00
Silvano Cerza
d55bac189c
Make version semver compliant ( #4456 )
2023-03-17 14:21:36 +01:00
Vladimir Blagojevic
53528c96a0
feat: Add ChatGPT PromptNode layer ( #4357 )
...
* Initial ChatGPTInvocationLayer
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
Co-authored-by: agnieszka-m <amarzec13@gmail.com>
Co-authored-by: Sebastian <sjrl@users.noreply.github.com>
2023-03-17 14:16:41 +01:00
Silvano Cerza
0f605118d9
ci: remove python_cache internal action ( #4429 )
2023-03-17 13:55:07 +01:00
Agnieszka Marzec
26e0fbb4f8
Docs: Update language classifier docstrings ( #4413 )
...
* Update language classifier docstrings
* Apply suggestions from code review
---------
Co-authored-by: ZanSara <sara.zanzottera@deepset.ai>
2023-03-17 12:40:02 +01:00
Sebastian
f04b2f3cee
Update test to reflect change in max token length ( #4451 )
2023-03-17 09:43:23 +01:00
Ahmed Nabil
d29342c8bf
feat: Add the New Tokenizer of gpt-3.5-turbo
( #4331 )
...
* Updated the tokenizer algorithm and pyproject.tomel tiktoken version
* Updated the tokenizer algorithm and pyproject.tomel tiktoken version
* Update haystack/utils/openai_utils.py
Co-authored-by: Sebastian <sjrl@users.noreply.github.com>
* Update references in openai_utils.py
* Update docs/pydoc/config/extractor.yml
Co-authored-by: Sebastian <sjrl@users.noreply.github.com>
* Update docs/pydoc/config/document-classifier.yml
Co-authored-by: Sebastian <sjrl@users.noreply.github.com>
* Update docs/pydoc/config/file-converters.yml
Co-authored-by: Sebastian <sjrl@users.noreply.github.com>
* Update docs/pydoc/config/file-classifier.yml
Co-authored-by: Sebastian <sjrl@users.noreply.github.com>
* Update docs/pydoc/config/other.yml
Co-authored-by: Sebastian <sjrl@users.noreply.github.com>
* Update docs/pydoc/config/pipelines.yml
Co-authored-by: Sebastian <sjrl@users.noreply.github.com>
* Update docs/pydoc/config/preprocessor.yml
Co-authored-by: Sebastian <sjrl@users.noreply.github.com>
* Update docs/pydoc/config/primitives.yml
Co-authored-by: Sebastian <sjrl@users.noreply.github.com>
* Update docs/pydoc/config/translator.yml
Co-authored-by: Sebastian <sjrl@users.noreply.github.com>
* Update docs/pydoc/config/crawler.yml
Co-authored-by: Sebastian <sjrl@users.noreply.github.com>
* Update docs/pydoc/config/prompt-node.yml
Co-authored-by: Sebastian <sjrl@users.noreply.github.com>
* Update docs/pydoc/config/pseudo-label-generator.yml
Co-authored-by: Sebastian <sjrl@users.noreply.github.com>
* Update docs/pydoc/config/query-classifier.yml
Co-authored-by: Sebastian <sjrl@users.noreply.github.com>
* Update docs/pydoc/config/question-generator.yml
Co-authored-by: Sebastian <sjrl@users.noreply.github.com>
* Update docs/pydoc/config/reader.yml
Co-authored-by: Sebastian <sjrl@users.noreply.github.com>
* Update docs/pydoc/config/ranker.yml
Co-authored-by: Sebastian <sjrl@users.noreply.github.com>
* Update docs/pydoc/config/retriever.yml
Co-authored-by: Sebastian <sjrl@users.noreply.github.com>
* Update docs/pydoc/config/transformers-img-to-text.yml
Co-authored-by: Sebastian <sjrl@users.noreply.github.com>
* Update openai_utils.py
Adding GPT-4 tokenization handler
* try to fix black
---------
Co-authored-by: Sebastian <sjrl@users.noreply.github.com>
Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>
2023-03-17 08:20:57 +01:00
ju-gu
a3409c7da6
fix: issue evaluation check for content type ( #4181 )
...
* fix: issue evaluation check for content type
Evaluation currently breaks, when the content type is not a str.
* add black
* add test table eval
* add black formatting
* Expand integration test
---------
Co-authored-by: Sebastian Lee <sebastian.lee@deepset.ai>
2023-03-16 17:36:53 +01:00
Silvano Cerza
1b5df55dbb
Skip flaky test ( #4444 )
2023-03-16 16:32:28 +01:00
Silvano Cerza
22c50207c1
Run readme_sync.yml in PRs ( #4442 )
2023-03-16 15:18:13 +01:00
Massimiliano Pippi
8d4c56720c
do not run tests on osx ( #4443 )
2023-03-16 15:00:29 +01:00
Agnieszka Marzec
798fba87dd
Fix agent module ( #4441 )
2023-03-16 10:14:59 +01:00
Silvano Cerza
9802fb159a
Remove unnecessary imports in conftest.py ( #4434 )
2023-03-16 10:02:01 +01:00
Agnieszka Marzec
3a97e271fc
Fix order and category of agent ( #4440 )
2023-03-16 09:59:17 +01:00
Silvano Cerza
3591fc02e1
Mark Crawler tests correctly ( #4435 )
2023-03-16 09:26:19 +01:00
Vladimir Blagojevic
2538b4cbc9
Make promptnode test unit ( #4420 )
2023-03-15 22:17:23 +01:00
Silvano Cerza
b59cf76093
refactor: Remove AnswerToSpeech and DocumentToSpeech nodes ( #4391 )
...
* Remove AnswerToSpeech and DocumentToSpeech nodes
* Remove unused dataclasses
* Remove unnecessary dependencies
* Remove unused error class and imports
2023-03-15 19:31:13 +01:00
Vladimir Blagojevic
f13501309e
OpenAI streaming support ( #4397 )
2023-03-15 18:24:47 +01:00
ZanSara
3ecce5cbeb
refactor: rename v2
package to preview
( #4409 )
...
* v2->preview
* fossa -> py3.8
* test matrix
* test matrix
* tests
* test imports
---------
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-03-15 18:02:18 +01:00
Agnieszka Marzec
374d7c9c4f
docs: Update Agent docstrings + add api docs ( #4296 )
...
* Update docstrings + add api docs
* Update with reviewer's changes
* Fix category id and blackify
* make max iterations test more robust
---------
Co-authored-by: Julian Risch <julian.risch@deepset.ai>
2023-03-15 17:26:35 +01:00
Massimiliano Pippi
d87b310f01
feat: improve is_containerized() ( #4412 )
...
* improve is_containerized()
* ignore global-var warning
2023-03-15 17:06:46 +01:00
Silvano Cerza
b3a659cd4a
test: Fix audio tests failing ( #4418 )
...
* Fix audio tests failing
* Disable local whisper tests
2023-03-15 15:26:30 +01:00