Julian Risch
59e89b1031
test: Remove anthropic from "getting started" example test ( #6024 )
2023-10-12 22:36:49 +02:00
ZanSara
adf7e49af3
chore: review all
extra ( #6029 )
2023-10-12 21:50:53 +02:00
Stefano Fiorucci
2c2549f13d
move embedding backends ( #6033 )
2023-10-12 17:52:28 +02:00
Vladimir Blagojevic
d51be9edac
Add top_k to SimilarityRanker ( #6036 )
2023-10-12 13:52:01 +02:00
Vladimir Blagojevic
4b8b6e9191
Use forward reference for AnalyzeResult ( #6030 )
2023-10-11 16:33:02 +02:00
Vladimir Blagojevic
3803d23ff6
feat: Update PyPDFToDocument
to process ByteStream
inputs ( #6021 )
...
* Update PyPDF converter
* Add mixed source unit test
* Update haystack/preview/components/file_converters/pypdf.py
Co-authored-by: ZanSara <sara.zanzottera@deepset.ai>
---------
Co-authored-by: ZanSara <sara.zanzottera@deepset.ai>
2023-10-11 10:52:08 +02:00
Vladimir Blagojevic
1a6a8863e8
feat: Update HTMLToDocument
to handle ByteStream
inputs ( #6020 )
...
* Update HTML converter
* Add mixed source unit test
* Update haystack/preview/components/file_converters/html.py
Co-authored-by: ZanSara <sara.zanzottera@deepset.ai>
---------
Co-authored-by: ZanSara <sara.zanzottera@deepset.ai>
2023-10-11 10:15:58 +02:00
Julian Risch
12fe0364dc
test: Utility to compare two lists of documents for equality ( #6005 )
...
* check that sorted lists contain same docs
* fix broken tests
2023-10-11 08:16:41 +02:00
Vladimir Blagojevic
6a50123b9f
feat: Adjust LinkContentFetcher run method, use ByteStream ( #5972 )
2023-10-10 17:48:31 +02:00
Nicola Procopio
c102b152dc
fix: Run update_embeddings in examples ( #6008 )
...
* added hybrid search example
Added an example about hybrid search for faq pipeline on covid dataset
* formatted with back formatter
* renamed document
* fixed
* fixed typos
* added test
added test for hybrid search
* fixed withespaces
* removed test for hybrid search
* fixed pylint
* commented logging
* updated hybrid search example
* release notes
* Update hybrid_search_faq_pipeline.py-815df846dca7e872.yaml
* Update hybrid_search_faq_pipeline.py
* mention hybrid search example in release notes
* reduce installed dependencies in examples test workflow
* do not install cuda dependencies
* skip models if API key not set; delete document indices
* skip models if API key not set; delete document indices
* skip models if API key not set; delete document indices
* keep roberta-base model and inference extra
* pylint
* disable pylint no-logging-basicconfig rule
---------
Co-authored-by: Julian Risch <julian.risch@deepset.ai>
2023-10-10 16:38:52 +02:00
Vladimir Blagojevic
c05f564359
feat: Split linting preview into a separate file ( #6017 )
...
* Split linting preview into seperate file
* Add not trigger paths in old workflow
2023-10-10 14:54:27 +02:00
Vladimir Blagojevic
98215aec0d
feat: Rename FileExtensionRouter
to FileTypeRouter
, handle ByteStream(s) ( #5998 )
...
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2023-10-10 09:14:04 +02:00
DanShatford
07048791aa
feat: allow list of file paths in convert_files_to_docs
( #5961 )
...
* feat: allow list of file paths in `convert_files_to_docs`
* Fix validation
* Fix check errors
2023-10-09 20:19:03 +02:00
David Berenstein
13fb7c5b5f
feat: added on_agent_final_answer-support to Agent callback_manager ( #5736 )
...
* chore: added on_agent_final_answer-support to Agent callback_manager
* chore: format black
* run pre-commit to format file
* updated release notes
* reverted sorted imports
---------
Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>
2023-10-09 18:03:47 +02:00
Daria Fokina
d0ff3fa7c2
docs: readme-get-started ( #5993 )
...
* readme-get-started
* lg update
* lg update
Co-authored-by: Bilge Yücel <bilgeyucel96@gmail.com>
---------
Co-authored-by: Bilge Yücel <bilgeyucel96@gmail.com>
2023-10-09 15:24:47 +02:00
Timo Moeller
aea6333637
Add end2end tests as getting started to HS2.0 readme ( #5981 )
...
* Add end2end tests as getting started to HS2.0 readme
* capital heading
---------
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2023-10-09 15:03:24 +02:00
ZanSara
71f2430fd1
test: enhance e2e tests to also draw and serialize/deserialize the test pipelines ( #5910 )
...
* add draw and serialization/deserialization to e2e pipeline examples
* add comment about json serialization
* fix a small gptgenerator bug and move indexing in tests
* to json
* review feedback
2023-10-09 13:54:17 +02:00
Vladimir Blagojevic
40b83d8a47
feat: Add TopPSampler Haystack 2.0 component ( #5924 )
2023-10-09 13:44:01 +02:00
Silvano Cerza
0cb9abb1c2
Rename proposal to respect specifications ( #6002 )
2023-10-09 11:24:19 +02:00
Greg
0e9a51cfb1
fix: annotation-tool is missing DOMAIN_WHITELIST
envvar ( #5997 )
...
The docker-compose.yml file for annotation tool
('annotation_tool/domain-compose.yml') is giving an error with latest
image.
The annotation-tool demands the `DOMAIN_WHITELIST` envvar to be defined
that is not a part of the given template referenced by the
documentation.
Signed-off-by: Greg Nagy <greg.nagy@deepset.ai>
2023-10-08 19:48:30 +02:00
Stefano Fiorucci
4e921c650e
rm useless pin ( #5995 )
2023-10-06 18:26:08 +02:00
Vladimir Blagojevic
1cdff6427e
feat: Add SimilarityRanker to Haystack 2.0 ( #5923 )
...
* Initial SimilarityRanker
2023-10-06 16:01:34 +02:00
Stefano Fiorucci
ccc9f010bb
fix: fix ChatGPT invocation layer (and add async support) ( #5979 )
...
* ChatGPT async
* release note
* fix tests
2023-10-05 18:43:26 +02:00
Vladimir Blagojevic
282419d82b
feat: Unfreeze Document in Haystack 2.0 ( #5974 )
...
* Unfreeze document
* Remove immutability test
2023-10-05 17:55:07 +02:00
Vladimir Blagojevic
f983e605c7
Revert "ci: added isort to pyproject.toml and pre-commit ( #5933 )" ( #5980 )
...
This reverts commit 64243540fb1f2cb6d4dfbb5b12db3aaf59a21b4a.
2023-10-05 17:45:28 +02:00
Tobias Wochinger
d5d3a9eef4
chore: adapt deepset cloud sdk endpoint format for saving pipelines ( #5969 )
...
* chore: adapt to new endpoints formats
* docs: add release notes
2023-10-05 08:56:28 +02:00
Massimiliano Pippi
c2ec3f5fde
feat: add File type to preview package ( #5873 )
...
* add Blob type
* review feedback
* fix tests and naming
* Update add-blob-type-2a9476a39841f54d.yaml
* removed unused import
---------
Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>
2023-10-04 17:23:12 +02:00
dependabot[bot]
a4beec3013
build(deps): bump aws-actions/configure-aws-credentials ( #5968 )
...
Bumps [aws-actions/configure-aws-credentials](https://github.com/aws-actions/configure-aws-credentials ) from 4.0.0 to 4.0.1.
- [Release notes](https://github.com/aws-actions/configure-aws-credentials/releases )
- [Changelog](https://github.com/aws-actions/configure-aws-credentials/blob/main/CHANGELOG.md )
- [Commits](8c3f20df09...010d0da01d
)
---
updated-dependencies:
- dependency-name: aws-actions/configure-aws-credentials
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-10-04 17:20:17 +02:00
Mike Lay
de4592f6f1
Fix contributing link ( #5960 )
...
* Fix broken contributor link. As this ultimately links to the actual contributing page, simply redirect to *CONTRIBUTING.md* instead of `#-contributing`
The 💙 emoji in the anchor does not actually resolve. Should be `#contributing`.
2023-10-04 12:01:51 +02:00
Matt Speck
64243540fb
ci: added isort to pyproject.toml and pre-commit ( #5933 )
2023-10-04 01:01:26 +02:00
dependabot[bot]
58192d35f1
build(deps): bump iterative/setup-cml from 1 to 2 ( #5911 )
...
Bumps [iterative/setup-cml](https://github.com/iterative/setup-cml ) from 1 to 2.
- [Release notes](https://github.com/iterative/setup-cml/releases )
- [Commits](https://github.com/iterative/setup-cml/compare/v1...v2 )
---
updated-dependencies:
- dependency-name: iterative/setup-cml
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-10-03 17:39:22 +02:00
ZanSara
b844ab8e22
chore: remove matrix from Linux CI ( #5955 )
...
* remove matrix
* workflow names
2023-10-03 17:39:04 +02:00
Stefano Fiorucci
cc70b4b613
deprecation ( #5954 )
2023-10-03 12:48:06 +02:00
Massimiliano Pippi
ac408134f4
feat: add support for async openai calls ( #5946 )
...
* add support for async openai calls
* add actual async call
* split the async api
* ask permission
* Update haystack/utils/openai_utils.py
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
* Fix OpenAI content moderation tests
* Fix ChatGPT invocation layer tests
---------
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
Co-authored-by: Silvano Cerza <silvanocerza@gmail.com>
2023-10-03 10:42:21 +02:00
Lavesh Akhadkar
1ccf674d73
feat: DocumentWriter
returns number of documents written ( #5939 )
...
* Make DocumentWriter return the number of documents it wrote
* Fixed return type
2023-10-03 10:02:33 +02:00
Timo Moeller
dfd9870bcd
Remove language validation ( #5948 )
2023-10-03 09:37:07 +02:00
Silvano Cerza
a933a42749
Fix release_notes.yml syntax
2023-10-02 13:24:08 -07:00
Zubeen
b8c3b68141
Update release_notes.yml ( #5949 )
...
Ignoring release notes check for PRs of type doc/ci/test
2023-10-02 22:17:55 +02:00
dependabot[bot]
69232612d0
build(deps): bump actions/checkout from 3 to 4 ( #5928 )
...
Bumps [actions/checkout](https://github.com/actions/checkout ) from 3 to 4.
- [Release notes](https://github.com/actions/checkout/releases )
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md )
- [Commits](https://github.com/actions/checkout/compare/v3...v4 )
---
updated-dependencies:
- dependency-name: actions/checkout
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-10-01 12:38:57 +02:00
ZanSara
81b2e83d04
feat: separate out preview
tests ( #5639 )
...
* add preview workflows
* feedback
* feedback
* use preview extra
* remove coverage and add separate e2e
* rename workflow file for consistency
* trigger ci
* undo trigger
* torch import in testing
* add deps to unit tests
* feedback
* run container instead of service
* comment
* add if statement
* fix tika version
* separate out win integration tests
* separate out all CIs
* try installing docker on macos
* exclude tika
* remove tika docker
2023-09-29 13:16:08 +02:00
bogdankostic
d61df24b27
chore: Remove classifiers directory from preview package ( #5918 )
2023-09-29 10:38:33 +02:00
Massimiliano Pippi
0947f59545
feat: add async PromptNode run ( #5890 )
...
* add async promptnode
* Remove unecessary calls to dict.keys()
---------
Co-authored-by: Silvano Cerza <silvanocerza@gmail.com>
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2023-09-29 08:40:01 +02:00
ZanSara
578f2b4bbf
feat: update canals to 0.8.1 ( #5900 )
...
* Update canals to 0.8.1
* scale up runner
2023-09-28 17:50:46 +02:00
Vladimir Blagojevic
e882a7d5c8
feat: Add HTMLToDocument component (v2) ( #5907 )
2023-09-28 17:22:28 +02:00
Massimiliano Pippi
dfa48eece9
clean up the Slack integrations ( #5908 )
2023-09-28 15:49:19 +02:00
Stefano Fiorucci
d4aacad5f9
feat: OpenAIDocumentEmbedder
( #5822 )
...
* first draft
* release note
* mypy fix
* fix test
* corrections
* pr feedback
* better secrets handling and new tests
* missing imports in embedders/__init__.py
* better format condition
* address feedback
2023-09-28 15:42:51 +02:00
ZanSara
83724b74e3
feat: Make metadata
optional in AnswerBuilder ( #5909 )
...
* optional metadata
* improve docstring
2023-09-28 14:42:19 +02:00
Stefano Fiorucci
9340c572f9
alternative skipif conditions in azure ocr converter test ( #5906 )
2023-09-28 12:09:19 +02:00
Silvano Cerza
35ec8cc8fb
Rework evaluation and metrics calculation for Haystack 2.x ( #5794 )
...
* draft requirements from discussion
* Add some more information
* Update proposal given new feedback
* More drawbacks
* Decision drivers
* Nitpick
* Summary
* PR number
* Mark code snippets
Co-authored-by: ZanSara <sara.zanzottera@deepset.ai>
* Link correct issue
* Add missing word
* More context on blind evaluation
* Rephrase confusing sentence
* Add a more detailed code example
* Ignore mypy and pylint in example file
---------
Co-authored-by: Julian Risch <julian.risch@deepset.ai>
Co-authored-by: ZanSara <sara.zanzottera@deepset.ai>
2023-09-28 00:51:51 +02:00
Julian Risch
4413675e64
feat: Add TextDocumentSplitter that splits by word, sentence, passage (2.0) ( #5870 )
...
* draft split by word, sentence, passage
* naive way to split sentences without nltk
* reno
* add tests
* make input list of docs, review feedback
* add source_id and more validation
* update docstrings
* add split delimiters back to strings
---------
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2023-09-27 12:26:20 +02:00