Stefano Fiorucci
2a9a6401d2
chore: pin openai>=1.56.1 ( #8632 )
...
* pin openai>=1.56.1
* release note
2024-12-12 16:26:38 +01:00
David S. Batista
3f77d3ab6c
!feat: unify NLTKDocumentSplitter and DocumentSplitter ( #8617 )
...
* wip: initial import
* wip: refactoring
* wip: refactoring tests
* wip: refactoring tests
* making all NLTKSplitter related tests work
* refactoring
* docstrings
* refactoring and removing NLTKDocumentSplitter
* fixing tests for custom sentence tokenizer
* fixing tests for custom sentence tokenizer
* cleaning up
* adding release notes
* reverting some changes
* cleaning up tests
* fixing serialisation and adding tests
* cleaning up
* wip
* renaming and cleaning
* adding NLTK files
* updating docstring
* adding import to init
* Update haystack/components/preprocessors/document_splitter.py
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
* updating tests
* wip
* adding sentence/period change warning
* fixing LICENSE header
* Update haystack/components/preprocessors/document_splitter.py
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
---------
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
2024-12-12 14:22:27 +00:00
David S. Batista
6cceaac15f
docs: add deprecation warning nltk document splitter ( #8628 )
...
* adding deprecation warning
* adding release notes
* adding release notes
* updating message
* Update haystack/components/preprocessors/nltk_document_splitter.py
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
---------
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2024-12-12 15:16:54 +01:00
Stefano Fiorucci
04fc187bc4
chore: remove deprecation warnings related to store_full_path ( #8626 )
...
* remove deprecation warnings related to store_full_path
* unused imports
2024-12-12 09:27:19 +01:00
Michele Pangrazzi
21d53d0ec6
update default value of 'store_full_path' to False in converters ( #8619 )
2024-12-10 16:03:38 +01:00
dependabot[bot]
c78eb9be4e
build(deps): bump readmeio/rdme from 8 to 9 ( #8615 )
...
Bumps [readmeio/rdme](https://github.com/readmeio/rdme ) from 8 to 9.
- [Release notes](https://github.com/readmeio/rdme/releases )
- [Changelog](https://github.com/readmeio/rdme/blob/next/CHANGELOG.md )
- [Commits](https://github.com/readmeio/rdme/compare/v8...v9 )
---
updated-dependencies:
- dependency-name: readmeio/rdme
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-12-10 13:22:08 +01:00
David S. Batista
248dccbdd3
chore: fixing pylint issues ( #8610 )
...
* initial import
* fixing internal methods
* fixing some internal methods
* modify _preprocess
* fixed internal methods
---------
Co-authored-by: anakin87 <stefanofiorucci@gmail.com>
2024-12-09 16:53:37 +00:00
Anton Pelykh
6f983a22ca
fix: add missing stream mime type assignment to the LinkContentFetcher ( #8596 )
...
* add missing stream mime type assignment to the `LinkContentFetcher`
* fix release note fmt
---------
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
2024-12-09 14:51:14 +00:00
Stefano Fiorucci
09adf856dc
rm openapi spec util ( #8613 )
2024-12-09 10:59:21 +01:00
ArzelaAscoIi
ed2f37da60
fix: docstring for normalization ( #8604 )
...
* fix: docstring for normalization
* chore: add reno
* fixing docstrings and adding pylint disable too many args
---------
Co-authored-by: David S. Batista <dsbatista@gmail.com>
2024-12-06 17:13:30 +01:00
Michele Pangrazzi
b32f85cca2
remove deprecated 'converter' init parameter from PyPDFToDocument component ( #8609 )
2024-12-06 15:43:43 +01:00
David S. Batista
3da5bac8c4
refactor: converting some DocumentJoiner methods to staticmethod ( #8606 )
...
* converting some methods to static, since they change/depend on state of the object
* adding release notes
* removing tab
2024-12-06 10:28:41 +01:00
David S. Batista
e349a7f2fc
docs: complete docstring for DocumentJoiner code example ( #8593 )
...
* initial import
* changing a method to static
* reverting staticmethod
2024-12-05 14:04:34 +00:00
David S. Batista
2282c26f17
feat!: SentenceWindowRetriever returns List[Document] with docs ordered by split_idx_start ( #8590 )
...
* initial import
* adding a few pylint disable
* adding tests
* fixing integration tests
* adding release notes
* fixing types and docstrings
2024-12-04 16:55:56 +01:00
David S. Batista
f0638b2868
refactor: moving SentenceSplitter outside NLTKDocumentSplitter ( #8599 )
...
* initial import
* fixing imports and renaming file
* fixing imports path
* adding condition to check NLTK successfully imported
* adding one class inside the NLTK imported condition
2024-12-04 10:44:36 +01:00
David S. Batista
c5ef0b2956
chore: adding a deprecation warning on the SentenceWindowRetriever ( #8597 )
...
* linting
* improving message
* fixing header
* adding deprecation in the release notes
2024-12-03 17:41:19 +01:00
Julian Risch
41369b9e0a
chore: Mention breaking changes in PR template ( #8602 )
2024-12-03 17:18:48 +01:00
Amna Mubashar
4c8eb54049
feat: Add store_full_path to converters (3/3) ( #8585 )
...
* Add store_full_path params
2024-12-03 13:48:56 +05:00
Stefano Fiorucci
de7099e560
ci: add job to check imports ( #8594 )
...
* try checking imports
* clarify error message
* better fmt
* do not show complete list of successfully imported packages
* refinements
* relnote
* add missing forward references
* better function name
* linting
* fix linting
* Update .github/utils/check_imports.py
Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
---------
Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
2024-11-29 14:00:59 +00:00
Madeesh Kannan
163c06f3d6
chore: Revert change to deserialization error in Pipeline ( #8591 )
2024-11-28 13:28:52 +01:00
Stefano Fiorucci
c8685aa141
refactor: update components to access ChatMessage.text instead of content ( #8589 )
...
* introduce text property and deprecate content
* release note
* use chatmessage.text
* release note
* linting
2024-11-28 10:16:07 +00:00
Stefano Fiorucci
fb1baf4921
refactor: ChatMessage - introduce text property and deprecate content ( #8588 )
...
* introduce text property and deprecate content
* release note
* minor test refactoring
---------
Co-authored-by: Michele Pangrazzi <xmikex83@gmail.com>
2024-11-28 09:53:02 +00:00
Stefano Fiorucci
51c1390426
chore: use class methods to create ChatMessage ( #8581 )
...
* use class methods to build messages
* fix failing format
2024-11-28 09:35:24 +00:00
Silvano Cerza
473f7bef11
Change Pipeline.from_dict error message
2024-11-28 10:15:06 +01:00
Stefano Fiorucci
fb42c035c5
feat: PyPDFToDocument - add new customization parameters ( #8574 )
...
* deprecat converter in pypdf
* fix linting of MetaFieldGroupingRanker
* linting
* pypdftodocument: add customization params
* fix mypy
* incorporate feedback
2024-11-26 16:37:59 +01:00
Stefano Fiorucci
2440a5ee17
chore:PyPDFToDocument - deprecate converter init parameter ( #8569 )
...
* deprecat converter in pypdf
* fix linting of MetaFieldGroupingRanker
* linting
2024-11-26 14:47:04 +01:00
Haystack Bot
3d95e06722
chore: Update unstable version to 2.9.0-rc0 ( #8582 )
...
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2024-11-26 16:42:12 +05:00
Michele Pangrazzi
f0c3692cf2
Remove is_greedy deprecated argument from @component decorator ( #8580 )
...
* Remove 'is_greedy' deprecated argument from @component decorator
* Remove unused import
v2.9.0-rc0
2024-11-26 10:44:50 +00:00
Vladimir Blagojevic
59f1e182db
feat: Add variable to specify inputs as optional to ConditionalRouter ( #8568 )
...
* Add optional_variables in ConditionalRouter
* Add reno note
* Add more unit test with various complex scenarios
* Add more unit tests
* Add pylint disable=too-many-positional-arguments
* PR feedback from @sjrl
2024-11-26 10:48:55 +01:00
Matt G
e3b73e048b
fix: bug on tracing where components are in a loop in a pipeline ( #8576 )
...
* Fix to tracing parent spans on loops
* Fix linting
* Add release notes
---------
Co-authored-by: Silvano Cerza <silvanocerza@gmail.com>
2024-11-25 14:21:08 +01:00
Silvano Cerza
ab840351f8
Fix DocumentCleaner not preserving Document fields ( #8578 )
2024-11-25 13:08:59 +01:00
Amna Mubashar
9302d3d9f0
feat: Add store_full_path to converters (2/3) ( #8573 )
2024-11-25 15:22:19 +05:00
Sebastian Husch Lee
eace2a99e5
feat: Add Literal["*"] option to required_variables in ChatPrompBuilder and PromptBuilder ( #8572 )
...
* Add new option for required_variables in PromptBuilder and ChatPromptBuilder
* Add reno note
* Add tests
2024-11-22 16:27:50 +01:00
David S. Batista
b5a2fad642
feat: adding Maximum Margin Relevance Ranker ( #8554 )
...
* initial import
* linting
* adding MRR tests
* adding release notes
* fixing tests
* adding linting ignore to cross-encoder ranker
* update docstring
* refactoring
* making strategy Optional instead of Literal
* wip: adding unit tests
* refactoring MMR algorithm
* refactoring tests
* cleaning up and updating tests
* adding empty line between license + code
* bug in tests
* using Enum for strategy and similarity metric
* adding more tests
* adding empty line between license + code
* removing run time params
* PR comments
* PR comments
* fixing
* fixing serialisation
* fixing serialisation tests
* Update haystack/components/rankers/sentence_transformers_diversity.py
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
* Update haystack/components/rankers/sentence_transformers_diversity.py
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
* Update haystack/components/rankers/sentence_transformers_diversity.py
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
* Update haystack/components/rankers/sentence_transformers_diversity.py
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
* Update haystack/components/rankers/sentence_transformers_diversity.py
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
* Update haystack/components/rankers/sentence_transformers_diversity.py
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
* Update haystack/components/rankers/sentence_transformers_diversity.py
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
* fixing tests
* PR comments
* PR comments
* PR comments
* PR comments
---------
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2024-11-22 14:58:45 +00:00
Richard Hudson
a8eeb2024f
feat: Allow unverified OpenAPI calls ( #8562 )
...
* Feed through ssl_verify value to OpenAPI
* Add release note
* Update serialization methods
* Applied black formatting
2024-11-22 15:45:00 +01:00
Amna Mubashar
4e6c7967d9
fix: Remove deprecated Legacy Filter Tests ( #8567 )
2024-11-22 14:39:22 +01:00
Amna Mubashar
21906d0558
feat: Add store_full_path to converters (1/3) ( #8566 )
...
* Add store_full_path param to 3 converters
2024-11-22 13:55:08 +01:00
Stefano Fiorucci
0b2a299378
fix linting of MetaFieldGroupingRanker ( #8570 )
2024-11-22 13:37:04 +01:00
Bilge Yücel
5cc2df16ee
Update the Studio part on README.md ( #8564 )
2024-11-21 12:25:35 +03:00
Ulises M
b1353f4f0f
fix: append runtime meta to ChatMessage's extracted meta in AnswerBuilder ( #8544 )
...
* append runtime meta to extracted meta
* add pylint ignore flag to .run()
* explicitly convert reply to string
2024-11-20 20:07:04 +01:00
Julian Risch
2cc45dd5b9
docs: Update discord invite link in README.md ( #8560 )
2024-11-20 16:27:31 +01:00
Silvano Cerza
3ef8c081be
fix: OpenAIChatGenerator and OpenAIGenerator crashing when streaming with usage tracking ( #8558 )
...
* Fix OpenAIGenerator crashing with tracking usage with streaming enabled
* Fix OpenAIChatGenerator crashing with tracking usage with streaming enabled
* Add release notes
* Fix linting
2024-11-20 10:27:22 +01:00
Daria Fokina
3a30ee352e
Update component name in docsrtrings ( #8559 )
2024-11-19 17:25:15 +01:00
Sebastian Husch Lee
14895f6573
chore: Use token instead of use_auth_token because of deprecation warning ( #8552 )
...
* Use token instead of use_auth_token because of deprecation warning
* Fix test
* pylint
* fix linting
---------
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
2024-11-18 11:58:22 +00:00
Silvano Cerza
bd77120cf3
Fix DocumentSplitter not splitting by function ( #8549 )
...
* Fix DocumentSplitter not splitting by function
* Make the split_by mapping a constant
2024-11-18 11:54:30 +01:00
Silvano Cerza
cea1e3f07e
Add Zed config folder to .gitignore ( #8550 )
2024-11-15 17:36:03 +01:00
Ivo Bellin Salarin
c78545dfc0
feat(openai): be tolerant to exceptions ( #8526 )
...
* feat: be tolerant to exceptions
if ever an error is raised by the OpenAI API, don't fail the entire processing
* fix: missing import, string separator
* Enhance error handling
* Use batched from more_itertools for compatibility with older Python versions
* Fix batching and add test
---------
Co-authored-by: Silvano Cerza <silvanocerza@gmail.com>
2024-11-15 10:52:44 +01:00
Stefano Fiorucci
f085959067
chore: declare requires-python<3.13 in pyproject ( #8547 )
...
* restrict to python<3.13
* try unpinning dulwich
* reintroduce dulwich pin
2024-11-15 09:28:39 +00:00
Sebastian Husch Lee
e45d3329a1
feat: Adding DALLE image generator ( #8448 )
...
* First pass at adding DALLE image generator
* Add missing header
* Fix tests
* Add tests
* Fix mypy
* Make mypy happy
* More unit tests
* Adding release notes
* Add a test for run
* Update haystack/components/generators/openai_dalle.py
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
* Fix pylint
* Update haystack/components/generators/openai_dalle.py
Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>
* Update haystack/components/generators/openai_dalle.py
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
* Update haystack/components/generators/openai_dalle.py
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
* Update haystack/components/generators/openai_dalle.py
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
* Update haystack/components/generators/openai_dalle.py
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
* Update haystack/components/generators/openai_dalle.py
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
---------
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2024-11-14 16:19:49 +01:00
Sriniketh J
a045c0eabb
feat: added split by line to DocumentSplitter ( #8525 )
...
* feat: added split by line to DocumentSplitter
* fix: pr review comments
Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com>
---------
Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com>
2024-11-14 16:09:01 +01:00