3778 Commits

Author SHA1 Message Date
Stefano Fiorucci
2a9a6401d2
chore: pin openai>=1.56.1 (#8632)
* pin openai>=1.56.1

* release note
2024-12-12 16:26:38 +01:00
David S. Batista
3f77d3ab6c
!feat: unify NLTKDocumentSplitter and DocumentSplitter (#8617)
* wip: initial import

* wip: refactoring

* wip: refactoring tests

* wip: refactoring tests

* making all NLTKSplitter related tests work

* refactoring

* docstrings

* refactoring and removing NLTKDocumentSplitter

* fixing tests for custom sentence tokenizer

* fixing tests for custom sentence tokenizer

* cleaning up

* adding release notes

* reverting some changes

* cleaning up tests

* fixing serialisation and adding tests

* cleaning up

* wip

* renaming and cleaning

* adding NLTK files

* updating docstring

* adding import to init

* Update haystack/components/preprocessors/document_splitter.py

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>

* updating tests

* wip

* adding sentence/period change warning

* fixing LICENSE header

* Update haystack/components/preprocessors/document_splitter.py

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>

---------

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
2024-12-12 14:22:27 +00:00
David S. Batista
6cceaac15f
docs: add deprecation warning nltk document splitter (#8628)
* adding deprecation warning

* adding release notes

* adding release notes

* updating message

* Update haystack/components/preprocessors/nltk_document_splitter.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

---------

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2024-12-12 15:16:54 +01:00
Stefano Fiorucci
04fc187bc4
chore: remove deprecation warnings related to store_full_path (#8626)
* remove deprecation warnings related to store_full_path

* unused imports
2024-12-12 09:27:19 +01:00
Michele Pangrazzi
21d53d0ec6
update default value of 'store_full_path' to False in converters (#8619) 2024-12-10 16:03:38 +01:00
dependabot[bot]
c78eb9be4e
build(deps): bump readmeio/rdme from 8 to 9 (#8615)
Bumps [readmeio/rdme](https://github.com/readmeio/rdme) from 8 to 9.
- [Release notes](https://github.com/readmeio/rdme/releases)
- [Changelog](https://github.com/readmeio/rdme/blob/next/CHANGELOG.md)
- [Commits](https://github.com/readmeio/rdme/compare/v8...v9)

---
updated-dependencies:
- dependency-name: readmeio/rdme
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-12-10 13:22:08 +01:00
David S. Batista
248dccbdd3
chore: fixing pylint issues (#8610)
* initial import

* fixing internal methods

* fixing some internal methods

* modify _preprocess

* fixed internal methods

---------

Co-authored-by: anakin87 <stefanofiorucci@gmail.com>
2024-12-09 16:53:37 +00:00
Anton Pelykh
6f983a22ca
fix: add missing stream mime type assignment to the LinkContentFetcher (#8596)
* add missing stream mime type assignment to the `LinkContentFetcher`

* fix release note fmt

---------

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
2024-12-09 14:51:14 +00:00
Stefano Fiorucci
09adf856dc
rm openapi spec util (#8613) 2024-12-09 10:59:21 +01:00
ArzelaAscoIi
ed2f37da60
fix: docstring for normalization (#8604)
* fix: docstring for normalization

* chore: add reno

* fixing docstrings and adding pylint disable too many args

---------

Co-authored-by: David S. Batista <dsbatista@gmail.com>
2024-12-06 17:13:30 +01:00
Michele Pangrazzi
b32f85cca2
remove deprecated 'converter' init parameter from PyPDFToDocument component (#8609) 2024-12-06 15:43:43 +01:00
David S. Batista
3da5bac8c4
refactor: converting some DocumentJoiner methods to staticmethod (#8606)
* converting some methods to static, since they change/depend on state of the object

* adding release notes

* removing tab
2024-12-06 10:28:41 +01:00
David S. Batista
e349a7f2fc
docs: complete docstring for DocumentJoiner code example (#8593)
* initial import

* changing a method to static

* reverting staticmethod
2024-12-05 14:04:34 +00:00
David S. Batista
2282c26f17
feat!: SentenceWindowRetriever returns List[Document] with docs ordered by split_idx_start (#8590)
* initial import

* adding a few pylint disable

* adding tests

* fixing integration tests

* adding release notes

* fixing types and docstrings
2024-12-04 16:55:56 +01:00
David S. Batista
f0638b2868
refactor: moving SentenceSplitter outside NLTKDocumentSplitter (#8599)
* initial import

* fixing imports and renaming file

* fixing imports path

* adding condition to check NLTK successfully imported

* adding one class inside the NLTK imported condition
2024-12-04 10:44:36 +01:00
David S. Batista
c5ef0b2956
chore: adding a deprecation warning on the SentenceWindowRetriever (#8597)
* linting

* improving message

* fixing header

* adding deprecation in the release notes
2024-12-03 17:41:19 +01:00
Julian Risch
41369b9e0a
chore: Mention breaking changes in PR template (#8602) 2024-12-03 17:18:48 +01:00
Amna Mubashar
4c8eb54049
feat: Add store_full_path to converters (3/3) (#8585)
* Add store_full_path params
2024-12-03 13:48:56 +05:00
Stefano Fiorucci
de7099e560
ci: add job to check imports (#8594)
* try checking imports

* clarify error message

* better fmt

* do not show complete list of successfully imported packages

* refinements

* relnote

* add missing forward references

* better function name

* linting

* fix linting

* Update .github/utils/check_imports.py

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

---------

Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>
2024-11-29 14:00:59 +00:00
Madeesh Kannan
163c06f3d6
chore: Revert change to deserialization error in Pipeline (#8591) 2024-11-28 13:28:52 +01:00
Stefano Fiorucci
c8685aa141
refactor: update components to access ChatMessage.text instead of content (#8589)
* introduce text property and deprecate content

* release note

* use chatmessage.text

* release note

* linting
2024-11-28 10:16:07 +00:00
Stefano Fiorucci
fb1baf4921
refactor: ChatMessage - introduce text property and deprecate content (#8588)
* introduce text property and deprecate content

* release note

* minor test refactoring

---------

Co-authored-by: Michele Pangrazzi <xmikex83@gmail.com>
2024-11-28 09:53:02 +00:00
Stefano Fiorucci
51c1390426
chore: use class methods to create ChatMessage (#8581)
* use class methods to build messages

* fix failing format
2024-11-28 09:35:24 +00:00
Silvano Cerza
473f7bef11 Change Pipeline.from_dict error message 2024-11-28 10:15:06 +01:00
Stefano Fiorucci
fb42c035c5
feat: PyPDFToDocument - add new customization parameters (#8574)
* deprecat converter in pypdf

* fix linting of MetaFieldGroupingRanker

* linting

* pypdftodocument: add customization params

* fix mypy

* incorporate feedback
2024-11-26 16:37:59 +01:00
Stefano Fiorucci
2440a5ee17
chore:PyPDFToDocument - deprecate converter init parameter (#8569)
* deprecat converter in pypdf

* fix linting of MetaFieldGroupingRanker

* linting
2024-11-26 14:47:04 +01:00
Haystack Bot
3d95e06722
chore: Update unstable version to 2.9.0-rc0 (#8582)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2024-11-26 16:42:12 +05:00
Michele Pangrazzi
f0c3692cf2
Remove is_greedy deprecated argument from @component decorator (#8580)
* Remove 'is_greedy' deprecated argument from @component decorator

* Remove unused import
v2.9.0-rc0
2024-11-26 10:44:50 +00:00
Vladimir Blagojevic
59f1e182db
feat: Add variable to specify inputs as optional to ConditionalRouter (#8568)
* Add optional_variables in ConditionalRouter

* Add reno note

* Add more unit test with various complex scenarios

* Add more unit tests

* Add pylint disable=too-many-positional-arguments

* PR feedback from @sjrl
2024-11-26 10:48:55 +01:00
Matt G
e3b73e048b
fix: bug on tracing where components are in a loop in a pipeline (#8576)
* Fix to tracing parent spans on loops

* Fix linting

* Add release notes

---------

Co-authored-by: Silvano Cerza <silvanocerza@gmail.com>
2024-11-25 14:21:08 +01:00
Silvano Cerza
ab840351f8
Fix DocumentCleaner not preserving Document fields (#8578) 2024-11-25 13:08:59 +01:00
Amna Mubashar
9302d3d9f0
feat: Add store_full_path to converters (2/3) (#8573) 2024-11-25 15:22:19 +05:00
Sebastian Husch Lee
eace2a99e5
feat: Add Literal["*"] option to required_variables in ChatPrompBuilder and PromptBuilder (#8572)
* Add new option for required_variables in PromptBuilder and ChatPromptBuilder

* Add reno note

* Add tests
2024-11-22 16:27:50 +01:00
David S. Batista
b5a2fad642
feat: adding Maximum Margin Relevance Ranker (#8554)
* initial import

* linting

* adding MRR tests

* adding release notes

* fixing tests

* adding linting ignore to cross-encoder ranker

* update docstring

* refactoring

* making strategy Optional instead of Literal

* wip: adding unit tests

* refactoring MMR algorithm

* refactoring tests

* cleaning up and updating tests

* adding empty line between license + code

* bug in tests

* using Enum for strategy and similarity metric

* adding more tests

* adding empty line between license + code

* removing run time params

* PR comments

* PR comments

* fixing

* fixing serialisation

* fixing serialisation tests

* Update haystack/components/rankers/sentence_transformers_diversity.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/rankers/sentence_transformers_diversity.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/rankers/sentence_transformers_diversity.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/rankers/sentence_transformers_diversity.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/rankers/sentence_transformers_diversity.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/rankers/sentence_transformers_diversity.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/rankers/sentence_transformers_diversity.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* fixing tests

* PR comments

* PR comments

* PR comments

* PR comments

---------

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2024-11-22 14:58:45 +00:00
Richard Hudson
a8eeb2024f
feat: Allow unverified OpenAPI calls (#8562)
* Feed through ssl_verify value to OpenAPI

* Add release note

* Update serialization methods

* Applied black formatting
2024-11-22 15:45:00 +01:00
Amna Mubashar
4e6c7967d9
fix: Remove deprecated Legacy Filter Tests (#8567) 2024-11-22 14:39:22 +01:00
Amna Mubashar
21906d0558
feat: Add store_full_path to converters (1/3) (#8566)
* Add store_full_path param to 3 converters
2024-11-22 13:55:08 +01:00
Stefano Fiorucci
0b2a299378
fix linting of MetaFieldGroupingRanker (#8570) 2024-11-22 13:37:04 +01:00
Bilge Yücel
5cc2df16ee
Update the Studio part on README.md (#8564) 2024-11-21 12:25:35 +03:00
Ulises M
b1353f4f0f
fix: append runtime meta to ChatMessage's extracted meta in AnswerBuilder (#8544)
* append runtime meta to extracted meta

* add pylint ignore flag to .run()

* explicitly convert reply to string
2024-11-20 20:07:04 +01:00
Julian Risch
2cc45dd5b9
docs: Update discord invite link in README.md (#8560) 2024-11-20 16:27:31 +01:00
Silvano Cerza
3ef8c081be
fix: OpenAIChatGenerator and OpenAIGenerator crashing when streaming with usage tracking (#8558)
* Fix OpenAIGenerator crashing with tracking usage with streaming enabled

* Fix OpenAIChatGenerator crashing with tracking usage with streaming enabled

* Add release notes

* Fix linting
2024-11-20 10:27:22 +01:00
Daria Fokina
3a30ee352e
Update component name in docsrtrings (#8559) 2024-11-19 17:25:15 +01:00
Sebastian Husch Lee
14895f6573
chore: Use token instead of use_auth_token because of deprecation warning (#8552)
* Use token instead of use_auth_token because of deprecation warning

* Fix test

* pylint

* fix linting

---------

Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
2024-11-18 11:58:22 +00:00
Silvano Cerza
bd77120cf3
Fix DocumentSplitter not splitting by function (#8549)
* Fix DocumentSplitter not splitting by function

* Make the split_by mapping a constant
2024-11-18 11:54:30 +01:00
Silvano Cerza
cea1e3f07e
Add Zed config folder to .gitignore (#8550) 2024-11-15 17:36:03 +01:00
Ivo Bellin Salarin
c78545dfc0
feat(openai): be tolerant to exceptions (#8526)
* feat: be tolerant to exceptions

if ever an error is raised by the OpenAI API, don't fail the entire processing

* fix: missing import, string separator

* Enhance error handling

* Use batched from more_itertools for compatibility with older Python versions

* Fix batching and add test

---------

Co-authored-by: Silvano Cerza <silvanocerza@gmail.com>
2024-11-15 10:52:44 +01:00
Stefano Fiorucci
f085959067
chore: declare requires-python<3.13 in pyproject (#8547)
* restrict to python<3.13

* try unpinning dulwich

* reintroduce dulwich pin
2024-11-15 09:28:39 +00:00
Sebastian Husch Lee
e45d3329a1
feat: Adding DALLE image generator (#8448)
* First pass at adding DALLE image generator

* Add missing header

* Fix tests

* Add tests

* Fix mypy

* Make mypy happy

* More unit tests

* Adding release notes

* Add a test for run

* Update haystack/components/generators/openai_dalle.py

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* Fix pylint

* Update haystack/components/generators/openai_dalle.py

Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>

* Update haystack/components/generators/openai_dalle.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/generators/openai_dalle.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/generators/openai_dalle.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/generators/openai_dalle.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

* Update haystack/components/generators/openai_dalle.py

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>

---------

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
Co-authored-by: Amna Mubashar <amnahkhan.ak@gmail.com>
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2024-11-14 16:19:49 +01:00
Sriniketh J
a045c0eabb
feat: added split by line to DocumentSplitter (#8525)
* feat: added split by line to DocumentSplitter

* fix: pr review comments

Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com>

---------

Co-authored-by: Sebastian Husch Lee <sjrl@users.noreply.github.com>
2024-11-14 16:09:01 +01:00