3174 Commits

Author SHA1 Message Date
Massimiliano Pippi
1b63cfc8b3
fix: make types work without installing pypdf (#6269)
* make types work without installing pypdf

* make pylint happy, keep pyright happy, hope mypy doesn't care
2023-11-09 20:02:22 +01:00
Vladimir Blagojevic
b4d8d1c904
feat: Add custom conversion callable to PyPDFToDocument - Haystack 2.x (#6258)
* Allow user specified converter hook

* Add a release note

* More unit tests

* PR review - Massi, use protocol as converter
2023-11-09 17:35:33 +01:00
Agnieszka Marzec
1046bebbe0
Docs: Update docstrings lg (#6260)
* Update docstrings lg

* Update test_in_memory_bm25_retriever.py

* Update test_in_memory_embedding_retriever.py

---------

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-11-09 17:34:52 +01:00
Tuana Çelik
3be6ec7840
Update openai.py (#6263) 2023-11-09 17:33:35 +01:00
Silvano Cerza
73e2843cf1
Fix deprecation warning when calling Document.from_dict() (#6267) 2023-11-09 16:50:06 +01:00
Stefano Fiorucci
f95937b0ce
chore: move HuggingFaceLocalGenerator to the generators directory (#6264)
* move HuggingFaceLocalGenerator to right directory

* fix tests
2023-11-09 15:59:23 +01:00
Stefano Fiorucci
2b3c77e41d
fix: make JoinDocuments correctly handle duplicate documents w null scores (#6261)
* fix error with null values

* release note

* simplify
2023-11-09 14:28:56 +01:00
Domenico
676da681d0
feat: MetaField Ranker (#6189)
* proposal: meta field ranker

* Apply suggestions from code review

Co-authored-by: ZanSara <sarazanzo94@gmail.com>

* update proposal filename

* feat: add metafield ranker

* fix docstrings

* remove proposal file from pr

* add release notes

* update code according to new Document class

* separate loops for each ranking mode in __merge_scores

* change error type in init and new tests for linear score warning

* docstring upd

---------

Co-authored-by: ZanSara <sarazanzo94@gmail.com>
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2023-11-09 12:20:41 +01:00
Sebastian Husch Lee
71d0d92ea2
feat: Add model_kwargs to ExtractiveReader to impact model loading (#6257)
* Add ability to pass model_kwargs to AutoModelForQuestionAnswering

* Add testing for new model_kwargs

* Add spacing

* Add release notes

* Update haystack/preview/components/readers/extractive.py

Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>

* Make changes suggested by Stefano

---------

Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>
2023-11-09 11:25:22 +01:00
Vladimir Blagojevic
cd429a73cd
feat: Add GPTChatGenerator to Haystack 2.x (#6212)
* Add GPTChatGenerator

* Apply lessons from previous PR

* PR review - Stefano
2023-11-09 10:45:41 +01:00
Daria Fokina
08e211f9d6
docs: fix whisper_local indentations docstrings (#6209)
* whisper_local indentations

* Update whisper_local.py

* fix param docstrings

---------

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-11-08 18:15:39 +01:00
Stefano Fiorucci
72cbf3ee0b
fix: replace haystack.lazy_imports with haystack.preview.lazy_imports (#6255)
* lazy import transformers in tgi

* fix pylint

* fix wrong import
2023-11-08 17:33:07 +01:00
Massimiliano Pippi
f019896335
ci: Generate release notes in a Github workflow (#6211)
* first try

* Update config.yaml

* Update github_release.yml

* set the rc0 tag more explicitly
2023-11-08 12:29:37 +01:00
jambudipa
2f118e857c
feat: add tokenization details for gpt-4-1106-preview (#6250)
* feat: add tokenization details for gpt-4-1106-preview

* update max_tokens value

* reno

---------

Co-authored-by: jambudipa <mark.norgate@ext.ons.gov.uk>
Co-authored-by: ZanSara <sara.zanzottera@deepset.ai>
2023-11-08 12:04:08 +01:00
Massimiliano Pippi
58e357148e
ci: tag when branching off for a release (#6206)
* tag when branching off

* change minor bump workflow

* Update .github/workflows/minor_version_release.yml

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* Update minor_version_release.yml

---------

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2023-11-08 11:06:45 +01:00
Silvano Cerza
bf884094d1
refactor: Change Document.blob type and remove mime_type field (#6249)
* Change Document.blob type and remove mime_type field

* Add release notes

* Remove mime_type from Document docstring
2023-11-08 10:35:17 +01:00
Stefano Fiorucci
e2881e2ad3
fix: lazy import transformers in TGI Generators (#6252)
* lazy import transformers in tgi

* fix pylint
2023-11-08 00:09:42 +01:00
Vladimir Blagojevic
5497ca2a45
feat: Adapt GPTGenerator to use str input/output format in Haystack 2.x (#6214)
* Adapt GPTGenerator to string input/output

* Finishing touches

* punctuation upd

* PR feedback

* Small naming fixes

* Update haystack/preview/components/generators/openai.py

Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>

* Update class pydoc with a printed response

---------

Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2023-11-07 18:00:43 +01:00
Silvano Cerza
6c5bfe3da4
Update README links and badges (#6248) 2023-11-07 16:53:17 +01:00
Stefano Fiorucci
9b76acb165
pin openai<1 (#6244) 2023-11-06 18:11:41 +01:00
Stefano Fiorucci
982ac3df01
fix: fix failing e2e test (after moving classifiers) (#6243)
* mv classifiers

* release note

* fix e2e test
2023-11-06 17:08:20 +01:00
Stefano Fiorucci
4782bc3e93
unpin trio (#6239) 2023-11-06 13:26:26 +01:00
Stefano Fiorucci
fb96aef4dd
refactor!: move classifiers to an appropriate directory/package (#6240)
* mv classifiers

* release note
2023-11-06 12:00:01 +01:00
Vladimir Blagojevic
d7e1833c40
feat: Add HuggingFaceTGIChatGenerator Haystack 2.x component (#6199)
* Add ChatHuggingFaceTGIGenerator

* Add release note
---------
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>
2023-11-06 09:48:45 +01:00
Massimiliano Pippi
03015877f3
chore: pin trio to <0.23 (#6227)
* chore: pin trio to <0.23

* Update Dockerfile.base

---------

Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>
2023-11-03 14:46:18 +01:00
Stefano Fiorucci
063d27c522
refactor!: rename TextDocumentSplitter to DocumentSplitter (#6223)
* rename TextDocumentSplitter to DocumentSplitter

* reno

* fix init
2023-11-03 11:33:20 +01:00
Vladimir Blagojevic
6e2dbdc320
feat: Add HuggingFaceTGIGenerator Haystack 2.x component (#6205)
* Add HuggingFaceTGIGenerator

* PR review

* PR feedback from Stefano

---------

Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>
2023-11-02 19:35:16 +01:00
Stefano Fiorucci
8511b8cd79
feat: HuggingFaceLocalGenerator- allow passing generation_kwargs in run method (#6220)
* allow custom generation_kwargs in run

* reno

* make pylint ignore too-many-public-methods
2023-11-02 15:29:38 +01:00
Vladimir Blagojevic
f2db68ef0b
fix: Add new rankers to nodes __init__.py (#6219)
* Add new rankers to nodes __init__.py

* Add release note
2023-11-02 10:56:52 +01:00
Vatsalya Vyas
712888ce40
Update README.md (#6208)
Fixed minor typo
2023-10-31 19:02:15 +01:00
Ashwin Mathur
6bf0b9dc7c
feat: Add MarkdownToTextDocument (v2) (#6159)
* Add MarkdownToTextDocument

* Add release notes

* Update GitHub workflows

* Update GitHub workflows

* Refactor code with minimal dependencies

* Update docstrings

* Apply suggestions from code review

Co-authored-by: Daria Fokina <daria.f93@gmail.com>

* Update document with content and meta for backward compatibility

* Refactor Document Class for Backward Compatibility

Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>

* Update tests

* Improve test assertions

---------

Co-authored-by: Daria Fokina <daria.f93@gmail.com>
Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>
2023-10-31 18:28:13 +01:00
Techuuu
5e12230436
Update README.md (#6201)
* Update README.md

Fixed typos.

* Update README.md

Done

* Update README.md

Fixed.

* Update README.md

Fixed!

* Update README.md

---------

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
Co-authored-by: Daria Fokina <daria.f93@gmail.com>
2023-10-31 17:27:42 +01:00
Julian Risch
29b1fefaa4
feat: Add DocumentLanguageClassifier 2.0 (#6037)
* add DocumentLanguageClassifier and tests

* reno

* fix import, rename DocumentCleaner

* mark example usage as python code

* add assertions to e2e test

* use deserialized document_store

* Apply suggestions from code review

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>

* remove from/to_dict

* use renamed InMemoryDocumentStore

* adapt to Document refactoring

* improve docstring

* fix test for new Document

---------

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>
Co-authored-by: anakin87 <stefanofiorucci@gmail.com>
2023-10-31 15:35:05 +01:00
Massimiliano Pippi
209e349be3
do not run preview tests twice (#6204) 2023-10-31 13:13:32 +01:00
Silvano Cerza
7287657f0e
refactor: Rename Document's text field to content (#6181)
* Rework Document serialisation

Make Document backward compatible

Fix InMemoryDocumentStore filters

Fix InMemoryDocumentStore.bm25_retrieval

Add release notes

Fix pylint failures

Enhance Document kwargs handling and docstrings

Rename Document's text field to content

Fix e2e tests

Fix SimilarityRanker tests

Fix typo in release notes

Rename Document's metadata field to meta (#6183)

* fix bugs

* make linters happy

* fix

* more fix

* match regex

---------

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-10-31 12:44:04 +01:00
Vladimir Blagojevic
c51aa1ee8d
feat: Add general and HF util methods (#6200)
* Add general and hf util methods
2023-10-31 11:13:11 +01:00
HardikBandhiya
431902b0db
fix: In README.md Community Section the Twitter's new name updated to 𝕏. (#6203)
* Update README.md

In  README.md Community section the twitter's name is not updated to new name 𝕏

* Update README.md

---------

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-10-31 09:33:10 +01:00
Silvano Cerza
76d5142bb8
Refactor: Document serialization and backward compatibility (#6180)
* Rework Document serialisation

* Make Document backward compatible

* Fix InMemoryDocumentStore filters

* Fix InMemoryDocumentStore.bm25_retrieval

* Add release notes

* Fix pylint failures

* Enhance Document kwargs handling and docstrings

* cosmetics

---------

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-10-30 17:03:06 +01:00
Daria Fokina
f2bd6e266f
update anthropic doc link (#6198) 2023-10-30 15:04:47 +01:00
github-actions[bot]
104d4c374a
Update unstable version (#6197)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
v1.23.0-rc0
2023-10-30 12:57:13 +01:00
Ayush Jain
655bf68b7a
fix: Add search_engine_kwargs param to WebRetriever to pass to WebSearch (#5805)
* Add search_engine_kwargs param to WebRetriever to pass to WebSearch

* add relnote

---------

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-10-30 12:50:00 +01:00
Stefano Fiorucci
1a64fc594c
chore: make __init__.py files uniform in preview (#6187)
* align __init__

* fix wrong class name
2023-10-30 12:31:22 +01:00
Utsav Paul
447191352d
fix: Update Indentation in dense.py (#6167)
* Update Indentation in dense.py

* Update dense.py

---------

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-10-30 12:18:08 +01:00
Utsav Paul
7b605e148e
Update openai.py (#6166)
* Update openai.py

* escape new lines

* Update openai.py

---------

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-10-30 12:17:47 +01:00
Nripesh Niketan
708d33a657
feat: add apple silicon GPU acceleration (#6151)
* feat: add apple silicon GPU acceleration

* add release notes

* small fix

* Update utils.py

* Update utils.py

* ci fix mps

* Revert "ci fix mps"

This reverts commit 783ae503940d9ff8270a970a321549fb9e69dce7.

* mps fix

* Update experiment_tracking.py

* try removing upper watermark limit

* disable mps CI

* Use xl runner

* initialise env

* small fix

* black linting

---------

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-10-30 11:26:46 +01:00
Massimiliano Pippi
789e524de3
remove leftovers from 1.18 (#6196) 2023-10-30 11:25:54 +01:00
Domenico
6bddc5c78a
fix: missing closing quotation marks (#6195)
The proposal was missing closing quotation marks so it was formatted badly
2023-10-30 10:34:55 +01:00
Domenico
4196102a56
proposal: meta field ranker (#6141)
* proposal: meta field ranker

* Apply suggestions from code review

Co-authored-by: ZanSara <sarazanzo94@gmail.com>

* update proposal filename

* feat: add metafield ranker

* Revert "feat: add metafield ranker"

This reverts commit be760d8b037a3e1a37539c8002edde9d322c874a.

---------

Co-authored-by: ZanSara <sarazanzo94@gmail.com>
2023-10-30 09:24:23 +01:00
Aazam Thakur
3f3d2c3474
docs: add return types in docstring for transcribe and run function (#6177) 2023-10-28 20:33:22 +02:00
dependabot[bot]
8a2a3c9a3f
build(deps): bump tj-actions/changed-files from 39 to 40 (#6175)
Bumps [tj-actions/changed-files](https://github.com/tj-actions/changed-files) from 39 to 40.
- [Release notes](https://github.com/tj-actions/changed-files/releases)
- [Changelog](https://github.com/tj-actions/changed-files/blob/main/HISTORY.md)
- [Commits](https://github.com/tj-actions/changed-files/compare/v39...v40)

---
updated-dependencies:
- dependency-name: tj-actions/changed-files
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-10-27 16:58:38 +02:00