3803 Commits

Author SHA1 Message Date
Stefano Fiorucci
4782bc3e93
unpin trio (#6239) 2023-11-06 13:26:26 +01:00
Stefano Fiorucci
fb96aef4dd
refactor!: move classifiers to an appropriate directory/package (#6240)
* mv classifiers

* release note
2023-11-06 12:00:01 +01:00
Vladimir Blagojevic
d7e1833c40
feat: Add HuggingFaceTGIChatGenerator Haystack 2.x component (#6199)
* Add ChatHuggingFaceTGIGenerator

* Add release note
---------
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>
2023-11-06 09:48:45 +01:00
Massimiliano Pippi
03015877f3
chore: pin trio to <0.23 (#6227)
* chore: pin trio to <0.23

* Update Dockerfile.base

---------

Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>
2023-11-03 14:46:18 +01:00
Stefano Fiorucci
063d27c522
refactor!: rename TextDocumentSplitter to DocumentSplitter (#6223)
* rename TextDocumentSplitter to DocumentSplitter

* reno

* fix init
2023-11-03 11:33:20 +01:00
Vladimir Blagojevic
6e2dbdc320
feat: Add HuggingFaceTGIGenerator Haystack 2.x component (#6205)
* Add HuggingFaceTGIGenerator

* PR review

* PR feedback from Stefano

---------

Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>
2023-11-02 19:35:16 +01:00
Stefano Fiorucci
8511b8cd79
feat: HuggingFaceLocalGenerator- allow passing generation_kwargs in run method (#6220)
* allow custom generation_kwargs in run

* reno

* make pylint ignore too-many-public-methods
2023-11-02 15:29:38 +01:00
Vladimir Blagojevic
f2db68ef0b
fix: Add new rankers to nodes __init__.py (#6219)
* Add new rankers to nodes __init__.py

* Add release note
2023-11-02 10:56:52 +01:00
Vatsalya Vyas
712888ce40
Update README.md (#6208)
Fixed minor typo
2023-10-31 19:02:15 +01:00
Ashwin Mathur
6bf0b9dc7c
feat: Add MarkdownToTextDocument (v2) (#6159)
* Add MarkdownToTextDocument

* Add release notes

* Update GitHub workflows

* Update GitHub workflows

* Refactor code with minimal dependencies

* Update docstrings

* Apply suggestions from code review

Co-authored-by: Daria Fokina <daria.f93@gmail.com>

* Update document with content and meta for backward compatibility

* Refactor Document Class for Backward Compatibility

Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>

* Update tests

* Improve test assertions

---------

Co-authored-by: Daria Fokina <daria.f93@gmail.com>
Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>
2023-10-31 18:28:13 +01:00
Techuuu
5e12230436
Update README.md (#6201)
* Update README.md

Fixed typos.

* Update README.md

Done

* Update README.md

Fixed.

* Update README.md

Fixed!

* Update README.md

---------

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
Co-authored-by: Daria Fokina <daria.f93@gmail.com>
2023-10-31 17:27:42 +01:00
Julian Risch
29b1fefaa4
feat: Add DocumentLanguageClassifier 2.0 (#6037)
* add DocumentLanguageClassifier and tests

* reno

* fix import, rename DocumentCleaner

* mark example usage as python code

* add assertions to e2e test

* use deserialized document_store

* Apply suggestions from code review

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>

* remove from/to_dict

* use renamed InMemoryDocumentStore

* adapt to Document refactoring

* improve docstring

* fix test for new Document

---------

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>
Co-authored-by: anakin87 <stefanofiorucci@gmail.com>
2023-10-31 15:35:05 +01:00
Massimiliano Pippi
209e349be3
do not run preview tests twice (#6204) 2023-10-31 13:13:32 +01:00
Silvano Cerza
7287657f0e
refactor: Rename Document's text field to content (#6181)
* Rework Document serialisation

Make Document backward compatible

Fix InMemoryDocumentStore filters

Fix InMemoryDocumentStore.bm25_retrieval

Add release notes

Fix pylint failures

Enhance Document kwargs handling and docstrings

Rename Document's text field to content

Fix e2e tests

Fix SimilarityRanker tests

Fix typo in release notes

Rename Document's metadata field to meta (#6183)

* fix bugs

* make linters happy

* fix

* more fix

* match regex

---------

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-10-31 12:44:04 +01:00
Vladimir Blagojevic
c51aa1ee8d
feat: Add general and HF util methods (#6200)
* Add general and hf util methods
2023-10-31 11:13:11 +01:00
HardikBandhiya
431902b0db
fix: In README.md Community Section the Twitter's new name updated to 𝕏. (#6203)
* Update README.md

In  README.md Community section the twitter's name is not updated to new name 𝕏

* Update README.md

---------

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-10-31 09:33:10 +01:00
Silvano Cerza
76d5142bb8
Refactor: Document serialization and backward compatibility (#6180)
* Rework Document serialisation

* Make Document backward compatible

* Fix InMemoryDocumentStore filters

* Fix InMemoryDocumentStore.bm25_retrieval

* Add release notes

* Fix pylint failures

* Enhance Document kwargs handling and docstrings

* cosmetics

---------

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-10-30 17:03:06 +01:00
Daria Fokina
f2bd6e266f
update anthropic doc link (#6198) 2023-10-30 15:04:47 +01:00
github-actions[bot]
104d4c374a
Update unstable version (#6197)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
v1.23.0-rc0
2023-10-30 12:57:13 +01:00
Ayush Jain
655bf68b7a
fix: Add search_engine_kwargs param to WebRetriever to pass to WebSearch (#5805)
* Add search_engine_kwargs param to WebRetriever to pass to WebSearch

* add relnote

---------

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-10-30 12:50:00 +01:00
Stefano Fiorucci
1a64fc594c
chore: make __init__.py files uniform in preview (#6187)
* align __init__

* fix wrong class name
2023-10-30 12:31:22 +01:00
Utsav Paul
447191352d
fix: Update Indentation in dense.py (#6167)
* Update Indentation in dense.py

* Update dense.py

---------

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-10-30 12:18:08 +01:00
Utsav Paul
7b605e148e
Update openai.py (#6166)
* Update openai.py

* escape new lines

* Update openai.py

---------

Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-10-30 12:17:47 +01:00
Nripesh Niketan
708d33a657
feat: add apple silicon GPU acceleration (#6151)
* feat: add apple silicon GPU acceleration

* add release notes

* small fix

* Update utils.py

* Update utils.py

* ci fix mps

* Revert "ci fix mps"

This reverts commit 783ae503940d9ff8270a970a321549fb9e69dce7.

* mps fix

* Update experiment_tracking.py

* try removing upper watermark limit

* disable mps CI

* Use xl runner

* initialise env

* small fix

* black linting

---------

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-10-30 11:26:46 +01:00
Massimiliano Pippi
789e524de3
remove leftovers from 1.18 (#6196) 2023-10-30 11:25:54 +01:00
Domenico
6bddc5c78a
fix: missing closing quotation marks (#6195)
The proposal was missing closing quotation marks so it was formatted badly
2023-10-30 10:34:55 +01:00
Domenico
4196102a56
proposal: meta field ranker (#6141)
* proposal: meta field ranker

* Apply suggestions from code review

Co-authored-by: ZanSara <sarazanzo94@gmail.com>

* update proposal filename

* feat: add metafield ranker

* Revert "feat: add metafield ranker"

This reverts commit be760d8b037a3e1a37539c8002edde9d322c874a.

---------

Co-authored-by: ZanSara <sarazanzo94@gmail.com>
2023-10-30 09:24:23 +01:00
Aazam Thakur
3f3d2c3474
docs: add return types in docstring for transcribe and run function (#6177) 2023-10-28 20:33:22 +02:00
dependabot[bot]
8a2a3c9a3f
build(deps): bump tj-actions/changed-files from 39 to 40 (#6175)
Bumps [tj-actions/changed-files](https://github.com/tj-actions/changed-files) from 39 to 40.
- [Release notes](https://github.com/tj-actions/changed-files/releases)
- [Changelog](https://github.com/tj-actions/changed-files/blob/main/HISTORY.md)
- [Commits](https://github.com/tj-actions/changed-files/compare/v39...v40)

---
updated-dependencies:
- dependency-name: tj-actions/changed-files
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-10-27 16:58:38 +02:00
Julian Risch
db36d6277a
docs: Add readme sync for API docs 2.0 (#6173)
* add sync docs for preview

* add example config for audio docs

* hardcode version in renderer

* use custom renderer for preview docs

* update comment and excerpt
2023-10-27 14:53:03 +02:00
Vladimir Blagojevic
f76fc04ed0
feat: Add StreamingChunk dataclass to Haystack 2.x (#6174)
* Add StreamingChunk

* Add release note

* Use default value init for metadata, turn of hashing

* Add unit tests
2023-10-26 17:42:52 +02:00
Vladimir Blagojevic
bb295d29ee
Fix failing test (#6176) 2023-10-26 17:22:24 +02:00
Ashwin Mathur
5f35e7d04a
refactor: Migrate RemoteWhisperTranscriber to OpenAI SDK. (#6149)
* Migrate RemoteWhisperTranscriber to OpenAI SDK

* Migrate RemoteWhisperTranscriber to OpenAI SDK

* Remove unnecessary imports

* Add release notes

* Fix api_key serialization

* Fix linting

* Apply suggestions from code review

Co-authored-by: ZanSara <sarazanzo94@gmail.com>

* Add additional tests for api_key

* Adapt .run() to take ByteStream inputs

* Update docstrings

* Rework implementation to use io.BytesIO

* Update error message

* Add default file name

---------

Co-authored-by: ZanSara <sarazanzo94@gmail.com>
2023-10-26 16:25:23 +02:00
Stefano Fiorucci
26a22045e4
improve docstrings (#6170)
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2023-10-26 13:08:35 +02:00
Massimiliano Pippi
7c07fb3290
Update labeler.yml (#6169) 2023-10-26 09:50:43 +02:00
Mani Yadla
3ff79d0307
Update README.md (#6163) 2023-10-25 09:42:58 +02:00
Julian Risch
fe3bc15571
chore: Rename ExtractiveReader's input from document to documents to match its type List[Document] (#6164)
* rename input param, add doc string, add example

* reno
2023-10-24 21:44:15 +02:00
Stefano Fiorucci
1f4ed3cc03
refactor!: rename SimilarityRanker to TransformersSimilarityRanker (#6100)
* rename

* release note

* Update haystack/preview/components/rankers/transformers_similarity.py

Co-authored-by: Domenico <domenico.cinque98@gmail.com>

* Update haystack/preview/components/rankers/transformers_similarity.py

Co-authored-by: Domenico <domenico.cinque98@gmail.com>

* fix test

---------

Co-authored-by: Domenico <domenico.cinque98@gmail.com>
2023-10-24 19:45:16 +02:00
Grant Williams
1cf70d3dce
build: Upgrade transformers to the latest version 4.34.1 (#5994)
* Upgrade transformers to the latest version 4.34.0 so that Haystack can support the new Mistral, Nougat, and other models.

* update release notes

* updated missing lazy import

* Update .github workflows imports

* bump more versions in .github workflows

* rever import sorting

* Update  to catch runtime errors to match haystack_hub changes

* add language parameter value to whisper test

* bump transformers version in linting preview workflow

* bump transformers version in linting preview workflow

* bump version to v4.34.1

* resolve mypy issue with reused variables

* install openai-whisper without dependencies

* remove audio extra, update whisper install instructions

* remove audio extra, update whisper install instructions

* keep audio extra but add version

* keep audio extra with no constraints

* remove audio extra

---------

Co-authored-by: Julian Risch <julian.risch@deepset.ai>
2023-10-24 19:13:12 +02:00
Vladimir Blagojevic
b9b7d7666d
feat: Add dynamic per-user ChatMessage templating support (#6161)
* Add dynamic per-user ChatMessage templating support

* Add unit tests for dynamic templating

* Update add-dynamic-per-message-templating-908468226c5e3d45.yaml

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* Proper init ValueError raising, unit tests

---------

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2023-10-24 16:50:45 +02:00
Massimiliano Pippi
dd24210908
feat: add pipeline Yaml marshaller (#6137)
* add marshaller

* release notes

* add docstrings and missing tests
2023-10-23 19:02:59 +02:00
Silvano Cerza
31fb5b84e7
feature: Add mime_type field to ByteStream (#6154)
* Add mime_type field to ByteStream

* Add release notes

* Update tests
2023-10-23 16:13:40 +02:00
Vladimir Blagojevic
dcc7e63dc9
feat: Add ChatMessage class to Haystack 2.0 (#6144)
* Add ChatMessage and ChatRole
2023-10-23 16:08:05 +02:00
Shaurya Agrawal
9d8979af41
feat: Refactor SentenceTransformersDocumentEmbedder.py (#6143)
* changed sentense_transformers

* added release note

* updated release notes

* Corrected release notes

---------

Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>
2023-10-23 14:02:35 +02:00
Silvano Cerza
ae812617fd
Remove Document.array field (#6139) 2023-10-23 13:01:15 +02:00
Stefano Fiorucci
047e79f256
refactor: better API keys handling in GPTGenerator (#6103)
* refactor: do not serialize API keys

* release note

* check if api key is set in the module client

* make tests more robust

* better tests
2023-10-23 12:53:52 +02:00
Ashwin Mathur
101bd816f8
refactor: Remove api_key from serialization of AzureOCRDocumentConverter and SerperDevWebSearch (#6150)
* Remove api_key from serialization of AzureOCRDocumentConverter

* Remove api_key from serialization of SerperDevWebSearch

* Add release notes

* Add init_fail_without_api_key test for SerperDevWebSearch

* Rename env var to AZURE_AI_API_KEY
2023-10-23 12:26:23 +02:00
Silvano Cerza
c8d162ced9
refactor: Change Document.embedding type to list of floats (#6135)
* Change Document.embedding type

* Add release notes

* Fix document_store testing

* Fix pylint

* Fix tests
2023-10-23 12:26:05 +02:00
Silvano Cerza
8f289282f1
refactor: Remove id_hash_keys field from Document (#6127)
* Remove id_hash_fields from Document

* Update release notes

* Remove unused import
2023-10-23 10:35:24 +02:00
Stefano Fiorucci
7e6c6becd6
fix release note (#6145) 2023-10-22 11:15:51 +02:00