HardikBandhiya
431902b0db
fix: In README.md Community Section the Twitter's new name updated to 𝕏. ( #6203 )
...
* Update README.md
In README.md Community section the twitter's name is not updated to new name 𝕏
* Update README.md
---------
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-10-31 09:33:10 +01:00
Silvano Cerza
76d5142bb8
Refactor: Document serialization and backward compatibility ( #6180 )
...
* Rework Document serialisation
* Make Document backward compatible
* Fix InMemoryDocumentStore filters
* Fix InMemoryDocumentStore.bm25_retrieval
* Add release notes
* Fix pylint failures
* Enhance Document kwargs handling and docstrings
* cosmetics
---------
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-10-30 17:03:06 +01:00
Daria Fokina
f2bd6e266f
update anthropic doc link ( #6198 )
2023-10-30 15:04:47 +01:00
github-actions[bot]
104d4c374a
Update unstable version ( #6197 )
...
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
v1.23.0-rc0
2023-10-30 12:57:13 +01:00
Ayush Jain
655bf68b7a
fix: Add search_engine_kwargs param to WebRetriever to pass to WebSearch ( #5805 )
...
* Add search_engine_kwargs param to WebRetriever to pass to WebSearch
* add relnote
---------
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-10-30 12:50:00 +01:00
Stefano Fiorucci
1a64fc594c
chore: make __init__.py files uniform in preview ( #6187 )
...
* align __init__
* fix wrong class name
2023-10-30 12:31:22 +01:00
Utsav Paul
447191352d
fix: Update Indentation in dense.py ( #6167 )
...
* Update Indentation in dense.py
* Update dense.py
---------
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-10-30 12:18:08 +01:00
Utsav Paul
7b605e148e
Update openai.py ( #6166 )
...
* Update openai.py
* escape new lines
* Update openai.py
---------
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-10-30 12:17:47 +01:00
Nripesh Niketan
708d33a657
feat: add apple silicon GPU acceleration ( #6151 )
...
* feat: add apple silicon GPU acceleration
* add release notes
* small fix
* Update utils.py
* Update utils.py
* ci fix mps
* Revert "ci fix mps"
This reverts commit 783ae503940d9ff8270a970a321549fb9e69dce7.
* mps fix
* Update experiment_tracking.py
* try removing upper watermark limit
* disable mps CI
* Use xl runner
* initialise env
* small fix
* black linting
---------
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-10-30 11:26:46 +01:00
Massimiliano Pippi
789e524de3
remove leftovers from 1.18 ( #6196 )
2023-10-30 11:25:54 +01:00
Domenico
6bddc5c78a
fix: missing closing quotation marks ( #6195 )
...
The proposal was missing closing quotation marks so it was formatted badly
2023-10-30 10:34:55 +01:00
Domenico
4196102a56
proposal: meta field ranker ( #6141 )
...
* proposal: meta field ranker
* Apply suggestions from code review
Co-authored-by: ZanSara <sarazanzo94@gmail.com>
* update proposal filename
* feat: add metafield ranker
* Revert "feat: add metafield ranker"
This reverts commit be760d8b037a3e1a37539c8002edde9d322c874a.
---------
Co-authored-by: ZanSara <sarazanzo94@gmail.com>
2023-10-30 09:24:23 +01:00
Aazam Thakur
3f3d2c3474
docs: add return types in docstring for transcribe and run function ( #6177 )
2023-10-28 20:33:22 +02:00
dependabot[bot]
8a2a3c9a3f
build(deps): bump tj-actions/changed-files from 39 to 40 ( #6175 )
...
Bumps [tj-actions/changed-files](https://github.com/tj-actions/changed-files ) from 39 to 40.
- [Release notes](https://github.com/tj-actions/changed-files/releases )
- [Changelog](https://github.com/tj-actions/changed-files/blob/main/HISTORY.md )
- [Commits](https://github.com/tj-actions/changed-files/compare/v39...v40 )
---
updated-dependencies:
- dependency-name: tj-actions/changed-files
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-10-27 16:58:38 +02:00
Julian Risch
db36d6277a
docs: Add readme sync for API docs 2.0 ( #6173 )
...
* add sync docs for preview
* add example config for audio docs
* hardcode version in renderer
* use custom renderer for preview docs
* update comment and excerpt
2023-10-27 14:53:03 +02:00
Vladimir Blagojevic
f76fc04ed0
feat: Add StreamingChunk
dataclass to Haystack 2.x ( #6174 )
...
* Add StreamingChunk
* Add release note
* Use default value init for metadata, turn of hashing
* Add unit tests
2023-10-26 17:42:52 +02:00
Vladimir Blagojevic
bb295d29ee
Fix failing test ( #6176 )
2023-10-26 17:22:24 +02:00
Ashwin Mathur
5f35e7d04a
refactor: Migrate RemoteWhisperTranscriber
to OpenAI SDK. ( #6149 )
...
* Migrate RemoteWhisperTranscriber to OpenAI SDK
* Migrate RemoteWhisperTranscriber to OpenAI SDK
* Remove unnecessary imports
* Add release notes
* Fix api_key serialization
* Fix linting
* Apply suggestions from code review
Co-authored-by: ZanSara <sarazanzo94@gmail.com>
* Add additional tests for api_key
* Adapt .run() to take ByteStream inputs
* Update docstrings
* Rework implementation to use io.BytesIO
* Update error message
* Add default file name
---------
Co-authored-by: ZanSara <sarazanzo94@gmail.com>
2023-10-26 16:25:23 +02:00
Stefano Fiorucci
26a22045e4
improve docstrings ( #6170 )
...
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2023-10-26 13:08:35 +02:00
Massimiliano Pippi
7c07fb3290
Update labeler.yml ( #6169 )
2023-10-26 09:50:43 +02:00
Mani Yadla
3ff79d0307
Update README.md ( #6163 )
2023-10-25 09:42:58 +02:00
Julian Risch
fe3bc15571
chore: Rename ExtractiveReader's input from document
to documents
to match its type List[Document] ( #6164 )
...
* rename input param, add doc string, add example
* reno
2023-10-24 21:44:15 +02:00
Stefano Fiorucci
1f4ed3cc03
refactor!: rename SimilarityRanker
to TransformersSimilarityRanker
( #6100 )
...
* rename
* release note
* Update haystack/preview/components/rankers/transformers_similarity.py
Co-authored-by: Domenico <domenico.cinque98@gmail.com>
* Update haystack/preview/components/rankers/transformers_similarity.py
Co-authored-by: Domenico <domenico.cinque98@gmail.com>
* fix test
---------
Co-authored-by: Domenico <domenico.cinque98@gmail.com>
2023-10-24 19:45:16 +02:00
Grant Williams
1cf70d3dce
build: Upgrade transformers to the latest version 4.34.1 ( #5994 )
...
* Upgrade transformers to the latest version 4.34.0 so that Haystack can support the new Mistral, Nougat, and other models.
* update release notes
* updated missing lazy import
* Update .github workflows imports
* bump more versions in .github workflows
* rever import sorting
* Update to catch runtime errors to match haystack_hub changes
* add language parameter value to whisper test
* bump transformers version in linting preview workflow
* bump transformers version in linting preview workflow
* bump version to v4.34.1
* resolve mypy issue with reused variables
* install openai-whisper without dependencies
* remove audio extra, update whisper install instructions
* remove audio extra, update whisper install instructions
* keep audio extra but add version
* keep audio extra with no constraints
* remove audio extra
---------
Co-authored-by: Julian Risch <julian.risch@deepset.ai>
2023-10-24 19:13:12 +02:00
Vladimir Blagojevic
b9b7d7666d
feat: Add dynamic per-user ChatMessage templating support ( #6161 )
...
* Add dynamic per-user ChatMessage templating support
* Add unit tests for dynamic templating
* Update add-dynamic-per-message-templating-908468226c5e3d45.yaml
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
* Proper init ValueError raising, unit tests
---------
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2023-10-24 16:50:45 +02:00
Massimiliano Pippi
dd24210908
feat: add pipeline Yaml marshaller ( #6137 )
...
* add marshaller
* release notes
* add docstrings and missing tests
2023-10-23 19:02:59 +02:00
Silvano Cerza
31fb5b84e7
feature: Add mime_type
field to ByteStream
( #6154 )
...
* Add mime_type field to ByteStream
* Add release notes
* Update tests
2023-10-23 16:13:40 +02:00
Vladimir Blagojevic
dcc7e63dc9
feat: Add ChatMessage class to Haystack 2.0 ( #6144 )
...
* Add ChatMessage and ChatRole
2023-10-23 16:08:05 +02:00
Shaurya Agrawal
9d8979af41
feat: Refactor SentenceTransformersDocumentEmbedder.py ( #6143 )
...
* changed sentense_transformers
* added release note
* updated release notes
* Corrected release notes
---------
Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>
2023-10-23 14:02:35 +02:00
Silvano Cerza
ae812617fd
Remove Document.array field ( #6139 )
2023-10-23 13:01:15 +02:00
Stefano Fiorucci
047e79f256
refactor: better API keys handling in GPTGenerator
( #6103 )
...
* refactor: do not serialize API keys
* release note
* check if api key is set in the module client
* make tests more robust
* better tests
2023-10-23 12:53:52 +02:00
Ashwin Mathur
101bd816f8
refactor: Remove api_key from serialization of AzureOCRDocumentConverter
and SerperDevWebSearch
( #6150 )
...
* Remove api_key from serialization of AzureOCRDocumentConverter
* Remove api_key from serialization of SerperDevWebSearch
* Add release notes
* Add init_fail_without_api_key test for SerperDevWebSearch
* Rename env var to AZURE_AI_API_KEY
2023-10-23 12:26:23 +02:00
Silvano Cerza
c8d162ced9
refactor: Change Document.embedding
type to list of floats ( #6135 )
...
* Change Document.embedding type
* Add release notes
* Fix document_store testing
* Fix pylint
* Fix tests
2023-10-23 12:26:05 +02:00
Silvano Cerza
8f289282f1
refactor: Remove id_hash_keys
field from Document
( #6127 )
...
* Remove id_hash_fields from Document
* Update release notes
* Remove unused import
2023-10-23 10:35:24 +02:00
Stefano Fiorucci
7e6c6becd6
fix release note ( #6145 )
2023-10-22 11:15:51 +02:00
Silvano Cerza
2a45e7cc06
refactor: Remove id_hash_keys
from all file_converters
( #6125 )
...
* Remove id_hash_keys from DocumentCleaner
* Remove id_hash_keys from TextDocumentSplitter
* Remove id_hash_keys from all file_converters
* Fix pylint failure
* Update docstrings
2023-10-20 16:22:14 +02:00
Silvano Cerza
3d69094f9a
refactor: Remove id_hash_keys
from TextDocumentSplitter
( #6124 )
...
* Remove id_hash_keys from DocumentCleaner
* Remove id_hash_keys from TextDocumentSplitter
2023-10-20 15:18:28 +02:00
Silvano Cerza
ec376c7dbd
Remove id_hash_keys from DocumentCleaner ( #6123 )
2023-10-20 15:16:06 +02:00
Tuana Çelik
366f0366bf
Update gpt.py docstring ( #6129 )
...
* Update gpt.py docstring
Noticed this slight issue in docstrings for GPTGenerator, so submitting a fix.
* Update haystack/preview/components/generators/openai/gpt.py
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
---------
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2023-10-20 14:45:05 +02:00
Julian Risch
64649312bc
build: Upgrade to canals==0.9.0
( #6133 )
...
* build: Upgrade to `canals==0.9.0`
* reno
2023-10-20 13:00:24 +02:00
Silvano Cerza
3f98bd9137
refactor: Rework Document.id
generation ( #6122 )
...
* Rework Document id generation
* Fix tests
* Add release notes
* Fix failing integration test
* Remove score from Document id generation
* Enhance tests
* Update release notes
---------
Co-authored-by: Julian Risch <julian.risch@deepset.ai>
2023-10-20 10:34:28 +02:00
Sunil Kumar Dash
957d1be68d
Enrich documents with embeddings for OpenAIDocumentEmbedder ( #6126 )
...
* Enrich documents with embeddings
Signed-off-by: sunilkumardash9 <sunilkumardash9@gmail.com>
* add release note
Signed-off-by: sunilkumardash9 <sunilkumardash9@gmail.com>
* try to fix typing
* change embedding field type in Document
---------
Signed-off-by: sunilkumardash9 <sunilkumardash9@gmail.com>
Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>
2023-10-19 18:29:16 +02:00
Stefano Fiorucci
fe261b9986
mv StopWordsCriteria under lazy_import ( #6128 )
2023-10-19 17:48:59 +02:00
Stefano Fiorucci
025418c10e
rm unnecessary deps ( #6121 )
2023-10-19 17:01:02 +02:00
Stefano Fiorucci
ff06da8712
pin fastapi ( #6120 )
2023-10-19 13:30:50 +02:00
Stefano Fiorucci
ef40c7c728
refactor: make sure that Document's id_hash_keys
has a valid value ( #6112 )
...
* fix handling id_hash_keys
* reno
* handle empty id_hash_keys in post_init
* fix
* reno
* test
2023-10-19 12:10:19 +02:00
Julian Risch
9f3b6512be
refactor: Remove reimplementations of default from_dict
/to_dict
and corresponding tests in 2.0 ( #6108 )
...
* whisper transcriber
* remove from/to_dict from builders
* remove from/to_dict from embedders
* remove from/to_dict from fetcher, file_converters
* remove from/to_dict from generators, preprocessors
* remove from/to_dict from ranker, reader
* remove from/to_dict from router, sampler, websearch
* pylint
* reno
* refactor import
* remove unused import
2023-10-19 11:17:02 +02:00
Stefano Fiorucci
6df077cbb4
add more-itertools to preview dependencies ( #6110 )
2023-10-18 17:53:48 +02:00
Stefano Fiorucci
21d894d85a
refactor: adopt token
instead of use_auth_token
in HF components ( #6040 )
...
* move embedding backends
* use token in Sentence Transformers embeddings
* more compact token handling
* token parameter in reader
* add token to ranker
* release note
* add test for reader
2023-10-17 16:32:13 +02:00
Stefano Fiorucci
4e4af99a5e
refactor!: rename MemoryDocumentStore
and related Retrievers ( #6076 )
...
* rename doc store and retrievers
* release note
* fix patch
2023-10-17 16:15:16 +02:00