Silvano Cerza
83fce1bd72
Add Store class factory ( #5530 )
...
* Add Store class factory
* Add release notes
2023-08-09 13:09:36 +02:00
HP
ff86af576a
fix: TransformersImageToText.generate_captions accepts "str" #5485 ( #5491 )
...
* fix: TransformersImageToText.generate_captions accepts "str" #5485 -- fix author email
* fix: TransformersImageToText.generate_captions accepts "str" #5485 - fix mypy, pylint, black issues
* fix: TransformersImageToText.generate_captions accepts "str" #5485 - changes after pr review
2023-08-09 09:54:12 +02:00
ZanSara
c27622e1bc
chore: normalize more optional imports ( #5251 )
...
* docstore filters
* modeling metrics
* doc language classifier
* file converter
* docx converter
* tika
* preprocessor
* context matcher
* pylint
2023-08-09 09:27:53 +02:00
Stefano Fiorucci
30e6c7ac43
build: pin safetensors ( #5528 )
...
* pin safetensors
* rm unneeded optional pin
2023-08-08 18:05:56 +02:00
Vladimir Blagojevic
227bf6ca39
feat: Remove template variables from PromptNode invocation kwargs ( #5526 )
...
* Remove template params from kwargs before passing kwargs to invocation layer
* More unit tests
* Add release note
* Enable simple prompt node pipeline integration test use case
2023-08-08 16:40:23 +02:00
Vladimir Blagojevic
84ed954c8c
feat: Improve performance and add default media support in FileTypeClassifier ( #5083 )
...
* feat: add media outgoing edge to FileTypeClassifier
* Add release note
* Update language
---------
Co-authored-by: Daniel Bichuetti <daniel.bichuetti@gmail.com>
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
Co-authored-by: agnieszka-m <amarzec13@gmail.com>
2023-08-08 15:51:07 +02:00
tstadel
d46c84bb61
feat: support dynamic filters in custom_query ( #5427 )
...
* support filters in custom_query
* better tests
* Update docstrings
---------
Co-authored-by: agnieszka-m <amarzec13@gmail.com>
2023-08-08 15:48:15 +02:00
Stefano Fiorucci
3f472995bb
refactor: update Crawler to support selenium>=4.11.0 and simplify it ( #5515 )
...
* refactor crawler
* rm unused imports
* release notes!
* rm outdated mock
2023-08-08 15:13:22 +02:00
Vladimir Blagojevic
37cf1fe49c
Tests in e2e/nodes/test_summarizer.py could be removed as pipeline e2e tests cover SearchSummarizationPipeline already ( #5454 )
...
Tests in e2e/nodes/test_translator.py can be removed as unit tests exist for translattor and e2e test mostly tests just that the model is good, which is nothing we should test for
2023-08-08 13:21:11 +02:00
Fanli Lin
f6b50cfdf9
fix: StopWordsCriteria doesn't compare the stop word token ids with the input ids in a continuous and sequential order ( #5503 )
...
* bug fix
* add release note
* add unit test
* refactor
---------
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>
2023-08-08 08:35:10 +02:00
Daria Fokina
99cb95a63a
docs: separate abstract classes into separate API references ( #5501 )
...
* separate_abstractions
* img-to-text parent slug upd
* Apply suggestions from code review
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
---------
Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
2023-08-07 12:21:25 +02:00
Massimiliano Pippi
ac4e762422
Fix datadog client init ( #5524 )
2023-08-07 12:18:46 +02:00
Stefano Fiorucci
43d4730b6c
remove reference to the UI directory ( #5522 )
2023-08-07 11:52:37 +02:00
Fanli Lin
4496fc6afd
fix: leading whitespace is missing in the generated text when using stop_words
( #5511 )
...
* bug fix
* add release note
* Update releasenotes/notes/fix-stop-words-strip-issue-22ce51306e7b91e4.yaml
Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>
* Update releasenotes/notes/fix-stop-words-strip-issue-22ce51306e7b91e4.yaml
Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>
---------
Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>
2023-08-04 17:40:19 +02:00
Vladimir Blagojevic
abc6737e63
feat: Improve LFQA Web Example ( #5504 )
...
* Improve web_lfqa example
* Turn off pylint for logging setup
* Another way to turn off logging
2023-08-04 14:20:06 +02:00
Massimiliano Pippi
c079576a87
chore: move base test class into haystack core ( #5509 )
...
* move base test class into haystack core
* fix linter
* do not compute coverage of testing code
2023-08-04 12:42:13 +02:00
tstadel
d26d4201fc
feat: support search_fields in DeepsetCloudDocumentStore ( #5455 )
...
* feat: support search_fields in DeepsetCloudDocumentStore
* add reno file
* make search_fields plain init arg
* Update lg
* Update releasenotes/notes/deepset-cloud-document-store-search-fields-40b2322466f808a3.yaml
* Update haystack/document_stores/deepsetcloud.py
---------
Co-authored-by: agnieszka-m <amarzec13@gmail.com>
2023-08-04 11:13:05 +02:00
Vladimir Blagojevic
d96c963bc4
test: Convert two HFLocalInvocationLayer integration to unit tests ( #5446 )
...
* Convert two HFLocalInvocationLayer integration to unit tests
* Simplify unit test
* Improve HFLocalInvocationLayer unit tests
2023-08-03 17:41:32 +02:00
Daria Fokina
1f88cd165f
Update hugging_face.py ( #5488 )
2023-08-03 13:34:45 +02:00
bogdankostic
56cea8cbbd
test: Add scripts to send benchmark results to datadog ( #5432 )
...
* Add config files
* log benchmarks to stdout
* Add top-k and batch size to configs
* Add batch size to configs
* fix: don't download files if they already exist
* Add batch size to configs
* refine script
* Remove configs using 1m docs
* update run script
* update run script
* update run script
* datadog integration
* remove out folder
* gitignore benchmarks output
* test: send benchmarks to datadog
* remove uncommented lines in script
* feat: take branch/tag argument for benchmark setup script
* fix: run.sh should ignore errors
* Remove changes unrelated to datadog
* Apply black
* Update test/benchmarks/utils.py
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
* PR feedback
* Account for reader benchmarks not doing indexing
* Change key of reader metrics
* Apply PR feedback
* Remove whitespace
---------
Co-authored-by: rjanjua <rohan.janjua@gmail.com>
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2023-08-03 10:09:00 +02:00
bogdankostic
a26859f065
docs: Add inherited methods to API reference documentation ( #5273 )
...
* Add inherited methods to API reference documentation
* Fix typing
2023-08-02 18:54:15 +02:00
Vladimir Blagojevic
1876c41f07
feat: Add LostInTheMiddleRanker ( #5457 )
...
* Add lost in the middle ranker
* Add release note
* Julian's feedback: more precise version of truncate
* Better comments for the litm algorithm
* Sebastian PR feedback
* Add check for invalid values of word_count_threshold
* Remove _truncate as it is not needed any more
---------
Co-authored-by: Darja Fokina <daria.f93@gmail.com>
2023-08-02 17:05:13 +02:00
Vladimir Blagojevic
0efe0ee7b3
feat: Add top_k
parameter to DiversityRanker
init method ( #5494 )
...
* Add top_k
* Add release note
2023-08-02 17:04:04 +02:00
Fanli Lin
8d04f28e11
fix: hf agent outputs the prompt text while the openai agent not ( #5461 )
...
* add skil prompt
* fix formatting
* add release note
* add release note
* Update releasenotes/notes/add-skip-prompt-for-hf-model-agent-89aef2838edb907c.yaml
Co-authored-by: Daria Fokina <daria.f93@gmail.com>
* Update haystack/nodes/prompt/invocation_layer/handlers.py
Co-authored-by: bogdankostic <bogdankostic@web.de>
* Update haystack/nodes/prompt/invocation_layer/handlers.py
Co-authored-by: bogdankostic <bogdankostic@web.de>
* Update haystack/nodes/prompt/invocation_layer/hugging_face.py
Co-authored-by: bogdankostic <bogdankostic@web.de>
* add a unit test
* add a unit test2
* add skil prompt
* Revert "add skil prompt"
This reverts commit b1ba938c94b67a4fd636d321945990aabd2c5b2a.
* add unit test
---------
Co-authored-by: Daria Fokina <daria.f93@gmail.com>
Co-authored-by: bogdankostic <bogdankostic@web.de>
2023-08-02 16:34:33 +02:00
Fanli Lin
73fa796735
fix: enable passing max_length
for text2text-generation task ( #5420 )
...
* bug fix
* add unit test
* reformatting
* add release note
* add release note
* Update releasenotes/notes/enable-set-max-length-during-runtime-097d65e537bf800b.yaml
Co-authored-by: bogdankostic <bogdankostic@web.de>
* Update test/prompt/invocation_layer/test_hugging_face.py
Co-authored-by: bogdankostic <bogdankostic@web.de>
* Update test/prompt/invocation_layer/test_hugging_face.py
Co-authored-by: bogdankostic <bogdankostic@web.de>
* Update test/prompt/invocation_layer/test_hugging_face.py
Co-authored-by: bogdankostic <bogdankostic@web.de>
* Update test/prompt/invocation_layer/test_hugging_face.py
Co-authored-by: bogdankostic <bogdankostic@web.de>
* bug fix
---------
Co-authored-by: bogdankostic <bogdankostic@web.de>
2023-08-02 14:13:30 +02:00
Vladimir Blagojevic
40a2e9b56a
refactor: Update WebRetriever to use LinkContentFetcher ( #5229 )
...
* Refactor WebRetriever to use LinkContentFetcher
* PR feedback
---------
Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>
2023-08-02 12:45:03 +02:00
Fanli Lin
f7fd5eeb4f
feat: enable loading tokenizer for models that are not supported by the transformers library ( #5314 )
...
* add tokenizer load
* change import order
* move imports
* refactor code
* import lib
* remove pretrainedmodel
* fix linting
* update patch
* fix order
* remove tokenizer class
* use tokenizer class
* no copy
* add case for model is an instance
* fix optional
* add ut
* set default to None
* change models
* Update haystack/nodes/prompt/invocation_layer/hugging_face.py
Co-authored-by: bogdankostic <bogdankostic@web.de>
* Update haystack/nodes/prompt/invocation_layer/hugging_face.py
Co-authored-by: bogdankostic <bogdankostic@web.de>
* add unit tests
* add unit tests
* remove lib
* formatting
* formatting
* formatting
* add release note
* Update releasenotes/notes/load-tokenizer-if-not-load-by-transformers-5841cdc9ff69bcc2.yaml
Co-authored-by: bogdankostic <bogdankostic@web.de>
---------
Co-authored-by: bogdankostic <bogdankostic@web.de>
2023-08-02 11:42:23 +02:00
bogdankostic
97e4522a83
build: Remove upper bound for weaviate client ( #5486 )
...
* Set upper bound for boto3 and botocore versions
* Set lower bound for weaviate client
* Remove upper bound for version from weaviate
* Add release note
* Update release note
* Remove release note
2023-08-02 11:08:50 +02:00
Bilge Yücel
37bdfddff5
Fix Agent API ( #5483 )
...
* Fix agent.yml for new modules
* Fix ConversationalAgent docstrings
2023-08-01 17:05:13 +03:00
Vladimir Blagojevic
540d0fad97
feat: Add DiversityRanker ( #5398 )
...
* Introduce DiversityRanker
* improve most_diverse_order speed
* Compute mean for numerical stability
* Add release note
* Add cosine similarity
* Test both dot product and cosine similarity
* Add pydocs hook
---------
Co-authored-by: Michel Bartels <login@michelbartels.com>
2023-08-01 12:48:34 +02:00
Malte Pietsch
8c017ccc32
Update installation instructions in README.md ( #5480 )
2023-08-01 12:33:40 +02:00
Silvano Cerza
bc152d953c
Skip running tests in CI when editing docs Python files ( #5482 )
2023-08-01 12:31:24 +02:00
Silvano Cerza
9a359101fd
chore: Rework docs generation ( #5481 )
...
* Change docs generation to use id for parent doc instead of slug
* Rename step
2023-08-01 12:18:33 +02:00
bogdankostic
a51ca19fe4
feat: Add TextFileToDocument
component (v2) ( #5467 )
...
* Add TextfileToDocument component
* Add docstrings
* Add unit tests
* Add release note file
* Make use of progress bar
* Add TextfileToDocument to __init__.py
* Use lazy % formatting in logging functions
* Remove f from non-f-string
* Add TextfileToDocument to __init__.py
* Use correct dependency extra
* Compare file path against path object
* PR feedback
* PR feedback
* Update haystack/preview/components/file_converters/txt.py
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
* Update docstrings
* Add error handling
* Add unit test
* Reintroduce falsely removed caplog
---------
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2023-08-01 11:34:52 +02:00
Muhammad Bilal
8920fd6939
feat: add optional index selection for endpoints ( #5444 )
...
* add index selection
* reformatting
* updated test script
2023-08-01 10:47:46 +02:00
Bilge Yücel
62029ba441
Add AgentStep to api reference ( #5402 )
2023-07-31 19:26:34 +03:00
Stefano Fiorucci
6f534873a5
fix: restrict supports
method in the OpenAI invocation layer and a similar method in the EmbeddingRetriever
( #5458 )
...
* restrict OpenAI supports method
* better note
* Update releasenotes/notes/restrict-openai-supports-method-fb126583e4beb057.yaml
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
---------
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2023-07-31 13:14:22 +02:00
Massimiliano Pippi
d9fd1ab7bc
feat!: remove original files after indexing ( #5459 )
...
* remove original files after indexing
* fix tests
2023-07-31 13:07:16 +02:00
Massimiliano Pippi
5f01391827
add workflow to check presence of release notes ( #5449 )
2023-07-27 10:40:40 +02:00
Stefano Fiorucci
672813052d
Update invocation-layers.yml ( #5445 )
2023-07-26 15:39:08 +02:00
Vladimir Blagojevic
409e3471cb
feat: Enable Support for Meta LLama-2 Models in Amazon Sagemaker ( #5437 )
...
* Enable Support for Meta LLama-2 Models in Amazon Sagemaker
* Improve unit test for invocation layers positioning
* Small adjustment, add more unit tests
* mypy fixes
* Improve unit tests
* Update test/prompt/invocation_layer/test_sagemaker_meta.py
Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>
* PR feedback
* Add pydocs for newly extracted methods
* simplify is_proper_chat_*
---------
Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>
Co-authored-by: anakin87 <stefanofiorucci@gmail.com>
2023-07-26 15:26:39 +02:00
Silvano Cerza
9ab6298f1d
build: Unpin mlflow, constraint dulwich and botocore ( #5441 )
...
* Unpin mlflow
* Pin dulwich
* Pin botocore
2023-07-26 12:59:16 +02:00
Silvano Cerza
7940ec0482
Add @store decorator ( #5438 )
2023-07-26 09:32:23 +02:00
Vladimir Blagojevic
22897c17a2
fix:Improve log warnings in REST API /health endpoint ( #5381 )
...
* Improve warning in REST APIs get_health_status method
* Convert log message
* A better solution and documentation
* Add another nested try/except block
* Simplify
2023-07-25 17:06:03 +02:00
Julian Risch
5bb0a1f57a
Revert "fix: num_return_sequences should be less than num_beams, not top_k ( #5280 )" ( #5434 )
...
This reverts commit 514f93a6eb575d376b21d22e32080fac62cf785f.
2023-07-25 13:27:41 +02:00
Sebastian Husch Lee
2bc7fe1a08
test: reactivate unit tests in test_eval.py
( #5255 )
...
* Activate tests that follow unit test and integration test rules
* Adding more integration labels
* Change name to better reflect complexity of test
* Remove mark integration tags, move test to doc store test for add_eval_data
* Removing incorrect integration label
* Deactivated document store test b/c it fails for Weaviate and pinecone
* Remove unit label since test needs to be refactored to be considered a unit test
* Undo changes
* Undo change
* Check every field in the load evaluation result
* Add back label and add skip reason
* Use pytest skip instead of TODO
2023-07-24 17:07:45 +02:00
Massimiliano Pippi
363f3edbf7
feat: add reno
to manage release notes ( #5397 )
...
* first draft
* add release notes
* remove old settings
* add reno usage instructions
* page the docs team when release notes are added
* add reno to the dev dependencies
* Apply suggestions from code review
Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
---------
Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>
Co-authored-by: Daria Fokina <daria.fokina@deepset.ai>
2023-07-24 17:02:46 +02:00
github-actions[bot]
afabc785c3
Update unstable version ( #5424 )
...
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2023-07-24 16:59:49 +02:00
bogdankostic
345dbeb638
docs: Add Elasticsearch to API config ( #5422 )
2023-07-24 16:23:13 +02:00
Nicola Procopio
8a2ab82651
feat: Added hybrid search example ( #5376 )
...
* added hybrid search example
Added an example about hybrid search for faq pipeline on covid dataset
* formatted with back formatter
* renamed document
* fixed
* fixed typos
* added test
added test for hybrid search
* fixed withespaces
* removed test for hybrid search
* fixed pylint
* commented logging
2023-07-24 12:54:21 +02:00