7 Commits

Author SHA1 Message Date
recrudesce
38768bffdf
fix: Tiktoken does not support Azure gpt-35-turbo (#4739)
* force support for gpt-35-turbo

Cos Tiktoken doesn't support it yet - see https://github.com/openai/tiktoken/pull/72

* Update openai_utils.py

* Appeasing the linting gods

Why hast thou forsaken me ?

* Remove trailing whitespace

* chg: remove redundant elif block
2023-04-25 16:43:24 +02:00
Silvano Cerza
b70715a74d
Remove retry_with_exponential_backoff in favor of tenacity (#4460) 2023-03-24 11:14:11 +01:00
Florian Hardow
462484445d
feat: break retry loop for 401 unauthorized errors in promptnode (#4389)
* feat: break retry loop for 401 unauthorized errors in promptnode

* Fix black, pylint, mypy

* Update haystack/nodes/retriever/_embedding_encoder.py

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* Update haystack/utils/openai_utils.py

Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>

* chore: blackify project

* chore: fix liniting error (remove elif after raise)

---------

Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>
Co-authored-by: Silvano Cerza <3314350+silvanocerza@users.noreply.github.com>
2023-03-17 17:07:08 +01:00
Vladimir Blagojevic
53528c96a0
feat: Add ChatGPT PromptNode layer (#4357)
* Initial ChatGPTInvocationLayer
Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
Co-authored-by: agnieszka-m <amarzec13@gmail.com>
Co-authored-by: Sebastian <sjrl@users.noreply.github.com>
2023-03-17 14:16:41 +01:00
Ahmed Nabil
d29342c8bf
feat: Add the New Tokenizer of gpt-3.5-turbo (#4331)
* Updated the tokenizer algorithm and pyproject.tomel tiktoken version

* Updated the tokenizer algorithm and pyproject.tomel tiktoken version

* Update haystack/utils/openai_utils.py

Co-authored-by: Sebastian <sjrl@users.noreply.github.com>

* Update references in openai_utils.py

* Update docs/pydoc/config/extractor.yml

Co-authored-by: Sebastian <sjrl@users.noreply.github.com>

* Update docs/pydoc/config/document-classifier.yml

Co-authored-by: Sebastian <sjrl@users.noreply.github.com>

* Update docs/pydoc/config/file-converters.yml

Co-authored-by: Sebastian <sjrl@users.noreply.github.com>

* Update docs/pydoc/config/file-classifier.yml

Co-authored-by: Sebastian <sjrl@users.noreply.github.com>

* Update docs/pydoc/config/other.yml

Co-authored-by: Sebastian <sjrl@users.noreply.github.com>

* Update docs/pydoc/config/pipelines.yml

Co-authored-by: Sebastian <sjrl@users.noreply.github.com>

* Update docs/pydoc/config/preprocessor.yml

Co-authored-by: Sebastian <sjrl@users.noreply.github.com>

* Update docs/pydoc/config/primitives.yml

Co-authored-by: Sebastian <sjrl@users.noreply.github.com>

* Update docs/pydoc/config/translator.yml

Co-authored-by: Sebastian <sjrl@users.noreply.github.com>

* Update docs/pydoc/config/crawler.yml

Co-authored-by: Sebastian <sjrl@users.noreply.github.com>

* Update docs/pydoc/config/prompt-node.yml

Co-authored-by: Sebastian <sjrl@users.noreply.github.com>

* Update docs/pydoc/config/pseudo-label-generator.yml

Co-authored-by: Sebastian <sjrl@users.noreply.github.com>

* Update docs/pydoc/config/query-classifier.yml

Co-authored-by: Sebastian <sjrl@users.noreply.github.com>

* Update docs/pydoc/config/question-generator.yml

Co-authored-by: Sebastian <sjrl@users.noreply.github.com>

* Update docs/pydoc/config/reader.yml

Co-authored-by: Sebastian <sjrl@users.noreply.github.com>

* Update docs/pydoc/config/ranker.yml

Co-authored-by: Sebastian <sjrl@users.noreply.github.com>

* Update docs/pydoc/config/retriever.yml

Co-authored-by: Sebastian <sjrl@users.noreply.github.com>

* Update docs/pydoc/config/transformers-img-to-text.yml

Co-authored-by: Sebastian <sjrl@users.noreply.github.com>

* Update openai_utils.py

Adding GPT-4 tokenization handler

* try to fix black

---------

Co-authored-by: Sebastian <sjrl@users.noreply.github.com>
Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com>
2023-03-17 08:20:57 +01:00
Vladimir Blagojevic
f13501309e
OpenAI streaming support (#4397) 2023-03-15 18:24:47 +01:00
Sebastian
1a42166978
fix: Prevent going past token limit in OpenAI calls in PromptNode (#4179)
* Refactoring to remove duplicate code when using OpenAI API

* Adding docstrings

* Fix mypy issue

* Moved retry mechanism to openai_request function in openai_utils

* Migrate OpenAI embedding encoder to use the openai_request util function.

* Adding docstrings.

* pylint import errors

* More pylint import errors

* Move construction of headers into openai_request and api_key as input variable.

* Made _openai_text_completion_tokenization_details so can be resued in PromptNode and OpenAIAnswerGenerator

* Add prompt truncation to the PromptNode.

* Removed commented out test.

* Bump version of tiktoken to 0.2.0 so we can use MODEL_TO_ENCODING to automatically determine correct tokenizer for the requested model

* Change one method back to public

* Fixed bug in token length truncation. Included answer length into truncation amount. Moved truncation higher up to PromptNode level.

* Pylint error

* Improved warning message

* Added _ensure_token_limit for HFLocalInvocationLayer. Had to remove max_length from base PromptModelInvocationLayer to ensure that max_length has a default value.

* Adding tests

* Expanded on doc strings

* Updated tests

* Update docstrings

* Update tests, and go back to how USE_TIKTOKEN was used before.

* Update haystack/nodes/prompt/prompt_node.py

Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>

* Update haystack/nodes/prompt/prompt_node.py

Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>

* Update haystack/nodes/prompt/prompt_node.py

Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>

* Update haystack/nodes/retriever/_openai_encoder.py

Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>

* Update haystack/utils/openai_utils.py

Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>

* Update haystack/utils/openai_utils.py

Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>

* Updated docstrings, and added integration marks

* Remove comment

* Update test

* Fix test

* Update test

* Updated openai_request function to work with the azure api

* Fixed error in _openai_encodery.py

---------

Co-authored-by: Agnieszka Marzec <97166305+agnieszka-m@users.noreply.github.com>
Co-authored-by: Vladimir Blagojevic <dovlex@gmail.com>
2023-03-03 13:49:21 +01:00