16 Commits

Author SHA1 Message Date
Li Jiang
5a96dc2c29
Add source to the answer for default prompt (#2289)
* Add source to the answer for default prompt

* Fix qdrant

* Fix tests

* Update docstring

* Fix check files

* Fix qdrant test error
2024-04-10 00:45:26 +00:00
Li Jiang
6b1376b04d
Add bs4 and overlap (#2271)
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2024-04-05 05:21:32 +00:00
Li Jiang
42b27b9a9d
Add isort (#2265)
* Add isort

* Apply isort on py files

* Fix circular import

* Fix format for notebooks

* Fix format

---------

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2024-04-05 02:26:06 +00:00
Gunnar Kudrjavets
b8ceb866e6
Add shebang functionality to tests (#1784)
Tests that contain `if __name__ == "__main__":` now have a shebang line
and execute permission.
2024-02-29 01:11:08 +00:00
Maxim Saplin
c80df8acab
Skip tests that depend on OpenAI via --skip-openai (#1097)
* --skip-openai

* All tests pass

* Update build.yml

* Update Contribute.md

* Fix for failing Ubuntu tests

* More tests skipped, fixing 3.10 build

* Apply suggestions from code review

Co-authored-by: Qingyun Wu <qingyun0327@gmail.com>

* Added more comments

* fixed test__wrap_function_*

---------

Co-authored-by: Qingyun Wu <qingyun0327@gmail.com>
Co-authored-by: Davor Runje <davor@airt.ai>
2023-12-31 19:37:21 +00:00
Li Jiang
07646d448c
Support custom text formats and recursive (#496)
* Add custom text types and recursive

* Add custom text types and recursive

* Fix format

* Update qdrant, Add pdf to unstructured

* Use unstructed as the default text extractor if installed

* Add tests for unstructured

* Update tests env for unstructured

* Fix error if last message is a function call, issue #569

* Remove csv, md and tsv from UNSTRUCTURED_FORMATS

* Update docstring of docs_path

* Update test for get_files_from_dir

* Update docstring of custom_text_types

* Fix missing search_string in update_context

* Add custom_text_types to notebook example
2023-11-21 03:53:50 +00:00
Li Jiang
f052977e24
Add support to unstructrued (#501)
* Add support to unstructrued

* Fix tests

* Add test and documents

* Fix tests

* Fix tests

* Test unstructured on linux and mac
2023-11-05 13:30:28 +00:00
Chi Wang
c4f8b1c761
Dev/v0.2 (#393)
* api_base -> base_url (#383)

* InvalidRequestError -> BadRequestError (#389)

* remove api_key_path; close #388

* close #402 (#403)

* openai client (#419)

* openai client

* client test

* _client -> client

* _client -> client

* extra kwargs

* Completion -> client (#426)

* Completion -> client

* Completion -> client

* Completion -> client

* Completion -> client

* support aoai

* fix test error

* remove commented code

* support aoai

* annotations

* import

* reduce test

* skip test

* skip test

* skip test

* debug test

* rename test

* update workflow

* update workflow

* env

* py version

* doc improvement

* docstr update

* openai<1

* add tiktoken to dependency

* filter_func

* async test

* dependency

* migration guide (#477)

* migration guide

* change in kwargs

* simplify header

* update optigude description

* deal with azure gpt-3.5

* add back test_eval_math_responses

* timeout

* Add back tests for RetrieveChat (#480)

* Add back tests for RetrieveChat

* Fix format

* Update dependencies order

* Fix path

* Fix path

* Fix path

* Fix tests

* Add not run openai on MacOS or Win

* Update skip openai tests

* Remove unnecessary dependencies, improve format

* Add py3.8 for testing qdrant

* Fix multiline error of windows

* Add openai tests

* Add dependency mathchat, remove unused envs

* retrieve chat is tested

* bump version to 0.2.0b1

---------

Co-authored-by: Li Jiang <bnujli@gmail.com>
2023-11-04 04:01:49 +00:00
Chi Wang
dd90756bdb
bump version to 0.1.14 (#400)
* bump version to 0.1.14

* endpoint

* test

* test

* add ipython to retrievechat dependency

* constraints

* target
2023-10-28 00:24:04 +00:00
Yiran Wu
8c3401dd6a
Add token_count_util (#421)
* add token_count_util

* remove token_count from retrieval util

* format

* update dependency

* update test
2023-10-27 12:57:35 +00:00
Li Jiang
80954e4b8d
Fix tmp dir not exists (#401)
* Fix tmp dir not exists

* Update tests to make it more clear

* Add check if save path is not None
2023-10-24 16:09:25 +00:00
Li Jiang
f2d7553cdc
Add support to custom text spliter (#270)
* Add support to custom text spliter function and a list of files or urls

* Add parameter to retrieve_config, add tests

* Fix tests

* Fix tests
2023-10-17 14:53:40 +00:00
Li Jiang
fa6e2a52c0
Add support to customized vectordb and embedding functions (#161)
* Add custom embedding function

* Add support to custom vector db

* Improve docstring

* Improve docstring

* Improve docstring

* Add support to customized is_termination_msg fucntion

* Add a test for customize vector db with lancedb

* Fix tests

* Add test for embedding_function

* Update docstring
2023-10-10 12:53:18 +00:00
Li Jiang
19f8711c1b
Update num tokens from text (#149)
* Improve num_tokens_from_text

* Format

* Update comments

* Improve docstrings
2023-10-09 02:30:11 +00:00
Mohamed Attia
a3547f82c4
Replace the use of assert in non-test code (#80)
* Replace `assert`s in the `conversable_agent` module with `if-log-raise`.

* Use a `logger` object in the `code_utils` module.

* Replace use of `assert` with `if-log-raise` in the `code_utils` module.

* Replace use of `assert` in the `math_utils` module with `if-not-raise`.

* Replace `assert` with `if` in the `oai.completion` module.

* Replace `assert` in the `retrieve_utils` module with an if statement.

* Add missing `not`.

* Blacken `completion.py`.

* Test `generate_reply` and `a_generate_reply` raise an assertion error
when there are neither `messages` nor a `sender`.

* Test `execute_code` raises an `AssertionError` when neither code nor
filename is provided.

* Test `split_text_to_chunks` raises when passed an invalid chunk mode.

* * Add `tiktoken` and `chromadb` to test dependencies as they're used in
the `test_retrieve_utils` module.

* Sort the test requirements alphabetically.
2023-10-03 17:52:50 +00:00
Aaron
4adbffa94b
retrieve_utils.py - Updated.py to have the ability to parse text from PDF Files (#50)
* UPDATE - Updated retrieve_utils.py to have the ability to parse text from pdf files

* UNDO - change to recursive condition

* UPDATE - updated agentchat_RetrieveChat.ipynb to clarify which file types are accepted to be in the docs path

* ADD - missing import

* UPDATE - setup.py to have PyPDF2 in retrievechat

* RE-ADD - urls

* ADD - tests for retrieve utils, and removed deprecated PyPdf2

* Update agentchat_RetrieveChat.ipynb

* Update retrieve_utils.py

Fix format

* Update retrieve_utils.py

Replace print with logger

* UPDATE - added more specific exception to PDF decryption try/catch

* FIX - typo, return statement at wrong indentation in extract_text_from_pdf

---------

Co-authored-by: Ward <award40@LAMU0CLP74YXVX6.uhc.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
2023-10-01 10:22:58 +00:00