* fixed spelling, minor errors and reformatted using black
* polishing
* added codespell to pre-commit hooks, fixed a number of spelling errors and a few minor bugs in the code
* update autogen library version in notebooks
* update autogen library version in notebooks
* update autogen library version in notebooks
* update autogen library version in notebooks
* update autogen library version in notebooks
* Add custom text types and recursive
* Add custom text types and recursive
* Fix format
* Update qdrant, Add pdf to unstructured
* Use unstructed as the default text extractor if installed
* Add tests for unstructured
* Update tests env for unstructured
* Fix error if last message is a function call, issue #569
* Remove csv, md and tsv from UNSTRUCTURED_FORMATS
* Update docstring of docs_path
* Update test for get_files_from_dir
* Update docstring of custom_text_types
* Fix missing search_string in update_context
* Add custom_text_types to notebook example
* UPDATE - Updated retrieve_utils.py to have the ability to parse text from pdf files
* UNDO - change to recursive condition
* UPDATE - updated agentchat_RetrieveChat.ipynb to clarify which file types are accepted to be in the docs path
* ADD - missing import
* UPDATE - setup.py to have PyPDF2 in retrievechat
* RE-ADD - urls
* ADD - tests for retrieve utils, and removed deprecated PyPdf2
* Update agentchat_RetrieveChat.ipynb
* Update retrieve_utils.py
Fix format
* Update retrieve_utils.py
Replace print with logger
* UPDATE - added more specific exception to PDF decryption try/catch
* FIX - typo, return statement at wrong indentation in extract_text_from_pdf
---------
Co-authored-by: Ward <award40@LAMU0CLP74YXVX6.uhc.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
* fix bug for windows
* fix bug for windows
* more clear example
* link to example
* add test
* format
* comment
* fix assertion error
* fix test error and links
---------
Co-authored-by: Chi Wang (MSR) <chiw@microsoft.com>