* Added vectordb base and chromadb
* Remove timer and unused functions
* Added filter by distance
* Added test utils
* Fix format
* Fix type hint of dict
* Rename test
* Add test chromadb
* Fix test no chromadb
* Add coverage
* Don't skip test vectordb utils
* Add types
* Fix tests
* Fix docs build error
* Add types to base
* Update base
* Update utils
* Update chromadb
* Add get_docs_by_ids
* Improve docstring
* Update init params
* Update init vector db
* Add get all docs
* Move chroma_results_to_query_results to utils
* Add init vectordb
* Convert format of results for old version
* Improve type hints
* Update get_context for new query results format
* Fix typo
* Improve init db
* Update default folder
* Update logger
* Update init, add embedding func
* Update distance_threshold
* Fix logger name
* Update qdrant
* Fix init db
* Update notebooks
* Use kwargs to improve readability
* Improve docstring of vectordb, add two attributes
* Add db_config
* Update gitignore
* Update comments
* Add source
* Fix file downloaded from urls have the same name
* Remove files added by mistake
* Improve docstring
* Update docstring
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
* Update docstring
* Update docstring
---------
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
* Initial infrasctructure for notebooks page
* migrate two notebooks
* add readme notification for notebook dir
* override 'text' prism language to add basic syntactical structure to autogens output
* Rework to retain existing directory and not expose front matter to consumers of the notebook
* improve error handling of process notebooks
* format, ruff and type fixes
* undo changes to navbar
* update readme, CI
* whitespace
* spelling mistakes
* spelling
* Add contributing guide for notebooks
* update notebook
* formatting
* fixed spelling, minor errors and reformatted using black
* polishing
* added codespell to pre-commit hooks, fixed a number of spelling errors and a few minor bugs in the code
* update autogen library version in notebooks
* update autogen library version in notebooks
* update autogen library version in notebooks
* update autogen library version in notebooks
* update autogen library version in notebooks
* Add custom text types and recursive
* Add custom text types and recursive
* Fix format
* Update qdrant, Add pdf to unstructured
* Use unstructed as the default text extractor if installed
* Add tests for unstructured
* Update tests env for unstructured
* Fix error if last message is a function call, issue #569
* Remove csv, md and tsv from UNSTRUCTURED_FORMATS
* Update docstring of docs_path
* Update test for get_files_from_dir
* Update docstring of custom_text_types
* Fix missing search_string in update_context
* Add custom_text_types to notebook example
* UPDATE - Updated retrieve_utils.py to have the ability to parse text from pdf files
* UNDO - change to recursive condition
* UPDATE - updated agentchat_RetrieveChat.ipynb to clarify which file types are accepted to be in the docs path
* ADD - missing import
* UPDATE - setup.py to have PyPDF2 in retrievechat
* RE-ADD - urls
* ADD - tests for retrieve utils, and removed deprecated PyPdf2
* Update agentchat_RetrieveChat.ipynb
* Update retrieve_utils.py
Fix format
* Update retrieve_utils.py
Replace print with logger
* UPDATE - added more specific exception to PDF decryption try/catch
* FIX - typo, return statement at wrong indentation in extract_text_from_pdf
---------
Co-authored-by: Ward <award40@LAMU0CLP74YXVX6.uhc.com>
Co-authored-by: Li Jiang <bnujli@gmail.com>
* fix bug for windows
* fix bug for windows
* more clear example
* link to example
* add test
* format
* comment
* fix assertion error
* fix test error and links
---------
Co-authored-by: Chi Wang (MSR) <chiw@microsoft.com>