mirror of
https://github.com/microsoft/autogen.git
synced 2025-11-03 11:20:35 +00:00
Multiline docstrings fix (#2130)
* DOC FIX - Formatted Docstrings for the retrieve_user_proxy_agent.py and Added first single line for the class RetrieveUserProxyAgent. * DOC FIX - Formatted Docstrings for theinitiate_chats functiion of ChatResult class in autogen/agentchat/chat.py * Add vision capability (#2025) * Add vision capability * Configurate: description_prompt * Print warning instead of raising issues for type * Skip vision capability test if dependencies not installed * Append "vision" to agent's system message when enabled VisionCapability * GPT-4V notebook update with ConversableAgent * Clean GPT-4V notebook * Add vision capability test to workflow * Lint import * Update system message for vision capability * Add a `custom_caption_func` to VisionCapability * Add custom function example for vision capability * Skip test Vision capability custom func * GPT-4V notebook metadata to website * Remove redundant files * The custom caption function takes more inputs now * Add a more complex example of custom caption func * Remove trailing space --------- Co-authored-by: Chi Wang <wang.chi@microsoft.com> * Native tool call support for Mistral AI API and topic notebook. (#2135) * Support for Mistral AI API and topic notebook. * formatting * formatting * New conversational chess notebook using nested chats and tool use (#2137) * add chess notebook * update * update * Update notebook with figure * Add example link * redirect * Clean up example format * address gagan's comments * update references * fix links * add webarena in samples (#2114) * add webarena in samples/tools * Update samples/tools/webarena/README.md Co-authored-by: gagb <gagb@users.noreply.github.com> * Update samples/tools/webarena/README.md Co-authored-by: gagb <gagb@users.noreply.github.com> * Update samples/tools/webarena/README.md Co-authored-by: gagb <gagb@users.noreply.github.com> * update installation instructions * black formatting * Update README.md --------- Co-authored-by: gagb <gagb@users.noreply.github.com> Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com> * context to kwargs (#2064) * context to kwargs * add tag * add test * text to kwargs --------- Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com> Co-authored-by: Chi Wang <wang.chi@microsoft.com> * Bump webpack-dev-middleware from 5.3.3 to 5.3.4 in /website (#2131) Bumps [webpack-dev-middleware](https://github.com/webpack/webpack-dev-middleware) from 5.3.3 to 5.3.4. - [Release notes](https://github.com/webpack/webpack-dev-middleware/releases) - [Changelog](https://github.com/webpack/webpack-dev-middleware/blob/v5.3.4/CHANGELOG.md) - [Commits](https://github.com/webpack/webpack-dev-middleware/compare/v5.3.3...v5.3.4) --- updated-dependencies: - dependency-name: webpack-dev-middleware dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com> * Parse Any HTML-esh Style Tags (#2046) * tried implementing my own regex * improves tests * finally works * removes prints * fixed test * adds start and end * delete unused imports * refactored to use new tool * significantly improved algo * tag content -> tag attr * fix tests + adds new field * return full match * return remove start and end * update docstrings * update docstrings * update docstrings --------- Co-authored-by: Beibin Li <BeibinLi@users.noreply.github.com> Co-authored-by: Chi Wang <wang.chi@microsoft.com> * Integrate AgentOptimizer (#1767) * draft agent optimizer * refactor * remove * change openai config interface * notebook * update blog * add test * clean up * redir * update * update interface * change model name * move to contrib * Update autogen/agentchat/contrib/agent_optimizer.py Co-authored-by: Jack Gerrits <jackgerrits@users.noreply.github.com> --------- Co-authored-by: “skzhang1” <“shaokunzhang529@gmail.com”> Co-authored-by: Beibin Li <BeibinLi@users.noreply.github.com> Co-authored-by: Jieyu Zhang <jieyuz2@cs.washington.edu> Co-authored-by: Jack Gerrits <jackgerrits@users.noreply.github.com> * Introducing IOStream protocol and adding support for websockets (#1551) * Introducing IOStream * bug fixing * polishing * refactoring * refactoring * refactoring * wip: async tests * websockets added * wip * merge with main * notebook added * FastAPI example added * wip * merge * getter/setter to iostream added * website/blog/2024-03-03-AutoGen-Update/img/dalle_gpt4v.png: convert to Git LFS * website/blog/2024-03-03-AutoGen-Update/img/gaia.png: convert to Git LFS * website/blog/2024-03-03-AutoGen-Update/img/teach.png: convert to Git LFS * add SSL support * wip * wip * exception handling added to on_connect() * refactoring: default iostream is being set in a context manager * test fix * polishing * polishing * polishing * fixed bug with new thread * polishing * a bit of refactoring and docs added * notebook added to docs * type checking added to CI * CI fix * CI fix * CI fix * polishing * obsolete todo comment removed * fixed precommit error --------- Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com> * [CAP] [Feature] Get list of actors from directory service. (#2073) * Search directory for list of actors using regex '.*' gets all actors * docs changes * pre-commit fixes * Use ActorInfo from protobuf * pre-commit * Added zmq tests to work on removing sleeps * minor refactor of zmq tests * 1) Change DirSvr to user Broker. 2) Add req-router to broker 3) In ActorConnector use handshake and req/resp to remove sleep * 1) Change DirSvr to user Broker. 2) Add req-router to broker 3) In ActorConnector use handshake and req/resp to remove sleep * move socket creation to thread with recv * move socket creation to thread with recv * Better logging for DirectorySvc * better logging for directory svc * Use logging config * Start removing sleeps * pre-commit * Cleanup monitor socket * Mark cache as a protocol and update type hints to reflect (#2168) * Mark cache as a protocl and update type hints to reflect * int * undo init change modified: autogen/agentchat/chat.py * fix(): fix word spelling errors (#2171) * Implement User Defined Functions for Local CLI Executor (#2102) * Implement user defined functions feature for local cli exec, add docs * add tests, update docs * fixes * fix test * add pandas test dep * install test * provide template as func * formatting * undo change * address comments * add test deps * formatting * test only in 1 env * formatting * remove test for local only --------- Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com> * simplify getting-started; update news (#2175) * simplify getting-started; update news * bug fix * update (#2178) Co-authored-by: AnonymousRepoSub <“shaokunzhang529@outlook.com” > * Fix formatting of admonitions in udf docs (#2188) * Fix iostream on new thread (#2181) * fixed get_stream in new thread by introducing a global default * fixed get_stream in new thread by introducing a global default --------- Co-authored-by: Chi Wang <wang.chi@microsoft.com> * Add link for rendering notebooks docs on website (#2191) * Transform Messages Capability (#1923) * wip * Adds docstrings * fixed spellings * wip * fixed errors * better class names * adds tests * added tests to workflow * improved token counting * improved notebook * improved token counting in test * improved docstrings * fix inconsistencies * changed by mistake * fixed docstring * fixed details * improves tests + adds openai contrib test * fix spelling oai contrib test * clearer docstrings * remove repeated docstr * improved notebook * adds metadata to notebook * Improve outline and description (#2125) * better dir structure * clip max tokens to allowed tokens * more accurate comments/docstrs * add deperecation warning * fix front matter desc * add deperecation warning notebook * undo local notebook settings changes * format notebook * format workflow --------- Co-authored-by: gagb <gagb@users.noreply.github.com> * Bump express from 4.18.2 to 4.19.2 in /website (#2157) Bumps [express](https://github.com/expressjs/express) from 4.18.2 to 4.19.2. - [Release notes](https://github.com/expressjs/express/releases) - [Changelog](https://github.com/expressjs/express/blob/master/History.md) - [Commits](https://github.com/expressjs/express/compare/4.18.2...4.19.2) --- updated-dependencies: - dependency-name: express dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * add clarity analytics (#2201) * Docstring formatting fix: Standardize docstrings to adhere to Google style guide, ensuring consistency and clarity. and also fixed the broken link for autogen/agentchat/chat.py * Docstring fix: Reformattted docstrings to adhere to Google style guide, nsuring consistency and clarity. For agentchat/contrib/retrieve_user_proxy_agent.py file * Fixed Pre-Commit Error, Trailing spaces on agentchat/chat.py * Fixed Pre-Commit Error, Trailing spaces on agentchat/chat.py --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: Li Jiang <bnujli@gmail.com> Co-authored-by: Beibin Li <BeibinLi@users.noreply.github.com> Co-authored-by: Chi Wang <wang.chi@microsoft.com> Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com> Co-authored-by: olgavrou <olgavrou@gmail.com> Co-authored-by: gagb <gagb@users.noreply.github.com> Co-authored-by: Qingyun Wu <qingyun0327@gmail.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Wael Karkoub <wael.karkoub96@gmail.com> Co-authored-by: Shaokun Zhang <shaokunzhang529@gmail.com> Co-authored-by: “skzhang1” <“shaokunzhang529@gmail.com”> Co-authored-by: Jieyu Zhang <jieyuz2@cs.washington.edu> Co-authored-by: Jack Gerrits <jackgerrits@users.noreply.github.com> Co-authored-by: Davor Runje <davor@airt.ai> Co-authored-by: Rajan <rajan.chari@yahoo.com> Co-authored-by: calm <1191465097@qq.com> Co-authored-by: AnonymousRepoSub <“shaokunzhang529@outlook.com” >
This commit is contained in:
parent
1c22a93535
commit
2053dd9f3d
@ -26,7 +26,9 @@ class ChatResult:
|
||||
summary: str = None
|
||||
"""A summary obtained from the chat."""
|
||||
cost: tuple = None # (dict, dict) - (total_cost, actual_cost_with_cache)
|
||||
"""The cost of the chat. a tuple of (total_cost, total_actual_cost), where total_cost is a dictionary of cost information, and total_actual_cost is a dictionary of information on the actual incurred cost with cache."""
|
||||
"""The cost of the chat. a tuple of (total_cost, total_actual_cost), where total_cost is a
|
||||
dictionary of cost information, and total_actual_cost is a dictionary of information on
|
||||
the actual incurred cost with cache."""
|
||||
human_input: List[str] = None
|
||||
"""A list of human input solicited during the chat."""
|
||||
|
||||
@ -141,25 +143,32 @@ def __post_carryover_processing(chat_info: Dict[str, Any]) -> None:
|
||||
|
||||
def initiate_chats(chat_queue: List[Dict[str, Any]]) -> List[ChatResult]:
|
||||
"""Initiate a list of chats.
|
||||
|
||||
Args:
|
||||
chat_queue (List[Dict]): a list of dictionaries containing the information about the chats.
|
||||
|
||||
Each dictionary should contain the input arguments for [`ConversableAgent.initiate_chat`](/docs/reference/agentchat/conversable_agent#initiate_chat). For example:
|
||||
- "sender": the sender agent.
|
||||
- "recipient": the recipient agent.
|
||||
- "clear_history" (bool): whether to clear the chat history with the agent. Default is True.
|
||||
- "silent" (bool or None): (Experimental) whether to print the messages in this conversation. Default is False.
|
||||
- "cache" (AbstractCache or None): the cache client to use for this conversation. Default is None.
|
||||
- "max_turns" (int or None): maximum number of turns for the chat. If None, the chat will continue until a termination condition is met. Default is None.
|
||||
- "summary_method" (str or callable): a string or callable specifying the method to get a summary from the chat. Default is DEFAULT_summary_method, i.e., "last_msg".
|
||||
- "summary_args" (dict): a dictionary of arguments to be passed to the summary_method. Default is {}.
|
||||
- "message" (str, callable or None): if None, input() will be called to get the initial message.
|
||||
- **context: additional context information to be passed to the chat.
|
||||
- "carryover": It can be used to specify the carryover information to be passed to this chat.
|
||||
If provided, we will combine this carryover with the "message" content when generating the initial chat
|
||||
message in `generate_init_message`.
|
||||
chat_queue (List[Dict]): A list of dictionaries containing the information about the chats.
|
||||
|
||||
Each dictionary should contain the input arguments for
|
||||
[`ConversableAgent.initiate_chat`](/docs/reference/agentchat/conversable_agent#initiate_chat).
|
||||
For example:
|
||||
- `"sender"` - the sender agent.
|
||||
- `"recipient"` - the recipient agent.
|
||||
- `"clear_history" (bool) - whether to clear the chat history with the agent.
|
||||
Default is True.
|
||||
- `"silent"` (bool or None) - (Experimental) whether to print the messages in this
|
||||
conversation. Default is False.
|
||||
- `"cache"` (Cache or None) - the cache client to use for this conversation.
|
||||
Default is None.
|
||||
- `"max_turns"` (int or None) - maximum number of turns for the chat. If None, the chat
|
||||
will continue until a termination condition is met. Default is None.
|
||||
- `"summary_method"` (str or callable) - a string or callable specifying the method to get
|
||||
a summary from the chat. Default is DEFAULT_summary_method, i.e., "last_msg".
|
||||
- `"summary_args"` (dict) - a dictionary of arguments to be passed to the summary_method.
|
||||
Default is {}.
|
||||
- `"message"` (str, callable or None) - if None, input() will be called to get the
|
||||
initial message.
|
||||
- `**context` - additional context information to be passed to the chat.
|
||||
- `"carryover"` - It can be used to specify the carryover information to be passed
|
||||
to this chat. If provided, we will combine this carryover with the "message" content when
|
||||
generating the initial chat message in `generate_init_message`.
|
||||
Returns:
|
||||
(list): a list of ChatResult objects corresponding to the finished chats in the chat_queue.
|
||||
"""
|
||||
@ -228,11 +237,11 @@ async def a_initiate_chats(chat_queue: List[Dict[str, Any]]) -> Dict[int, ChatRe
|
||||
"""(async) Initiate a list of chats.
|
||||
|
||||
args:
|
||||
Please refer to `initiate_chats`.
|
||||
- Please refer to `initiate_chats`.
|
||||
|
||||
|
||||
returns:
|
||||
(Dict): a dict of ChatId: ChatResult corresponding to the finished chats in the chat_queue.
|
||||
- (Dict): a dict of ChatId: ChatResult corresponding to the finished chats in the chat_queue.
|
||||
"""
|
||||
consolidate_chat_info(chat_queue)
|
||||
_validate_recipients(chat_queue)
|
||||
|
||||
@ -62,6 +62,10 @@ Context is: {input_context}
|
||||
|
||||
|
||||
class RetrieveUserProxyAgent(UserProxyAgent):
|
||||
"""(In preview) The Retrieval-Augmented User Proxy retrieves document chunks based on the embedding
|
||||
similarity, and sends them along with the question to the Retrieval-Augmented Assistant
|
||||
"""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
name="RetrieveChatAgent", # default set to RetrieveChatAgent
|
||||
@ -73,67 +77,106 @@ class RetrieveUserProxyAgent(UserProxyAgent):
|
||||
r"""
|
||||
Args:
|
||||
name (str): name of the agent.
|
||||
|
||||
human_input_mode (str): whether to ask for human inputs every time a message is received.
|
||||
Possible values are "ALWAYS", "TERMINATE", "NEVER".
|
||||
1. When "ALWAYS", the agent prompts for human input every time a message is received.
|
||||
Under this mode, the conversation stops when the human input is "exit",
|
||||
or when is_termination_msg is True and there is no human input.
|
||||
2. When "TERMINATE", the agent only prompts for human input only when a termination message is received or
|
||||
the number of auto reply reaches the max_consecutive_auto_reply.
|
||||
3. When "NEVER", the agent will never prompt for human input. Under this mode, the conversation stops
|
||||
when the number of auto reply reaches the max_consecutive_auto_reply or when is_termination_msg is True.
|
||||
2. When "TERMINATE", the agent only prompts for human input only when a termination
|
||||
message is received or the number of auto reply reaches
|
||||
the max_consecutive_auto_reply.
|
||||
3. When "NEVER", the agent will never prompt for human input. Under this mode, the
|
||||
conversation stops when the number of auto reply reaches the
|
||||
max_consecutive_auto_reply or when is_termination_msg is True.
|
||||
|
||||
is_termination_msg (function): a function that takes a message in the form of a dictionary
|
||||
and returns a boolean value indicating if this received message is a termination message.
|
||||
The dict can contain the following keys: "content", "role", "name", "function_call".
|
||||
|
||||
retrieve_config (dict or None): config for the retrieve agent.
|
||||
To use default config, set to None. Otherwise, set to a dictionary with the following keys:
|
||||
- task (Optional, str): the task of the retrieve chat. Possible values are "code", "qa" and "default". System
|
||||
prompt will be different for different tasks. The default value is `default`, which supports both code and qa.
|
||||
- client (Optional, chromadb.Client): the chromadb client. If key not provided, a default client `chromadb.Client()`
|
||||
will be used. If you want to use other vector db, extend this class and override the `retrieve_docs` function.
|
||||
- docs_path (Optional, Union[str, List[str]]): the path to the docs directory. It can also be the path to a single file,
|
||||
the url to a single file or a list of directories, files and urls. Default is None, which works only if the collection is already created.
|
||||
- extra_docs (Optional, bool): when true, allows adding documents with unique IDs without overwriting existing ones; when false, it replaces existing documents using default IDs, risking collection overwrite.,
|
||||
when set to true it enables the system to assign unique IDs starting from "length+i" for new document chunks, preventing the replacement of existing documents and facilitating the addition of more content to the collection..
|
||||
By default, "extra_docs" is set to false, starting document IDs from zero. This poses a risk as new documents might overwrite existing ones, potentially causing unintended loss or alteration of data in the collection.
|
||||
- collection_name (Optional, str): the name of the collection.
|
||||
|
||||
To use default config, set to None. Otherwise, set to a dictionary with the
|
||||
following keys:
|
||||
- `task` (Optional, str) - the task of the retrieve chat. Possible values are
|
||||
"code", "qa" and "default". System prompt will be different for different tasks.
|
||||
The default value is `default`, which supports both code and qa.
|
||||
- `client` (Optional, chromadb.Client) - the chromadb client. If key not provided, a
|
||||
default client `chromadb.Client()` will be used. If you want to use other
|
||||
vector db, extend this class and override the `retrieve_docs` function.
|
||||
- `docs_path` (Optional, Union[str, List[str]]) - the path to the docs directory. It
|
||||
can also be the path to a single file, the url to a single file or a list
|
||||
of directories, files and urls. Default is None, which works only if the
|
||||
collection is already created.
|
||||
- `extra_docs` (Optional, bool) - when true, allows adding documents with unique IDs
|
||||
without overwriting existing ones; when false, it replaces existing documents
|
||||
using default IDs, risking collection overwrite., when set to true it enables
|
||||
the system to assign unique IDs starting from "length+i" for new document
|
||||
chunks, preventing the replacement of existing documents and facilitating the
|
||||
addition of more content to the collection..
|
||||
By default, "extra_docs" is set to false, starting document IDs from zero.
|
||||
This poses a risk as new documents might overwrite existing ones, potentially
|
||||
causing unintended loss or alteration of data in the collection.
|
||||
- `collection_name` (Optional, str) - the name of the collection.
|
||||
If key not provided, a default name `autogen-docs` will be used.
|
||||
- model (Optional, str): the model to use for the retrieve chat.
|
||||
- `model` (Optional, str) - the model to use for the retrieve chat.
|
||||
If key not provided, a default model `gpt-4` will be used.
|
||||
- chunk_token_size (Optional, int): the chunk token size for the retrieve chat.
|
||||
- `chunk_token_size` (Optional, int) - the chunk token size for the retrieve chat.
|
||||
If key not provided, a default size `max_tokens * 0.4` will be used.
|
||||
- context_max_tokens (Optional, int): the context max token size for the retrieve chat.
|
||||
- `context_max_tokens` (Optional, int) - the context max token size for the
|
||||
retrieve chat.
|
||||
If key not provided, a default size `max_tokens * 0.8` will be used.
|
||||
- chunk_mode (Optional, str): the chunk mode for the retrieve chat. Possible values are
|
||||
"multi_lines" and "one_line". If key not provided, a default mode `multi_lines` will be used.
|
||||
- must_break_at_empty_line (Optional, bool): chunk will only break at empty line if True. Default is True.
|
||||
- `chunk_mode` (Optional, str) - the chunk mode for the retrieve chat. Possible values
|
||||
are "multi_lines" and "one_line". If key not provided, a default mode
|
||||
`multi_lines` will be used.
|
||||
- `must_break_at_empty_line` (Optional, bool) - chunk will only break at empty line
|
||||
if True. Default is True.
|
||||
If chunk_mode is "one_line", this parameter will be ignored.
|
||||
- embedding_model (Optional, str): the embedding model to use for the retrieve chat.
|
||||
If key not provided, a default model `all-MiniLM-L6-v2` will be used. All available models
|
||||
can be found at `https://www.sbert.net/docs/pretrained_models.html`. The default model is a
|
||||
fast model. If you want to use a high performance model, `all-mpnet-base-v2` is recommended.
|
||||
- embedding_function (Optional, Callable): the embedding function for creating the vector db. Default is None,
|
||||
SentenceTransformer with the given `embedding_model` will be used. If you want to use OpenAI, Cohere, HuggingFace or
|
||||
other embedding functions, you can pass it here, follow the examples in `https://docs.trychroma.com/embeddings`.
|
||||
- customized_prompt (Optional, str): the customized prompt for the retrieve chat. Default is None.
|
||||
- customized_answer_prefix (Optional, str): the customized answer prefix for the retrieve chat. Default is "".
|
||||
If not "" and the customized_answer_prefix is not in the answer, `Update Context` will be triggered.
|
||||
- update_context (Optional, bool): if False, will not apply `Update Context` for interactive retrieval. Default is True.
|
||||
- get_or_create (Optional, bool): if True, will create/return a collection for the retrieve chat. This is the same as that used in chromadb.
|
||||
Default is False. Will raise ValueError if the collection already exists and get_or_create is False. Will be set to True if docs_path is None.
|
||||
- custom_token_count_function (Optional, Callable): a custom function to count the number of tokens in a string.
|
||||
The function should take (text:str, model:str) as input and return the token_count(int). the retrieve_config["model"] will be passed in the function.
|
||||
Default is autogen.token_count_utils.count_token that uses tiktoken, which may not be accurate for non-OpenAI models.
|
||||
- custom_text_split_function (Optional, Callable): a custom function to split a string into a list of strings.
|
||||
Default is None, will use the default function in `autogen.retrieve_utils.split_text_to_chunks`.
|
||||
- custom_text_types (Optional, List[str]): a list of file types to be processed. Default is `autogen.retrieve_utils.TEXT_FORMATS`.
|
||||
This only applies to files under the directories in `docs_path`. Explicitly included files and urls will be chunked regardless of their types.
|
||||
- recursive (Optional, bool): whether to search documents recursively in the docs_path. Default is True.
|
||||
- `embedding_model` (Optional, str) - the embedding model to use for the retrieve chat.
|
||||
If key not provided, a default model `all-MiniLM-L6-v2` will be used. All available
|
||||
models can be found at `https://www.sbert.net/docs/pretrained_models.html`.
|
||||
The default model is a fast model. If you want to use a high performance model,
|
||||
`all-mpnet-base-v2` is recommended.
|
||||
- `embedding_function` (Optional, Callable) - the embedding function for creating the
|
||||
vector db. Default is None, SentenceTransformer with the given `embedding_model`
|
||||
will be used. If you want to use OpenAI, Cohere, HuggingFace or other embedding
|
||||
functions, you can pass it here,
|
||||
follow the examples in `https://docs.trychroma.com/embeddings`.
|
||||
- `customized_prompt` (Optional, str) - the customized prompt for the retrieve chat.
|
||||
Default is None.
|
||||
- `customized_answer_prefix` (Optional, str) - the customized answer prefix for the
|
||||
retrieve chat. Default is "".
|
||||
If not "" and the customized_answer_prefix is not in the answer,
|
||||
`Update Context` will be triggered.
|
||||
- `update_context` (Optional, bool) - if False, will not apply `Update Context` for
|
||||
interactive retrieval. Default is True.
|
||||
- `get_or_create` (Optional, bool) - if True, will create/return a collection for the
|
||||
retrieve chat. This is the same as that used in chromadb.
|
||||
Default is False. Will raise ValueError if the collection already exists and
|
||||
get_or_create is False. Will be set to True if docs_path is None.
|
||||
- `custom_token_count_function` (Optional, Callable) - a custom function to count the
|
||||
number of tokens in a string.
|
||||
The function should take (text:str, model:str) as input and return the
|
||||
token_count(int). the retrieve_config["model"] will be passed in the function.
|
||||
Default is autogen.token_count_utils.count_token that uses tiktoken, which may
|
||||
not be accurate for non-OpenAI models.
|
||||
- `custom_text_split_function` (Optional, Callable) - a custom function to split a
|
||||
string into a list of strings.
|
||||
Default is None, will use the default function in
|
||||
`autogen.retrieve_utils.split_text_to_chunks`.
|
||||
- `custom_text_types` (Optional, List[str]) - a list of file types to be processed.
|
||||
Default is `autogen.retrieve_utils.TEXT_FORMATS`.
|
||||
This only applies to files under the directories in `docs_path`. Explicitly
|
||||
included files and urls will be chunked regardless of their types.
|
||||
- `recursive` (Optional, bool) - whether to search documents recursively in the
|
||||
docs_path. Default is True.
|
||||
|
||||
`**kwargs` (dict): other kwargs in [UserProxyAgent](../user_proxy_agent#__init__).
|
||||
|
||||
Example:
|
||||
|
||||
Example of overriding retrieve_docs - If you have set up a customized vector db, and it's not compatible with chromadb, you can easily plug in it with below code.
|
||||
Example of overriding retrieve_docs - If you have set up a customized vector db, and it's
|
||||
not compatible with chromadb, you can easily plug in it with below code.
|
||||
```python
|
||||
class MyRetrieveUserProxyAgent(RetrieveUserProxyAgent):
|
||||
def query_vector_db(
|
||||
@ -416,9 +459,9 @@ class RetrieveUserProxyAgent(UserProxyAgent):
|
||||
sender (Agent): the sender agent. It should be the instance of RetrieveUserProxyAgent.
|
||||
recipient (Agent): the recipient agent. Usually it's the assistant agent.
|
||||
context (dict): the context for the message generation. It should contain the following keys:
|
||||
- problem (str): the problem to be solved.
|
||||
- n_results (int): the number of results to be retrieved. Default is 20.
|
||||
- search_string (str): only docs that contain an exact match of this string will be retrieved. Default is "".
|
||||
- `problem` (str) - the problem to be solved.
|
||||
- `n_results` (int) - the number of results to be retrieved. Default is 20.
|
||||
- `search_string` (str) - only docs that contain an exact match of this string will be retrieved. Default is "".
|
||||
Returns:
|
||||
str: the generated message ready to be sent to the recipient agent.
|
||||
"""
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user