3 Commits

Author SHA1 Message Date
Eric Zhu
225eb9d0b2
feat: introduce ModelClientStreamingChunkEvent for streaming model output and update handling in agents and console (#5208)
Resolves #3983

* introduce `model_client_stream` parameter in `AssistantAgent` to
enable token-level streaming output.
* introduce `ModelClientStreamingChunkEvent` as a type of `AgentEvent`
to pass the streaming chunks to the application via `run_stream` and
`on_messages_stream`. Although this will not affect the inner messages
list in the final `Response` or `TaskResult`.
* handle this new message type in `Console`.
2025-01-29 02:49:02 +00:00
Sachin Joglekar
8926206479
Implement default in-memory store for ChatCompletionCache (#5188) 2025-01-25 21:07:58 +00:00
Sachin Joglekar
8bd65c672f
Add ChatCompletionCache along with AbstractStore for caching completions (#4924)
* Add ChatCompletionCache along with AbstractStore for caching completions

* Addressing comments

* Improve interface for cachestore

* Improve documentation & revert protocol

* Make cache store typed, and improve docs

* remove unnecessary casts
2025-01-16 15:47:38 -08:00