autogen/Image-chat-with-agent.md at python-v0.4.9.2

mirror of https://github.com/microsoft/autogen.git synced 2025-07-04 07:26:28 +00:00

Griffin Bassman 850377c74a

fix: Various fixes and cleanups to dotnet autogen core (#5242 )

Co-authored-by: Jack Gerrits <jack@jackgerrits.com>
Co-authored-by: Ryan Sweet <rysweet@microsoft.com>

2025-01-28 17:13:36 -05:00

2.4 KiB

Raw Permalink Blame History

This tutorial shows how to perform image chat with an agent using the @AutoGen.OpenAI.OpenAIChatAgent as an example.

Note

To chat image with an agent, the model behind the agent needs to support image input. Here is a partial list of models that support image input:

gpt-4o

gemini-1.5

llava

claude-3

...

In this example, we are using the gpt-4o model as the backend model for the agent.

Note

The complete code example can be found in Image_Chat_With_Agent.cs

Step 1: Install AutoGen

First, install the AutoGen package using the following command:

dotnet add package AutoGen

Step 2: Add Using Statements

[!code-csharpUsing Statements]

Step 3: Create an @AutoGen.OpenAI.OpenAIChatAgent

[!code-csharpCreate an OpenAIChatAgent]

Step 4: Prepare Image Message

In AutoGen, you can create an image message using either @AutoGen.Core.ImageMessage or @AutoGen.Core.MultiModalMessage. The @AutoGen.Core.ImageMessage takes a single image as input, whereas the @AutoGen.Core.MultiModalMessage allows you to pass multiple modalities like text or image.

Here is how to create an image message using @AutoGen.Core.ImageMessage: [!code-csharpCreate Image Message]

Here is how to create a multimodal message using @AutoGen.Core.MultiModalMessage: [!code-csharpCreate MultiModal Message]

Step 5: Generate Response

To generate response, you can use one of the overloaded methods of @AutoGen.Core.AgentExtension.SendAsync* method. The following code shows how to generate response with an image message:

[!code-csharpGenerate Response]

2.4 KiB Raw Permalink Blame History