
Co-authored-by: Jack Gerrits <jack@jackgerrits.com> Co-authored-by: Ryan Sweet <rysweet@microsoft.com>
2.4 KiB
This tutorial shows how to perform image chat with an agent using the @AutoGen.OpenAI.OpenAIChatAgent as an example.
Note
To chat image with an agent, the model behind the agent needs to support image input. Here is a partial list of models that support image input:
- gpt-4o
- gemini-1.5
- llava
- claude-3
- ...
In this example, we are using the gpt-4o model as the backend model for the agent.
Note
The complete code example can be found in Image_Chat_With_Agent.cs
Step 1: Install AutoGen
First, install the AutoGen package using the following command:
dotnet add package AutoGen
Step 2: Add Using Statements
[!code-csharpUsing Statements]
Step 3: Create an @AutoGen.OpenAI.OpenAIChatAgent
[!code-csharpCreate an OpenAIChatAgent]
Step 4: Prepare Image Message
In AutoGen, you can create an image message using either @AutoGen.Core.ImageMessage or @AutoGen.Core.MultiModalMessage. The @AutoGen.Core.ImageMessage takes a single image as input, whereas the @AutoGen.Core.MultiModalMessage allows you to pass multiple modalities like text or image.
Here is how to create an image message using @AutoGen.Core.ImageMessage: [!code-csharpCreate Image Message]
Here is how to create a multimodal message using @AutoGen.Core.MultiModalMessage: [!code-csharpCreate MultiModal Message]
Step 5: Generate Response
To generate response, you can use one of the overloaded methods of @AutoGen.Core.AgentExtension.SendAsync* method. The following code shows how to generate response with an image message:
[!code-csharpGenerate Response]