midscene/apps/site/docs/zh/model-provider.mdx

# 配置模型和服务商

Midscene 默认集成了 OpenAI SDK 调用 AI 服务。使用这个 SDK 限定了 AI 服务出入参的形式，但并不意味着你只能使用 OpenAI 的模型，你可以使用任何兼容此类接口的模型服务（绝大多数平台或工具都支持）。

在本文中，我们将展示如何配置 AI 提供商，以及如何选择不同的模型。你可以先阅读 [选择 AI 模型](./choose-a-model) 来了解如何选择模型。

# 选择 AI 模型


## 配置

### 通用配置

你可以通过环境变量来自定义配置。这些配置同样可以在 [Chrome 插件](./quick-experience) 中使用。

常用的主要配置项如下，其中 `OPENAI_API_KEY` 是必选项：

| 名称 | 描述 |
|------|-------------|
| `OPENAI_API_KEY` | 必选项。你的 OpenAI API Key (如 "sk-abcdefghijklmnopqrstuvwxyz") |
| `OPENAI_BASE_URL` | 可选。API 的接入 URL。常用于切换到其他模型服务，如 `https://some_service_name.com/v1` |
| `MIDSCENE_MODEL_NAME` | 可选。指定一个不同的模型名称 (默认是 gpt-4o)。常用于切换到其他模型服务|

使用 `Qwen 2.5 VL` 模型的额外配置：

| 名称 | 描述 |
|------|-------------|
| `MIDSCENE_USE_QWEN_VL` | 设置为 "1" 以适配 Qwen 2.5 VL 模型 |

使用 `UI-TARS` 模型的额外配置：

| 名称 | 描述 |
|------|-------------|
| `MIDSCENE_USE_VLM_UI_TARS` | 指定 UI-TARS 版本，支持的值为 `1.0` `1.5` `DOUBAO`（火山引擎版本） |

使用 `Gemini 2.5 Pro` 模型的额外配置：

| 名称 | 描述 |
|------|-------------|
| `MIDSCENE_USE_GEMINI` | 设置为 "1" 以适配 Gemini 2.5 Pro 模型 |

关于模型的更多信息，请参阅 [选择 AI 模型](./choose-a-model)。

### 高级配置

还有一些高级配置项，通常不需要使用。

| 名称 | 描述 |
|------|-------------|
| `OPENAI_USE_AZURE` | 可选。设置为 "true" 以使用 Azure OpenAI Service。更多详情请参阅后文 |
| `MIDSCENE_OPENAI_INIT_CONFIG_JSON` | 可选。OpenAI SDK 的初始化配置 JSON |
| `MIDSCENE_OPENAI_SOCKS_PROXY` | 可选。代理配置 (如 "socks5://127.0.0.1:1080") |
| `MIDSCENE_PREFERRED_LANGUAGE` | 可选。模型响应的语言。如果当前时区是 GMT+8 则默认是 `Chinese`，否则是 `English` |
| `OPENAI_MAX_TOKENS` | 可选。模型响应的 max_tokens 数 |

### 调试配置

通过设置以下配置，可以打印更多日志用于调试。这些日志也会打印到 `./midscene_run/log` 文件夹中。

| 名称 | 描述 |
|------|-------------|
| `DEBUG=midscene:ai:profile:stats` | 可选。设置此项，可以打印 AI 服务消耗的时间、token 使用情况，用逗号分隔，便于分析 |
| `DEBUG=midscene:ai:profile:detail` | 可选。设置此项，可以打印 AI token 消耗信息的详情 |
| `DEBUG=midscene:ai:call` | 可选。设置此项，可以打印 AI 响应详情 |
| `DEBUG=midscene:android:adb` | 可选。设置此项，可以打印 Android adb 命令调用详情 |

## 两种配置环境变量的方式

选择其中一种方式来配置环境变量。

### 方法一：在系统中设置环境变量

```bash
# 替换为你自己的 API Key
export OPENAI_API_KEY="sk-abcdefghijklmnopqrstuvwxyz"
```

### 方法二：使用 dotenv 配置环境变量

我们的 [demo 项目](https://github.com/web-infra-dev/midscene-example) 使用了这种方式。

[Dotenv](https://www.npmjs.com/package/dotenv) 是一个零依赖的 npm 包，用于将环境变量从 `.env` 文件加载到环境变量 `process.env` 中。

```bash
# 安装 dotenv
npm install dotenv --save
```

在项目根目录下创建一个 `.env` 文件，并添加以下内容。注意，这里不需要在每一行前添加 `export`。

```bash
OPENAI_API_KEY="sk-abcdefghijklmnopqrstuvwxyz"
```

在脚本中导入 dotenv 模块，导入后它会自动读取 `.env` 文件中的环境变量。

```typescript
import 'dotenv/config';
```

## 使用 Azure OpenAI 服务时的配置

### 使用 ADT token provider

此种模式无法运行在浏览器插件中。

```bash
# 使用 Azure OpenAI 服务时，配置为 1
export MIDSCENE_USE_AZURE_OPENAI=1

export MIDSCENE_AZURE_OPENAI_SCOPE="https://cognitiveservices.azure.com/.default"
export AZURE_OPENAI_ENDPOINT="..."
export AZURE_OPENAI_API_VERSION="2024-05-01-preview"
export AZURE_OPENAI_DEPLOYMENT="gpt-4o"
```

### 使用 keyless 模式

```bash
export MIDSCENE_USE_AZURE_OPENAI=1
export AZURE_OPENAI_ENDPOINT="..."
export AZURE_OPENAI_KEY="..."
export AZURE_OPENAI_API_VERSION="2024-05-01-preview"
export AZURE_OPENAI_DEPLOYMENT="gpt-4o"
```

## 使用 Javascript 配置 AI 服务

你也可以在运行 Midscene 代码之前，使用 Javascript 来配置 AI 服务。

```typescript
import { overrideAIConfig } from "@midscene/web/puppeteer";
// 或者 import { overrideAIConfig } from "@midscene/web/playwright";
// 或者 import { overrideAIConfig } from "@midscene/android";


overrideAIConfig({
  MIDSCENE_MODEL_NAME: "...",
  // ...
});
```

## 示例：使用 OpenAI 的 `gpt-4o` 模型

配置环境变量：

```bash
export OPENAI_API_KEY="sk-..."
export OPENAI_BASE_URL="https://endpoint.some_other_provider.com/v1" # 可选，如果你想要使用一个不同于 OpenAI 官方的接入点
export MIDSCENE_MODEL_NAME="gpt-4o-2024-11-20" # 可选，默认是 "gpt-4o"
```

## 示例：使用阿里云官方的 `qwen-vl-max-latest` 模型

配置环境变量：

```bash
export OPENAI_API_KEY="sk-..."
export OPENAI_BASE_URL="https://dashscope.aliyuncs.com/compatible-mode/v1"
export MIDSCENE_MODEL_NAME="qwen-vl-max-latest"
export MIDSCENE_USE_QWEN_VL=1
```

## 示例：使用 Doubao-1.5-thinking-vision-pro 模型

配置环境变量：

```bash
export OPENAI_BASE_URL="https://ark-cn-beijing.bytedance.net/api/v3"
export OPENAI_API_KEY="..."
export MIDSCENE_MODEL_NAME='ep-...'
export MIDSCENE_USE_DOUBAO_VISION=1
```

## 示例：使用 UI-TARS 模型

配置环境变量：

```bash
export OPENAI_BASE_URL="http://localhost:1234/v1"
export MIDSCENE_MODEL_NAME="ui-tars-72b-sft"
export MIDSCENE_USE_VLM_UI_TARS=1
```

## 示例：使用 Anthropic 的 `claude-3-opus-20240229` 模型

当配置 `MIDSCENE_USE_ANTHROPIC_SDK=1` 时，Midscene 会使用 Anthropic SDK (`@anthropic-ai/sdk`) 来调用模型。

配置环境变量：

```bash
export MIDSCENE_USE_ANTHROPIC_SDK=1
export ANTHROPIC_API_KEY="....."
export MIDSCENE_MODEL_NAME="claude-3-opus-20240229"
```
## 调试 LLM 服务连接问题

如果你想要调试 LLM 服务连接问题，可以使用示例项目中的 `connectivity-test` 目录：[https://github.com/web-infra-dev/midscene-example/tree/main/connectivity-test](https://github.com/web-infra-dev/midscene-example/tree/main/connectivity-test)

将你的 `.env` 文件放在 `connectivity-test` 文件夹中，然后运行 `npm i && npm run test` 来查看问题。
-												docs(ai-model): update docs for ui-tars (#305)

* feat: update docs for ui-tars

* doc: update

* doc: update

* doc: update

* chore: update readme

* fix: ci

* docs: upgrade video

* chore: modify huagging face icon

---------

Co-authored-by: zhouxiao.shaw <zhouxiao.shaw@bytedance.com>
											
										
										
											2025-01-22 09:24:29 +08:00
+								# 配置模型和服务商
-												feat(cli): implement cli wrapper (#43)


											
										
										
											2024-08-08 15:39:07 +08:00
-												doc: update the instructions to configure the model service (#274)

---------

Co-authored-by: zhouxiao.shaw <zhouxiao.shaw@bytedance.com>
											
										
										
											2025-01-15 10:14:36 +08:00
+								Midscene 默认集成了 OpenAI SDK 调用 AI 服务。使用这个 SDK 限定了 AI 服务出入参的形式，但并不意味着你只能使用 OpenAI 的模型，你可以使用任何兼容此类接口的模型服务（绝大多数平台或工具都支持）。
-												feat(cli): implement cli wrapper (#43)


											
										
										
											2024-08-08 15:39:07 +08:00
-												docs(ai-model): update docs for ui-tars (#305)

* feat: update docs for ui-tars

* doc: update

* doc: update

* doc: update

* chore: update readme

* fix: ci

* docs: upgrade video

* chore: modify huagging face icon

---------

Co-authored-by: zhouxiao.shaw <zhouxiao.shaw@bytedance.com>
											
										
										
											2025-01-22 09:24:29 +08:00
+								在本文中，我们将展示如何配置 AI 提供商，以及如何选择不同的模型。你可以先阅读 [选择 AI 模型](./choose-a-model) 来了解如何选择模型。
-												feat(cli): implement cli wrapper (#43)


											
										
										
											2024-08-08 15:39:07 +08:00
-												docs(llm): add doubao-1.5-vl-pro model (#719)

* docs(llm): add doubao-1.5-vl-pro model

* docs(site): move model comparison to choose a model doc

* docs(core): update doubao model

---------

Co-authored-by: yutao <yutao.tao@bytedance.com>
											
										
										
											2025-05-19 15:56:28 +08:00
+								# 选择 AI 模型
-												doc: update the instructions to configure the model service (#274)

---------

Co-authored-by: zhouxiao.shaw <zhouxiao.shaw@bytedance.com>
											
										
										
											2025-01-15 10:14:36 +08:00
+								## 配置
-												feat: print stats in debug (#515)

Co-authored-by: Zhou Xiao <zhouxiao.shaw@bytedance.com>
											
										
										
											2025-04-01 17:20:37 +08:00
+								### 通用配置
-												doc: update the instructions to configure the model service (#274)

---------

Co-authored-by: zhouxiao.shaw <zhouxiao.shaw@bytedance.com>
											
										
										
											2025-01-15 10:14:36 +08:00
+								你可以通过环境变量来自定义配置。这些配置同样可以在 [Chrome 插件](./quick-experience) 中使用。
-												docs(ai-model): update docs for ui-tars (#305)

* feat: update docs for ui-tars

* doc: update

* doc: update

* doc: update

* chore: update readme

* fix: ci

* docs: upgrade video

* chore: modify huagging face icon

---------

Co-authored-by: zhouxiao.shaw <zhouxiao.shaw@bytedance.com>
											
										
										
											2025-01-22 09:24:29 +08:00
+								常用的主要配置项如下，其中 `OPENAI_API_KEY` 是必选项：
-												doc: update the instructions to configure the model service (#274)

---------

Co-authored-by: zhouxiao.shaw <zhouxiao.shaw@bytedance.com>
											
										
										
											2025-01-15 10:14:36 +08:00
 								| 名称 | 描述 |
 								|------|-------------|
 								| `OPENAI_API_KEY` | 必选项。你的 OpenAI API Key (如 "sk-abcdefghijklmnopqrstuvwxyz") |
-												feat(chrome-devtool): add 'stop' button in extension  (#281)

* feat: add 'stop' to playground

* feat: make extension stopable

---------

Co-authored-by: zhouxiao.shaw <zhouxiao.shaw@bytedance.com>
											
										
										
											2025-01-15 19:18:26 +08:00
+								| `OPENAI_BASE_URL` | 可选。API 的接入 URL。常用于切换到其他模型服务，如 `https://some_service_name.com/v1` |
-												doc: update the instructions to configure the model service (#274)

---------

Co-authored-by: zhouxiao.shaw <zhouxiao.shaw@bytedance.com>
											
										
										
											2025-01-15 10:14:36 +08:00
+								| `MIDSCENE_MODEL_NAME` | 可选。指定一个不同的模型名称 (默认是 gpt-4o)。常用于切换到其他模型服务|
-												docs(core): update model config doc (#613)

* docs(core): update model config doc

* docs(core): update model-chosen doc
											
										
										
											2025-04-22 10:19:50 +08:00
+								使用 `Qwen 2.5 VL` 模型的额外配置：
-												docs(ai-model): update docs for ui-tars (#305)

* feat: update docs for ui-tars

* doc: update

* doc: update

* doc: update

* chore: update readme

* fix: ci

* docs: upgrade video

* chore: modify huagging face icon

---------

Co-authored-by: zhouxiao.shaw <zhouxiao.shaw@bytedance.com>
											
										
										
											2025-01-22 09:24:29 +08:00
-												feat: locate by coord (#383)

---------

Co-authored-by: zhouxiao.shaw <zhouxiao.shaw@bytedance.com>
											
										
										
											2025-02-21 09:56:09 +08:00
+								| 名称 | 描述 |
 								|------|-------------|
-												feat(core):  adapt UI tars 1.5 (#616)

* feat(core): adapt ui-tars 1.5

* chore(core): adaptr ui-tars-1.5

* chore(core): fix lint

* fix(core): env building issue

* fix(core): update import for uiTarsModelVersion from shared env

* feat(core): ui-tars hotkey event

* chore(core): move @ui-tars/action-parser to devDependencies

* fix(core): adapting new model
											
										
										
											2025-04-28 08:42:43 +08:00
+								| `MIDSCENE_USE_QWEN_VL` | 设置为 "1" 以适配 Qwen 2.5 VL 模型 |
-												feat: locate by coord (#383)

---------

Co-authored-by: zhouxiao.shaw <zhouxiao.shaw@bytedance.com>
											
										
										
											2025-02-21 09:56:09 +08:00
-												docs(core): update model config doc (#613)

* docs(core): update model config doc

* docs(core): update model-chosen doc
											
										
										
											2025-04-22 10:19:50 +08:00
+								使用 `UI-TARS` 模型的额外配置：
-												docs(ai-model): update docs for ui-tars (#305)

* feat: update docs for ui-tars

* doc: update

* doc: update

* doc: update

* chore: update readme

* fix: ci

* docs: upgrade video

* chore: modify huagging face icon

---------

Co-authored-by: zhouxiao.shaw <zhouxiao.shaw@bytedance.com>
											
										
										
											2025-01-22 09:24:29 +08:00
 								| 名称 | 描述 |
 								|------|-------------|
-												docs(site): fix volcengine version USE_VLM_UI_TARS value and modify 火山云 to 火山引擎 (#692)

Co-authored-by: yaozhen.00 <yaozhen.00@bytedance.com>
											
										
										
											2025-05-07 19:54:23 +08:00
+								| `MIDSCENE_USE_VLM_UI_TARS` | 指定 UI-TARS 版本，支持的值为 `1.0` `1.5` `DOUBAO`（火山引擎版本） |
-												docs(core): update model config doc (#613)

* docs(core): update model config doc

* docs(core): update model-chosen doc
											
										
										
											2025-04-22 10:19:50 +08:00
 								使用 `Gemini 2.5 Pro` 模型的额外配置：
 								| 名称 | 描述 |
 								|------|-------------|
-												feat(core):  adapt UI tars 1.5 (#616)

* feat(core): adapt ui-tars 1.5

* chore(core): adaptr ui-tars-1.5

* chore(core): fix lint

* fix(core): env building issue

* fix(core): update import for uiTarsModelVersion from shared env

* feat(core): ui-tars hotkey event

* chore(core): move @ui-tars/action-parser to devDependencies

* fix(core): adapting new model
											
										
										
											2025-04-28 08:42:43 +08:00
+								| `MIDSCENE_USE_GEMINI` | 设置为 "1" 以适配 Gemini 2.5 Pro 模型 |
-												docs(ai-model): update docs for ui-tars (#305)

* feat: update docs for ui-tars

* doc: update

* doc: update

* doc: update

* chore: update readme

* fix: ci

* docs: upgrade video

* chore: modify huagging face icon

---------

Co-authored-by: zhouxiao.shaw <zhouxiao.shaw@bytedance.com>
											
										
										
											2025-01-22 09:24:29 +08:00
-												feat: locate by coord (#383)

---------

Co-authored-by: zhouxiao.shaw <zhouxiao.shaw@bytedance.com>
											
										
										
											2025-02-21 09:56:09 +08:00
+								关于模型的更多信息，请参阅 [选择 AI 模型](./choose-a-model)。
-												feat: print stats in debug (#515)

Co-authored-by: Zhou Xiao <zhouxiao.shaw@bytedance.com>
											
										
										
											2025-04-01 17:20:37 +08:00
+								### 高级配置
-												doc: update the instructions to configure the model service (#274)

---------

Co-authored-by: zhouxiao.shaw <zhouxiao.shaw@bytedance.com>
											
										
										
											2025-01-15 10:14:36 +08:00
+								还有一些高级配置项，通常不需要使用。
 								| 名称 | 描述 |
 								|------|-------------|
 								| `OPENAI_USE_AZURE` | 可选。设置为 "true" 以使用 Azure OpenAI Service。更多详情请参阅后文 |
 								| `MIDSCENE_OPENAI_INIT_CONFIG_JSON` | 可选。OpenAI SDK 的初始化配置 JSON |
 								| `MIDSCENE_OPENAI_SOCKS_PROXY` | 可选。代理配置 (如 "socks5://127.0.0.1:1080") |
-												feat(core): add element describer (#750)


											
										
										
											2025-05-21 21:05:47 +08:00
+								| `MIDSCENE_PREFERRED_LANGUAGE` | 可选。模型响应的语言。如果当前时区是 GMT+8 则默认是 `Chinese`，否则是 `English` |
-												doc: update the instructions to configure the model service (#274)

---------

Co-authored-by: zhouxiao.shaw <zhouxiao.shaw@bytedance.com>
											
										
										
											2025-01-15 10:14:36 +08:00
+								| `OPENAI_MAX_TOKENS` | 可选。模型响应的 max_tokens 数 |
-												fix: planning prompt (#448)

* feat: add more case for llm planning

* fix: ai e2e

* chore: use debug to print log

* chore: fix error in gpt mode
											
										
										
											2025-03-10 16:50:43 +08:00
-												feat: print stats in debug (#515)

Co-authored-by: Zhou Xiao <zhouxiao.shaw@bytedance.com>
											
										
										
											2025-04-01 17:20:37 +08:00
+								### 调试配置
 								通过设置以下配置，可以打印更多日志用于调试。这些日志也会打印到 `./midscene_run/log` 文件夹中。
-												fix: planning prompt (#448)

* feat: add more case for llm planning

* fix: ai e2e

* chore: use debug to print log

* chore: fix error in gpt mode
											
										
										
											2025-03-10 16:50:43 +08:00
 								| 名称 | 描述 |
 								|------|-------------|
-												feat: print stats in debug (#515)

Co-authored-by: Zhou Xiao <zhouxiao.shaw@bytedance.com>
											
										
										
											2025-04-01 17:20:37 +08:00
+								| `DEBUG=midscene:ai:profile:stats` | 可选。设置此项，可以打印 AI 服务消耗的时间、token 使用情况，用逗号分隔，便于分析 |
 								| `DEBUG=midscene:ai:profile:detail` | 可选。设置此项，可以打印 AI token 消耗信息的详情 |
-												feat: optimize locator (#456)

---------

Co-authored-by: zhouxiao.shaw <zhouxiao.shaw@bytedance.com>
											
										
										
											2025-03-17 19:19:54 +08:00
+								| `DEBUG=midscene:ai:call` | 可选。设置此项，可以打印 AI 响应详情 |
-												feat: print stats in debug (#515)

Co-authored-by: Zhou Xiao <zhouxiao.shaw@bytedance.com>
											
										
										
											2025-04-01 17:20:37 +08:00
+								| `DEBUG=midscene:android:adb` | 可选。设置此项，可以打印 Android adb 命令调用详情 |
-												doc: update the instructions to configure the model service (#274)

---------

Co-authored-by: zhouxiao.shaw <zhouxiao.shaw@bytedance.com>
											
										
										
											2025-01-15 10:14:36 +08:00
 								## 两种配置环境变量的方式
 								选择其中一种方式来配置环境变量。
 								### 方法一：在系统中设置环境变量
-												feat(cli): implement cli wrapper (#43)


											
										
										
											2024-08-08 15:39:07 +08:00
 								```bash
 								# 替换为你自己的 API Key
 								export OPENAI_API_KEY="sk-abcdefghijklmnopqrstuvwxyz"
 								```
-												doc: update the instructions to configure the model service (#274)

---------

Co-authored-by: zhouxiao.shaw <zhouxiao.shaw@bytedance.com>
											
										
										
											2025-01-15 10:14:36 +08:00
+								### 方法二：使用 dotenv 配置环境变量
 								我们的 [demo 项目](https://github.com/web-infra-dev/midscene-example) 使用了这种方式。
 								[Dotenv](https://www.npmjs.com/package/dotenv) 是一个零依赖的 npm 包，用于将环境变量从 `.env` 文件加载到环境变量 `process.env` 中。
-												feat(cli): implement cli wrapper (#43)


											
										
										
											2024-08-08 15:39:07 +08:00
 								```bash
-												doc: update the instructions to configure the model service (#274)

---------

Co-authored-by: zhouxiao.shaw <zhouxiao.shaw@bytedance.com>
											
										
										
											2025-01-15 10:14:36 +08:00
+								# 安装 dotenv
 								npm install dotenv --save
 								```
-												feat(cli): implement cli wrapper (#43)


											
										
										
											2024-08-08 15:39:07 +08:00
-												doc: update the instructions to configure the model service (#274)

---------

Co-authored-by: zhouxiao.shaw <zhouxiao.shaw@bytedance.com>
											
										
										
											2025-01-15 10:14:36 +08:00
+								在项目根目录下创建一个 `.env` 文件，并添加以下内容。注意，这里不需要在每一行前添加 `export`。
-												feat(cli): implement cli wrapper (#43)


											
										
										
											2024-08-08 15:39:07 +08:00
-												doc: update the instructions to configure the model service (#274)

---------

Co-authored-by: zhouxiao.shaw <zhouxiao.shaw@bytedance.com>
											
										
										
											2025-01-15 10:14:36 +08:00
+								```bash
 								OPENAI_API_KEY="sk-abcdefghijklmnopqrstuvwxyz"
 								```
-												feat: support socks proxy for OpenAI SDK (#175)

* feat: support socks proxy https://github.com/web-infra-dev/midscene-example/issues/14

* feat: show error for invalid json
											
										
										
											2024-12-10 09:24:21 +08:00
-												doc: update the instructions to configure the model service (#274)

---------

Co-authored-by: zhouxiao.shaw <zhouxiao.shaw@bytedance.com>
											
										
										
											2025-01-15 10:14:36 +08:00
+								在脚本中导入 dotenv 模块，导入后它会自动读取 `.env` 文件中的环境变量。
-												feat: let max_tokens configurable (#212)

* feat: let max_tokens configurable

* fix: update ci test case
											
										
										
											2024-12-26 13:24:21 +08:00
-												doc: update the instructions to configure the model service (#274)

---------

Co-authored-by: zhouxiao.shaw <zhouxiao.shaw@bytedance.com>
											
										
										
											2025-01-15 10:14:36 +08:00
+								```typescript
 								import 'dotenv/config';
-												feat(cli): implement cli wrapper (#43)


											
										
										
											2024-08-08 15:39:07 +08:00
+								```
-												feat: support the if-statement in planning prompt (#184)


											
										
										
											2024-12-19 10:44:08 +08:00
-												feat: update the Azure OpenAI integration, add instruction for other models (#193)


											
										
										
											2024-12-20 15:18:52 +08:00
+								## 使用 Azure OpenAI 服务时的配置
-												feat: show pointer position in chrome extension (#286)

---------

Co-authored-by: zhouxiao.shaw <zhouxiao.shaw@bytedance.com>
											
										
										
											2025-01-17 18:19:22 +08:00
+								### 使用 ADT token provider
 								此种模式无法运行在浏览器插件中。
-												feat: support keyless auth mode for azure (#227)

* feat: support keyless auth mode for azure

* feat: support keyless auth mode for azure

* fix: remove default scope config
											
										
										
											2024-12-31 18:00:20 +08:00
-												feat: update the Azure OpenAI integration, add instruction for other models (#193)


											
										
										
											2024-12-20 15:18:52 +08:00
+								```bash
-												feat: add bridge mode for extension (#228)


											
										
										
											2025-01-07 11:10:28 +08:00
+								# 使用 Azure OpenAI 服务时，配置为 1
-												feat: update the Azure OpenAI integration, add instruction for other models (#193)


											
										
										
											2024-12-20 15:18:52 +08:00
+								export MIDSCENE_USE_AZURE_OPENAI=1
-												feat: add bridge mode for extension (#228)


											
										
										
											2025-01-07 11:10:28 +08:00
-												feat: update the Azure OpenAI integration, add instruction for other models (#193)


											
										
										
											2024-12-20 15:18:52 +08:00
+								export MIDSCENE_AZURE_OPENAI_SCOPE="https://cognitiveservices.azure.com/.default"
-												feat: support keyless auth mode for azure (#227)

* feat: support keyless auth mode for azure

* feat: support keyless auth mode for azure

* fix: remove default scope config
											
										
										
											2024-12-31 18:00:20 +08:00
+								export AZURE_OPENAI_ENDPOINT="..."
 								export AZURE_OPENAI_API_VERSION="2024-05-01-preview"
 								export AZURE_OPENAI_DEPLOYMENT="gpt-4o"
 								```
-												feat: show pointer position in chrome extension (#286)

---------

Co-authored-by: zhouxiao.shaw <zhouxiao.shaw@bytedance.com>
											
										
										
											2025-01-17 18:19:22 +08:00
+								### 使用 keyless 模式
-												feat: support keyless auth mode for azure (#227)

* feat: support keyless auth mode for azure

* feat: support keyless auth mode for azure

* fix: remove default scope config
											
										
										
											2024-12-31 18:00:20 +08:00
 								```bash
 								export MIDSCENE_USE_AZURE_OPENAI=1
 								export AZURE_OPENAI_ENDPOINT="..."
 								export AZURE_OPENAI_KEY="..."
 								export AZURE_OPENAI_API_VERSION="2024-05-01-preview"
 								export AZURE_OPENAI_DEPLOYMENT="gpt-4o"
-												feat: update the Azure OpenAI integration, add instruction for other models (#193)


											
										
										
											2024-12-20 15:18:52 +08:00
+								```
-												docs(android): update android docs (#607)

* docs: release android automation

* chore(docs): update doubao docs

* chore(docs): merge docs for doubao

* docs(android): update

* docs(site): add more android case

* docs(site): update slogan and authors

* docs(site): android yaml

* docs(core): instruction for override config

* docs(core): update readme

* Update README.md

* docs(core): update readme

* docs(core): update readme

* docs(core): update readme

* docs(core): update readme

* docs(core): update README and blog for Android automation support

* docs(core): update android playground doc

* docs(core): enhance Android integration documentation with setup instructions

* docs(core): update android playground doc

* docs(core): update Android integration documentation and add setup instructions

* docs(core): update bridge mode title

* docs(core): update yaml docs

* docs(site): chore update

* docs(site): update YAML documentation with setup instructions and clarify parameters

* docs(core): update instructions

* chore: update docs

* chore: update bridge mode docs

* docs(site): translate to zh

* docs(site): translate error

* docs(site): remove unnecessary code block in YAML automation documentation

* docs(core): update blog

* docs(core): update instructions

* docs(core): update instructions

---------

Co-authored-by: yutao <yutao.tao@bytedance.com>
Co-authored-by: yuyutaotao <167746126+yuyutaotao@users.noreply.github.com>
											
										
										
											2025-04-21 20:51:17 +08:00
+								## 使用 Javascript 配置 AI 服务
 								你也可以在运行 Midscene 代码之前，使用 Javascript 来配置 AI 服务。
 								```typescript
-												feat(core): allow custom midscene_run dir (#631)

* feat(core): support custom midscene_run dir

* feat(report): add search functionality to PlaywrightCaseSelector component

* refactor(shared): simplify base directory resolution and remove unused environment variable

* feat(shared): integrate shared environment variables across multiple packages

* refactor(shared): update base directory resolution to use dynamic midscene_run directory

* fix(puppeteer): increase screenshot timeout from 3s to 10s for improved reliability
											
										
										
											2025-04-24 22:54:52 +08:00
+								import { overrideAIConfig } from "@midscene/web/puppeteer";
 								// 或者 import { overrideAIConfig } from "@midscene/web/playwright";
 								// 或者 import { overrideAIConfig } from "@midscene/android";
-												docs(android): update android docs (#607)

* docs: release android automation

* chore(docs): update doubao docs

* chore(docs): merge docs for doubao

* docs(android): update

* docs(site): add more android case

* docs(site): update slogan and authors

* docs(site): android yaml

* docs(core): instruction for override config

* docs(core): update readme

* Update README.md

* docs(core): update readme

* docs(core): update readme

* docs(core): update readme

* docs(core): update readme

* docs(core): update README and blog for Android automation support

* docs(core): update android playground doc

* docs(core): enhance Android integration documentation with setup instructions

* docs(core): update android playground doc

* docs(core): update Android integration documentation and add setup instructions

* docs(core): update bridge mode title

* docs(core): update yaml docs

* docs(site): chore update

* docs(site): update YAML documentation with setup instructions and clarify parameters

* docs(core): update instructions

* chore: update docs

* chore: update bridge mode docs

* docs(site): translate to zh

* docs(site): translate error

* docs(site): remove unnecessary code block in YAML automation documentation

* docs(core): update blog

* docs(core): update instructions

* docs(core): update instructions

---------

Co-authored-by: yutao <yutao.tao@bytedance.com>
Co-authored-by: yuyutaotao <167746126+yuyutaotao@users.noreply.github.com>
											
										
										
											2025-04-21 20:51:17 +08:00
 								overrideAIConfig({
 								  MIDSCENE_MODEL_NAME: "...",
 								  // ...
 								});
 								```
-												feat: locate by coord (#383)

---------

Co-authored-by: zhouxiao.shaw <zhouxiao.shaw@bytedance.com>
											
										
										
											2025-02-21 09:56:09 +08:00
+								## 示例：使用 OpenAI 的 `gpt-4o` 模型
 								配置环境变量：
 								```bash
 								export OPENAI_API_KEY="sk-..."
 								export OPENAI_BASE_URL="https://endpoint.some_other_provider.com/v1" # 可选，如果你想要使用一个不同于 OpenAI 官方的接入点
 								export MIDSCENE_MODEL_NAME="gpt-4o-2024-11-20" # 可选，默认是 "gpt-4o"
 								```
 								## 示例：使用阿里云官方的 `qwen-vl-max-latest` 模型
-												feat: update the Azure OpenAI integration, add instruction for other models (#193)


											
										
										
											2024-12-20 15:18:52 +08:00
 								配置环境变量：
-												feat: support the if-statement in planning prompt (#184)


											
										
										
											2024-12-19 10:44:08 +08:00
-												feat: update the Azure OpenAI integration, add instruction for other models (#193)


											
										
										
											2024-12-20 15:18:52 +08:00
+								```bash
-												feat: invoke anthropic SDK to call Claude (#197)

* feat: invoke anthropic SDK

* chore: set response format for extract

* fix: do not throw if waitUntilNetworkIdle failed in aiAction

* fix: timeout config for Puppeteer

* chore: add instruction for connectivity test
											
										
										
											2024-12-23 12:03:05 +08:00
+								export OPENAI_API_KEY="sk-..."
 								export OPENAI_BASE_URL="https://dashscope.aliyuncs.com/compatible-mode/v1"
 								export MIDSCENE_MODEL_NAME="qwen-vl-max-latest"
-												feat: locate by coord (#383)

---------

Co-authored-by: zhouxiao.shaw <zhouxiao.shaw@bytedance.com>
											
										
										
											2025-02-21 09:56:09 +08:00
+								export MIDSCENE_USE_QWEN_VL=1
 								```
-												docs(llm): add doubao-1.5-vl-pro model (#719)

* docs(llm): add doubao-1.5-vl-pro model

* docs(site): move model comparison to choose a model doc

* docs(core): update doubao model

---------

Co-authored-by: yutao <yutao.tao@bytedance.com>
											
										
										
											2025-05-19 15:56:28 +08:00
+								## 示例：使用 Doubao-1.5-thinking-vision-pro 模型
 								配置环境变量：
 								```bash
 								export OPENAI_BASE_URL="https://ark-cn-beijing.bytedance.net/api/v3"
 								export OPENAI_API_KEY="..."
 								export MIDSCENE_MODEL_NAME='ep-...'
 								export MIDSCENE_USE_DOUBAO_VISION=1
 								```
-												feat: locate by coord (#383)

---------

Co-authored-by: zhouxiao.shaw <zhouxiao.shaw@bytedance.com>
											
										
										
											2025-02-21 09:56:09 +08:00
+								## 示例：使用 UI-TARS 模型
 								配置环境变量：
 								```bash
 								export OPENAI_BASE_URL="http://localhost:1234/v1"
 								export MIDSCENE_MODEL_NAME="ui-tars-72b-sft"
 								export MIDSCENE_USE_VLM_UI_TARS=1
-												feat: update the Azure OpenAI integration, add instruction for other models (#193)


											
										
										
											2024-12-20 15:18:52 +08:00
+								```
-												docs: add docs for customize model and endpoint (#190)

* docs: add docs for customize model and endpoint

* doc: update docs
											
										
										
											2024-12-19 15:49:06 +08:00
-												feat: invoke anthropic SDK to call Claude (#197)

* feat: invoke anthropic SDK

* chore: set response format for extract

* fix: do not throw if waitUntilNetworkIdle failed in aiAction

* fix: timeout config for Puppeteer

* chore: add instruction for connectivity test
											
										
										
											2024-12-23 12:03:05 +08:00
+								## 示例：使用 Anthropic 的 `claude-3-opus-20240229` 模型
 								当配置 `MIDSCENE_USE_ANTHROPIC_SDK=1` 时，Midscene 会使用 Anthropic SDK (`@anthropic-ai/sdk`) 来调用模型。
-												docs: add docs for customize model and endpoint (#190)

* docs: add docs for customize model and endpoint

* doc: update docs
											
										
										
											2024-12-19 15:49:06 +08:00
 								配置环境变量：
 								```bash
-												feat: invoke anthropic SDK to call Claude (#197)

* feat: invoke anthropic SDK

* chore: set response format for extract

* fix: do not throw if waitUntilNetworkIdle failed in aiAction

* fix: timeout config for Puppeteer

* chore: add instruction for connectivity test
											
										
										
											2024-12-23 12:03:05 +08:00
+								export MIDSCENE_USE_ANTHROPIC_SDK=1
 								export ANTHROPIC_API_KEY="....."
 								export MIDSCENE_MODEL_NAME="claude-3-opus-20240229"
 								```
 								## 调试 LLM 服务连接问题
 								如果你想要调试 LLM 服务连接问题，可以使用示例项目中的 `connectivity-test` 目录：[https://github.com/web-infra-dev/midscene-example/tree/main/connectivity-test](https://github.com/web-infra-dev/midscene-example/tree/main/connectivity-test)
-												docs(llm): add doubao-1.5-vl-pro model (#719)

* docs(llm): add doubao-1.5-vl-pro model

* docs(site): move model comparison to choose a model doc

* docs(core): update doubao model

---------

Co-authored-by: yutao <yutao.tao@bytedance.com>
											
										
										
											2025-05-19 15:56:28 +08:00
+								将你的 `.env` 文件放在 `connectivity-test` 文件夹中，然后运行 `npm i && npm run test` 来查看问题。