midscene/apps/site/docs/zh/caching.mdx

102 lines
3.0 KiB
Plaintext
Raw Permalink Normal View History

2024-08-05 16:27:02 +08:00
# 缓存
feat(web): use xpath and yaml as cache (#711) * feat(web-integration): use xpath for cache instead of id * feat(web-integration): enhance TaskCache to support xpaths for cache matching and add new test cases * feat(web-integration): add debug log for unknown page types in TaskCache * feat(web-integration): update caching logic and cache hit conditions for Plan and Locate tasks * chore(core): update debug log * feat(web-integration): update rspress.config and enhance TaskCache structure with new properties * feat(web-integration): recalculate id when hit cache * fix(web-integration): update mock implementation in task-cache test to use evaluate method * feat(web-integration): enhance element caching by adding XPath support and improving cache hit logic * chore(core): lint * feat(web-integration): improve XPath handling in web-extractor * test(web-integration): fix tests * feat(core, web-integration): add attributes to LocateResultElement and enhance element handling * fix(core): lint * feat(web-integration): add midsceneVersion to TaskCache and update cache validation logic * fix(core): test * fix(web-integration): update cache validation logic to prevent reading outdated midscene cache files * feat(web-integration): enhance TaskCache to track used cache items and improve cache retrieval logic * fix(core): xpath logic (#710) * feat(core): resue context for locate * feat(core): build yamlFlow from aiAction * feat(core): refine task-cache * feat(core): update cache * feat(core): refine task-cache * feat(core): refine task-cache * feat(core): remove unused checkElementExistsByXPath * feat(core): use yaml file as cache * chore(core): fix lint * chore(core): print warning for previous cache * refactor(core): remove quickAnswer references and improve element matching logic * fix(core): update import path for buildYamlFlowFromPlans * chore(web-integration): update output image and skip task error test * fix(web-integration): update test snapshots to handle beta versions * fix(web-integration): adjust test snapshots for version consistency * fix(web-integration): track original cache length and adjust matching logic in tests * fix(web-integration): update test URLs to reflect new target site and enable previously skipped test * chore(core): update cache docs * fix(core): test * feat(core): try to match element from plan * fix(web-integration): cache id stable when retry in palywright * fix(web-integration): typo * style(web-integration): lint * fix(web-integration): stable cacheid in tests * fix(web-integration): cache id --------- Co-authored-by: quanruzhuoxiu <quanruzhuoxiu@gmail.com>
2025-05-16 17:16:56 +08:00
Midscene 支持缓存 Plan 的步骤与匹配到的 DOM 元素信息,减少 AI 模型的调用次数,从而大幅提升执行效率。
Android 自动化任务不支持缓存策略。
**效果**
2024-08-05 16:27:02 +08:00
通过引入缓存后用例的执行时间大幅降低了例如从51秒降低到了28秒。
2024-08-05 16:27:02 +08:00
* **before**
![](/cache/no-cache-time.png)
* **after**
![](/cache/use-cache-time.png)
## 使用方式
2024-08-05 16:27:02 +08:00
想要启用缓存特性,有两个关键点:
2024-08-05 16:27:02 +08:00
feat(web): use xpath and yaml as cache (#711) * feat(web-integration): use xpath for cache instead of id * feat(web-integration): enhance TaskCache to support xpaths for cache matching and add new test cases * feat(web-integration): add debug log for unknown page types in TaskCache * feat(web-integration): update caching logic and cache hit conditions for Plan and Locate tasks * chore(core): update debug log * feat(web-integration): update rspress.config and enhance TaskCache structure with new properties * feat(web-integration): recalculate id when hit cache * fix(web-integration): update mock implementation in task-cache test to use evaluate method * feat(web-integration): enhance element caching by adding XPath support and improving cache hit logic * chore(core): lint * feat(web-integration): improve XPath handling in web-extractor * test(web-integration): fix tests * feat(core, web-integration): add attributes to LocateResultElement and enhance element handling * fix(core): lint * feat(web-integration): add midsceneVersion to TaskCache and update cache validation logic * fix(core): test * fix(web-integration): update cache validation logic to prevent reading outdated midscene cache files * feat(web-integration): enhance TaskCache to track used cache items and improve cache retrieval logic * fix(core): xpath logic (#710) * feat(core): resue context for locate * feat(core): build yamlFlow from aiAction * feat(core): refine task-cache * feat(core): update cache * feat(core): refine task-cache * feat(core): refine task-cache * feat(core): remove unused checkElementExistsByXPath * feat(core): use yaml file as cache * chore(core): fix lint * chore(core): print warning for previous cache * refactor(core): remove quickAnswer references and improve element matching logic * fix(core): update import path for buildYamlFlowFromPlans * chore(web-integration): update output image and skip task error test * fix(web-integration): update test snapshots to handle beta versions * fix(web-integration): adjust test snapshots for version consistency * fix(web-integration): track original cache length and adjust matching logic in tests * fix(web-integration): update test URLs to reflect new target site and enable previously skipped test * chore(core): update cache docs * fix(core): test * feat(core): try to match element from plan * fix(web-integration): cache id stable when retry in palywright * fix(web-integration): typo * style(web-integration): lint * fix(web-integration): stable cacheid in tests * fix(web-integration): cache id --------- Co-authored-by: quanruzhuoxiu <quanruzhuoxiu@gmail.com>
2025-05-16 17:16:56 +08:00
1. 设置 `MIDSCENE_CACHE=1` 环境变量,用以启用缓存匹配
2. 设置 `cacheId` 来指定缓存文件名。在 Playwright 和 Yaml 模式下,`cacheId` 会自动设置为测试文件名,在 Javascript 模式下,需要手动设置 `cacheId`。
2024-08-05 16:27:02 +08:00
### Playwright
2024-08-05 16:27:02 +08:00
在 Playwright 模式下,只需要设置 `MIDSCENE_CACHE=1` 环境变量即可。
2024-08-05 16:27:02 +08:00
`cacheId` 会自动设置为测试文件名。
2024-08-05 16:27:02 +08:00
```diff
- playwright test --config=playwright.config.ts
+ MIDSCENE_CACHE=1 playwright test --config=playwright.config.ts
2024-08-05 16:27:02 +08:00
```
### Javascript agent, 例如 PuppeteerAgent, AgentOverChromeBridge
2024-08-05 16:27:02 +08:00
在 Javascript 模式下,需要设置 `MIDSCENE_CACHE=1` 环境变量,并且需要手动设置 `cacheId`。
2024-08-05 16:27:02 +08:00
```diff
- tsx demo.ts
+ MIDSCENE_CACHE=1 tsx demo.ts
```
2024-08-05 16:27:02 +08:00
```javascript
const mid = new PuppeteerAgent(originPage, {
cacheId: 'puppeteer-swag-sab', // 增加缓存标识
});
```
2024-08-05 16:27:02 +08:00
### Yaml
2024-08-05 16:27:02 +08:00
在 Yaml 模式下,需要设置 `MIDSCENE_CACHE=1` 环境变量。
2024-08-05 16:27:02 +08:00
`cacheId` 会自动设置为 yaml 文件名。
2024-08-05 16:27:02 +08:00
```diff
- npx midscene ./bing-search.yaml
+ # 增加缓存标识, cacheId 为 yaml 文件名
+ MIDSCENE_CACHE=1 npx midscene ./bing-search.yaml
2024-08-05 16:27:02 +08:00
```
## 缓存策略
feat(web): use xpath and yaml as cache (#711) * feat(web-integration): use xpath for cache instead of id * feat(web-integration): enhance TaskCache to support xpaths for cache matching and add new test cases * feat(web-integration): add debug log for unknown page types in TaskCache * feat(web-integration): update caching logic and cache hit conditions for Plan and Locate tasks * chore(core): update debug log * feat(web-integration): update rspress.config and enhance TaskCache structure with new properties * feat(web-integration): recalculate id when hit cache * fix(web-integration): update mock implementation in task-cache test to use evaluate method * feat(web-integration): enhance element caching by adding XPath support and improving cache hit logic * chore(core): lint * feat(web-integration): improve XPath handling in web-extractor * test(web-integration): fix tests * feat(core, web-integration): add attributes to LocateResultElement and enhance element handling * fix(core): lint * feat(web-integration): add midsceneVersion to TaskCache and update cache validation logic * fix(core): test * fix(web-integration): update cache validation logic to prevent reading outdated midscene cache files * feat(web-integration): enhance TaskCache to track used cache items and improve cache retrieval logic * fix(core): xpath logic (#710) * feat(core): resue context for locate * feat(core): build yamlFlow from aiAction * feat(core): refine task-cache * feat(core): update cache * feat(core): refine task-cache * feat(core): refine task-cache * feat(core): remove unused checkElementExistsByXPath * feat(core): use yaml file as cache * chore(core): fix lint * chore(core): print warning for previous cache * refactor(core): remove quickAnswer references and improve element matching logic * fix(core): update import path for buildYamlFlowFromPlans * chore(web-integration): update output image and skip task error test * fix(web-integration): update test snapshots to handle beta versions * fix(web-integration): adjust test snapshots for version consistency * fix(web-integration): track original cache length and adjust matching logic in tests * fix(web-integration): update test URLs to reflect new target site and enable previously skipped test * chore(core): update cache docs * fix(core): test * feat(core): try to match element from plan * fix(web-integration): cache id stable when retry in palywright * fix(web-integration): typo * style(web-integration): lint * fix(web-integration): stable cacheid in tests * fix(web-integration): cache id --------- Co-authored-by: quanruzhuoxiu <quanruzhuoxiu@gmail.com>
2025-05-16 17:16:56 +08:00
缓存内容会保存到 `./midscene_run/cache` 目录下,以 `.cache.yaml` 为扩展名。
2024-08-05 16:27:02 +08:00
feat(web): use xpath and yaml as cache (#711) * feat(web-integration): use xpath for cache instead of id * feat(web-integration): enhance TaskCache to support xpaths for cache matching and add new test cases * feat(web-integration): add debug log for unknown page types in TaskCache * feat(web-integration): update caching logic and cache hit conditions for Plan and Locate tasks * chore(core): update debug log * feat(web-integration): update rspress.config and enhance TaskCache structure with new properties * feat(web-integration): recalculate id when hit cache * fix(web-integration): update mock implementation in task-cache test to use evaluate method * feat(web-integration): enhance element caching by adding XPath support and improving cache hit logic * chore(core): lint * feat(web-integration): improve XPath handling in web-extractor * test(web-integration): fix tests * feat(core, web-integration): add attributes to LocateResultElement and enhance element handling * fix(core): lint * feat(web-integration): add midsceneVersion to TaskCache and update cache validation logic * fix(core): test * fix(web-integration): update cache validation logic to prevent reading outdated midscene cache files * feat(web-integration): enhance TaskCache to track used cache items and improve cache retrieval logic * fix(core): xpath logic (#710) * feat(core): resue context for locate * feat(core): build yamlFlow from aiAction * feat(core): refine task-cache * feat(core): update cache * feat(core): refine task-cache * feat(core): refine task-cache * feat(core): remove unused checkElementExistsByXPath * feat(core): use yaml file as cache * chore(core): fix lint * chore(core): print warning for previous cache * refactor(core): remove quickAnswer references and improve element matching logic * fix(core): update import path for buildYamlFlowFromPlans * chore(web-integration): update output image and skip task error test * fix(web-integration): update test snapshots to handle beta versions * fix(web-integration): adjust test snapshots for version consistency * fix(web-integration): track original cache length and adjust matching logic in tests * fix(web-integration): update test URLs to reflect new target site and enable previously skipped test * chore(core): update cache docs * fix(core): test * feat(core): try to match element from plan * fix(web-integration): cache id stable when retry in palywright * fix(web-integration): typo * style(web-integration): lint * fix(web-integration): stable cacheid in tests * fix(web-integration): cache id --------- Co-authored-by: quanruzhuoxiu <quanruzhuoxiu@gmail.com>
2025-05-16 17:16:56 +08:00
缓存内容分为两类:
2024-08-05 16:27:02 +08:00
feat(web): use xpath and yaml as cache (#711) * feat(web-integration): use xpath for cache instead of id * feat(web-integration): enhance TaskCache to support xpaths for cache matching and add new test cases * feat(web-integration): add debug log for unknown page types in TaskCache * feat(web-integration): update caching logic and cache hit conditions for Plan and Locate tasks * chore(core): update debug log * feat(web-integration): update rspress.config and enhance TaskCache structure with new properties * feat(web-integration): recalculate id when hit cache * fix(web-integration): update mock implementation in task-cache test to use evaluate method * feat(web-integration): enhance element caching by adding XPath support and improving cache hit logic * chore(core): lint * feat(web-integration): improve XPath handling in web-extractor * test(web-integration): fix tests * feat(core, web-integration): add attributes to LocateResultElement and enhance element handling * fix(core): lint * feat(web-integration): add midsceneVersion to TaskCache and update cache validation logic * fix(core): test * fix(web-integration): update cache validation logic to prevent reading outdated midscene cache files * feat(web-integration): enhance TaskCache to track used cache items and improve cache retrieval logic * fix(core): xpath logic (#710) * feat(core): resue context for locate * feat(core): build yamlFlow from aiAction * feat(core): refine task-cache * feat(core): update cache * feat(core): refine task-cache * feat(core): refine task-cache * feat(core): remove unused checkElementExistsByXPath * feat(core): use yaml file as cache * chore(core): fix lint * chore(core): print warning for previous cache * refactor(core): remove quickAnswer references and improve element matching logic * fix(core): update import path for buildYamlFlowFromPlans * chore(web-integration): update output image and skip task error test * fix(web-integration): update test snapshots to handle beta versions * fix(web-integration): adjust test snapshots for version consistency * fix(web-integration): track original cache length and adjust matching logic in tests * fix(web-integration): update test URLs to reflect new target site and enable previously skipped test * chore(core): update cache docs * fix(core): test * feat(core): try to match element from plan * fix(web-integration): cache id stable when retry in palywright * fix(web-integration): typo * style(web-integration): lint * fix(web-integration): stable cacheid in tests * fix(web-integration): cache id --------- Co-authored-by: quanruzhuoxiu <quanruzhuoxiu@gmail.com>
2025-05-16 17:16:56 +08:00
1. 任务规划结果,例如 `ai` 和 `aiAction` 方法的结果
2. 元素定位后的 XPath 数据,例如 `.aiLocate`, `.aiTap` 等方法的结果
2024-08-05 16:27:02 +08:00
feat(web): use xpath and yaml as cache (#711) * feat(web-integration): use xpath for cache instead of id * feat(web-integration): enhance TaskCache to support xpaths for cache matching and add new test cases * feat(web-integration): add debug log for unknown page types in TaskCache * feat(web-integration): update caching logic and cache hit conditions for Plan and Locate tasks * chore(core): update debug log * feat(web-integration): update rspress.config and enhance TaskCache structure with new properties * feat(web-integration): recalculate id when hit cache * fix(web-integration): update mock implementation in task-cache test to use evaluate method * feat(web-integration): enhance element caching by adding XPath support and improving cache hit logic * chore(core): lint * feat(web-integration): improve XPath handling in web-extractor * test(web-integration): fix tests * feat(core, web-integration): add attributes to LocateResultElement and enhance element handling * fix(core): lint * feat(web-integration): add midsceneVersion to TaskCache and update cache validation logic * fix(core): test * fix(web-integration): update cache validation logic to prevent reading outdated midscene cache files * feat(web-integration): enhance TaskCache to track used cache items and improve cache retrieval logic * fix(core): xpath logic (#710) * feat(core): resue context for locate * feat(core): build yamlFlow from aiAction * feat(core): refine task-cache * feat(core): update cache * feat(core): refine task-cache * feat(core): refine task-cache * feat(core): remove unused checkElementExistsByXPath * feat(core): use yaml file as cache * chore(core): fix lint * chore(core): print warning for previous cache * refactor(core): remove quickAnswer references and improve element matching logic * fix(core): update import path for buildYamlFlowFromPlans * chore(web-integration): update output image and skip task error test * fix(web-integration): update test snapshots to handle beta versions * fix(web-integration): adjust test snapshots for version consistency * fix(web-integration): track original cache length and adjust matching logic in tests * fix(web-integration): update test URLs to reflect new target site and enable previously skipped test * chore(core): update cache docs * fix(core): test * feat(core): try to match element from plan * fix(web-integration): cache id stable when retry in palywright * fix(web-integration): typo * style(web-integration): lint * fix(web-integration): stable cacheid in tests * fix(web-integration): cache id --------- Co-authored-by: quanruzhuoxiu <quanruzhuoxiu@gmail.com>
2025-05-16 17:16:56 +08:00
查询类方法,例如 `aiBoolean`, `aiQuery`, `aiAssert` 的内容不会被缓存。
如果缓存未命中Midscene 将会重新调用 AI 模型,并更新缓存文件。
## 常见问题
2024-08-05 16:27:02 +08:00
feat(web): use xpath and yaml as cache (#711) * feat(web-integration): use xpath for cache instead of id * feat(web-integration): enhance TaskCache to support xpaths for cache matching and add new test cases * feat(web-integration): add debug log for unknown page types in TaskCache * feat(web-integration): update caching logic and cache hit conditions for Plan and Locate tasks * chore(core): update debug log * feat(web-integration): update rspress.config and enhance TaskCache structure with new properties * feat(web-integration): recalculate id when hit cache * fix(web-integration): update mock implementation in task-cache test to use evaluate method * feat(web-integration): enhance element caching by adding XPath support and improving cache hit logic * chore(core): lint * feat(web-integration): improve XPath handling in web-extractor * test(web-integration): fix tests * feat(core, web-integration): add attributes to LocateResultElement and enhance element handling * fix(core): lint * feat(web-integration): add midsceneVersion to TaskCache and update cache validation logic * fix(core): test * fix(web-integration): update cache validation logic to prevent reading outdated midscene cache files * feat(web-integration): enhance TaskCache to track used cache items and improve cache retrieval logic * fix(core): xpath logic (#710) * feat(core): resue context for locate * feat(core): build yamlFlow from aiAction * feat(core): refine task-cache * feat(core): update cache * feat(core): refine task-cache * feat(core): refine task-cache * feat(core): remove unused checkElementExistsByXPath * feat(core): use yaml file as cache * chore(core): fix lint * chore(core): print warning for previous cache * refactor(core): remove quickAnswer references and improve element matching logic * fix(core): update import path for buildYamlFlowFromPlans * chore(web-integration): update output image and skip task error test * fix(web-integration): update test snapshots to handle beta versions * fix(web-integration): adjust test snapshots for version consistency * fix(web-integration): track original cache length and adjust matching logic in tests * fix(web-integration): update test URLs to reflect new target site and enable previously skipped test * chore(core): update cache docs * fix(core): test * feat(core): try to match element from plan * fix(web-integration): cache id stable when retry in palywright * fix(web-integration): typo * style(web-integration): lint * fix(web-integration): stable cacheid in tests * fix(web-integration): cache id --------- Co-authored-by: quanruzhuoxiu <quanruzhuoxiu@gmail.com>
2025-05-16 17:16:56 +08:00
### 如何检查缓存是否命中?
你可以查看报告文件。如果缓存命中,你将看到 `cache` 提示,并且执行时间大幅降低。
### 为什么在 CI 中无法命中缓存?
2024-08-05 16:27:02 +08:00
你需要在 CI 中将缓存文件提交到仓库中,并再次检查缓存命中的条件。
2024-08-05 16:27:02 +08:00
### 如果有了缓存,是否就不需要 AI 服务了?
2024-08-05 16:27:02 +08:00
feat(web): use xpath and yaml as cache (#711) * feat(web-integration): use xpath for cache instead of id * feat(web-integration): enhance TaskCache to support xpaths for cache matching and add new test cases * feat(web-integration): add debug log for unknown page types in TaskCache * feat(web-integration): update caching logic and cache hit conditions for Plan and Locate tasks * chore(core): update debug log * feat(web-integration): update rspress.config and enhance TaskCache structure with new properties * feat(web-integration): recalculate id when hit cache * fix(web-integration): update mock implementation in task-cache test to use evaluate method * feat(web-integration): enhance element caching by adding XPath support and improving cache hit logic * chore(core): lint * feat(web-integration): improve XPath handling in web-extractor * test(web-integration): fix tests * feat(core, web-integration): add attributes to LocateResultElement and enhance element handling * fix(core): lint * feat(web-integration): add midsceneVersion to TaskCache and update cache validation logic * fix(core): test * fix(web-integration): update cache validation logic to prevent reading outdated midscene cache files * feat(web-integration): enhance TaskCache to track used cache items and improve cache retrieval logic * fix(core): xpath logic (#710) * feat(core): resue context for locate * feat(core): build yamlFlow from aiAction * feat(core): refine task-cache * feat(core): update cache * feat(core): refine task-cache * feat(core): refine task-cache * feat(core): remove unused checkElementExistsByXPath * feat(core): use yaml file as cache * chore(core): fix lint * chore(core): print warning for previous cache * refactor(core): remove quickAnswer references and improve element matching logic * fix(core): update import path for buildYamlFlowFromPlans * chore(web-integration): update output image and skip task error test * fix(web-integration): update test snapshots to handle beta versions * fix(web-integration): adjust test snapshots for version consistency * fix(web-integration): track original cache length and adjust matching logic in tests * fix(web-integration): update test URLs to reflect new target site and enable previously skipped test * chore(core): update cache docs * fix(core): test * feat(core): try to match element from plan * fix(web-integration): cache id stable when retry in palywright * fix(web-integration): typo * style(web-integration): lint * fix(web-integration): stable cacheid in tests * fix(web-integration): cache id --------- Co-authored-by: quanruzhuoxiu <quanruzhuoxiu@gmail.com>
2025-05-16 17:16:56 +08:00
不是的。
缓存是加速脚本执行的手段,但它不是确保脚本长期稳定执行的工具。我们注意到,当页面发生变化时,缓存可能会失效(例如当元素 DOM 结构发生变化时。在缓存失效时Midscene 仍然需要调用 AI 服务来重新执行任务。
2024-08-05 16:27:02 +08:00
### 如何手动删除缓存?
2024-08-05 16:27:02 +08:00
你可以删除缓存文件,或者编辑缓存文件的内容。
### 如果我想禁用单个 API 的缓存,怎么办?
你可以使用 `cacheable` 选项来禁用单个 API 的缓存。
具体用法请参考对应 [API](./API.mdx) 的文档。