midscene/apps/site/docs/en/caching.mdx

101 lines
3.3 KiB
Plaintext
Raw Normal View History

# Caching
2024-08-05 16:27:02 +08:00
Midscene supports caching the planning steps and matched DOM element information of AI to reduce the call of AI models and improve the execution efficiency.
2024-08-05 16:27:02 +08:00
**Effect**
After enabling the cache, the execution time is significantly reduced, for example, from 39s to 13s.
* **before using cache, 39s**
2024-08-05 16:27:02 +08:00
![](/cache/no-cache-time.png)
* **after using cache, 13s**
2024-08-05 16:27:02 +08:00
![](/cache/use-cache-time.png)
## Instructions
2024-08-05 16:27:02 +08:00
There are two key points to use caching:
2024-08-05 16:27:02 +08:00
1. Set `MIDSCENE_CACHE=1` in the environment variable.
2. Set `cacheId` to specify the cache file name. It's automatically set in Playwright and Yaml mode, but if you are using javascript SDK, you should set it manually.
2024-08-05 16:27:02 +08:00
### Playwright
2024-08-05 16:27:02 +08:00
In playwright mode, you can use the `MIDSCENE_CACHE=1` environment variable to enable caching.
2024-08-05 16:27:02 +08:00
The `cacheId` will be automatically set to the test file name.
2024-08-05 16:27:02 +08:00
```diff
- playwright test --config=playwright.config.ts
+ MIDSCENE_CACHE=1 playwright test --config=playwright.config.ts
2024-08-05 16:27:02 +08:00
```
### Javascript agent, like PuppeteerAgent, AgentOverChromeBridge
2024-08-05 16:27:02 +08:00
Enable caching by setting the `MIDSCENE_CACHE=1` environment variable.
And also, you should set the `cacheId` to specify the cache identifier.
2024-08-05 16:27:02 +08:00
```diff
- tsx demo.ts
+ MIDSCENE_CACHE=1 tsx demo.ts
```
2024-08-05 16:27:02 +08:00
```javascript
const mid = new PuppeteerAgent(originPage, {
cacheId: 'puppeteer-swag-sab', // specify cache id
});
2024-08-05 16:27:02 +08:00
```
### Yaml
Enable caching by setting the `MIDSCENE_CACHE=1` environment variable.
The `cacheId` will be automatically set to the yaml filename.
2024-08-05 16:27:02 +08:00
```diff
- npx midscene ./bing-search.yaml
+ # Add cache identifier, cacheId is the yaml filename
+ MIDSCENE_CACHE=1 npx midscene ./bing-search.yaml
2024-08-05 16:27:02 +08:00
```
## Cache strategy
2024-08-05 16:27:02 +08:00
### Cache content
2024-08-05 16:27:02 +08:00
These two types of content will be cached:
2024-08-05 16:27:02 +08:00
1. the results of AI's planning (i.e., the results of ai and aiAction methods)
2. The results of AI's element locating
2024-08-05 16:27:02 +08:00
The results of `aiQuery` and `aiAssert` will never be cached. You can always use them to verify whether the AI's task is as expected.
2024-08-05 16:27:02 +08:00
### Cache hit conditions
2024-08-05 16:27:02 +08:00
Cache will only be hit when the following conditions are met:
1. The same `cacheId`
2. The same major and minor version of Midscene
3. The same page url and screen size
2024-08-05 16:27:02 +08:00
When using cache for locate element tasks, the following conditions are also required:
1. A DOM element with the same position and size can be found on the page according to the cache file.
2. If you are using VL model, there must be a DOM element matched with the coordinates. Otherwise, you will see a "POSITION NODE" in report file which means it cannot be cached.
2024-08-05 16:27:02 +08:00
### If cache is not hit
2024-08-05 16:27:02 +08:00
If cache is not hit, Midscene will call AI model again and the result in cache file will be updated.
2024-08-05 16:27:02 +08:00
## Common issues
2024-08-05 16:27:02 +08:00
### Why the cache is missed on CI?
2024-08-05 16:27:02 +08:00
You should commit the cache file to the repository (usually in the `./midscene_run/cache` directory). And also, check the cache-hit conditions.
2024-08-05 16:27:02 +08:00
### Does it mean that AI services are no longer needed after using cache?
2024-08-05 16:27:02 +08:00
No. Caching is not a tool for ensuring long-term script stability. We have noticed many scenarios where the cache may miss when the page changes, such as when the element position changes slightly or the DOM structure changes. AI services are still needed to reevaluate the task when the cache miss occurs.
2024-08-05 16:27:02 +08:00
### How to manually remove the cache?
You can remove the cache file in the `cache` directory, or edit the contents in the cache file.