mirror of
https://github.com/web-infra-dev/midscene.git
synced 2025-07-08 09:31:58 +00:00

* feat(web-integration): enhance timeout configurations and logging for network idle and navigation * fix(web-integration): refine timeout warning messages and remove unnecessary test files * feat(site): add network timeout customization details and additional parameters for Puppeteer * fix(site): update default timeout values and enhance customization options for network idle in YAML * fix(site): remove redundant timeout customization details in FAQ documentation * fix(web-integration): enhance Playwright agent to support network idle functionality * docs(playwright): update config docs * docs(playwright): update config docs * fix(web-integration): refactor network idle handling in Playwright agent --------- Co-authored-by: yutao <yutao.tao@bytedance.com>
322 lines
9.7 KiB
Plaintext
322 lines
9.7 KiB
Plaintext
import SetupEnv from './common/setup-env.mdx';
|
||
|
||
# Automate with Scripts in YAML
|
||
|
||
In most cases, developers write automation just to perform some smoke tests, like checking the appearance of some content, or verifying that the key user path is accessible. Maintaining a large test project is unnecessary in this situation.
|
||
|
||
Midscene offers a way to do this kind of automation with `.yaml` files, which helps you to focus on the script itself instead of the test infrastructure. Any team member can write an automation script without learning any API.
|
||
|
||
Here is an example of `.yaml` script, you may have already understood how it works by reading its content.
|
||
|
||
```yaml
|
||
web:
|
||
url: https://www.bing.com
|
||
|
||
tasks:
|
||
- name: search weather
|
||
flow:
|
||
- ai: search for 'weather today'
|
||
- sleep: 3000
|
||
|
||
- name: check result
|
||
flow:
|
||
- aiAssert: the result shows the weather info
|
||
```
|
||
|
||
:::info Demo Project
|
||
|
||
You can find the demo project with YAML scripts
|
||
[https://github.com/web-infra-dev/midscene-example/tree/main/yaml-scripts-demo](https://github.com/web-infra-dev/midscene-example/tree/main/yaml-scripts-demo)
|
||
- [Web](https://github.com/web-infra-dev/midscene-example/tree/main/yaml-scripts-demo)
|
||
- [Android](https://github.com/web-infra-dev/midscene-example/tree/main/android/yaml-scripts-demo)
|
||
|
||
:::
|
||
|
||
<SetupEnv />
|
||
|
||
or you can use a `.env` file locate at the same directory as you run the command to store the configuration, Midscene command line tool will automatically load it.
|
||
|
||
```env filename=.env
|
||
OPENAI_API_KEY="sk-abcdefghijklmnopqrstuvwxyz"
|
||
```
|
||
|
||
## Install Command Line Tool
|
||
|
||
Install `@midscene/cli` globally
|
||
|
||
```bash
|
||
npm i -g @midscene/cli
|
||
# or if you prefer a project-wide installation
|
||
npm i @midscene/cli --save-dev
|
||
```
|
||
|
||
Write a yaml file to `bing-search.yaml` to automate in web browser :
|
||
|
||
```yaml
|
||
web:
|
||
url: https://www.bing.com
|
||
|
||
tasks:
|
||
- name: search weather
|
||
flow:
|
||
- ai: search for 'weather today'
|
||
- sleep: 3000
|
||
- aiAssert: the result shows the weather info
|
||
```
|
||
|
||
or to automate in Android device connected by adb :
|
||
|
||
```yaml
|
||
android:
|
||
# launch: https://www.bing.com
|
||
deviceId: s4ey59
|
||
|
||
tasks:
|
||
- name: search weather
|
||
flow:
|
||
- ai: open browser and navigate to bing.com
|
||
- ai: search for 'weather today'
|
||
- sleep: 3000
|
||
- aiAssert: the result shows the weather info
|
||
```
|
||
|
||
Run this script
|
||
|
||
```bash
|
||
midscene ./bing-search.yaml
|
||
# or if you installed midscene inside the project
|
||
npx midscene ./bing-search.yaml
|
||
```
|
||
|
||
You should see that the output shows the progress of the running process and the report file.
|
||
|
||
## Command line usage
|
||
|
||
### Run single `.yaml` file
|
||
|
||
```bash
|
||
midscene /path/to/yaml
|
||
```
|
||
|
||
### Run all `.yaml` files under a folder
|
||
|
||
```bash
|
||
midscene /dir/of/yaml/
|
||
|
||
# glob is also supported
|
||
midscene /dir/**/yaml/
|
||
```
|
||
|
||
## YAML file schema
|
||
|
||
There are two parts in a `.yaml` file, the `web/android` and the `tasks`.
|
||
|
||
The `web/android` part defines the basic of a task. Use `web` parameter (also previously named as `target`) for web browser automation, and use `android` parameter for Android device automation. They are mutually exclusive.
|
||
|
||
### The `web` part
|
||
|
||
```yaml
|
||
web:
|
||
# The URL to visit, required. If `serve` is provided, provide the path to the file to visit
|
||
url: <url>
|
||
|
||
# Serve the local path as a static server, optional
|
||
serve: <root-directory>
|
||
|
||
# The user agent to use, optional
|
||
userAgent: <ua>
|
||
|
||
# number, the viewport width, default is 1280, optional
|
||
viewportWidth: <width>
|
||
|
||
# number, the viewport height, default is 960, optional
|
||
viewportHeight: <height>
|
||
|
||
# number, the device scale factor (dpr), default is 1, optional
|
||
deviceScaleFactor: <scale>
|
||
|
||
# string, the path to the json format cookie file, optional
|
||
cookie: <path-to-cookie-file>
|
||
|
||
# object, the strategy to wait for network idle, optional
|
||
waitForNetworkIdle:
|
||
# number, the timeout in milliseconds, 2000ms for default, optional
|
||
timeout: <ms>
|
||
# boolean, continue on network idle error, true for default
|
||
continueOnNetworkIdleError: <boolean>
|
||
|
||
# string, the path to save the aiQuery result, optional
|
||
output: <path-to-output-file>
|
||
|
||
# boolean, if limit the popup to the current page, true for default in yaml script
|
||
forceSameTabNavigation: <boolean>
|
||
|
||
# string, the bridge mode to use, optional, default is false, can be 'newTabWithUrl' or 'currentTab'. More details see the following section
|
||
bridgeMode: false | 'newTabWithUrl' | 'currentTab'
|
||
|
||
# boolean, if close the new tabs after the bridge is disconnected, optional, default is false
|
||
closeNewTabsAfterDisconnect: <boolean>
|
||
|
||
# boolean, if allow insecure https certs, optional, default is false
|
||
acceptInsecureCerts: <boolean>
|
||
|
||
# string, the background knowledge to send to the AI model when calling aiAction, optional
|
||
aiActionContext: <string>
|
||
```
|
||
|
||
### The `android` part
|
||
|
||
```yaml
|
||
android:
|
||
# The device id to use, optional, default is the first connected device
|
||
deviceId: <device-id>
|
||
|
||
# The url to launch, optional, default is the current page
|
||
launch: <url>
|
||
```
|
||
|
||
### The `tasks` part
|
||
|
||
The `tasks` part is an array indicates the tasks to do. Remember to write a `-` before each item which means an array item.
|
||
|
||
The interfaces of the `flow` part are almost the same as the [API](./API.html), except for some parameter levels.
|
||
|
||
```yaml
|
||
tasks:
|
||
- name: <name>
|
||
continueOnError: <boolean> # optional, default is false
|
||
flow:
|
||
# Auto Planning (.ai)
|
||
# ----------------
|
||
|
||
# perform an action, this is the shortcut for aiAction
|
||
- ai: <prompt>
|
||
|
||
# this is the same as ai
|
||
- aiAction: <prompt>
|
||
|
||
# Instant Action(.aiTap, .aiHover, .aiInput, .aiKeyboardPress, .aiScroll)
|
||
# ----------------
|
||
|
||
# tap an element located by prompt
|
||
- aiTap: <prompt>
|
||
deepThink: <boolean> # optional, whether to use deepThink to precisely locate the element
|
||
|
||
# hover an element located by prompt
|
||
- aiHover: <prompt>
|
||
deepThink: <boolean> # optional, whether to use deepThink to precisely locate the element
|
||
|
||
# input text into an element located by prompt
|
||
- aiInput: <final text content of the input>
|
||
locate: <prompt>
|
||
deepThink: <boolean> # optional, whether to use deepThink to precisely locate the element
|
||
|
||
# press a key (like Enter, Tab, Escape, etc.) on an element located by prompt
|
||
- aiKeyboardPress: <key>
|
||
locate: <prompt>
|
||
deepThink: <boolean> # optional, whether to use deepThink to precisely locate the element
|
||
|
||
|
||
# scroll globally or on an element located by prompt
|
||
- aiScroll:
|
||
direction: 'up' # or 'down' | 'left' | 'right'
|
||
scrollType: 'once' # or 'untilTop' | 'untilBottom' | 'untilLeft' | 'untilRight'
|
||
distance: <number> # optional, distance to scroll in px
|
||
locate: <prompt> # optional, the element to scroll on
|
||
deepThink: <boolean> # optional, whether to use deepThink to precisely locate the element
|
||
|
||
# Data Extraction
|
||
# ----------------
|
||
|
||
# perform a query, return a json object
|
||
- aiQuery: <prompt> # remember to describe the format of the result in the prompt
|
||
name: <name> # the name of the result, will be used as the key in the output json
|
||
|
||
# More APIs
|
||
# ----------------
|
||
|
||
# wait for a condition to be met with a timeout (ms, optional, default 30000)
|
||
- aiWaitFor: <prompt>
|
||
timeout: <ms>
|
||
|
||
# perform an assertion
|
||
- aiAssert: <prompt>
|
||
|
||
# sleep for a number of milliseconds
|
||
- sleep: <ms>
|
||
|
||
# evaluate a javascript expression in web page context
|
||
- javascript: <javascript>
|
||
name: <name> # assign a name to the return value, will be used as the key in the output json, optional
|
||
|
||
- name: <name>
|
||
flow:
|
||
# ...
|
||
```
|
||
|
||
## More features
|
||
|
||
### Use environment variables in `.yaml` file
|
||
|
||
You can use environment variables in `.yaml` file by `${variable-name}`.
|
||
|
||
For example, if you have a `.env` file with the following content:
|
||
|
||
```env filename=.env
|
||
topic=weather today
|
||
```
|
||
|
||
You can use the environment variable in the `.yaml` file like this:
|
||
|
||
```yaml
|
||
#...
|
||
- ai: type ${topic} in input box
|
||
#...
|
||
```
|
||
|
||
### Debug in headed mode
|
||
> `web` scenario only
|
||
|
||
'headed mode' means the browser will be visible. The default behavior is to run in headless mode.
|
||
|
||
To turn on headed mode, you can use `--headed` option. Besides, if you want to keep the browser window open after the script finishes, you can use `--keep-window` option. `--keep-window` implies `--headed`.
|
||
|
||
When running in headed mode, it will consume more resources, so we recommend you to use it locally only when needed.
|
||
|
||
```bash
|
||
# run in headed mode
|
||
midscene /path/to/yaml --headed
|
||
|
||
# run in headed mode and keep the browser window open after the script finishes
|
||
midscene /path/to/yaml --keep-window
|
||
```
|
||
|
||
### Use bridge mode
|
||
> `web` scenario only
|
||
By using bridge mode, you can utilize YAML scripts to automate the web browser on your desktop. This is particularly useful if you want to reuse cookies, plugins, and page states, or if you want to manually interact with automation scripts.
|
||
|
||
To use bridge mode, you should install the Chrome extension first, and use this configuration in the `target` section:
|
||
|
||
```diff
|
||
web:
|
||
url: https://www.bing.com
|
||
+ bridgeMode: newTabWithUrl
|
||
```
|
||
|
||
See [Bridge Mode by Chrome Extension](./bridge-mode-by-chrome-extension) for more details.
|
||
|
||
### Run yaml script with javascript
|
||
|
||
You can also run a yaml script with javascript by using the [`runYaml`](./api.html#runyaml) method of the Midscene agent. Only the `tasks` part of the yaml script will be executed.
|
||
|
||
|
||
## FAQ
|
||
|
||
**How to get cookies in JSON format from Chrome?**
|
||
|
||
You can use this [chrome extension](https://chromewebstore.google.com/detail/get-cookiestxt-locally/cclelndahbckbenkjhflpdbgdldlbecc) to export cookies in JSON format.
|
||
|
||
## More
|
||
|
||
You may also be interested in [Prompting Tips](./prompting-tips)
|