feat: implement .aiAssert, update some docs (#38)

* feat: implement .aiAssert, update some docs

* fix: lint

* fix: ci

* feat: update quick-start
This commit is contained in:
yuyutaotao 2024-08-06 10:00:25 +08:00 committed by GitHub
parent df12c7ea4d
commit 7edc2be46d
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
35 changed files with 485 additions and 195 deletions

View File

@ -33,7 +33,6 @@ export default defineConfig({
testDir: './e2e',
+ timeout: 90 * 1000,
+ reporter: [["list"], ["@midscene/web/playwright-report"]],
});
```
@ -58,12 +57,12 @@ import { expect } from "@playwright/test";
import { test } from "./fixture";
test.beforeEach(async ({ page }) => {
page.setViewportSize({ width: 400, height: 905 });
page.setViewportSize({ width: 1280, height: 800 });
await page.goto("https://www.ebay.com");
await page.waitForLoadState("networkidle");
});
test("search headphone on ebay", async ({ ai, aiQuery }) => {
test("search headphone on ebay", async ({ ai, aiQuery, aiAssert }) => {
// 👀 type keywords, perform a search
await ai('type "Headphones" in search box, hit Enter');
@ -73,7 +72,10 @@ test("search headphone on ebay", async ({ ai, aiQuery }) => {
);
console.log("headphones in stock", items);
expect(items?.length).toBeGreaterThan(1);
expect(items?.length).toBeGreaterThan(0);
// 👀 assert by AI
await aiAssert("There is a category filter on the left");
});
```
@ -139,19 +141,23 @@ Promise.resolve(
);
console.log("headphones in stock", items);
// 👀 assert by AI
await mid.aiAssert("There is a category filter on the left");
await browser.close();
})()
);
```
:::tip
You may have noticed that the key lines of code for this only consist of two lines. They are all written in plain language.
You may have noticed that the key lines of code for this only consist of three lines. They are all written in plain language.
```typescript
await mid.aiAction('type "Headphones" in search box, hit Enter');
await mid.aiQuery(
'{itemTitle: string, price: Number}[], find item in list and corresponding price',
);
await mid.aiAssert("There is a category filter on the left");
```
:::

View File

@ -105,14 +105,12 @@ const dataB = await mid.aiQuery('string[], task names in the list');
const dataC = await mid.aiQuery('{name: string, age: string}[], Data Record in the table');
```
### `.aiAssert(conditionPrompt: string, errorMsg?: string)` - do an assertion
### `.aiAssert(assertion: string, errorMsg?: string)` - do an assertion
This method will soon be available in Midscene.
`.aiAssert` works just like the normal `assert` method, except that the condition is a prompt string written in natural language. Midscene will call AI to determine if the `conditionPrompt` is true. If not, a detailed reason will be concatenated to the `errorMsg`.
`.aiAssert` works just like the normal `assert` method, except that the condition is a prompt string written in natural language. Midscene will call AI to determine if the `assertion` is true. If the condition is not met, an error will be thrown containing `errorMsg` and a detailed reason generated by AI.
```typescript
// coming soon
await mid.aiAssert('There should be a searchbox on the page');
```
## Use LangSmith (Optional)

Binary file not shown.

Before

Width:  |  Height:  |  Size: 15 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 100 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 185 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 80 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 3.9 MiB

After

Width:  |  Height:  |  Size: 2.5 MiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 653 KiB

After

Width:  |  Height:  |  Size: 676 KiB

View File

@ -34,7 +34,6 @@ export default defineConfig({
testDir: './e2e',
+ timeout: 90 * 1000,
+ reporter: [["list"], ["@midscene/web/playwright-report"]],
});
```
@ -64,7 +63,7 @@ test.beforeEach(async ({ page }) => {
await page.waitForLoadState("networkidle");
});
test("search headphone on ebay", async ({ ai, aiQuery }) => {
test("search headphone on ebay", async ({ ai, aiQuery, aiAssert }) => {
// 👀 输入关键字,执行搜索
// 注:尽管这是一个英文页面,你也可以用中文指令控制它
await ai('在搜索框输入 "Headphones" ,敲回车');
@ -76,6 +75,9 @@ test("search headphone on ebay", async ({ ai, aiQuery }) => {
console.log("headphones in stock", items);
expect(items?.length).toBeGreaterThan(0);
// 👀 用 AI 断言
await aiAssert("界面左侧有类目筛选功能");
});
```
@ -145,6 +147,9 @@ Promise.resolve(
);
console.log("耳机商品信息", items);
// 👀 用 AI 断言
await mid.aiAssert("界面左侧有类目筛选功能");
await browser.close();
})()
);
@ -152,13 +157,14 @@ Promise.resolve(
:::tip
你可能已经注意到了,上述文件中的关键代码只有行,且都是用自然语言编写的
你可能已经注意到了,上述文件中的关键代码只有行,且都是用自然语言编写的
```typescript
await mid.aiAction('在搜索框输入 "Headphones" ,敲回车');
await mid.aiQuery(
'{itemTitle: string, price: Number}[], 找到列表里的商品标题和价格',
);
await mid.aiAssert("界面左侧有类目筛选功能");
```
:::

View File

@ -103,11 +103,13 @@ const dataB = await mid.aiQuery('string[],列表中的任务名称');
const dataC = await mid.aiQuery('{name: string, age: string}[], 表格中的数据记录');
```
### `.aiAssert(conditionPrompt: string, errorMsg?: string)` - 进行断言
### `.aiAssert(assertion: string, errorMsg?: string)` - 进行断言
这个方法即将上线
`.aiAssert` 的功能类似于一般的断言assert方法但可以用自然语言编写条件参数 `assertion`。Midscene 会调用 AI 来判断条件是否为真。若条件不满足SDK 会抛出一个错误并在 `errorMsg` 后附上 AI 生成的错误原因
`.aiAssert` 的功能类似于一般的 `assert` 方法,但可以用自然语言编写条件参数 `conditionPrompt`。Midscene 会调用 AI 来判断条件是否为真。若满足条件,详细原因会附加到 `errorMsg` 中。
```typescript
await mid.aiAssert('界面中应该有个搜索框');
```
## 使用 LangSmith (可选)

View File

@ -4,7 +4,8 @@ import { defineConfig } from 'rspress/config';
export default defineConfig({
root: path.join(__dirname, 'docs'),
title: 'Midscene.js',
description: 'Your AI-Driven UI Compass',
description:
'An AI-powered automation SDK can control the page, perform assertions, and extract data in JSON format using natural language.',
icon: '/midscene-icon.png',
logo: {
light: '/midscene_with_text_light.png',

View File

@ -56,7 +56,7 @@ export class Executor {
}
}
async flush(): Promise<void> {
async flush(): Promise<any> {
if (this.status === 'init' && this.tasks.length > 0) {
console.warn(
'illegal state for executor, status is init but tasks are not empty',
@ -108,7 +108,9 @@ export class Executor {
};
if (task.type === 'Insight') {
assert(
task.subType === 'Locate' || task.subType === 'Query',
task.subType === 'Locate' ||
task.subType === 'Query' ||
task.subType === 'Assert',
`unsupported insight subType: ${task.subType}`,
);
returnValue = await task.executor(param, executorContext);
@ -151,6 +153,10 @@ export class Executor {
if (successfullyCompleted) {
this.status = 'completed';
if (this.tasks.length) {
// return the last output
return this.tasks[this.tasks.length - 1].output;
}
} else {
this.status = 'error';
throw new Error(`executor failed: ${errorMsg}`);

View File

@ -1,4 +1,6 @@
import assert from 'node:assert';
import type {
AIAssertionResponse,
AIElementParseResponse,
AISectionParseResponse,
BaseElement,
@ -7,7 +9,11 @@ import type {
import type { ChatCompletionMessageParam } from 'openai/resources';
import { callToGetJSONObject } from './openai';
import { systemPromptToFindElement } from './prompt/element_inspector';
import { describeUserPage, systemPromptToExtract } from './prompt/util';
import {
describeUserPage,
systemPromptToAssert,
systemPromptToExtract,
} from './prompt/util';
export async function AiInspectElement<
ElementType extends BaseElement = BaseElement,
@ -51,7 +57,6 @@ export async function AiInspectElement<
return {
parseResult,
elementById,
systemPrompt,
};
}
@ -101,6 +106,43 @@ export async function AiExtractElementInfo<
return {
parseResult,
elementById,
systemPrompt,
};
}
export async function AiAssert<
ElementType extends BaseElement = BaseElement,
>(options: {
assertion: string;
context: UIContext<ElementType>;
callAI?: typeof callToGetJSONObject;
}) {
const { assertion, context, callAI = callToGetJSONObject } = options;
assert(assertion, 'assertion should be a string');
const systemPrompt = systemPromptToAssert(assertion);
const { screenshotBase64 } = context;
const { description, elementById } = await describeUserPage(context);
const msgs: ChatCompletionMessageParam[] = [
{ role: 'system', content: systemPrompt },
{
role: 'user',
content: [
{
type: 'image_url',
image_url: {
url: screenshotBase64,
},
},
{
type: 'text',
text: description,
},
],
},
];
const assertResult = await callAI<AIAssertionResponse>(msgs);
return assertResult;
}

View File

@ -136,6 +136,22 @@ Return in the following JSON format:
`;
}
export function systemPromptToAssert(assertion: string) {
return `
${characteristic}
${contextFormatIntro}
Based on the information you get, assert the following:
${assertion}
Return in the following JSON format:
{
thought: string, // string, the thought of the assertion
pass: true, // true or false, whether the assertion is passed
}
`;
}
/*
To modify the response format:
1. update the function `describeSectionResponseFormat` here

View File

@ -4,10 +4,12 @@ import {
AiInspectElement,
callToGetJSONObject as callAI,
} from '@/ai-model/index';
import { AiAssert } from '@/ai-model/inspect';
import type {
AIElementParseResponse,
BaseElement,
DumpSubscriber,
InsightAssertionResponse,
InsightExtractParam,
InsightOptions,
InsightTaskInfo,
@ -17,7 +19,6 @@ import type {
} from '@/types';
import {
extractSectionQuery,
// describeUserPage as defaultDescriber,
ifElementTypeResponse,
splitElementResponse,
} from '../ai-model/prompt/util';
@ -93,7 +94,7 @@ export default class Insight<
const context = await this.contextRetrieverFn();
const startTime = Date.now();
const { parseResult, systemPrompt, elementById } = await AiInspectElement({
const { parseResult, elementById } = await AiInspectElement({
callAI,
context,
multi: Boolean(multi),
@ -105,7 +106,6 @@ export default class Insight<
...(this.taskInfo ? this.taskInfo : {}),
durationMs: timeCost,
rawResponse: JSON.stringify(parseResult),
systemPrompt,
};
let errorLog: string | undefined;
@ -212,20 +212,18 @@ export default class Insight<
const context = await this.contextRetrieverFn();
const startTime = Date.now();
const { parseResult, systemPrompt, elementById } =
await AiExtractElementInfo<T>({
context,
dataQuery,
sectionConstraints,
callAI: this.aiVendorFn,
});
const { parseResult, elementById } = await AiExtractElementInfo<T>({
context,
dataQuery,
sectionConstraints,
callAI: this.aiVendorFn,
});
const timeCost = Date.now() - startTime;
const taskInfo: InsightTaskInfo = {
...(this.taskInfo ? this.taskInfo : {}),
durationMs: timeCost,
rawResponse: JSON.stringify(parseResult),
systemPrompt,
};
let errorLog: string | undefined;
@ -313,4 +311,52 @@ export default class Insight<
return mergedData;
}
async assert(assertion: string): Promise<InsightAssertionResponse> {
if (typeof assertion !== 'string') {
throw new Error(
'This is the assert method for Midscene, the first argument should be a string. If you want to use the assert method from Node.js, please import it from the Node.js assert module.',
);
}
const dumpSubscriber = this.onceDumpUpdatedFn;
this.onceDumpUpdatedFn = undefined;
const context = await this.contextRetrieverFn();
const startTime = Date.now();
const assertResult = await AiAssert({
assertion,
callAI: this.aiVendorFn,
context,
});
const timeCost = Date.now() - startTime;
const taskInfo: InsightTaskInfo = {
...(this.taskInfo ? this.taskInfo : {}),
durationMs: timeCost,
rawResponse: JSON.stringify(assertResult),
};
const { thought, pass } = assertResult;
const dumpData: PartialInsightDumpFromSDK = {
type: 'assert',
context,
userQuery: {
assertion,
},
matchedSection: [],
matchedElement: [],
data: null,
taskInfo,
assertionPass: pass,
assertionThought: thought,
error: pass ? undefined : thought,
};
writeInsightDump(dumpData, undefined, dumpSubscriber);
return {
pass,
thought,
};
}
}

View File

@ -66,6 +66,11 @@ export interface AISectionParseResponse<DataShape> {
errors?: string[];
}
export interface AIAssertionResponse {
pass: boolean;
thought: string;
}
/**
* context
*/
@ -110,7 +115,6 @@ export type InsightExtractParam = string | Record<string, string>;
export interface InsightTaskInfo {
durationMs: number;
systemPrompt?: string;
rawResponse?: string;
}
@ -120,17 +124,20 @@ export interface DumpMeta {
}
export interface InsightDump extends DumpMeta {
type: 'locate' | 'extract';
type: 'locate' | 'extract' | 'assert';
logId: string;
context: UIContext;
userQuery: {
element?: string;
dataDemand?: InsightExtractParam;
sections?: Record<string, string>;
assertion?: string;
}; // ?
matchedSection: UISection[];
matchedElement: BaseElement[];
data: any;
assertionPass?: boolean;
assertionThought?: string;
taskInfo: InsightTaskInfo;
error?: string;
}
@ -152,13 +159,15 @@ export interface LiteUISection {
export type ElementById = (id: string) => BaseElement | null;
export type InsightAssertionResponse = AIAssertionResponse;
/**
* planning
*
*/
export interface PlanningAction<ParamType = any> {
thought: string;
thought?: string;
type:
| 'Locate'
| 'Tap'
@ -166,7 +175,8 @@ export interface PlanningAction<ParamType = any> {
| 'Input'
| 'KeyboardPress'
| 'Scroll'
| 'Error';
| 'Error'
| 'Assert';
param: ParamType;
}
@ -189,6 +199,10 @@ export interface PlanningActionParamScroll {
| 'ScrollUp';
}
export interface PlanningActionParamAssert {
assertion: string;
}
/**
* misc
*/
@ -285,7 +299,7 @@ export interface ExecutionDump extends DumpMeta {
}
/*
task - insight-find
task - insight-locate
*/
export interface ExecutionTaskInsightLocateParam {
prompt: string;
@ -295,7 +309,7 @@ export interface ExecutionTaskInsightLocateOutput {
element: BaseElement | null;
}
export interface ExecutionTaskInsightLocateLog {
export interface ExecutionTaskInsightDumpLog {
dump?: InsightDump;
}
@ -303,14 +317,14 @@ export type ExecutionTaskInsightLocateApply = ExecutionTaskApply<
'Insight',
ExecutionTaskInsightLocateParam,
ExecutionTaskInsightLocateOutput,
ExecutionTaskInsightLocateLog
ExecutionTaskInsightDumpLog
>;
export type ExecutionTaskInsightLocate =
ExecutionTask<ExecutionTaskInsightLocateApply>;
/*
task - insight-extract
task - insight-query
*/
export interface ExecutionTaskInsightQueryParam {
dataDemand: InsightExtractParam;
@ -322,13 +336,30 @@ export interface ExecutionTaskInsightQueryOutput {
export type ExecutionTaskInsightQueryApply = ExecutionTaskApply<
'Insight',
ExecutionTaskInsightQueryParam
ExecutionTaskInsightQueryParam,
any,
ExecutionTaskInsightDumpLog
>;
export type ExecutionTaskInsightQuery =
ExecutionTask<ExecutionTaskInsightQueryApply>;
// export type ExecutionTaskInsight = ExecutionTaskInsightLocate; // | ExecutionTaskInsightExtract;
/*
task - assertion
*/
export interface ExecutionTaskInsightAssertionParam {
assertion: string;
}
export type ExecutionTaskInsightAssertionApply = ExecutionTaskApply<
'Insight',
ExecutionTaskInsightAssertionParam,
InsightAssertionResponse,
ExecutionTaskInsightDumpLog
>;
export type ExecutionTaskInsightAssertion =
ExecutionTask<ExecutionTaskInsightAssertionApply>;
/*
task - action (i.e. interact)
@ -346,8 +377,6 @@ export type ExecutionTaskAction = ExecutionTask<ExecutionTaskActionApply>;
task - planning
*/
export type ExectuionTaskPlanningParam = PlanningAIResponse;
export type ExecutionTaskPlanningApply = ExecutionTaskApply<
'Planning',
{ userPrompt: string },

View File

@ -2,7 +2,7 @@
{
"elements": [
{
"id": "b0ca2e8c69",
"id": "3530a9c1eb",
},
],
"error": [],
@ -11,7 +11,7 @@
{
"elements": [
{
"id": "b9807d7de6",
"id": "b5bacc879a",
},
],
"error": [],
@ -20,7 +20,7 @@
{
"elements": [
{
"id": "c5a7702fed",
"id": "7ccd467339",
},
],
"error": [],
@ -29,7 +29,7 @@
{
"elements": [
{
"id": "c84a3afdac",
"id": "eb987bf616",
},
],
"error": [],
@ -38,7 +38,7 @@
{
"elements": [
{
"id": "defa24dedd",
"id": "0f8f471e06",
},
],
"error": [],

View File

@ -43,7 +43,7 @@ repeat(5, (repeatIndex) => {
path.join(__dirname, './test-data/online_order'),
);
const { aiResponse, filterUnStableinf } = await runTestCases(
const { aiResponse, filterUnstableResult } = await runTestCases(
testCases,
async (testCase) => {
const { parseResult } = await AiInspectElement({
@ -62,12 +62,12 @@ repeat(5, (repeatIndex) => {
JSON.stringify(aiResponse, null, 2),
{ encoding: 'utf-8' },
);
expect(filterUnStableinf).toMatchFileSnapshot(
expect(filterUnstableResult).toMatchFileSnapshot(
'./__snapshots__/online_order_inspector.test.ts.snap',
);
},
{
timeout: 99999,
timeout: 90 * 1000,
},
);
});

View File

@ -1,5 +1,6 @@
import path from 'node:path';
import { AiInspectElement } from '@/ai-model';
import { AiAssert } from '@/ai-model/inspect';
import { expect, it } from 'vitest';
import {
getPageTestData,
@ -39,7 +40,7 @@ repeat(2, (repeatIndex) => {
path.join(__dirname, './test-data/todo'),
);
const { aiResponse, filterUnStableinf } = await runTestCases(
const { aiResponse, filterUnstableResult } = await runTestCases(
testTodoCases,
async (testCase) => {
const { parseResult } = await AiInspectElement({
@ -58,12 +59,42 @@ repeat(2, (repeatIndex) => {
JSON.stringify(aiResponse, null, 2),
{ encoding: 'utf-8' },
);
expect(filterUnStableinf).toMatchFileSnapshot(
expect(filterUnstableResult).toMatchFileSnapshot(
'./__snapshots__/todo_inspector.test.ts.snap',
);
},
{
timeout: 99999,
timeout: 90 * 1000,
},
);
});
repeat(2, () => {
it(
'todo: assert',
async () => {
const { context } = await getPageTestData(
path.join(__dirname, './test-data/todo'),
);
const { pass, thought } = await AiAssert({
context,
assertion: 'There are three tasks in the list',
});
expect(pass).toBeTruthy();
expect(thought).toBeTruthy();
const { pass: pass2, thought: thought2 } = await AiAssert({
context,
assertion: 'There is an button to sort the list in a time order',
});
expect(pass2).toBeFalsy();
expect(thought2).toBeTruthy();
},
{
timeout: 90 * 1000,
},
);
});

View File

@ -65,7 +65,7 @@ export async function runTestCases(
}
});
const filterUnStableinf = aiResponse.map((aiInfo) => {
const filterUnstableResult = aiResponse.map((aiInfo) => {
const { elements = [], prompt, error = [] } = aiInfo;
return {
elements: elements.map((element) => {
@ -80,7 +80,7 @@ export async function runTestCases(
return {
aiResponse,
filterUnStableinf,
filterUnstableResult,
};
}
@ -133,3 +133,8 @@ export async function getPageTestData(targetDir: string) {
screenshotBase64: base64Encoded(resizeOutputImgP),
};
}
export async function getPageDataOfTestName(testName: string) {
const targetDir = path.join(__dirname, `test-data/${testName}`);
return await getPageTestData(targetDir);
}

View File

@ -1,79 +1,73 @@
// /* eslint-disable max-lines-per-function */
// import { it, describe, vi, expect } from 'vitest';
// import { plan } from '@/automation/';
// import { getFixture, launch } from 'tests/utils';
// import { parseContextFromPuppeteerBrowser } from '@/puppeteer';
// import { beforeEach } from 'node:test';
// import { Browser } from 'puppeteer';
import { plan } from '@/automation/';
import { getPageDataOfTestName } from 'tests/ai-model/inspector/util';
/* eslint-disable max-lines-per-function */
import { describe, expect, it, vi } from 'vitest';
// vi.setConfig({
// testTimeout: 180 * 1000,
// hookTimeout: 30 * 1000,
// });
vi.setConfig({
testTimeout: 180 * 1000,
hookTimeout: 30 * 1000,
});
// const localPage = `file://${getFixture('simple.html')}`;
// describe('automation - planning', () => {
// let browser: Browser;
// beforeEach(() =>
// async () => {
// await browser?.close();
// },
// );
describe('automation - planning', () => {
it('basic run', async () => {
const { context } = await getPageDataOfTestName('todo');
// it('basic run', async () => {
// browser = await launch('https://www.baidu.com');
// const context = await parseContextFromPuppeteerBrowser(browser);
const { plans } = await plan(
'type "Why is the earth a sphere?", hit Enter',
{
context,
},
);
expect(plans.length).toBe(3);
expect(plans[0].thought).toBeTruthy();
expect(plans[0].type).toBe('Locate');
expect(plans[1].type).toBe('Input');
expect(plans[2].type).toBe('KeyboardPress');
});
// const {plans} = await plan(context, 'type keyword "Why is the earth a sphere?", hit Enter');
// expect(plans.length).toBe(3);
// expect(plans[0].thought).toBeTruthy();
// expect(plans[0].type).toBe('Find');
// expect(plans[1].type).toBe('Input');
// expect(plans[2].type).toBe('KeyboardPress');
// });
it('should raise an error when prompt is irrelevant with page', async () => {
const { context } = await getPageDataOfTestName('todo');
// it('should raise an error when prompt is irrelevant with page', async () => {
// browser = await launch(localPage);
// const context = await parseContextFromPuppeteerBrowser(browser);
expect(async () => {
await plan(
'Tap the blue T-shirt in left top corner, and click the "add to cart" button',
{
context,
},
);
}).rejects.toThrowError();
});
// expect((async () => {
// await plan(context, 'Tap the blue T-shirt in left top corner, and click the "add to cart" button');
// })).rejects.toThrowError();
// });
it('Error message in Chinese', async () => {
const { context } = await getPageDataOfTestName('todo');
let error: Error | undefined;
try {
await plan('在界面上点击“香蕉奶茶”,然后添加到购物车', {
context,
});
} catch (e: any) {
error = e;
}
// it('Error message in Chinese', async () => {
// browser = await launch(localPage);
// const context = await parseContextFromPuppeteerBrowser(browser);
expect(error).toBeTruthy();
expect(/a-z/i.test(error!.message)).toBeFalsy();
});
// let error: Error | undefined;
// try {
// await plan(context, '在界面上点击“香蕉奶茶”,然后添加到购物车');
// } catch(e: any) {
// error = e;
// }
// expect(error).toBeTruthy();
// expect(/a-z/i.test(error!.message)).toBeFalsy();
// });
// it.only('instructions of to-do mvc', async() => {
// browser = await launch('https://todomvc.com/examples/react/dist/');
// const context = await parseContextFromPuppeteerBrowser(browser);
// const instructions = [
// '在任务框 input 输入 今天学习 JS按回车键',
// '在任务框 input 输入 明天学习 Rust按回车键',
// '在任务框 input 输入后天学习 AI按回车键',
// '将鼠标移动到任务列表中的第二项,点击第二项任务右边的删除按钮',
// '点击第二条任务左边的勾选按钮',
// '点击任务列表下面的 completed 状态按钮',
// ];
// for(const instruction of instructions) {
// const {plans} = await plan(context, instruction);
// expect(plans).toBeTruthy();
// console.log(`instruction: ${instruction}\nplans: ${JSON.stringify(plans, undefined, 2)}`);
// }
// });
// });
it('instructions of to-do mvc', async () => {
const { context } = await getPageDataOfTestName('todo');
const instructions = [
'在任务框 input 输入 今天学习 JS按回车键',
'在任务框 input 输入 明天学习 Rust按回车键',
'在任务框 input 输入后天学习 AI按回车键',
'将鼠标移动到任务列表中的第二项,点击第二项任务右边的删除按钮',
'点击第二条任务左边的勾选按钮',
'点击任务列表下面的 completed 状态按钮',
];
for (const instruction of instructions) {
const { plans } = await plan(instruction, { context });
expect(plans).toBeTruthy();
// console.log(`instruction: ${instruction}\nplans: ${JSON.stringify(plans, undefined, 2)}`);
}
});
});

View File

@ -44,39 +44,12 @@ const insightFindTask = (shouldThrow?: boolean) => {
return insightFindTask;
};
// const insightExtractTask = () => {
// let insightDump: InsightDump | undefined;
// const dumpCollector: DumpSubscriber = (dump) => {
// insightDump = dump;
// };
// const insight = fakeInsight('test-executor');
// insight.onceDumpUpdatedFn = dumpCollector;
// const task: any = {
// type: 'Insight-extract',
// param: {
// dataDemand: 'data-demand',
// },
// async executor(param: any) {
// return {
// output: {
// data: await insight.extract(param.dataDemand as any),
// },
// log: {
// dump: insightDump,
// },
// };
// },
// };
// return task;
// }
describe('executor', () => {
it(
'insight - basic run',
async () => {
const insightTask1 = insightFindTask();
const flushResultData = 'abcdef';
const taskParam = {
action: 'tap',
anything: 'acceptable',
@ -87,15 +60,24 @@ describe('executor', () => {
param: taskParam,
executor: tapperFn,
};
const actionTask2: ExecutionTaskActionApply = {
type: 'Action',
param: taskParam,
executor: async () => {
return {
output: flushResultData,
} as any;
},
};
const inputTasks = [insightTask1, actionTask];
const inputTasks = [insightTask1, actionTask, actionTask2];
const executor = new Executor(
'test',
'hello, this is a test',
inputTasks,
);
await executor.flush();
const flushResult = await executor.flush();
const tasks = executor.tasks as ExecutionTaskInsightLocate[];
const { element } = tasks[0].output || {};
expect(element).toBeTruthy();
@ -115,6 +97,8 @@ describe('executor', () => {
const dump = executor.dump();
expect(dump.logTime).toBeTruthy();
expect(flushResult).toBe(flushResultData);
},
{
timeout: 999 * 1000,
@ -150,7 +134,7 @@ describe('executor', () => {
expect(dumpContent1.tasks.length).toBe(2);
// append while running
await Promise.all([
const output = await Promise.all([
initExecutor.flush(),
(async () => {
// sleep 200ms

View File

@ -12,18 +12,16 @@ dotenv.config();
const enableTest = process.env.AITEST;
const aiModelTest =
enableTest !== 'true' ? ['tests/ai-model/**/*.test.ts'] : [];
enableTest === 'true' || enableTest === '1'
? []
: ['tests/ai-model/**/*.test.ts', 'tests/automation/planning.test.ts'];
export default defineConfig({
test: {
// include: ['tests/inspector/*.test.ts'],
include: ['tests/**/*.test.ts'],
// Need to improve the corresponding testing
exclude: [
'tests/insight/*.test.ts',
'tests/automation/planning.test.ts',
...aiModelTest,
],
exclude: ['tests/insight/*.test.ts', ...aiModelTest],
},
resolve: {
alias: {

View File

@ -7,7 +7,7 @@ test.beforeEach(async ({ page }) => {
await page.waitForLoadState('networkidle');
});
test('search headphone on ebay', async ({ ai, aiQuery }) => {
test('search headphone on ebay', async ({ ai, aiQuery, aiAssert }) => {
// 👀 perform a search
await ai('type "Headphones" in search box, hit Enter');
@ -18,5 +18,7 @@ test('search headphone on ebay', async ({ ai, aiQuery }) => {
console.log('headphones in stock', items);
expect(items?.length).toBeGreaterThan(1);
expect(items?.length).toBeGreaterThanOrEqual(1);
await aiAssert('There is a big input box in the page');
});

View File

@ -21,7 +21,7 @@ export default defineConfig({
/* Opt out of parallel tests on CI. */
workers: process.env.CI ? 1 : undefined,
/* Reporter to use. See https://playwright.dev/docs/test-reporters */
reporter: '@midscene/web/playwright-report',
reporter: [['list'], ['@midscene/web/playwright-report']],
/* Shared settings for all the projects below. See https://playwright.dev/docs/api/class-testoptions. */
use: {
/* Base URL to use in actions like `await page.goto('/')`. */

View File

@ -27,13 +27,14 @@ const VIEW_TYPE_SCREENSHOT = 'screenshot';
const VIEW_TYPE_JSON = 'json';
const DetailPanel = (): JSX.Element => {
const dumpContext = useInsightDump((store) => store.data);
const dumpId = useInsightDump((store) => store._loadId);
const blackboardViewAvailable = Boolean(dumpId);
const blackboardViewAvailable = Boolean(dumpContext);
const activeTask = useExecutionDump((store) => store.activeTask);
const [preferredViewType, setViewType] = useState(VIEW_TYPE_BLACKBOARD);
const viewType =
preferredViewType === VIEW_TYPE_BLACKBOARD && !dumpId
preferredViewType === VIEW_TYPE_BLACKBOARD && !blackboardViewAvailable
? VIEW_TYPE_SCREENSHOT
: preferredViewType;
@ -47,7 +48,7 @@ const DetailPanel = (): JSX.Element => {
</div>
);
} else if (viewType === VIEW_TYPE_BLACKBOARD) {
if (dumpId) {
if (blackboardViewAvailable) {
content = <BlackBoard key={`${dumpId}`} />;
} else {
content = <div>invalid view</div>;

View File

@ -103,6 +103,7 @@
font-size: 14px;
margin-top: @side-horizontal-padding;
white-space: break-spaces;
word-wrap: break-word;
margin: 0;
}

View File

@ -6,6 +6,7 @@ import { RadiusSettingOutlined } from '@ant-design/icons';
import type {
BaseElement,
ExecutionTaskAction,
ExecutionTaskInsightAssertion,
ExecutionTaskInsightLocate,
ExecutionTaskInsightQuery,
ExecutionTaskPlanning,
@ -274,7 +275,8 @@ const DetailSide = (): JSX.Element => {
key: 'param',
content: JSON.stringify(
(task as ExecutionTaskInsightLocate)?.param?.prompt ||
(task as ExecutionTaskInsightQuery)?.param?.dataDemand,
(task as ExecutionTaskInsightQuery)?.param?.dataDemand ||
(task as ExecutionTaskInsightAssertion)?.param?.assertion,
),
},
],
@ -363,7 +365,7 @@ const DetailSide = (): JSX.Element => {
// const [showQuery, setShowQuery] = useState(false);
const errorSection = dump?.error ? (
const errorSection = task?.error ? (
<Card
liteMode={true}
title="Error"
@ -371,7 +373,7 @@ const DetailSide = (): JSX.Element => {
onMouseLeave={noop}
content={
<pre className="description-content" style={{ color: '#F00' }}>
{dump.error}
{task.error}
</pre>
}
/>
@ -385,7 +387,27 @@ const DetailSide = (): JSX.Element => {
content={<pre>{JSON.stringify(dump.data, undefined, 2)}</pre>}
/>
) : null;
console.log('dump is', dump);
let assertionCard: JSX.Element | null = null;
if (task?.type === 'Insight' && task.subType === 'Assert') {
assertionCard = (
<Card
liteMode={true}
title="Assert"
onMouseEnter={noop}
onMouseLeave={noop}
content={
<pre className="description-content">
{JSON.stringify(
(task as ExecutionTaskInsightAssertion).output,
undefined,
2,
)}
</pre>
}
/>
);
}
const plans = (task as ExecutionTaskPlanning)?.output?.plans;
let timelineData: TimelineItemProps[] = [];
@ -425,6 +447,7 @@ const DetailSide = (): JSX.Element => {
<div className="item-list item-list-space-up">
{errorSection}
{dataCard}
{assertionCard}
{matchedSectionsEl}
{matchedElementsEl}
<Timeline items={timelineData} />

File diff suppressed because one or more lines are too long

View File

@ -13,7 +13,7 @@ export class PageAgent {
dumpFile?: string;
actionAgent: PageTaskExecutor;
taskExecutor: PageTaskExecutor;
constructor(
page: WebPage,
@ -27,7 +27,7 @@ export class PageAgent {
},
];
this.testId = opts?.testId || String(process.pid);
this.actionAgent = new PageTaskExecutor(this.page, {
this.taskExecutor = new PageTaskExecutor(this.page, {
cache: opts?.cache || { aiTasks: [] },
});
}
@ -48,14 +48,14 @@ export class PageAgent {
async aiAction(taskPrompt: string) {
let error: Error | undefined;
try {
await this.actionAgent.action(taskPrompt);
await this.taskExecutor.action(taskPrompt);
} catch (e: any) {
error = e;
}
// console.log('cache logic', actionAgent.taskCache.generateTaskCache());
if (this.actionAgent.executionDump) {
this.appendDump(this.actionAgent.executionDump);
// this.appendDump(dumpGroupName, actionAgent.executionDump);
// console.log('cache logic', taskExecutor.taskCache.generateTaskCache());
if (this.taskExecutor.executionDump) {
this.appendDump(this.taskExecutor.executionDump);
// this.appendDump(dumpGroupName, taskExecutor.executionDump);
this.writeOutActionDumps();
}
if (error) {
@ -69,12 +69,12 @@ export class PageAgent {
let error: Error | undefined;
let result: any;
try {
result = await this.actionAgent.query(demand);
result = await this.taskExecutor.query(demand);
} catch (e: any) {
error = e;
}
if (this.actionAgent.executionDump) {
this.appendDump(this.actionAgent.executionDump);
if (this.taskExecutor.executionDump) {
this.appendDump(this.taskExecutor.executionDump);
this.writeOutActionDumps();
}
if (error) {
@ -85,6 +85,19 @@ export class PageAgent {
return result;
}
async aiAssert(assertion: string, msg?: string) {
const assertionResult = await this.taskExecutor.assert(assertion);
if (this.taskExecutor.executionDump) {
this.appendDump(this.taskExecutor.executionDump);
this.writeOutActionDumps();
}
if (!assertionResult.pass) {
const errMsg = msg || `Assertion failed: ${assertion}`;
const reasonMsg = `Reason: ${assertionResult.thought}`;
throw new Error(`${errMsg}\n${reasonMsg}`);
}
}
async ai(taskPrompt: string, type = 'action') {
if (type === 'action') {
return this.aiAction(taskPrompt);
@ -92,8 +105,13 @@ export class PageAgent {
if (type === 'query') {
return this.aiQuery(taskPrompt);
}
if (type === 'assert') {
return this.aiAssert(taskPrompt);
}
throw new Error(
`Unknown or Unsupported task type: ${type}, only support 'action' or 'query'`,
`Unknown type: ${type}, only support 'action', 'query', 'assert'`,
);
}
}

View File

@ -11,10 +11,12 @@ import Insight, {
type ExecutionTaskInsightQueryApply,
type ExecutionTaskPlanningApply,
Executor,
type InsightAssertionResponse,
type InsightDump,
type InsightExtractParam,
plan,
type PlanningAction,
type PlanningActionParamAssert,
type PlanningActionParamHover,
type PlanningActionParamInputOrKeyPress,
type PlanningActionParamScroll,
@ -144,6 +146,32 @@ export class PageTaskExecutor {
};
return taskFind;
}
if (plan.type === 'Assert') {
const assertPlan = plan as PlanningAction<PlanningActionParamAssert>;
const taskAssert: ExecutionTaskApply = {
type: 'Insight',
subType: 'Assert',
param: assertPlan.param,
executor: async () => {
let insightDump: InsightDump | undefined;
const dumpCollector: DumpSubscriber = (dump) => {
insightDump = dump;
};
this.insight.onceDumpUpdatedFn = dumpCollector;
const assertion = await this.insight.assert(
assertPlan.param.assertion,
);
return {
output: assertion,
log: {
dump: insightDump,
},
};
},
};
return taskAssert;
}
if (plan.type === 'Input') {
const taskActionInput: ExecutionTaskActionApply<PlanningActionParamInputOrKeyPress> =
{
@ -163,6 +191,7 @@ export class PageTaskExecutor {
};
return taskActionInput;
}
if (plan.type === 'KeyboardPress') {
const taskActionKeyboardPress: ExecutionTaskActionApply<PlanningActionParamInputOrKeyPress> =
{
@ -366,4 +395,24 @@ export class PageTaskExecutor {
}
return data;
}
async assert(assertion: string): Promise<InsightAssertionResponse> {
const description = assertion;
const taskExecutor = new Executor(description);
taskExecutor.description = description;
const assertionPlan: PlanningAction<PlanningActionParamAssert> = {
type: 'Assert',
param: {
assertion,
},
};
const assertTask = await this.convertPlanToExecutable([assertionPlan]);
await taskExecutor.append(this.wrapExecutorWithScreenshot(assertTask[0]));
const assertionResult: InsightAssertionResponse =
await taskExecutor.flush();
this.executionDump = taskExecutor.dump();
return assertionResult;
}
}

View File

@ -26,17 +26,17 @@ const groupAndCaseForTest = (testInfo: TestInfo) => {
return { taskFile, taskTitle };
};
const midSceneAgentKeyId = '_midSceneAgentId';
const midsceneAgentKeyId = '_midsceneAgentId';
export const PlaywrightAiFixture = () => {
const pageAgentMap: Record<string, PageAgent> = {};
const agentForPage = (
page: WebPage,
opts: { testId: string; taskFile: string; taskTitle: string },
) => {
let idForPage = (page as any)[midSceneAgentKeyId];
let idForPage = (page as any)[midsceneAgentKeyId];
if (!idForPage) {
idForPage = randomUUID();
(page as any)[midSceneAgentKeyId] = idForPage;
(page as any)[midsceneAgentKeyId] = idForPage;
const testCase = readTestCache(opts.taskFile, opts.taskTitle) || {
aiTasks: [],
};
@ -69,7 +69,7 @@ export const PlaywrightAiFixture = () => {
return result;
},
);
const taskCacheJson = agent.actionAgent.taskCache.generateTaskCache();
const taskCacheJson = agent.taskExecutor.taskCache.generateTaskCache();
writeTestCache(taskFile, taskTitle, taskCacheJson);
if (agent.dumpFile) {
testInfo.annotations.push({
@ -132,6 +132,31 @@ export const PlaywrightAiFixture = () => {
});
}
},
aiAssert: async (
{ page }: { page: PlaywrightPage },
use: any,
testInfo: TestInfo,
) => {
const { taskFile, taskTitle } = groupAndCaseForTest(testInfo);
const agent = agentForPage(page, {
testId: testInfo.testId,
taskFile,
taskTitle,
});
await use(async (assertion: string, errorMsg?: string) => {
await page.waitForLoadState('networkidle');
await agent.aiAssert(assertion, errorMsg);
});
if (agent.dumpFile) {
testInfo.annotations.push({
type: 'MIDSCENE_AI_ACTION',
description: JSON.stringify({
testId: testInfo.testId,
dumpPath: agent.dumpFile,
}),
});
}
},
};
};
@ -142,4 +167,5 @@ export type PlayWrightAiFixtureType = {
) => Promise<T>;
aiAction: (taskPrompt: string) => ReturnType<PageTaskExecutor['action']>;
aiQuery: <T = any>(demand: any) => Promise<T>;
aiAssert: (assertion: string, errorMsg?: string) => Promise<void>;
};

View File

@ -17,7 +17,7 @@ function logger(...message: any[]) {
}
}
class MidSceneReporter implements Reporter {
class MidsceneReporter implements Reporter {
async onBegin(config: FullConfig, suite: Suite) {
const suites = suite.allTests();
logger(`Starting the run with ${suites.length} tests`);
@ -63,4 +63,4 @@ class MidSceneReporter implements Reporter {
}
}
export default MidSceneReporter;
export default MidsceneReporter;

View File

@ -4,7 +4,7 @@ import { describe, expect, it, vi } from 'vitest';
import { launchPage } from './utils';
vi.setConfig({
testTimeout: 60 * 1000,
testTimeout: 90 * 1000,
});
describe('puppeteer integration', () => {
@ -22,6 +22,8 @@ describe('puppeteer integration', () => {
);
console.log('headphones in stock', items);
expect(items.length).toBeGreaterThanOrEqual(2);
await mid.aiAssert('There is a category filter on the left');
});
it('extract the Github service status', async () => {
@ -32,5 +34,7 @@ describe('puppeteer integration', () => {
'this is a service status page. Extract all status data with this scheme: {[serviceName]: [statusText]}',
);
console.log('Github service status', result);
await mid.aiAssert('food delivery service is in normal state');
});
});

View File

@ -4,7 +4,9 @@ import { defineConfig } from 'vitest/config';
const enableTest = process.env.AITEST;
const aiModelTest =
enableTest !== 'true' ? ['tests/puppeteer/bing.test.ts'] : [];
enableTest !== 'true'
? ['tests/puppeteer/bing.test.ts', 'tests/puppeteer/showcase.test.ts']
: [];
export default defineConfig({
resolve: {