50 Commits

Author SHA1 Message Date
Leyang
7b190ae4ee
fix(web-integration): sanitize content for reportHtmlContent (#745)
* fix(web-integration): sanitize content for reportHtmlContent

* fix(workflow): version

* fix(web-integration): recover after retrieve

* fix(web-integration): test

* chore(core): fix lint

* fix(report): remove ! for content

* chore(core): add test case

* chore(core): add test case

---------

Co-authored-by: yutao <yutao.tao@bytedance.com>
2025-05-22 07:20:32 +08:00
yuyutaotao
08466cac1e
feat(core): add element describer (#750) 2025-05-21 21:05:47 +08:00
Leyang
5a1a3ba18a
feat(web-integration): support disable cache for a single api call (#740)
* feat(web-integration): support disable cache for a single api call

* feat(workflow): version mismatch

* feat(web-integration): cache rename to cacheable

* feat(web-integration): add cacheable option to multiple API methods and update caching documentation

* docs(site): update cacheable option descriptions to reference caching feature documentation

* docs(core): update caching doc

---------

Co-authored-by: yutao <yutao.tao@bytedance.com>
2025-05-21 16:46:20 +08:00
bac9051d2d
feat(evaluation): add screenspot v2 evaluation (#737)
* feat(evaluation): add screenspot v2 evaluation

* style(evaluation): format files array in package.json
2025-05-20 15:52:03 +08:00
Leyang
fb2580616c
fix cache bugs(scroll instant, text node info, cache hit condition) (#732)
* fix(web-integration): cache hit when element.id exist and scroll element instantly

* fix(web-integration): use parent xpath for text node

* fix(web-integration): only scroll in to view when element is not completly visible

* fix(shared): distinct text node

* test(web-integration): getElementInfoByXpath

* test(web-integration): rename desc

* test(web-integration): fix

* test(web-integration): snapshot attributes only

* test(web-integration): fix test
2025-05-20 13:19:32 +08:00
yuyutaotao
b261ed7f2a
feat(web): use xpath and yaml as cache (#711)
* feat(web-integration): use xpath for cache instead of id

* feat(web-integration): enhance TaskCache to support xpaths for cache matching and add new test cases

* feat(web-integration): add debug log for unknown page types in TaskCache

* feat(web-integration): update caching logic and cache hit conditions for Plan and Locate tasks

* chore(core): update debug log

* feat(web-integration): update rspress.config and enhance TaskCache structure with new properties

* feat(web-integration): recalculate id when hit cache

* fix(web-integration): update mock implementation in task-cache test to use evaluate method

* feat(web-integration): enhance element caching by adding XPath support and improving cache hit logic

* chore(core): lint

* feat(web-integration): improve XPath handling in web-extractor

* test(web-integration): fix tests

* feat(core, web-integration): add attributes to LocateResultElement and enhance element handling

* fix(core): lint

* feat(web-integration): add midsceneVersion to TaskCache and update cache validation logic

* fix(core): test

* fix(web-integration): update cache validation logic to prevent reading outdated midscene cache files

* feat(web-integration): enhance TaskCache to track used cache items and improve cache retrieval logic

* fix(core): xpath logic (#710)

* feat(core): resue context for locate

* feat(core): build yamlFlow from aiAction

* feat(core): refine task-cache

* feat(core): update cache

* feat(core): refine task-cache

* feat(core): refine task-cache

* feat(core): remove unused checkElementExistsByXPath

* feat(core): use yaml file as cache

* chore(core): fix lint

* chore(core): print warning for previous cache

* refactor(core): remove quickAnswer references and improve element matching logic

* fix(core): update import path for buildYamlFlowFromPlans

* chore(web-integration): update output image and skip task error test

* fix(web-integration): update test snapshots to handle beta versions

* fix(web-integration): adjust test snapshots for version consistency

* fix(web-integration): track original cache length and adjust matching logic in tests

* fix(web-integration): update test URLs to reflect new target site and enable previously skipped test

* chore(core): update cache docs

* fix(core): test

* feat(core): try to match element from plan

* fix(web-integration): cache id stable when retry in palywright

* fix(web-integration): typo

* style(web-integration): lint

* fix(web-integration): stable cacheid in tests

* fix(web-integration): cache id

---------

Co-authored-by: quanruzhuoxiu <quanruzhuoxiu@gmail.com>
2025-05-16 17:16:56 +08:00
Leyang
b9ff80a0db
implement repeat function for scrolling until actions (#713)
* feat(android): implement repeat function for scrolling until actions

* fix(shared): fix potential error in getAIConfig by ensuring trim is called correctly

* feat(android): update scrolling behavior with adjustable duration and added sleep

* feat(android): refine scrolling durations with new constants for fast and normal scroll
2025-05-14 18:28:10 +08:00
yutao
6224154bdc Merge branch 'main' of https://github.com/neewbee/midscene into neewbee-main 2025-05-09 10:55:12 +08:00
yuyutaotao
c1bc73c78b
feat(android): customize adb path (#684)
* feat(shared): add custom adb path
feat(android): add custom adb path

* feat(android): add docs for custom adb path

---------

Co-authored-by: HBLADEH <1012582116@qq.com>
2025-04-30 17:16:38 +08:00
yuyutaotao
b8f29e8e66
fix(core): use unified config for doubao-ui-tars model (#678) 2025-04-29 21:39:58 +08:00
neewee
5d96f60853 feat(core): support HTTP proxy and reorder dependencies
Added HTTP proxy support with `https-proxy-agent` and cleaned up dependency order.
2025-04-28 10:04:23 +08:00
yuyutaotao
5fb208a08c
feat(core): adapt UI tars 1.5 (#616)
* feat(core): adapt ui-tars 1.5

* chore(core): adaptr ui-tars-1.5

* chore(core): fix lint

* fix(core): env building issue

* fix(core): update import for uiTarsModelVersion from shared env

* feat(core): ui-tars hotkey event

* chore(core): move @ui-tars/action-parser to devDependencies

* fix(core): adapting new model
2025-04-28 08:42:43 +08:00
Leyang
ca644d8914
feat(core): allow custom midscene_run dir (#631)
* feat(core): support custom midscene_run dir

* feat(report): add search functionality to PlaywrightCaseSelector component

* refactor(shared): simplify base directory resolution and remove unused environment variable

* feat(shared): integrate shared environment variables across multiple packages

* refactor(shared): update base directory resolution to use dynamic midscene_run directory

* fix(puppeteer): increase screenshot timeout from 3s to 10s for improved reliability
2025-04-24 22:54:52 +08:00
yuyutaotao
ecefd8b0fa
fix(report): reduce context size in report file (#626)
* fix(core): reduce context size in report file

* chore(core): fix lint

* chore(core): resolve conflict

---------

Co-authored-by: zhouxiao.shaw <zhouxiao.shaw@bytedance.com>
2025-04-24 18:28:45 +08:00
Leyang
03a597e022
feat(web-integration): enhance timeout configurations and logging for network idle and navigation (#624)
* feat(web-integration): enhance timeout configurations and logging for network idle and navigation

* fix(web-integration): refine timeout warning messages and remove unnecessary test files

* feat(site): add network timeout customization details and additional parameters for Puppeteer

* fix(site): update default timeout values and enhance customization options for network idle in YAML

* fix(site): remove redundant timeout customization details in FAQ documentation

* fix(web-integration): enhance Playwright agent to support network idle functionality

* docs(playwright): update config docs

* docs(playwright): update config docs

* fix(web-integration): refactor network idle handling in Playwright agent

---------

Co-authored-by: yutao <yutao.tao@bytedance.com>
2025-04-24 10:28:26 +08:00
daf308b1d0
fix(mcp): resolve mcp server log error (#599)
* fix(mcp): resolve mcp server log error

* chore(shared): delete unless code

* chore(workflow): fix lint error
2025-04-19 12:00:39 +08:00
yuyutaotao
ad457a33a8
feat(yaml): allow running javascript in yaml (#555)
* feat: allow running javascript in yaml

* feat: change the output dir

* fix: CI

* docs: update docs about evaluate javascript

* chore: merge main

* chore: merge main
2025-04-18 09:58:51 +08:00
Leyang
b76211bd5d
feat: android playground (#542)
* refactor: android api

* refactor: enhance Android agent to accept options for device connection

* fix: type error

* fix: click after clearInput

* fix: click before clearInput

* feat: android playground

* feat: support npx package name

* feat: android playground joint

* fix: git ignore conflicts

* feat: ensure adb server is running before initializing adb client

* fix: deps consistency

* ci: add android playground

* feat: integrate shared constants and improve server configuration in android playground

* feat: android playground style

* feat: style opt

* feat: add @rsbuild/plugin-svgr dependency and improve URI handling in adb

* feat: remove unused water flow scripts and update comments to English

* feat: download report file

* feat: standalone android playground

* feat: use dynamic import

* feat: migrate CSS to LESS and remove unused styles in chrome extension and report

* feat: enhance Android playground with ScrcpyPlayer ref integration and device management improvements

* feat: optimize styles and layout in Android playground and visualizer components

* chore: add bin back

* chore: update build script to exclude documentation generation

* feat: add not ready message to PlaygroundResult for improved user guidance

* feat: add error handling for screenshot capture in Android page

* docs: update readme

* feat: add PNG validation for screenshot buffer in Android page

* feat: enhance UI components with improved styling and tooltips in ScrcpyPlayer and PromptInput

* docs: update uri parameter description in integrate-with-android documentation and improve uri handling in launch function

* style: update primary color to #2B83FF across multiple components and adjust margin in App.less

* refactor: replace userConfig with globalConfig for environment configuration management and update related functions

* feat: integrate server validation logic in App, AdbDevice, and ScrcpyPlayer components for improved connection handling

* style: enhance player component layout with overflow handling and margin adjustments

* style: refine player component layout with flex adjustments and improved spacing

* feat: add midscene model name display and improve layout in EnvConfig component

* feat: integrate ShinyText component for enhanced loading progress display in PlaygroundResult

* test: add test for isValidPNGImageBuffer

* style: remove background color from App.less and adjust AI config override behavior in env.ts

---------

Co-authored-by: yutao <yutao.tao@bytedance.com>
2025-04-17 17:44:11 +08:00
yuyutaotao
a1b5a54d89
fix(report): do not call mkdir in browser (#577)
* fix: log dir

* fix: log dir

* fix: gitignore config
2025-04-17 15:09:59 +08:00
yuyutaotao
824be26c85
fix: use tmpdir as a fallback for log file (#575)
---------

Co-authored-by: zhouxiao.shaw <zhouxiao.shaw@bytedance.com>
2025-04-17 10:54:19 +08:00
yuyutaotao
c6cd10ebb2
fix: filename (#534) 2025-04-03 16:37:00 +08:00
Zhou Xiao
6468bb0206
refactor(logger): use log file output instead of command line output in Node (#509)
* chore: add element detail info

* chore: add element detail info

* chore: add element detail info

* chore: optimize logger logic

* chore: optimize logger logic
2025-03-31 19:22:39 +08:00
Leyang
934a1e2b5d
use adb instead of appium (#483)
* feat: enable search area for locate

* fix: update evaluation

* fix: locator

* feat: show searchArea in report

* chore: add yaml support for aiTap

* feat: use adb instead appium

* feat: Adds debugging information and reconstructs input text capabilities

* feat: refactoring Android related functions and adding android modules

* feat: update the image scaling algorithm, adjust the Android page class to support device scaling, and remove test files that are no longer needed

* feat: adjust the Android page class to support device scaling, and remove test files that are no longer needed

* feat: use appium-adb instead of bare command

* fix: update entry for @midscene/android

* feat: optimize the screenshot processing logic, add a backup mechanism when screenshots fail, and update test cases to accommodate new features

* fix: rethrow error

* feat: add Android debug configuration options and update documentation

* chore: fix code style in #483 (#492)

* fix: remove try for error handle by outside

---------

Co-authored-by: yutao <yutao.tao@bytedance.com>
Co-authored-by: linyibing <linyibing@bytedance.com>
Co-authored-by: yuyutaotao <167746126+yuyutaotao@users.noreply.github.com>
2025-03-25 22:45:05 +08:00
yuyutaotao
649aeceb43
feat: enable search area for locate (#473)
* feat: enable search area for locate

* fix: update evaluation

* fix: build error

* fix: ci

* fix: locator

* feat: show searchArea in report

* chore: add yaml support for aiTap

* feat: update status tip

* fix: #473 (#484)

* chore: optimize unit test list

---------

Co-authored-by: zhouxiao.shaw <zhouxiao.shaw@bytedance.com>
2025-03-24 09:50:27 +08:00
yuyutaotao
8e1ba565d0
feat: optimize locator (#456)
---------

Co-authored-by: zhouxiao.shaw <zhouxiao.shaw@bytedance.com>
2025-03-17 19:19:54 +08:00
yuyutaotao
212e4e3725
fix: planning prompt (#448)
* feat: add more case for llm planning

* fix: ai e2e

* chore: use debug to print log

* chore: fix error in gpt mode
2025-03-10 16:50:43 +08:00
Zhou Xiao
d128745e31
fix(esm): resolve cli can't load esm module (#445)
* fix(esm): resolve cli can't load esm module

* chore: resolve deps error

* chore: resolve deps error

* chore: resolve deps error

* chore: resolve deps error

* chore: resolve deps error

* chore: resolve deps error

* chore: resolve deps error

* chore: resolve deps error

* chore: resolve deps error

* chore: resolve deps error

* chore: resolve deps error

* chore: resolve deps error
2025-03-09 21:50:20 +08:00
Zhou Xiao
5d63ef9151
refactor: switch bundle type to bundleless (#437) 2025-03-07 17:20:18 +08:00
yuyutaotao
bbe9874e78
fix: coord offset of qwen model (#407)
---------

Co-authored-by: zhouxiao.shaw <zhouxiao.shaw@bytedance.com>
2025-02-21 10:30:20 +08:00
yuyutaotao
59ce2d0140
feat: locate by coord (#383)
---------

Co-authored-by: zhouxiao.shaw <zhouxiao.shaw@bytedance.com>
2025-02-21 09:56:09 +08:00
yuyutaotao
2f2400dffa
fix: correctly collect elements in absolute container (#373) 2025-02-10 20:51:43 +08:00
yuyutaotao
2a28472fa5
feat: use different color for annotations (#366) 2025-02-10 16:36:12 +08:00
Zhou Xiao
bdff171da6
fix(ui-tars): resolve page down and page up event error (#370) 2025-02-10 16:35:03 +08:00
yuyutaotao
9d5f2fbcac
feat(web-extract): extract web content as a tree (#337)
* feat: extract web content as a tree

* chore: update test data

* chore: update test data

* feat: update answer of evaluation

* chore: update test cases

* chore: remove focusing on cases

* fix: ci

* fix: put rect in html tree

* fix: CI

* fix: AI test

* fix: lint

* fix: CI

* fix: static-page compatibility

* fix: CI

* fix: map by markerId

* fix: llm planning prompt

* chore: update hash length

* chore: ignore writing dump file

* fix: lint

* fix: ci snapshot

* chore: snapshot tree in web extractor

* chore: export tree utils in core

* chore: export tree utils in core

* fix: CI

* fix: update test case and evaluation

* chore: remove unused file

* refactor(extract): modify dependencies (#358)

* refactor(extract): modify dependencies

* chore: modify files config

* chore: add indexId as key for map

---------

Co-authored-by: Zhou Xiao <zhouxiao.shaw@bytedance.com>
2025-02-07 14:55:52 +08:00
Zhou Xiao
9c88186540
feat(ui-tars): enhance the UI-TARS keyboard event handling and optimize parser logic (#330) 2025-01-26 20:34:56 +08:00
yuyutaotao
3aa1b33955
feat: use jpeg as default image format (#301)
---------

Co-authored-by: zhouxiao.shaw <zhouxiao.shaw@bytedance.com>
2025-01-20 20:02:49 +08:00
Zhou xiao
2b18ed55de
feat(ai-model): support vlm (#262)
* feat(ai-model): support plan to target

* chore: modify test

* chore: modify test

* chore: fix env config

* chore: unify the action logic

* chore: optimize type hint

* chore: optimize type hint

* chore: fix type hint

* chore: fix build type error

* chore: support open new tab

* feat: support auto complete element

* chore: add sleep event

* chore: add ai cost time

* chore: optimize prompt

* chore: optimize report prompt

* chore: optimize vlm name

* chore: fix command

* chore: optimize error handle and use check debugger list replace last tab id

* chore: fix chrome debugger attach logic

* chore: fix unit test
2025-01-13 14:32:17 +08:00
Zhou xiao
691eb6ef0a
feat(ai-model): support Image positioning and integrate langchain (#230)
* feat: add point img logic

* feat: migrate prompt to langchain

* chore: delete unless log

* chore: optimize test case

* chore: fix lint error

* chore: delete httpAgent logic

* chore: delete unless fn

* chore: fix some comment

* chore: fix ci error

* chore: delete unless fn

* chore: update prompt

* chore: delete unless language
2025-01-02 21:23:30 +08:00
yuyutaotao
8d83debd13
fix: add some default value for tmp dir #231 (#233) 2025-01-02 10:18:44 +08:00
yuyutaotao
198172dc4e
feat: optimize the speed of screenshot in browser (#144)
* feat: optimize the speed of screenshot in browser

* feat: remove unnecessary context call

* fix: CI
2024-11-05 14:28:16 +08:00
yuyutaotao
9e1eef5cfd
feat: Build a chrome extension for playground (#140) 2024-11-05 11:49:21 +08:00
Zhou xiao
adb9b58879
feat(ai-model): add claude computer ablity (#136)
* Add new changes

* Add computer test results and update AI evaluation tests

* chore: Update build outputs and configurations

* feat(ai-model): support claude computer ability use position replace element id

* feat: generate add and commit

* feat: implement computer ability test for Claude

* chore: fix build Lose

* chore: Add and commit changes

* chore: optimize ai position

* chore: optimize ai position

* Add AI evaluation results and update tests

* chore: optimize ai test

* chore: add and commit changes

* chore: optimize ai test content

* chore: fix test case

* chore: fix e2e test
2024-10-31 18:18:31 +08:00
yuyutaotao
c288baa448
feat: make playground working in the browser (#135)
---------

Co-authored-by: zhouxiao.shaw <zhouxiao.shaw@bytedance.com>
2024-10-28 11:04:40 +08:00
Zhou xiao
f312cecbac
refactor(extract): optimize image box selection (#130)
* refactor(extract): optimize image box selection

* chore: Optimize the logic for generating AI test data to avoid producing duplicate data

* Update code and configurations

* Update code and configurations

* chore: fix lint error
2024-10-17 14:58:19 +08:00
yuyutaotao
f9dc0f698e
feat(ai-model): merge ai planning and insight call to accelerate the aiAction (#97)
---------

Co-authored-by: zhouxiao.shaw <zhouxiao.shaw@bytedance.com>

* feat(ai-model): optimize AI model for element inspection

* feat(ai-model): optimize AI model and add quick answer functionality

---------

Co-authored-by: yuyutaotao <167746126+yuyutaotao@users.noreply.github.com>

* feat(ai-model): implement quick answer functionality for element inspection

---------

Co-authored-by: zhouxiao.shaw <zhouxiao.shaw@bytedance.com>
2024-10-12 12:09:25 +08:00
Zhou xiao
10757a8ba1
refacotr(ai-model): optimize model evalution method (#98) 2024-09-29 17:16:07 +08:00
Zhou xiao
eafa5bfa20
feat(cache): The cache is generalized to support puppeteers and mobile terminals (#85)
* feat(cache): The cache is generalized to support Puppeteers and mobile terminals

* chore: update cache test

* chore: update cache test

* chore: update cache test

* docs: update cache doc

* chore: update ai test command

* chore: update ai test command

* chore: update ai test command

* chore: optimize cache logic

* chore: update get dir path logic

* chore: update get dir path logic
2024-09-06 17:19:35 +08:00
Leyang
cfa92b3980
feat(app): supports control of iOS and Android devices through appium (#82)
Co-authored-by: zhouxiao.shaw <zhouxiao.shaw@bytedance.com>
Co-authored-by: linyibing <linyibing@bytedance.com>
2024-09-05 20:05:19 +08:00
yuyutaotao
704a0b8a52
feat(web-extract): extract some <div />s as container (#80) 2024-08-31 08:17:50 +08:00
Zhou xiao
d2a5dbecba
refactor(shared): migrate sharp to jimp and migrate common img logci to shared lib (#74)
* fix(web): fix sharp deps

* chore: optimize sharp deps

* refactor(extract): migrate sharp to jimp

* refactor: migrate img common logic to shared lib

* chore: merge main branch

* chore: merge main branch

* chore: merge main branch

* chore: delete unless code

* chore: optimize code

* chore: optimize ai test branch trigger method

* chore: optimize ai test branch trigger method

* chore: optimize trigger method
2024-08-26 18:50:33 +08:00