Behind the scenes, we utilize the visual grounding capabilities of vl-model to locate target elements on the screen. So, regardless of whether it's a native app, a [Lynx](https://github.com/lynx-family/lynx) page, or a hybrid app with a webview, it makes no difference. Developers can write automation scripts without the burden of worrying about the technology stack of the app.
You can use the playground to experience the Android automation without any code. Please refer to [Quick experience with Android](./quick-experience-with-android) for more details.
After the experience, you can integrate with the Android device by javascript code. Please refer to [Integrate with Android(adb)](./integrate-with-android) for more details.
If you prefer the yaml file for automation scripts, please refer to [Automate with scripts in yaml](./automate-with-scripts-in-yaml).
If you want to use the automation for testing purpose, you can use the javascript with vitest. We have setup a demo project for you to see how it works:
1. Caching feature for element locator is not supported. Since no view-hierarchy is collected, we cannot cache the element identifier and reuse it.
2. LLMs like gpt-4o or deepseek are not supported. Only some known vl models with visual grounding ability are supported for now. If you want to introduce other vl models, please let us know.
3. The performance is not good enough for now. We are still working on it.
4. The vl model may not perform well on `.aiQuery` and `.aiAssert`. We will give a way to switch model for different kinds of tasks.
5. Due to some security restrictions, you may got a blank screenshot for the password input and Midscene will not be able to work for that.
## Credits
We would like to thank the following projects:
- [scrcpy](https://github.com/Genymobile/scrcpy) and [yume-chan](https://github.com/yume-chan) allow us to control Android devices with browser.
- [appium-adb](https://github.com/appium/appium-adb) for the javascript bridge of adb.
- [YADB](https://github.com/ysbing/YADB) for the yadb tool which improves the performance of text input.