Playwright MCP
This package is experimental and not yet ready for production use. It is a subject to change and will not respect semver versioning.
Example config
{
"mcpServers": {
"playwright": {
"command": "npx",
"args": [
"@playwright/mcp",
"--headless"
]
}
}
}
Running headed browser (Browser with GUI).
{
"mcpServers": {
"playwright": {
"command": "npx",
"args": [
"@playwright/mcp"
]
}
}
}
Running headed browser on Linux
When running headed browser on system w/o display or from worker processes of the IDEs, you can run Playwright in a client-server manner. You'll run the Playwright server from environment with the DISPLAY
npx playwright run-server
And then in MCP config, add following to the env:
{
"mcpServers": {
"playwright": {
"command": "npx",
"args": [
"@playwright/mcp"
],
"env": {
// Use the endpoint from the output of the server above.
"PLAYWRIGHT_WS_ENDPOINT": "ws://localhost:<port>/"
}
}
}
}
Tool Modes
The tools are available in two modes:
- Snapshot Mode (default): Uses accessibility snapshots for better performance and reliability
- Vision Mode: Uses screenshots for visual-based interactions
To use Vision Mode, add the --vision flag when starting the server:
{
"mcpServers": {
"playwright": {
"command": "npx",
"args": [
"@playwright/mcp",
"--vision"
]
}
}
}
Vision Mode works best with the computer use models that are able to interact with elements using X Y coordinate space, based on the provided screenshot.
Snapshot Mode
The Playwright MCP provides a set of tools for browser automation. Here are all available tools:
-
browser_navigate
- Description: Navigate to a URL
- Parameters:
url(string): The URL to navigate to
-
browser_go_back
- Description: Go back to the previous page
- Parameters: None
-
browser_go_forward
- Description: Go forward to the next page
- Parameters: None
-
browser_click
- Description: Perform click on a web page
- Parameters:
element(string): Human-readable element description used to obtain permission to interact with the elementref(string): Exact target element reference from the page snapshot
-
browser_hover
- Description: Hover over element on page
- Parameters:
element(string): Human-readable element description used to obtain permission to interact with the elementref(string): Exact target element reference from the page snapshot
-
browser_drag
- Description: Perform drag and drop between two elements
- Parameters:
startElement(string): Human-readable source element description used to obtain permission to interact with the elementstartRef(string): Exact source element reference from the page snapshotendElement(string): Human-readable target element description used to obtain permission to interact with the elementendRef(string): Exact target element reference from the page snapshot
-
browser_type
- Description: Type text into editable element
- Parameters:
element(string): Human-readable element description used to obtain permission to interact with the elementref(string): Exact target element reference from the page snapshottext(string): Text to type into the elementsubmit(boolean): Whether to submit entered text (press Enter after)
-
browser_press_key
- Description: Press a key on the keyboard
- Parameters:
key(string): Name of the key to press or a character to generate, such asArrowLeftora
-
browser_snapshot
- Description: Capture accessibility snapshot of the current page (better than screenshot)
- Parameters: None
-
browser_save_as_pdf
- Description: Save page as PDF
- Parameters: None
-
browser_wait
- Description: Wait for a specified time in seconds
- Parameters:
time(number): The time to wait in seconds (capped at 10 seconds)
-
browser_close
- Description: Close the page
- Parameters: None
Vision Mode
Vision Mode provides tools for visual-based interactions using screenshots. Here are all available tools:
-
browser_navigate
- Description: Navigate to a URL
- Parameters:
url(string): The URL to navigate to
-
browser_go_back
- Description: Go back to the previous page
- Parameters: None
-
browser_go_forward
- Description: Go forward to the next page
- Parameters: None
-
browser_screenshot
- Description: Capture screenshot of the current page
- Parameters: None
-
browser_move_mouse
- Description: Move mouse to specified coordinates
- Parameters:
x(number): X coordinatey(number): Y coordinate
-
browser_click
- Description: Click at specified coordinates
- Parameters:
x(number): X coordinate to click aty(number): Y coordinate to click at
-
browser_drag
- Description: Perform drag and drop operation
- Parameters:
startX(number): Start X coordinatestartY(number): Start Y coordinateendX(number): End X coordinateendY(number): End Y coordinate
-
browser_type
- Description: Type text at specified coordinates
- Parameters:
text(string): Text to typesubmit(boolean): Whether to submit entered text (press Enter after)
-
browser_press_key
- Description: Press a key on the keyboard
- Parameters:
key(string): Name of the key to press or a character to generate, such asArrowLeftora
-
browser_save_as_pdf
- Description: Save page as PDF
- Parameters: None
-
browser_wait
- Description: Wait for a specified time in seconds
- Parameters:
time(number): The time to wait in seconds (capped at 10 seconds)
-
browser_close
- Description: Close the page
- Parameters: None