Picture this: you’re building an AI assistant that needs to help users with web-based tasks—filling out forms, extracting data from websites, or running automated tests across different environments. Traditional approaches require hardcoding specific selectors and workflows, but what if your AI could dynamically understand and interact with web pages just like a human would?
This is exactly what becomes possible when you combine Model Context Protocol (MCP) with Playwright. In this guide, I’ll show you how to build an MCP server that gives AI applications sophisticated web automation capabilities, transforming static scripts into intelligent, adaptable web interactions.
Why MCP + Playwright is a Game Changer
Web automation has traditionally been brittle and inflexible. You write a script with specific CSS selectors, it works for a while, then the website changes and everything breaks. Even worse, creating custom automations for different AI applications means rebuilding the same web interaction logic over and over.
MCP changes this paradigm by providing a standardized way for AI applications to access web automation capabilities. Instead of each chatbot, IDE extension, or AI agent implementing its own web scraping logic, they can all share a single, powerful MCP server that handles the complexity of browser automation.
Here’s what this combination enables:
- Dynamic web interactions: AI can adapt to page changes and find elements intelligently
- Reusable automation: One MCP server serves multiple AI applications
- Intelligent testing: AI-driven test generation and execution
- Content extraction: Smart data scraping that understands page context
- Form automation: AI that can fill complex forms based on natural language instructions
Understanding the Architecture
Before diving into implementation, let’s understand how MCP and Playwright work together:
MCP Server (Playwright) → Exposes web automation capabilities through standardized tools MCP Client (AI Application) → Requests web actions using natural language or structured commands Playwright Browser → Executes the actual web automation tasks
This architecture means you write the browser automation logic once in your MCP server, and any MCP-compatible AI application can use it—from Claude Desktop to custom AI agents.
Setting Up Your MCP Playwright Server
Let’s build a practical MCP server that provides essential web automation capabilities. I’ll walk you through the complete setup process.
Prerequisites
Before we start, make sure you have:
- Node.js 18+ installed on your system
- Basic knowledge of JavaScript/TypeScript and web automation concepts
- Understanding of MCP fundamentals (check my previous post on Understanding Model Context Protocol)
Project Initialization
First, let’s create a new MCP server project:
$ mkdir mcp-playwright-server
$ cd mcp-playwright-server
$ npm init -yInstall the required dependencies:
$ npm install @modelcontextprotocol/sdk playwright
$ npm install -D @types/node typescript ts-node
$ npx playwright installBasic Server Structure
Create the main server file src/index.ts:
#!/usr/bin/env node
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import {
CallToolRequestSchema,
ListToolsRequestSchema,
} from '@modelcontextprotocol/sdk/types.js';
import { chromium, Browser, Page } from 'playwright';
class PlaywrightMCPServer {
private server: Server;
private browser: Browser | null = null;
private page: Page | null = null;
constructor() {
this.server = new Server(
{
name: 'playwright-mcp-server',
version: '1.0.0',
},
{
capabilities: {
tools: {},
},
}
);
this.setupToolHandlers();
this.setupLifecycle();
}
private setupLifecycle() {
// Initialize browser when server starts
this.server.setRequestHandler(ListToolsRequestSchema, async () => {
if (!this.browser) {
await this.initializeBrowser();
}
return {
tools: [
{
name: 'navigate_to_url',
description: 'Navigate to a specific URL',
inputSchema: {
type: 'object',
properties: {
url: {
type: 'string',
description: 'The URL to navigate to',
},
},
required: ['url'],
},
},
{
name: 'click_element',
description: 'Click on an element using CSS selector or text content',
inputSchema: {
type: 'object',
properties: {
selector: {
type: 'string',
description: 'CSS selector or text to find the element',
},
selectorType: {
type: 'string',
enum: ['css', 'text'],
description: 'Type of selector to use',
default: 'css',
},
},
required: ['selector'],
},
},
{
name: 'fill_input',
description: 'Fill an input field with text',
inputSchema: {
type: 'object',
properties: {
selector: {
type: 'string',
description: 'CSS selector for the input field',
},
text: {
type: 'string',
description: 'Text to fill in the input field',
},
},
required: ['selector', 'text'],
},
},
{
name: 'extract_text',
description: 'Extract text content from elements',
inputSchema: {
type: 'object',
properties: {
selector: {
type: 'string',
description: 'CSS selector for elements to extract text from',
},
},
required: ['selector'],
},
},
{
name: 'take_screenshot',
description: 'Take a screenshot of the current page',
inputSchema: {
type: 'object',
properties: {
fullPage: {
type: 'boolean',
description: 'Whether to capture the full scrollable page',
default: false,
},
},
},
},
{
name: 'wait_for_element',
description: 'Wait for an element to appear on the page',
inputSchema: {
type: 'object',
properties: {
selector: {
type: 'string',
description: 'CSS selector for the element to wait for',
},
timeout: {
type: 'number',
description: 'Timeout in milliseconds',
default: 5000,
},
},
required: ['selector'],
},
},
],
};
});
}
private async initializeBrowser() {
try {
this.browser = await chromium.launch({ headless: false });
this.page = await this.browser.newPage();
console.error('Browser initialized successfully');
} catch (error) {
console.error('Failed to initialize browser:', error);
throw error;
}
}
private setupToolHandlers() {
this.server.setRequestHandler(CallToolRequestSchema, async (request) => {
if (!this.page) {
throw new Error('Browser not initialized');
}
const { name, arguments: args } = request.params;
try {
switch (name) {
case 'navigate_to_url':
await this.page.goto(args.url);
return {
content: [
{
type: 'text',
text: `Successfully navigated to ${args.url}`,
},
],
};
case 'click_element':
if (args.selectorType === 'text') {
await this.page.getByText(args.selector).click();
} else {
await this.page.click(args.selector);
}
return {
content: [
{
type: 'text',
text: `Successfully clicked element: ${args.selector}`,
},
],
};
case 'fill_input':
await this.page.fill(args.selector, args.text);
return {
content: [
{
type: 'text',
text: `Successfully filled input ${args.selector} with: ${args.text}`,
},
],
};
case 'extract_text':
const elements = await this.page.$$(args.selector);
const texts = await Promise.all(
elements.map(el => el.textContent())
);
return {
content: [
{
type: 'text',
text: `Extracted text: ${JSON.stringify(texts.filter(Boolean))}`,
},
],
};
case 'take_screenshot':
const screenshot = await this.page.screenshot({
fullPage: args.fullPage || false,
encoding: 'base64',
});
return {
content: [
{
type: 'text',
text: 'Screenshot taken successfully',
},
{
type: 'image',
data: screenshot,
mimeType: 'image/png',
},
],
};
case 'wait_for_element':
await this.page.waitForSelector(args.selector, {
timeout: args.timeout || 5000,
});
return {
content: [
{
type: 'text',
text: `Element ${args.selector} appeared on page`,
},
],
};
default:
throw new Error(`Unknown tool: ${name}`);
}
} catch (error) {
return {
content: [
{
type: 'text',
text: `Error executing ${name}: ${error.message}`,
},
],
isError: true,
};
}
});
}
async run() {
const transport = new StdioServerTransport();
await this.server.connect(transport);
console.error('MCP Playwright server running on stdio');
}
async cleanup() {
if (this.browser) {
await this.browser.close();
}
}
}
// Handle graceful shutdown
const server = new PlaywrightMCPServer();
process.on('SIGINT', async () => {
await server.cleanup();
process.exit(0);
});
server.run().catch(console.error);Making the Server Executable
Create a package.json script and TypeScript configuration:
{
"name": "mcp-playwright-server",
"version": "1.0.0",
"type": "module",
"scripts": {
"build": "tsc",
"start": "node dist/index.js",
"dev": "ts-node --esm src/index.ts"
},
"bin": {
"mcp-playwright-server": "./dist/index.js"
}
}Create tsconfig.json:
{
"compilerOptions": {
"target": "ES2022",
"module": "ESNext",
"moduleResolution": "node",
"esModuleInterop": true,
"allowSyntheticDefaultImports": true,
"strict": true,
"outDir": "./dist",
"rootDir": "./src",
"declaration": true,
"skipLibCheck": true
},
"include": ["src/**/*"],
"exclude": ["node_modules", "dist"]
}Build the server:
$ npm run buildIntegrating with Claude Desktop
Now let’s configure Claude Desktop to use our MCP Playwright server. Add this configuration to your Claude Desktop config file:
On macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
On Windows: %APPDATA%\Claude\claude_desktop_config.json
{
"mcpServers": {
"playwright": {
"command": "node",
"args": ["/absolute/path/to/your/mcp-playwright-server/dist/index.js"]
}
}
}Important: Replace /absolute/path/to/your/mcp-playwright-server with the actual path to your project directory.
Real-World Usage Examples
Once your MCP server is running and connected to Claude Desktop, you can start using it for powerful web automation tasks. Here are some practical examples:
Example 1: Automated Form Filling
Human: Help me fill out a contact form on example.com.
Navigate to the site, fill in Name: "John Doe",
Email: "john@example.com", and Message: "Hello from MCP!"
Claude: I'll help you fill out that contact form. Let me navigate to the site and handle the form submission.
[Uses navigate_to_url tool]
[Uses fill_input tool for each field]
[Uses click_element tool to submit]Example 2: Content Extraction for Research
Human: Go to a news website and extract all article headlines
from the homepage for my daily digest.
Claude: I'll extract the headlines from the news site for you.
[Uses navigate_to_url tool]
[Uses extract_text tool with headline selectors]
[Returns formatted list of headlines]Example 3: Web Application Testing
Human: Test the login flow on our staging site. Try both
valid and invalid credentials and take screenshots.
Claude: I'll test the login flow systematically and document the results.
[Uses navigate_to_url for login page]
[Uses fill_input and click_element for form interaction]
[Uses take_screenshot to capture results]
[Tests multiple scenarios]Advanced Features and Best Practices
Error Handling and Resilience
Add robust error handling to your MCP server:
private async executeWithRetry<T>(
operation: () => Promise<T>,
maxRetries: number = 3
): Promise<T> {
for (let attempt = 1; attempt <= maxRetries; attempt++) {
try {
return await operation();
} catch (error) {
if (attempt === maxRetries) {
throw error;
}
console.error(`Attempt ${attempt} failed, retrying...`);
await this.page?.waitForTimeout(1000 * attempt);
}
}
throw new Error('Max retries exceeded');
}Smart Element Detection
Enhance your server with intelligent element finding:
private async smartFindElement(selector: string, selectorType: string = 'css') {
if (selectorType === 'text') {
// Try exact match first, then partial match
try {
return await this.page.getByText(selector, { exact: true });
} catch {
return await this.page.getByText(selector);
}
}
// For CSS selectors, try multiple strategies
const strategies = [
() => this.page.$(selector),
() => this.page.$(`[data-testid="${selector}"]`),
() => this.page.$(`[aria-label="${selector}"]`),
() => this.page.getByRole('button', { name: selector }),
];
for (const strategy of strategies) {
try {
const element = await strategy();
if (element) return element;
} catch {
continue;
}
}
throw new Error(`Could not find element with selector: ${selector}`);
}Performance Optimization
Implement connection pooling and resource management:
class BrowserPool {
private browsers: Browser[] = [];
private maxBrowsers = 3;
async getBrowser(): Promise<Browser> {
if (this.browsers.length < this.maxBrowsers) {
const browser = await chromium.launch({ headless: false });
this.browsers.push(browser);
return browser;
}
// Return least used browser
return this.browsers[0];
}
async cleanup() {
await Promise.all(this.browsers.map(browser => browser.close()));
this.browsers = [];
}
}Security Considerations
When building MCP servers for web automation, security should be a top priority:
URL Validation
private validateUrl(url: string): boolean {
try {
const parsed = new URL(url);
// Only allow specific protocols
if (!['http:', 'https:'].includes(parsed.protocol)) {
return false;
}
// Block internal networks in production
const hostname = parsed.hostname;
if (hostname === 'localhost' || hostname.startsWith('192.168.') || hostname.startsWith('10.')) {
return process.env.NODE_ENV === 'development';
}
return true;
} catch {
return false;
}
}Input Sanitization
Always sanitize user inputs to prevent injection attacks:
private sanitizeSelector(selector: string): string {
// Remove potentially dangerous characters
return selector.replace(/[<>\"']/g, '');
}
private sanitizeText(text: string): string {
// Escape special characters for safe injection
return text.replace(/[<>&\"']/g, (char) => {
const entities = {
'<': '<',
'>': '>',
'&': '&',
'"': '"',
"'": '''
};
return entities[char];
});
}Troubleshooting Common Issues
Browser Launch Problems
If your browser fails to launch, try these solutions:
# Install system dependencies on Linux
$ sudo apt-get install -y libnss3 libnspr4 libatk-bridge2.0-0 libdrm2 libxkbcommon0 libxss1 libasound2
# For headless mode in production
$ export DISPLAY=:99
$ Xvfb :99 -screen 0 1024x768x24 &Connection Issues
Verify your Claude Desktop configuration:
# Test your MCP server directly
$ node dist/index.js
# Should show: "MCP Playwright server running on stdio"
# Check Claude Desktop logs
$ tail -f ~/Library/Logs/Claude/mcp.logMemory Management
For long-running automation tasks, implement proper cleanup:
private async cleanupResources() {
// Close all pages except the main one
const pages = await this.browser?.pages() || [];
for (const page of pages.slice(1)) {
await page.close();
}
// Clear browser cache periodically
if (this.page) {
await this.page.evaluate(() => {
localStorage.clear();
sessionStorage.clear();
});
}
}Extending Your MCP Server
Your MCP Playwright server can be extended with additional capabilities:
File Upload Support
{
name: 'upload_file',
description: 'Upload a file to an input element',
inputSchema: {
type: 'object',
properties: {
selector: { type: 'string' },
filePath: { type: 'string' }
},
required: ['selector', 'filePath']
}
}API Integration
Combine web automation with API calls for comprehensive testing:
{
name: 'verify_api_response',
description: 'Verify that a web action triggers the expected API response',
inputSchema: {
type: 'object',
properties: {
action: { type: 'string' },
expectedEndpoint: { type: 'string' }
}
}
}Mobile Device Simulation
Add mobile testing capabilities:
await this.page.setViewportSize({ width: 375, height: 667 });
await this.page.emulateMedia({ media: 'screen', colorScheme: 'dark' });The Future of AI-Driven Web Automation
The combination of MCP and Playwright represents a significant shift toward more intelligent, adaptable web automation. As AI models become more sophisticated, we can expect to see:
- Visual element recognition: AI that can understand page layouts without CSS selectors
- Dynamic workflow adaptation: Automation that adjusts to unexpected page changes
- Natural language test generation: Creating comprehensive test suites from user stories
- Cross-browser intelligence: AI that optimizes tests for different browser behaviors
Early adoption of MCP for web automation gives you a foundation that will scale with these advancing capabilities.
Conclusion
Building an MCP server for Playwright transforms static web automation into dynamic, AI-driven interactions. Instead of brittle scripts that break when websites change, you get intelligent automation that can adapt and evolve with your needs.
What makes this approach particularly powerful is its reusability—once you’ve built your MCP Playwright server, any AI application can leverage its capabilities. Whether you’re automating form submissions, extracting data for research, or building comprehensive test suites, the combination of MCP’s standardization with Playwright’s robust browser automation creates a powerful foundation for modern web interactions.
The architecture I’ve shown you here is just the beginning. As you use this setup, you’ll discover opportunities to add more sophisticated features like visual regression testing, performance monitoring, and advanced element detection strategies. The key is starting with solid foundations and iterating based on your specific use cases.
If you’re already working with web automation or considering adding AI capabilities to your testing workflows, I’d recommend starting with a simple MCP Playwright server like the one we built today. The investment in learning MCP now will pay dividends as the ecosystem continues to mature and expand.