Building AI-Powered Web Automation with MCP and Playwright

September 12, 2025

Picture this: you’re building an AI assistant that needs to help users with web-based tasks—filling out forms, extracting data from websites, or running automated tests across different environments. Traditional approaches require hardcoding specific selectors and workflows, but what if your AI could dynamically understand and interact with web pages just like a human would?

This is exactly what becomes possible when you combine Model Context Protocol (MCP) with Playwright. In this guide, I’ll show you how to build an MCP server that gives AI applications sophisticated web automation capabilities, transforming static scripts into intelligent, adaptable web interactions.

Why MCP + Playwright is a Game Changer

Web automation has traditionally been brittle and inflexible. You write a script with specific CSS selectors, it works for a while, then the website changes and everything breaks. Even worse, creating custom automations for different AI applications means rebuilding the same web interaction logic over and over.

MCP changes this paradigm by providing a standardized way for AI applications to access web automation capabilities. Instead of each chatbot, IDE extension, or AI agent implementing its own web scraping logic, they can all share a single, powerful MCP server that handles the complexity of browser automation.

Here’s what this combination enables:

Dynamic web interactions: AI can adapt to page changes and find elements intelligently
Reusable automation: One MCP server serves multiple AI applications
Intelligent testing: AI-driven test generation and execution
Content extraction: Smart data scraping that understands page context
Form automation: AI that can fill complex forms based on natural language instructions

Understanding the Architecture

Before diving into implementation, let’s understand how MCP and Playwright work together:

MCP Server (Playwright) → Exposes web automation capabilities through standardized tools MCP Client (AI Application) → Requests web actions using natural language or structured commands Playwright Browser → Executes the actual web automation tasks

This architecture means you write the browser automation logic once in your MCP server, and any MCP-compatible AI application can use it—from Claude Desktop to custom AI agents.

Setting Up Your MCP Playwright Server

Let’s build a practical MCP server that provides essential web automation capabilities. I’ll walk you through the complete setup process.

Prerequisites

Before we start, make sure you have:

Node.js 18+ installed on your system
Basic knowledge of JavaScript/TypeScript and web automation concepts
Understanding of MCP fundamentals (check my previous post on Understanding Model Context Protocol)

Project Initialization

First, let’s create a new MCP server project:

$ mkdir mcp-playwright-server
$ cd mcp-playwright-server
$ npm init -y

Install the required dependencies:

$ npm install @modelcontextprotocol/sdk playwright
$ npm install -D @types/node typescript ts-node
$ npx playwright install

Basic Server Structure

Create the main server file src/index.ts:

#!/usr/bin/env node

import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import {
  CallToolRequestSchema,
  ListToolsRequestSchema,
} from '@modelcontextprotocol/sdk/types.js';
import { chromium, Browser, Page } from 'playwright';

class PlaywrightMCPServer {
  private server: Server;
  private browser: Browser | null = null;
  private page: Page | null = null;

  constructor() {
    this.server = new Server(
      {
        name: 'playwright-mcp-server',
        version: '1.0.0',
      },
      {
        capabilities: {
          tools: {},
        },
      }
    );

    this.setupToolHandlers();
    this.setupLifecycle();
  }

  private setupLifecycle() {
    // Initialize browser when server starts
    this.server.setRequestHandler(ListToolsRequestSchema, async () => {
      if (!this.browser) {
        await this.initializeBrowser();
      }
      return {
        tools: [
          {
            name: 'navigate_to_url',
            description: 'Navigate to a specific URL',
            inputSchema: {
              type: 'object',
              properties: {
                url: {
                  type: 'string',
                  description: 'The URL to navigate to',
                },
              },
              required: ['url'],
            },
          },
          {
            name: 'click_element',
            description: 'Click on an element using CSS selector or text content',
            inputSchema: {
              type: 'object',
              properties: {
                selector: {
                  type: 'string',
                  description: 'CSS selector or text to find the element',
                },
                selectorType: {
                  type: 'string',
                  enum: ['css', 'text'],
                  description: 'Type of selector to use',
                  default: 'css',
                },
              },
              required: ['selector'],
            },
          },
          {
            name: 'fill_input',
            description: 'Fill an input field with text',
            inputSchema: {
              type: 'object',
              properties: {
                selector: {
                  type: 'string',
                  description: 'CSS selector for the input field',
                },
                text: {
                  type: 'string',
                  description: 'Text to fill in the input field',
                },
              },
              required: ['selector', 'text'],
            },
          },
          {
            name: 'extract_text',
            description: 'Extract text content from elements',
            inputSchema: {
              type: 'object',
              properties: {
                selector: {
                  type: 'string',
                  description: 'CSS selector for elements to extract text from',
                },
              },
              required: ['selector'],
            },
          },
          {
            name: 'take_screenshot',
            description: 'Take a screenshot of the current page',
            inputSchema: {
              type: 'object',
              properties: {
                fullPage: {
                  type: 'boolean',
                  description: 'Whether to capture the full scrollable page',
                  default: false,
                },
              },
            },
          },
          {
            name: 'wait_for_element',
            description: 'Wait for an element to appear on the page',
            inputSchema: {
              type: 'object',
              properties: {
                selector: {
                  type: 'string',
                  description: 'CSS selector for the element to wait for',
                },
                timeout: {
                  type: 'number',
                  description: 'Timeout in milliseconds',
                  default: 5000,
                },
              },
              required: ['selector'],
            },
          },
        ],
      };
    });
  }

  private async initializeBrowser() {
    try {
      this.browser = await chromium.launch({ headless: false });
      this.page = await this.browser.newPage();
      console.error('Browser initialized successfully');
    } catch (error) {
      console.error('Failed to initialize browser:', error);
      throw error;
    }
  }

  private setupToolHandlers() {
    this.server.setRequestHandler(CallToolRequestSchema, async (request) => {
      if (!this.page) {
        throw new Error('Browser not initialized');
      }

      const { name, arguments: args } = request.params;

      try {
        switch (name) {
          case 'navigate_to_url':
            await this.page.goto(args.url);
            return {
              content: [
                {
                  type: 'text',
                  text: `Successfully navigated to ${args.url}`,
                },
              ],
            };

          case 'click_element':
            if (args.selectorType === 'text') {
              await this.page.getByText(args.selector).click();
            } else {
              await this.page.click(args.selector);
            }
            return {
              content: [
                {
                  type: 'text',
                  text: `Successfully clicked element: ${args.selector}`,
                },
              ],
            };

          case 'fill_input':
            await this.page.fill(args.selector, args.text);
            return {
              content: [
                {
                  type: 'text',
                  text: `Successfully filled input ${args.selector} with: ${args.text}`,
                },
              ],
            };

          case 'extract_text':
            const elements = await this.page.$$(args.selector);
            const texts = await Promise.all(
              elements.map(el => el.textContent())
            );
            return {
              content: [
                {
                  type: 'text',
                  text: `Extracted text: ${JSON.stringify(texts.filter(Boolean))}`,
                },
              ],
            };

          case 'take_screenshot':
            const screenshot = await this.page.screenshot({
              fullPage: args.fullPage || false,
              encoding: 'base64',
            });
            return {
              content: [
                {
                  type: 'text',
                  text: 'Screenshot taken successfully',
                },
                {
                  type: 'image',
                  data: screenshot,
                  mimeType: 'image/png',
                },
              ],
            };

          case 'wait_for_element':
            await this.page.waitForSelector(args.selector, {
              timeout: args.timeout || 5000,
            });
            return {
              content: [
                {
                  type: 'text',
                  text: `Element ${args.selector} appeared on page`,
                },
              ],
            };

          default:
            throw new Error(`Unknown tool: ${name}`);
        }
      } catch (error) {
        return {
          content: [
            {
              type: 'text',
              text: `Error executing ${name}: ${error.message}`,
            },
          ],
          isError: true,
        };
      }
    });
  }

  async run() {
    const transport = new StdioServerTransport();
    await this.server.connect(transport);
    console.error('MCP Playwright server running on stdio');
  }

  async cleanup() {
    if (this.browser) {
      await this.browser.close();
    }
  }
}

// Handle graceful shutdown
const server = new PlaywrightMCPServer();
process.on('SIGINT', async () => {
  await server.cleanup();
  process.exit(0);
});

server.run().catch(console.error);

Making the Server Executable

Create a package.json script and TypeScript configuration:

{
  "name": "mcp-playwright-server",
  "version": "1.0.0",
  "type": "module",
  "scripts": {
    "build": "tsc",
    "start": "node dist/index.js",
    "dev": "ts-node --esm src/index.ts"
  },
  "bin": {
    "mcp-playwright-server": "./dist/index.js"
  }
}

Create tsconfig.json:

{
  "compilerOptions": {
    "target": "ES2022",
    "module": "ESNext",
    "moduleResolution": "node",
    "esModuleInterop": true,
    "allowSyntheticDefaultImports": true,
    "strict": true,
    "outDir": "./dist",
    "rootDir": "./src",
    "declaration": true,
    "skipLibCheck": true
  },
  "include": ["src/**/*"],
  "exclude": ["node_modules", "dist"]
}

Build the server:

$ npm run build

Integrating with Claude Desktop

Now let’s configure Claude Desktop to use our MCP Playwright server. Add this configuration to your Claude Desktop config file:

On macOS: ~/Library/Application Support/Claude/claude_desktop_config.json On Windows: %APPDATA%\Claude\claude_desktop_config.json

{
  "mcpServers": {
    "playwright": {
      "command": "node",
      "args": ["/absolute/path/to/your/mcp-playwright-server/dist/index.js"]
    }
  }
}

Important: Replace /absolute/path/to/your/mcp-playwright-server with the actual path to your project directory.

Real-World Usage Examples

Once your MCP server is running and connected to Claude Desktop, you can start using it for powerful web automation tasks. Here are some practical examples:

Example 1: Automated Form Filling

Human: Help me fill out a contact form on example.com. 
Navigate to the site, fill in Name: "John Doe", 
Email: "john@example.com", and Message: "Hello from MCP!"

Claude: I'll help you fill out that contact form. Let me navigate to the site and handle the form submission.

[Uses navigate_to_url tool]
[Uses fill_input tool for each field]
[Uses click_element tool to submit]

Example 2: Content Extraction for Research

Human: Go to a news website and extract all article headlines 
from the homepage for my daily digest.

Claude: I'll extract the headlines from the news site for you.

[Uses navigate_to_url tool]
[Uses extract_text tool with headline selectors]
[Returns formatted list of headlines]

Example 3: Web Application Testing

Human: Test the login flow on our staging site. Try both 
valid and invalid credentials and take screenshots.

Claude: I'll test the login flow systematically and document the results.

[Uses navigate_to_url for login page]
[Uses fill_input and click_element for form interaction]
[Uses take_screenshot to capture results]
[Tests multiple scenarios]

Advanced Features and Best Practices

Error Handling and Resilience

Add robust error handling to your MCP server:

private async executeWithRetry<T>(
  operation: () => Promise<T>,
  maxRetries: number = 3
): Promise<T> {
  for (let attempt = 1; attempt <= maxRetries; attempt++) {
    try {
      return await operation();
    } catch (error) {
      if (attempt === maxRetries) {
        throw error;
      }
      console.error(`Attempt ${attempt} failed, retrying...`);
      await this.page?.waitForTimeout(1000 * attempt);
    }
  }
  throw new Error('Max retries exceeded');
}

Smart Element Detection

Enhance your server with intelligent element finding:

private async smartFindElement(selector: string, selectorType: string = 'css') {
  if (selectorType === 'text') {
    // Try exact match first, then partial match
    try {
      return await this.page.getByText(selector, { exact: true });
    } catch {
      return await this.page.getByText(selector);
    }
  }
  
  // For CSS selectors, try multiple strategies
  const strategies = [
    () => this.page.$(selector),
    () => this.page.$(`[data-testid="${selector}"]`),
    () => this.page.$(`[aria-label="${selector}"]`),
    () => this.page.getByRole('button', { name: selector }),
  ];
  
  for (const strategy of strategies) {
    try {
      const element = await strategy();
      if (element) return element;
    } catch {
      continue;
    }
  }
  
  throw new Error(`Could not find element with selector: ${selector}`);
}

Performance Optimization

Implement connection pooling and resource management:

class BrowserPool {
  private browsers: Browser[] = [];
  private maxBrowsers = 3;
  
  async getBrowser(): Promise<Browser> {
    if (this.browsers.length < this.maxBrowsers) {
      const browser = await chromium.launch({ headless: false });
      this.browsers.push(browser);
      return browser;
    }
    
    // Return least used browser
    return this.browsers[0];
  }
  
  async cleanup() {
    await Promise.all(this.browsers.map(browser => browser.close()));
    this.browsers = [];
  }
}

Security Considerations

When building MCP servers for web automation, security should be a top priority:

URL Validation

private validateUrl(url: string): boolean {
  try {
    const parsed = new URL(url);
    // Only allow specific protocols
    if (!['http:', 'https:'].includes(parsed.protocol)) {
      return false;
    }
    
    // Block internal networks in production
    const hostname = parsed.hostname;
    if (hostname === 'localhost' || hostname.startsWith('192.168.') || hostname.startsWith('10.')) {
      return process.env.NODE_ENV === 'development';
    }
    
    return true;
  } catch {
    return false;
  }
}

Input Sanitization

Always sanitize user inputs to prevent injection attacks:

private sanitizeSelector(selector: string): string {
  // Remove potentially dangerous characters
  return selector.replace(/[<>\"']/g, '');
}

private sanitizeText(text: string): string {
  // Escape special characters for safe injection
  return text.replace(/[<>&\"']/g, (char) => {
    const entities = {
      '<': '&lt;',
      '>': '&gt;',
      '&': '&amp;',
      '"': '&quot;',
      "'": '&#39;'
    };
    return entities[char];
  });
}

Troubleshooting Common Issues

Browser Launch Problems

If your browser fails to launch, try these solutions:

# Install system dependencies on Linux
$ sudo apt-get install -y libnss3 libnspr4 libatk-bridge2.0-0 libdrm2 libxkbcommon0 libxss1 libasound2

# For headless mode in production
$ export DISPLAY=:99
$ Xvfb :99 -screen 0 1024x768x24 &

Connection Issues

Verify your Claude Desktop configuration:

# Test your MCP server directly
$ node dist/index.js
# Should show: "MCP Playwright server running on stdio"

# Check Claude Desktop logs
$ tail -f ~/Library/Logs/Claude/mcp.log

Memory Management

For long-running automation tasks, implement proper cleanup:

private async cleanupResources() {
  // Close all pages except the main one
  const pages = await this.browser?.pages() || [];
  for (const page of pages.slice(1)) {
    await page.close();
  }
  
  // Clear browser cache periodically
  if (this.page) {
    await this.page.evaluate(() => {
      localStorage.clear();
      sessionStorage.clear();
    });
  }
}

Extending Your MCP Server

Your MCP Playwright server can be extended with additional capabilities:

File Upload Support

{
  name: 'upload_file',
  description: 'Upload a file to an input element',
  inputSchema: {
    type: 'object',
    properties: {
      selector: { type: 'string' },
      filePath: { type: 'string' }
    },
    required: ['selector', 'filePath']
  }
}

API Integration

Combine web automation with API calls for comprehensive testing:

{
  name: 'verify_api_response',
  description: 'Verify that a web action triggers the expected API response',
  inputSchema: {
    type: 'object',
    properties: {
      action: { type: 'string' },
      expectedEndpoint: { type: 'string' }
    }
  }
}

Mobile Device Simulation

Add mobile testing capabilities:

await this.page.setViewportSize({ width: 375, height: 667 });
await this.page.emulateMedia({ media: 'screen', colorScheme: 'dark' });

The Future of AI-Driven Web Automation

The combination of MCP and Playwright represents a significant shift toward more intelligent, adaptable web automation. As AI models become more sophisticated, we can expect to see:

Visual element recognition: AI that can understand page layouts without CSS selectors
Dynamic workflow adaptation: Automation that adjusts to unexpected page changes
Natural language test generation: Creating comprehensive test suites from user stories
Cross-browser intelligence: AI that optimizes tests for different browser behaviors

Early adoption of MCP for web automation gives you a foundation that will scale with these advancing capabilities.

Conclusion

Building an MCP server for Playwright transforms static web automation into dynamic, AI-driven interactions. Instead of brittle scripts that break when websites change, you get intelligent automation that can adapt and evolve with your needs.

What makes this approach particularly powerful is its reusability—once you’ve built your MCP Playwright server, any AI application can leverage its capabilities. Whether you’re automating form submissions, extracting data for research, or building comprehensive test suites, the combination of MCP’s standardization with Playwright’s robust browser automation creates a powerful foundation for modern web interactions.

The architecture I’ve shown you here is just the beginning. As you use this setup, you’ll discover opportunities to add more sophisticated features like visual regression testing, performance monitoring, and advanced element detection strategies. The key is starting with solid foundations and iterating based on your specific use cases.

If you’re already working with web automation or considering adding AI capabilities to your testing workflows, I’d recommend starting with a simple MCP Playwright server like the one we built today. The investment in learning MCP now will pay dividends as the ecosystem continues to mature and expand.