What Is the Puppeteer MCP Server?
The Puppeteer MCP server gives AI assistants the ability to control a real web browser. Instead of just fetching HTML like a simple HTTP client, Puppeteer launches an actual Chromium browser instance that can render JavaScript, interact with dynamic content, fill out forms, take screenshots, and generate PDFs. Through MCP (Model Context Protocol), your AI assistant - whether Claude Desktop or Cursor - can drive this browser using natural language commands.
This is fundamentally different from search-based tools like Brave Search or Exa Search. Those tools return search results or pre-processed text. Puppeteer gives the AI full browser control: it can navigate to any URL, wait for content to load, click buttons, scroll pages, extract structured data, and capture visual snapshots. If you have ever used Puppeteer in a Node.js script, imagine replacing all that code with a single sentence to Claude.
Common use cases include scraping product data from e-commerce sites, monitoring web pages for changes, automating form submissions, generating PDF reports from web dashboards, and taking screenshots for documentation or testing. This guide covers all of them with real prompts you can copy and use immediately.
Setting Up Puppeteer MCP Server
Prerequisites
- Node.js 18+ - verify with
node --version - Claude Desktop or Cursor - latest version
- Chromium - Puppeteer downloads its own Chromium binary, so you typically do not need to install Chrome separately
Claude Desktop Setup
Open your claude_desktop_config.json file (see paths below) and add the Puppeteer server:
| OS | Config Path |
|---|---|
| macOS | ~/Library/Application Support/Claude/claude_desktop_config.json |
| Windows | %APPDATA%\Claude\claude_desktop_config.json |
| Linux | ~/.config/Claude/claude_desktop_config.json |
{
"mcpServers": {
"puppeteer": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-puppeteer"]
}
}
}
Save the file and restart Claude Desktop completely (quit and reopen, not just close the window).
Cursor Setup
In Cursor, open Settings > MCP Servers and add a new server with the command npx -y @modelcontextprotocol/server-puppeteer. Cursor will start the server automatically when you open a new AI chat.
Claude Code CLI Setup
claude mcp add puppeteer -- npx -y @modelcontextprotocol/server-puppeteer
Use Case 1: Scrape a Product Page
One of the most common uses for Puppeteer MCP is extracting structured data from web pages. Here is how to scrape product information from an e-commerce site:
# Prompt to Claude:
"Navigate to https://example-store.com/products/widget-pro and extract
the product name, price, description, and all customer review ratings.
Return the data as a JSON object."
Claude will use Puppeteer to navigate to the page, wait for the content to render (including JavaScript-loaded reviews), and extract the data into a clean JSON structure. This works even on sites that load content dynamically with React, Vue, or Angular - unlike simple HTTP scraping tools that only see the initial HTML.
For more complex scraping tasks, you can chain multiple pages:
# Multi-page scraping prompt:
"Go to https://example-store.com/category/electronics, extract all
product links from the first page, then visit each product page and
collect the name, price, and rating. Save everything as a CSV table."
Scraping with Specific HTML Selectors
When you need precision, tell Claude exactly which selectors to target. This is especially useful for sites with complex layouts where the AI might grab the wrong element:
# Target specific CSS selectors:
"Navigate to https://example-store.com/products/widget-pro.
Extract the product title from the h1.product-title element,
the price from span.price-current, the original price from
span.price-original, and all review text from div.review-body p elements.
Return as JSON."
# Extract data from a table:
"Go to https://example.com/specs/laptop-pro and extract the entire
specifications table. The table has class 'spec-table' with th elements
for labels and td elements for values. Return as a JSON object with
each spec name as a key."
# Handle nested elements:
"Navigate to https://news.example.com. Extract each article from the
div.article-card elements. For each card, get the title from h2 > a,
the date from span.published-date, the author from span.author-name,
and the summary from p.article-excerpt. Return an array of objects."
You can also target elements by their data attributes, which is often more reliable than class names that may change between deployments:
# Data attribute selectors:
"Go to https://shop.example.com/category/shoes. Find all elements with
data-product-id attributes. For each product, extract the data-product-id
value, the text content of [data-field='name'], and the text content of
[data-field='price']. Return the results as a JSON array."
Handling Infinite Scroll and Lazy Loading
Many modern sites load content as you scroll. Tell Claude to handle this explicitly:
# Infinite scroll scraping:
"Navigate to https://feed.example.com/trending. Scroll down 5 times,
waiting 2 seconds between each scroll for new content to load. Then
extract all post titles and their like counts from the loaded content."
# Lazy-loaded images:
"Go to https://gallery.example.com/portfolio. Scroll through the
entire page so all lazy-loaded images render. Then extract all image
URLs from img.portfolio-item elements."
Use Case 2: Take a Screenshot
Puppeteer can capture full-page or element-specific screenshots, which is invaluable for documentation, testing, and monitoring:
# Full page screenshot:
"Take a screenshot of https://example.com and show it to me"
# Specific viewport size:
"Take a screenshot of https://example.com at 1920x1080 resolution"
# Mobile viewport:
"Take a screenshot of https://example.com as it would appear on
an iPhone 14 (390x844)"
Claude returns the screenshot directly in the conversation. This is extremely useful for quickly checking how a website looks without opening a browser, verifying responsive designs, or documenting the current state of a page before making changes.
Screenshot Types Compared
Puppeteer supports three distinct screenshot modes, and each serves a different purpose:
| Screenshot Type | What It Captures | Best For | File Size |
|---|---|---|---|
| Viewport | Only the visible area of the browser window | Above-the-fold checks, hero section verification | Small (200-500 KB) |
| Full page | The entire scrollable page from top to bottom | Full page documentation, design review | Large (1-10 MB) |
| Element | A single DOM element, cropped to its bounding box | Component testing, capturing a specific chart or widget | Tiny (10-200 KB) |
# Element screenshot - capture just the navigation bar:
"Navigate to https://example.com and take a screenshot of only the
nav.main-header element"
# Full page screenshot for documentation:
"Take a full-page screenshot of https://docs.example.com/api-reference,
capturing the entire scrollable content"
# Viewport screenshot at a specific breakpoint:
"Take a viewport-only screenshot of https://example.com at 768x1024
to check the tablet layout"
Use Case 3: Generate a PDF
Puppeteer excels at converting web pages to PDF format with precise control over formatting:
# Basic PDF generation:
"Navigate to https://example.com/report/q1-2026 and generate a PDF
of the page"
# With formatting options:
"Generate a PDF of https://example.com/invoice/12345 in A4 format
with landscape orientation and include background graphics"
This is particularly useful for generating printable versions of web reports, invoices, and dashboards. The PDF captures the page exactly as it renders in the browser, including CSS styles, charts, and images.
PDF Generation Options
Puppeteer provides extensive control over PDF output. Here are the most useful options you can request through Claude:
| Option | Values | Use Case |
|---|---|---|
| Format | A4, Letter, Legal, Tabloid | Standard document sizes for printing |
| Orientation | Portrait, Landscape | Wide tables and dashboards need landscape |
| Background graphics | On / Off | Include CSS backgrounds and colors in the PDF |
| Margins | Custom (e.g., 20mm top, 15mm sides) | Control whitespace around content |
| Header / Footer | Custom HTML templates | Add page numbers, dates, or company logos |
| Page range | e.g., 1-3 or 2 | Export only specific pages of a long document |
# PDF with custom margins and page numbers:
"Generate a PDF of https://example.com/report/annual-2025 in A4 portrait
format with 25mm margins on all sides, include background graphics,
and add page numbers at the bottom center"
# PDF of just the first section:
"Navigate to https://docs.example.com/api and generate a PDF of only
pages 1 through 3 in Letter format"
# Dashboard PDF in landscape:
"Generate a landscape PDF of https://dashboard.example.com/overview
with background graphics enabled so the charts render correctly"
Use Case 4: Fill and Submit a Form
Puppeteer can interact with forms, filling in fields and clicking buttons just like a human user:
# Form automation prompt:
"Go to https://example.com/contact, fill in the name field with
'Test User', the email field with 'test@example.com', and the
message field with 'This is an automated test'. Then click the
Submit button and tell me what the confirmation page says."
This is useful for testing form submissions, automating repetitive data entry, and verifying that forms work correctly after code changes. Claude can handle multi-step forms, dropdowns, checkboxes, and even CAPTCHAs that use simple image recognition (though complex CAPTCHAs will still block automation).
Use Case 5: Monitor a Website for Changes
You can use Puppeteer MCP to check a website and report on its current state:
# Monitoring prompt:
"Navigate to https://status.example.com and tell me if there are
any incidents or degraded services listed on the page"
# Price monitoring:
"Go to https://store.example.com/product/12345 and tell me what
the current price is. Is there a sale or discount shown?"
# Content change detection:
"Navigate to https://example.com/changelog and extract the most
recent 3 entries. What are the dates and summaries?"
While Puppeteer MCP does not run continuously in the background, you can use it for on-demand checks whenever you want to know the current state of a web page. For continuous monitoring, you would need to set up a separate scheduled task that triggers Claude periodically.
Scheduling Scraping and Monitoring Jobs
For recurring tasks, combine Puppeteer MCP with automation tools outside of Claude. Here are practical approaches:
- Cron + Claude Code CLI: Write a shell script that calls
claude --message "Navigate to https://status.example.com and check for incidents"and schedule it with cron. This gives you periodic monitoring without manual intervention. - Node.js scheduler: Use
node-cronornode-schedulein a lightweight Node.js script that programmatically calls the Puppeteer MCP server. Store results in a SQLite database for trend analysis. - GitHub Actions: Set up a scheduled workflow that runs a Claude Code command to scrape and commit results to a repository. This gives you version-controlled snapshots of web page data over time.
# Example cron job (check every hour):
0 * * * * /usr/local/bin/claude --message "Navigate to https://competitor.example.com/pricing and extract all plan names and prices. Save the result to /tmp/pricing-check.json" >> /var/log/price-monitor.log 2>&1
Dealing with Anti-Bot Detection
Many websites use anti-bot measures that can block or mislead automated browsers. Puppeteer is detectable by default because it sets certain JavaScript properties (like navigator.webdriver = true) and uses recognizable browser fingerprints. Here is how to handle common scenarios:
Common Detection Methods and Workarounds
| Detection Method | What It Checks | Workaround |
|---|---|---|
| WebDriver flag | navigator.webdriver is true |
Use headed mode or stealth plugins |
| User agent string | Default headless Chrome user agent | Ask Claude to set a standard browser user agent |
| Rate limiting | Too many requests in a short period | Add delays between page navigations |
| CAPTCHA | reCAPTCHA, hCaptcha, Turnstile | Use headed mode and solve manually, or skip the site |
| JavaScript challenges | Cloudflare JS challenge page | Wait longer for the challenge to resolve automatically |
| IP reputation | Known datacenter or VPN IP ranges | Use residential proxy or your home IP |
For sites protected by Cloudflare or similar services, tell Claude to wait for the challenge to clear before extracting content:
# Handling Cloudflare challenge pages:
"Navigate to https://protected-site.example.com. If you see a
Cloudflare challenge or 'checking your browser' page, wait up to
15 seconds for it to resolve before extracting the page content."
Important: always respect a website's robots.txt and terms of service. Anti-bot measures exist for a reason, and bypassing them may violate the site's terms or applicable laws. Use these techniques only for legitimate purposes like testing your own sites or accessing publicly available data.
Advanced Configuration
Headless vs Headed Mode
By default, Puppeteer runs in headless mode (no visible browser window). If you need to see what Puppeteer is doing - useful for debugging - you can configure headed mode:
{
"mcpServers": {
"puppeteer": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-puppeteer"],
"env": {
"PUPPETEER_HEADLESS": "false"
}
}
}
}
In headed mode, a Chrome window will open and you can watch Claude navigate, click, and type in real time. This is helpful for understanding what the AI is doing and for debugging issues with complex web pages.
Custom Chrome Path
If you want Puppeteer to use your existing Chrome installation instead of downloading Chromium:
{
"mcpServers": {
"puppeteer": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-puppeteer"],
"env": {
"PUPPETEER_EXECUTABLE_PATH": "/Applications/Google Chrome.app/Contents/MacOS/Google Chrome"
}
}
}
}
Proxy Configuration
To route Puppeteer traffic through a proxy (useful for accessing geo-restricted content or avoiding rate limiting):
{
"mcpServers": {
"puppeteer": {
"command": "npx",
"args": [
"-y",
"@modelcontextprotocol/server-puppeteer",
"--proxy-server=http://proxy.example.com:8080"
]
}
}
}
Error Handling: Timeouts and Navigation Failures
Browser automation is inherently fragile. Pages fail to load, elements disappear, and network connections drop. Understanding common failure modes will save you significant debugging time.
Navigation Timeouts
The default navigation timeout in Puppeteer is 30 seconds. If a page takes longer than that to reach the load event, the operation fails. Common causes include:
- Heavy third-party scripts: Analytics, ad networks, and social widgets can delay page load significantly. Ask Claude to wait for the
domcontentloadedevent instead of the fullloadevent for faster results. - Stuck network requests: A single slow resource (large image, unresponsive API) can hold up the entire page. Tell Claude to navigate with a shorter timeout and retry if it fails.
- Redirect chains: Multiple redirects (HTTP to HTTPS, www to non-www, marketing redirects) each add latency. If you know the final URL, navigate directly to it.
# Handle slow pages:
"Navigate to https://slow-site.example.com with a 60-second timeout.
If the page does not load within 60 seconds, take a screenshot of
whatever has loaded so far and report what you can see."
# Wait for specific content instead of full page load:
"Navigate to https://dashboard.example.com. Don't wait for the full
page to load - just wait until the div.main-content element appears,
then extract the data from it."
Element Not Found Errors
When Claude tries to interact with an element that does not exist on the page, the operation fails. This commonly happens when:
- The page layout changed since you last used the prompt
- The element is loaded asynchronously and has not appeared yet
- The element is inside an iframe that Puppeteer cannot see by default
- The selector has a typo or uses a class name that was minified in production
# Robust scraping with fallbacks:
"Navigate to https://store.example.com/product/123. Try to extract
the price from span.price-current. If that element doesn't exist,
try span.product-price or div.price-display. If none of those work,
take a screenshot so I can see the current page layout."
Network Errors
DNS failures, SSL certificate errors, and connection resets can all prevent navigation. Tell Claude how to handle these gracefully:
# Handle potential network issues:
"Try to navigate to https://example.com/api-status. If the page
fails to load due to a network error, report the specific error
message you received so I can diagnose the issue."
Comparison: Web Scraping MCP Servers
Puppeteer is not the only way to access web content through MCP. Here is how it compares to alternatives:
| Feature | Puppeteer | Playwright | Firecrawl | Brave Search |
|---|---|---|---|---|
| JavaScript rendering | Yes | Yes | Yes (cloud) | No |
| Screenshots | Yes | Yes | Yes | No |
| Form interaction | Yes | Yes | No | No |
| PDF generation | Yes | Yes | No | No |
| Runs locally | Yes | Yes | Cloud API | Cloud API |
| Multi-browser support | Chrome only | Chrome, Firefox, Safari | N/A | N/A |
| Cost | Free | Free | Paid API | Free tier |
| Best for | Full browser control | Cross-browser testing | Large-scale crawling | Quick web search |
When to choose Puppeteer: You need to interact with a specific page (click, type, scroll), take screenshots, generate PDFs, or scrape JavaScript-rendered content. It runs locally and is completely free.
When NOT to use Puppeteer: If you just need to search the web, use Brave Search. If you need to crawl hundreds of pages at scale, Firecrawl or Exa will be faster and more reliable. If you need cross-browser testing, Playwright offers Firefox and Safari support. Puppeteer is also overkill for fetching simple API responses or static HTML pages.
Troubleshooting
Chromium Download Fails
On first run, Puppeteer downloads a Chromium binary (~170 MB). If this fails due to network issues or corporate firewalls, set PUPPETEER_SKIP_DOWNLOAD=true and provide a custom Chrome path via PUPPETEER_EXECUTABLE_PATH.
Page Times Out
Some pages take a long time to load due to heavy JavaScript or slow network requests. The default timeout is usually 30 seconds. If pages consistently time out, the site may be blocking automated browsers. Try using a different user agent or adding a delay between navigation steps.
spawn ENOENT Error
This means Node.js or npx is not found. See our spawn ENOENT troubleshooting guide for detailed fixes on every operating system.
Blank Screenshots
If Puppeteer returns a blank or white screenshot, the page likely has not finished rendering. This is common with single-page applications that render content after the initial page load event fires. Tell Claude to wait for a specific element to appear before capturing the screenshot. Another cause is CSS animations or transitions that have not completed - adding a small delay (1-2 seconds) after navigation usually resolves this.
Memory Issues with Long Sessions
Chromium uses significant RAM, typically 200-400 MB per instance. If you are running multiple scraping tasks in a single Claude session, memory usage can grow over time as browser tabs accumulate. If you notice slowdowns, start a new Claude conversation to get a fresh Puppeteer instance. On machines with limited RAM (4 GB or less), close other memory-intensive applications while using Puppeteer MCP.
Next Steps
Now that you have Puppeteer MCP running, explore these related resources:
- Learn about the full Puppeteer MCP server capabilities and API.
- Set up advanced web scraping workflows with Puppeteer MCP.
- Compare it with other servers in our MCP server directory.
- Read the MCP Servers for Cursor, VS Code, and Claude setup guide for IDE integration.
