
Fetcher-MCP
Servidor MCP para obtener contenido de la página web con el navegador sin cabeza de dramaturgo.
3 years
Works with Finder
4
Github Watches
37
Github Forks
567
Github Stars
Fetcher MCP
MCP server for fetch web page content using Playwright headless browser.
Advantages
-
JavaScript Support: Unlike traditional web scrapers, Fetcher MCP uses Playwright to execute JavaScript, making it capable of handling dynamic web content and modern web applications.
-
Intelligent Content Extraction: Built-in Readability algorithm automatically extracts the main content from web pages, removing ads, navigation, and other non-essential elements.
-
Flexible Output Format: Supports both HTML and Markdown output formats, making it easy to integrate with various downstream applications.
-
Parallel Processing: The
fetch_urls
tool enables concurrent fetching of multiple URLs, significantly improving efficiency for batch operations. -
Resource Optimization: Automatically blocks unnecessary resources (images, stylesheets, fonts, media) to reduce bandwidth usage and improve performance.
-
Robust Error Handling: Comprehensive error handling and logging ensure reliable operation even when dealing with problematic web pages.
-
Configurable Parameters: Fine-grained control over timeouts, content extraction, and output formatting to suit different use cases.
Quick Start
Run directly with npx:
npx -y fetcher-mcp
First time setup - install the required browser by running the following command in your terminal:
npx playwright install chromium
Debug Mode
Run with the --debug
option to show the browser window for debugging:
npx -y fetcher-mcp --debug
Configuration MCP
Configure this MCP server in Claude Desktop:
On MacOS: ~/Library/Application Support/Claude/claude_desktop_config.json
On Windows: %APPDATA%/Claude/claude_desktop_config.json
{
"mcpServers": {
"fetcher": {
"command": "npx",
"args": ["-y", "fetcher-mcp"]
}
}
}
Features
-
fetch_url
- Retrieve web page content from a specified URL- Uses Playwright headless browser to parse JavaScript
- Supports intelligent extraction of main content and conversion to Markdown
- Supports the following parameters:
-
url
: The URL of the web page to fetch (required parameter) -
timeout
: Page loading timeout in milliseconds, default is 30000 (30 seconds) -
waitUntil
: Specifies when navigation is considered complete, options: 'load', 'domcontentloaded', 'networkidle', 'commit', default is 'load' -
extractContent
: Whether to intelligently extract the main content, default is true -
maxLength
: Maximum length of returned content (in characters), default is no limit -
returnHtml
: Whether to return HTML content instead of Markdown, default is false -
waitForNavigation
: Whether to wait for additional navigation after initial page load (useful for sites with anti-bot verification), default is false -
navigationTimeout
: Maximum time to wait for additional navigation in milliseconds, default is 10000 (10 seconds) -
disableMedia
: Whether to disable media resources (images, stylesheets, fonts, media), default is true -
debug
: Whether to enable debug mode (showing browser window), overrides the --debug command line flag if specified
-
-
fetch_urls
- Batch retrieve web page content from multiple URLs in parallel- Uses multi-tab parallel fetching for improved performance
- Returns combined results with clear separation between webpages
- Supports the following parameters:
-
urls
: Array of URLs to fetch (required parameter) - Other parameters are the same as
fetch_url
-
Tips
Handling Special Website Scenarios
Dealing with Anti-Crawler Mechanisms
-
Wait for Complete Loading: For websites using CAPTCHA, redirects, or other verification mechanisms, include in your prompt:
Please wait for the page to fully load
This will use the
waitForNavigation: true
parameter. -
Increase Timeout Duration: For websites that load slowly:
Please set the page loading timeout to 60 seconds
This adjusts both
timeout
andnavigationTimeout
parameters accordingly.
Content Retrieval Adjustments
-
Preserve Original HTML Structure: When content extraction might fail:
Please preserve the original HTML content
Sets
extractContent: false
andreturnHtml: true
. -
Fetch Complete Page Content: When extracted content is too limited:
Please fetch the complete webpage content instead of just the main content
Sets
extractContent: false
. -
Return Content as HTML: When HTML format is needed instead of default Markdown:
Please return the content in HTML format
Sets
returnHtml: true
.
Debugging and Authentication
Enabling Debug Mode
-
Dynamic Debug Activation: To display the browser window during a specific fetch operation:
This setsPlease enable debug mode for this fetch operation
debug: true
even if the server was started without the--debug
flag.
Using Custom Cookies for Authentication
-
Manual Login: To login using your own credentials:
Please run in debug mode so I can manually log in to the website
Sets
debug: true
or uses the--debug
flag, keeping the browser window open for manual login. -
Interacting with Debug Browser: When debug mode is enabled:
- The browser window remains open
- You can manually log into the website using your credentials
- After login is complete, content will be fetched with your authenticated session
-
Enable Debug for Specific Requests: Even if the server is already running, you can enable debug mode for a specific request:
Please enable debug mode for this authentication step
Sets
debug: true
for this specific request only, opening the browser window for manual login.
Development
Install Dependencies
npm install
Install Playwright Browser
Install the browsers needed for Playwright:
npm run install-browser
Build the Server
npm run build
Debugging
Use MCP Inspector for debugging:
npm run inspector
You can also enable visible browser mode for debugging:
node build/index.js --debug
Related Projects
- g-search-mcp: A powerful MCP server for Google search that enables parallel searching with multiple keywords simultaneously. Perfect for batch search operations and data collection.
License
Licensed under the MIT License
相关推荐
Converts Figma frames into front-end code for various mobile frameworks.
Advanced software engineer GPT that excels through nailing the basics.
Take an adjectivised noun, and create images making it progressively more adjective!
Embark on a thrilling diplomatic quest across a galaxy on the brink of war. Navigate complex politics and alien cultures to forge peace and avert catastrophe in this immersive interstellar adventure.
OrchestratorofexpertagentsincybersecurityandOSINT
Puente entre los servidores Ollama y MCP, lo que permite a LLM locales utilizar herramientas de protocolo de contexto del modelo
Servidores MCP impresionantes: una lista curada de servidores de protocolo de contexto del modelo
La aplicación AI de escritorio todo en uno y Docker con trapo incorporado, agentes de IA, creador de agentes sin código, compatibilidad de MCP y más.
Habilite clientes asistentes de IA como Cursor, Windsurf y Claude Desktop para controlar el motor irreal a través del lenguaje natural utilizando el Protocolo de contexto del modelo (MCP).
🔥 1Panel proporciona una interfaz web intuitiva y un servidor MCP para administrar sitios web, archivos, contenedores, bases de datos y LLM en un servidor de Linux.
Query MCP permite la administración de extremo a extremo de Supabase a través de la interfaz de chat: lectura y escritura de ejecuciones de consultas, soporte de API de administración, versiones de migración automática, acceso a registros y mucho más.
⛓️Rulego es un marco de motor de regla de orquestación de componentes de alta generación de alto rendimiento, de alto rendimiento y de alto rendimiento para GO.
🧑🚀 全世界最好的 llM 资料总结(数据处理、模型训练、模型部署、 O1 模型、 MCP 、小语言模型、视觉语言模型) | Resumen de los mejores recursos del mundo.
Reviews

user_qALXI9eR
Fetcher-mcp by jae-jae is an exceptional tool for MCP enthusiasts. It efficiently fetches and processes data, significantly improving workflow. The GitHub link, https://github.com/jae-jae/fetcher-mcp, provides clear guidance and resources. I highly recommend it for anyone looking to enhance their MCP experience.