Cover image
Try Now
2025-04-14

3 years

Works with Finder

0

Github Watches

0

Github Forks

0

Github Stars

mcp-server-spider: A spider MCP server

Overview

A Model Context Protocol server for Spider crawler interaction and automation. This server provides tools to crawl and scrape web pages.

Please note that mcp-server-spider is currently in early develpoment. There might be bugs and features added in the future.

Tools

  1. crawl
    • Crawls the given url and returns the list of URLs that were found
    • Input:
      • url: The url to crawl
      • headers: Additional headers passed along with crawl requests
      • user_agent: User agent to use for the crawl requests
      • depth: The depth of link traversal
      • blacklist: A list of regural expression to blacklist URLs from the crawling process
      • whitelist: A list of regular expression to whitelist URLS from the crawling process
      • respect_robots_txt: Whether to respect robots.txt file
      • accept_invalid_certs: Whether to accept invalid certifcates or not
    • Returns: List of URLs found
  2. scrape
    • Scrapes the given url and returns a list of JSON objects that contain the url, links and content of each page discovered
    • Input: Same as crawl
    • Returns: A list of JSON objects (as a string) that contain the url, links and content of each page discovered

Installation

Using uv (recommended)

When using uv no specific installation is needed. We will use uvx to directly run mcp-server-spider.

Using PIP

Alternatively you can install mcp-server-spider via pip:

pip install mcp-server-spider

After installation, you can run it as a script using:

python -m mcp_server_spider

相关推荐

  • av
  • Exécutez sans effort LLM Backends, API, Frontends et Services avec une seule commande.

  • WangRongsheng
  • 🧑‍🚀 全世界最好的 LLM 资料总结 (数据处理、模型训练、模型部署、 O1 模型、 MCP 、小语言模型、视觉语言模型) | Résumé des meilleures ressources LLM du monde.

  • 1Panel-dev
  • 🔥 1Panel fournit une interface Web intuitive et un serveur MCP pour gérer des sites Web, des fichiers, des conteneurs, des bases de données et des LLM sur un serveur Linux.

  • rulego
  • ⛓️RULEGO est un cadre de moteur de règle d'orchestration des composants de nouvelle génération légère, intégrée, intégrée et de nouvelle génération pour GO.

  • Byaidu
  • PDF Traduction de papier scientifique avec formats conservés - 基于 AI 完整保留排版的 PDF 文档全文双语翻译 , 支持 Google / Deepl / Olllama / Openai 等服务 , 提供 CLI / GUI / MCP / DOCKER / ZOTERO

  • lasso-security
  • Une passerelle basée sur un plugin qui orchestre d'autres MCP et permet aux développeurs de s'appuyer sur des agents de qualité d'entreprise informatiques.

  • hkr04
  • SDK C ++ MCP (Protocole de contexte modèle léger)

  • sigoden
  • Créez facilement des outils et des agents LLM à l'aide de fonctions Plain Bash / JavaScript / Python.

  • RockChinQ
  • 😎简单易用、🧩丰富生态 - 大模型原生即时通信机器人平台 | 适配 QQ / 微信 (企业微信、个人微信) / 飞书 / 钉钉 / Discord / Telegram / Slack 等平台 | 支持 Chatgpt 、 Deepseek 、 Dify 、 Claude 、 GEMINI 、 XAI 、 PPIO 、 OLLAMA 、 LM Studio 、阿里云百炼、火山方舟、 Siliconflow 、 Qwen 、 Moonshot 、 ChatGlm 、 Sillytraven 、 MCP 等 LLM 的机器人 / Agent | Plateforme de bots de messagerie instantanée basés sur LLM, prend en charge Discord, Telegram, WeChat, Lark, Dingtalk, QQ, Slack

  • modelscope
  • Commencez à construire des applications multi-agents LLM, plus facilement.

    Reviews

    2.5 (2)
    Avatar
    user_5o5ZGvie
    2025-04-24

    As a dedicated user of mcp applications, I found the mcp-server-spider by GeorgeLS to be outstanding. It efficiently handles server-side operations, and the seamless integration saved me a lot of time. The user-friendly interface and responsive design make it an invaluable tool in our tech stack. Highly recommended!

    Avatar
    user_FVx1jjKm
    2025-04-24

    As a loyal user of mcp-server-spider by GeorgeLS, I must say this tool is incredibly efficient for web scraping tasks. Its intuitive interface and seamless performance make extracting data a breeze. The prompt welcome message adds a nice touch, ensuring a user-friendly experience right from the start. Highly recommended for anyone needing reliable web scraping solutions!