
Site Web à PDF-MCP
Serveur MCP qui récupère les sites Web et les convertit en PDF, avec une prise en charge de la traversée de liaison
3 years
Works with Finder
0
Github Watches
0
Github Forks
0
Github Stars
Website to PDF/Markdown MCP Server
This MCP server fetches websites (including those behind authentication) and converts them to PDF or Markdown documents. It can also traverse links on a webpage and include them in the generated documents or return the discovered URLs.
Features
- Convert a single webpage to PDF
- Convert a webpage to Markdown format
- Traverse links on a webpage and convert multiple pages to a single PDF or Markdown file
- Support for authentication via username and password
- Configurable maximum page limit for link traversal
- Traverse website links and return URLs without conversion
Setup
- Clone this repository
- Install dependencies:
npm install
- Copy the sample environment file:
cp .env.example .env
- Start the server:
npm start
API Endpoints
Convert Website to PDF
POST /api/convert
Request Body:
{
"url": "https://example.com",
"username": "optional-username",
"password": "optional-password",
"traverseLinks": true,
"maxPages": 10
}
Parameters:
-
url
: (Required) The URL to convert to PDF -
username
: (Optional) Username for authentication -
password
: (Optional) Password for authentication -
traverseLinks
: (Optional) Whether to traverse links on the page (default: false) -
maxPages
: (Optional) Maximum number of pages to process when traversing links (default: 10)
Response:
The response will be the PDF document with appropriate content-type headers:
Content-Type: application/pdf
Content-Disposition: attachment; filename="example_com.pdf"
The binary PDF content is returned directly in the response body.
Convert Website to Markdown
POST /api/to-markdown
Request Body:
{
"url": "https://example.com",
"username": "optional-username",
"password": "optional-password",
"traverseLinks": true,
"maxPages": 10
}
Parameters:
-
url
: (Required) The URL to convert to Markdown -
username
: (Optional) Username for authentication -
password
: (Optional) Password for authentication -
traverseLinks
: (Optional) Whether to traverse links on the page (default: false) -
maxPages
: (Optional) Maximum number of pages to process when traversing links (default: 10)
Response:
The response will be the Markdown document with appropriate content-type headers:
Content-Type: text/markdown
Content-Disposition: attachment; filename="example_com.md"
The Markdown content is returned directly in the response body.
Traverse Website and Return URLs
POST /api/traverse
Request Body:
{
"url": "https://example.com",
"username": "optional-username",
"password": "optional-password",
"maxPages": 10
}
Parameters:
-
url
: (Required) The URL to start traversal from -
username
: (Optional) Username for authentication -
password
: (Optional) Password for authentication -
maxPages
: (Optional) Maximum number of pages to traverse (default: 10)
Response:
{
"success": true,
"message": "Website traversed successfully (found 8 URLs)",
"urls": [
"https://example.com",
"https://example.com/page1",
"https://example.com/page2",
...
]
}
Customization
You can customize the PDF and Markdown generation by modifying the relevant functions in src/index.js
:
PDF Generation
The websiteToPdf
function supports:
- Custom page formats
- Background rendering
- Page margins
- And more through Puppeteer's options
Markdown Generation
The websiteToMarkdown
function uses the Turndown library which offers:
- Custom rules for conversion
- Ability to preserve certain HTML elements
- Options for handling code blocks, headings, and lists
Authentication Handling
The default implementation assumes a simple username/password form. You may need to customize the authentication logic based on the specific websites you're targeting.
Using as a Claude MCP
This server is configured as a Claude MCP (Managed Claude Plugin) that can be used directly with Claude. To use it:
Self-Hosting Setup
- Host this server on a platform like Heroku, Vercel, or your own infrastructure
- Make sure the server is publicly accessible via HTTPS
- Add an icon.png file to your repository
Installing in Claude
- Open Claude in your browser and navigate to the Plugins section
- Click "Create a plugin"
- Enter the URL where your MCP server is hosted
- Claude will discover the API endpoints and create the plugin interface
- Save and enable the plugin
Usage in Claude
Once installed, you can use the MCP directly in your conversations with Claude:
- "Convert example.com to a PDF"
- "Convert example.com to Markdown"
- "Get all the URLs from example.com"
- "Convert the website with authentication using username 'myuser' and password 'mypass'"
The plugin provides three main functions:
- Converting websites to PDF
- Converting websites to Markdown
- Traversing websites and returning discovered URLs
Local Development
For local development, you can use tools like ngrok to expose your local server to the internet:
npm start
# In a separate terminal
ngrok http 3000
Then use the ngrok URL when setting up the MCP in Claude.
相关推荐
🔥 1Panel fournit une interface Web intuitive et un serveur MCP pour gérer des sites Web, des fichiers, des conteneurs, des bases de données et des LLM sur un serveur Linux.
🧑🚀 全世界最好的 LLM 资料总结 (数据处理、模型训练、模型部署、 O1 模型、 MCP 、小语言模型、视觉语言模型) | Résumé des meilleures ressources LLM du monde.
⛓️RULEGO est un cadre de moteur de règle d'orchestration des composants de nouvelle génération légère, intégrée, intégrée et de nouvelle génération pour GO.
PDF Traduction de papier scientifique avec formats conservés - 基于 AI 完整保留排版的 PDF 文档全文双语翻译 , 支持 Google / Deepl / Olllama / Openai 等服务 , 提供 CLI / GUI / MCP / DOCKER / ZOTERO
Créez facilement des outils et des agents LLM à l'aide de fonctions Plain Bash / JavaScript / Python.
😎简单易用、🧩丰富生态 - 大模型原生即时通信机器人平台 | 适配 QQ / 微信 (企业微信、个人微信) / 飞书 / 钉钉 / Discord / Telegram / Slack 等平台 | 支持 Chatgpt 、 Deepseek 、 Dify 、 Claude 、 GEMINI 、 XAI 、 PPIO 、 OLLAMA 、 LM Studio 、阿里云百炼、火山方舟、 Siliconflow 、 Qwen 、 Moonshot 、 ChatGlm 、 Sillytraven 、 MCP 等 LLM 的机器人 / Agent | Plateforme de bots de messagerie instantanée basés sur LLM, prend en charge Discord, Telegram, WeChat, Lark, Dingtalk, QQ, Slack
Reviews

user_bN0qcmJl
I recently used website-to-pdf-mcp by ldangelo and it exceeded my expectations. The tool effortlessly converts any webpage into a high-quality PDF, preserving the layout and design perfectly. Great for offline reading and sharing important web content. Highly recommend it to anyone in need of reliable webpage-to-PDF conversion!

user_ksLLBAaC
I've been using website-to-pdf-mcp by ldangelo, and it's been a game-changer for me. The ease of converting any website to a PDF is unmatched. It’s fast, efficient, and incredibly user-friendly. I highly recommend it to anyone needing a reliable solution for saving web content in PDF format.

user_ovzxCSf4
I've been using website-to-pdf-mcp by ldangelo for several weeks now, and it has truly streamlined my workflow. The tool is incredibly user-friendly and converts web pages to PDFs with remarkable accuracy and speed. It's a must-have for anyone needing reliable and efficient online-to-document conversion. Highly recommended!

user_5ORj4MLG
As a dedicated user of website-to-pdf-mcp by ldangelo, I must say this tool has been a game-changer. It's incredibly efficient in converting websites to PDFs with just a simple URL input. The ease of use and the seamless execution make it a must-have for anyone needing quick and reliable document conversion. Highly recommended!

user_FD0IEGH2
As a dedicated user of website-to-pdf-mcp by ldangelo, I am thoroughly impressed by its seamless functionality. It effortlessly converts any website to a PDF with just a start URL. The intuitive design and reliability make it an invaluable tool for preserving web content. Highly recommended for anyone needing efficient web-to-PDF conversion!

user_lbgEfxI1
As a devoted user of website-to-pdf-mcp by ldangelo, I must say this tool is a game-changer! It seamlessly converts web pages into high-quality PDFs, making it perfect for archiving or offline reading. The process is straightforward and efficient, saving me a lot of time. Highly recommended for anyone needing reliable PDF conversion from websites!