
Website-zu-PDF-MCP
MCP -Server, der Websites abreibt und sie in PDF umwandelt, mit Link -Traversal -Unterstützung
3 years
Works with Finder
0
Github Watches
0
Github Forks
0
Github Stars
Website to PDF/Markdown MCP Server
This MCP server fetches websites (including those behind authentication) and converts them to PDF or Markdown documents. It can also traverse links on a webpage and include them in the generated documents or return the discovered URLs.
Features
- Convert a single webpage to PDF
- Convert a webpage to Markdown format
- Traverse links on a webpage and convert multiple pages to a single PDF or Markdown file
- Support for authentication via username and password
- Configurable maximum page limit for link traversal
- Traverse website links and return URLs without conversion
Setup
- Clone this repository
- Install dependencies:
npm install
- Copy the sample environment file:
cp .env.example .env
- Start the server:
npm start
API Endpoints
Convert Website to PDF
POST /api/convert
Request Body:
{
"url": "https://example.com",
"username": "optional-username",
"password": "optional-password",
"traverseLinks": true,
"maxPages": 10
}
Parameters:
-
url
: (Required) The URL to convert to PDF -
username
: (Optional) Username for authentication -
password
: (Optional) Password for authentication -
traverseLinks
: (Optional) Whether to traverse links on the page (default: false) -
maxPages
: (Optional) Maximum number of pages to process when traversing links (default: 10)
Response:
The response will be the PDF document with appropriate content-type headers:
Content-Type: application/pdf
Content-Disposition: attachment; filename="example_com.pdf"
The binary PDF content is returned directly in the response body.
Convert Website to Markdown
POST /api/to-markdown
Request Body:
{
"url": "https://example.com",
"username": "optional-username",
"password": "optional-password",
"traverseLinks": true,
"maxPages": 10
}
Parameters:
-
url
: (Required) The URL to convert to Markdown -
username
: (Optional) Username for authentication -
password
: (Optional) Password for authentication -
traverseLinks
: (Optional) Whether to traverse links on the page (default: false) -
maxPages
: (Optional) Maximum number of pages to process when traversing links (default: 10)
Response:
The response will be the Markdown document with appropriate content-type headers:
Content-Type: text/markdown
Content-Disposition: attachment; filename="example_com.md"
The Markdown content is returned directly in the response body.
Traverse Website and Return URLs
POST /api/traverse
Request Body:
{
"url": "https://example.com",
"username": "optional-username",
"password": "optional-password",
"maxPages": 10
}
Parameters:
-
url
: (Required) The URL to start traversal from -
username
: (Optional) Username for authentication -
password
: (Optional) Password for authentication -
maxPages
: (Optional) Maximum number of pages to traverse (default: 10)
Response:
{
"success": true,
"message": "Website traversed successfully (found 8 URLs)",
"urls": [
"https://example.com",
"https://example.com/page1",
"https://example.com/page2",
...
]
}
Customization
You can customize the PDF and Markdown generation by modifying the relevant functions in src/index.js
:
PDF Generation
The websiteToPdf
function supports:
- Custom page formats
- Background rendering
- Page margins
- And more through Puppeteer's options
Markdown Generation
The websiteToMarkdown
function uses the Turndown library which offers:
- Custom rules for conversion
- Ability to preserve certain HTML elements
- Options for handling code blocks, headings, and lists
Authentication Handling
The default implementation assumes a simple username/password form. You may need to customize the authentication logic based on the specific websites you're targeting.
Using as a Claude MCP
This server is configured as a Claude MCP (Managed Claude Plugin) that can be used directly with Claude. To use it:
Self-Hosting Setup
- Host this server on a platform like Heroku, Vercel, or your own infrastructure
- Make sure the server is publicly accessible via HTTPS
- Add an icon.png file to your repository
Installing in Claude
- Open Claude in your browser and navigate to the Plugins section
- Click "Create a plugin"
- Enter the URL where your MCP server is hosted
- Claude will discover the API endpoints and create the plugin interface
- Save and enable the plugin
Usage in Claude
Once installed, you can use the MCP directly in your conversations with Claude:
- "Convert example.com to a PDF"
- "Convert example.com to Markdown"
- "Get all the URLs from example.com"
- "Convert the website with authentication using username 'myuser' and password 'mypass'"
The plugin provides three main functions:
- Converting websites to PDF
- Converting websites to Markdown
- Traversing websites and returning discovered URLs
Local Development
For local development, you can use tools like ngrok to expose your local server to the internet:
npm start
# In a separate terminal
ngrok http 3000
Then use the ngrok URL when setting up the MCP in Claude.
相关推荐
🔥 1Panel bietet eine intuitive Weboberfläche und einen MCP -Server, um Websites, Dateien, Container, Datenbanken und LLMs auf einem Linux -Server zu verwalten.
🧑🚀 全世界最好的 llm 资料总结(数据处理、模型训练、模型部署、 O1 模型、 MCP 、小语言模型、视觉语言模型) | Zusammenfassung der weltbesten LLM -Ressourcen.
⛓️Rugele ist ein leichter, leistungsstarker, leistungsstarker, eingebetteter Komponenten-Orchestrierungsregel-Motor-Rahmen für GO.
PDF wissenschaftliche Papierübersetzung mit erhaltenen Formaten - 基于 ai 完整保留排版的 pdf 文档全文双语翻译 , 支持 支持 支持 支持 google/deeptl/ollama/openai 等服务 提供 cli/gui/mcp/docker/zotero
Erstellen Sie einfach LLM -Tools und -Argarten mit einfachen Bash/JavaScript/Python -Funktionen.
😎简单易用、🧩丰富生态 - 大模型原生即时通信机器人平台 | 适配 qq / 微信(企业微信、个人微信) / 飞书 / 钉钉 / diskord / telegram / slack 等平台 | 支持 Chatgpt 、 Deepseek 、 Diffy 、 Claude 、 Gemini 、 xai 、 ppio 、 、 ulama 、 lm Studio 、阿里云百炼、火山方舟、 siliconflow 、 qwen 、 mondshot 、 chatglm 、 sillytraven 、 mcp 等 llm 的机器人 / agent | LLM-basierte Instant Messaging Bots-Plattform, unterstützt Zwietracht, Telegramm, Wechat, Lark, Dingtalk, QQ, Slack
Reviews

user_bN0qcmJl
I recently used website-to-pdf-mcp by ldangelo and it exceeded my expectations. The tool effortlessly converts any webpage into a high-quality PDF, preserving the layout and design perfectly. Great for offline reading and sharing important web content. Highly recommend it to anyone in need of reliable webpage-to-PDF conversion!

user_ksLLBAaC
I've been using website-to-pdf-mcp by ldangelo, and it's been a game-changer for me. The ease of converting any website to a PDF is unmatched. It’s fast, efficient, and incredibly user-friendly. I highly recommend it to anyone needing a reliable solution for saving web content in PDF format.

user_ovzxCSf4
I've been using website-to-pdf-mcp by ldangelo for several weeks now, and it has truly streamlined my workflow. The tool is incredibly user-friendly and converts web pages to PDFs with remarkable accuracy and speed. It's a must-have for anyone needing reliable and efficient online-to-document conversion. Highly recommended!

user_5ORj4MLG
As a dedicated user of website-to-pdf-mcp by ldangelo, I must say this tool has been a game-changer. It's incredibly efficient in converting websites to PDFs with just a simple URL input. The ease of use and the seamless execution make it a must-have for anyone needing quick and reliable document conversion. Highly recommended!

user_FD0IEGH2
As a dedicated user of website-to-pdf-mcp by ldangelo, I am thoroughly impressed by its seamless functionality. It effortlessly converts any website to a PDF with just a start URL. The intuitive design and reliability make it an invaluable tool for preserving web content. Highly recommended for anyone needing efficient web-to-PDF conversion!

user_lbgEfxI1
As a devoted user of website-to-pdf-mcp by ldangelo, I must say this tool is a game-changer! It seamlessly converts web pages into high-quality PDFs, making it perfect for archiving or offline reading. The process is straightforward and efficient, saving me a lot of time. Highly recommended for anyone needing reliable PDF conversion from websites!