Cover image
Convertisseur de fichiers-MCP
Public

Convertisseur de fichiers-MCP

Try Now
2025-04-06

Un serveur MCP (Protocole de contexte de modèle basé sur Python alimenté par FastMCP qui exploite Pandoc pour la conversion de documents flexible entre différents formats (Markdown, Docx, PDF, HTML, etc.), conçu pour une utilisation avec des agents AI comme Claude. Comprend la configuration Docker pour un déploiement facile.

3 years

Works with Finder

1

Github Watches

1

Github Forks

1

Github Stars

Pandoc MCP Server

License: MIT smithery badge

A Python-based MCP (Model Context Protocol) server that provides powerful document conversion capabilities via Pandoc. This server allows AI agents (like Claude via LangChain/LangGraph) to request file conversions between various formats such as Markdown, DOCX, HTML, PDF, EPUB, and many more.

This project uses:

  • FastMCP: A Python library for easily creating MCP servers.
  • pypandoc: A Python wrapper around the Pandoc command-line tool.
  • Pandoc: The universal document converter.
  • (Optional) Docker: For containerized deployment, bundling all dependencies (Python, Pandoc, LaTeX).

Features

  • Exposes a single MCP tool: convert_document.
  • Supports a wide range of input and output formats handled by Pandoc.
  • Allows specifying input format (if auto-detection fails) and output format.
  • Supports passing extra command-line arguments to Pandoc for advanced control (e.g., Table of Contents, PDF margins, standalone files).
  • Includes Docker configuration (Dockerfile) for creating a self-contained server environment including Pandoc and necessary LaTeX components for PDF generation.
  • Designed for integration with MCP clients, particularly LangChain/LangGraph agents.

Exposed MCP Tool

convert_document

Converts a document from one format to another using Pandoc.

Arguments:

  • input_file_path (str, required): The path accessible by the server to the input document file. If running in Docker with a volume mount, this should be the path inside the container (e.g., /data/my_doc.docx).
  • output_file_path (str, required): The path accessible by the server where the converted output file should be saved. If running in Docker, this should be the path inside the container (e.g., /data/my_output.pdf). The directory will be created if it doesn't exist within the server's accessible filesystem.
  • to_format (str, required): The target format for the conversion (e.g., 'markdown', 'docx', 'pdf', 'html', 'rst', 'epub'). See Pandoc documentation for a full list (--list-output-formats).
  • from_format (str, optional): The format of the input file. If None, pandoc will try to guess from the file extension. Specify if the extension is ambiguous or missing (e.g., 'md', 'docx', 'html'). Defaults to None.
  • extra_args (List[str], optional): A list of additional command-line arguments to pass directly to pandoc (e.g., ['--toc'], ['-V', 'geometry:margin=1.5cm'], ['--standalone']). Defaults to None.

Returns:

  • (str): A message indicating success (e.g., "Successfully converted document to '/data/my_output.pdf'") or an error message (e.g., "Error: Input file not found...", "Error during conversion: Pandoc died...").

Setup and Running

You can run this server either locally (requires manual installation of dependencies) or using the provided Docker configuration (recommended for ease of use and deployment).

Installing via Smithery

To install Pandoc Document Converter for Claude Desktop automatically via Smithery:

npx -y @smithery/cli install @MaitreyaM/file-converter-mcp --client claude

Option 1: Running with Docker (Recommended)

This method bundles Python, Pandoc, LaTeX, and required libraries into a container. You only need Docker Desktop installed locally.

  1. Install Docker: Download and install Docker Desktop for your operating system. Start Docker Desktop.
  2. Clone Repository: Get the project files:
    git clone https://github.com/your-username/pandoc-mcp-server.git # Replace with your repo URL
    cd pandoc-mcp-server
    
  3. Build the Docker Image: This command builds the image using the Dockerfile. It installs Pandoc, a capable TeX Live distribution (for PDF support), and Python dependencies inside the image. This step might take several minutes the first time.
    docker build -t pandoc-converter-server .
    
  4. Run the Container: This starts the server inside the container.
    • Choose a directory on your host machine to share with the container for input/output files (e.g., the current project directory).
    • Run the container, mapping the host directory to /data inside the container and mapping port 8000. Replace /path/to/your/local/project with the actual absolute path to the project directory on your machine.
    # Example using the current directory (.) as the host path:
    docker run -it --rm -p 8000:8000 -v "$(pwd)":/data pandoc-converter-server
    
    # Or using an absolute path (replace):
    # docker run -it --rm -p 8000:8000 -v "/path/to/your/local/project":/data pandoc-converter-server
    
    • -it: Runs interactively (shows logs, allows Ctrl+C).
    • --rm: Removes the container when stopped.
    • -p 8000:8000: Maps port 8000 on your host to port 8000 in the container.
    • -v "$(pwd)":/data: Mounts the current working directory on your host to /data inside the container. Files placed in your local project directory will appear in /data inside the container, and files saved to /data by the server will appear in your local project directory.
    • pandoc-converter-server: The name of the image you built.
  5. Server is Running: You should see logs indicating the server started and is listening on SSE (http://0.0.0.0:8000). It's ready to accept connections from your MCP client (like the LangChain agent).
  6. Connecting from Client: Configure your MCP client (e.g., MultiServerMCPClient) to connect to http://127.0.0.1:8000/sse with transport: "sse".
  7. Using the Tool: When interacting with your agent/client, refer to files using their path inside the container, prefixed with /data/. For example: convert /data/my_input.docx to pdf at /data/my_output.pdf. The output file will appear in your local project directory due to the volume mapping.

Option 2: Running Locally (Manual Dependency Installation)

This requires you to install Python, Pandoc, and a LaTeX distribution directly onto your host machine.

  1. Install Python: Ensure you have Python >= 3.10 installed.
  2. Install Pandoc: Install the Pandoc command-line tool for your OS. Follow instructions at pandoc.org/installing.html. Verify by running pandoc --version in a new terminal.
  3. Install LaTeX: For PDF generation, install a TeX distribution.
    • macOS: brew install --cask mactex-no-gui (Recommended via Homebrew)
    • Debian/Ubuntu: sudo apt-get update && sudo apt-get install texlive-latex-base texlive-fonts-recommended texlive-latex-extra texlive-fonts-extra (or texlive-full for everything, but large).
    • Windows: Install MiKTeX or TeX Live. Ensure the bin directory containing pdflatex.exe is added to your system's PATH.
    • Verify by running pdflatex --version in a new terminal.
  4. Clone Repository:
    git clone https://github.com/your-username/pandoc-mcp-server.git # Replace with your repo URL
    cd pandoc-mcp-server
    
  5. Create Virtual Environment (Recommended):
    python -m venv venv
    source venv/bin/activate # Linux/macOS
    # venv\Scripts\activate # Windows
    
    (Or use Conda: conda create --name pandoc-env python=3.11 && conda activate pandoc-env)
  6. Install Python Dependencies:
    pip install -r requirements.txt
    
  7. Run the Server:
    python pandoc_mcp_server.py
    
  8. Server is Running: It will listen on http://127.0.0.1:8000/sse.
  9. Connecting from Client: Configure your MCP client to connect to http://127.0.0.1:8000/sse.
  10. Using the Tool: Refer to files using their regular paths on your local machine (e.g., convert my_input.docx to pdf at my_output.pdf, assuming files are in the same directory, or use absolute paths).

Example Agent Interaction (Running Server in Docker)

Assuming the server container is running with the volume mount:

You: convert /data/report.md to pdf

Agent: Thinking...
[Agent calls convert_document tool with input='/data/report.md', output='/data/report.pdf', to='pdf']
Agent: Successfully converted document to '/data/report.pdf'
[The bot may then attempt to upload report.pdf from the local project directory]

Files

  • pandoc_mcp_server.py: The main Python script for the MCP server.
  • Dockerfile: Instructions for building the Docker container image.
  • requirements.txt: Python dependencies needed inside the Docker container (or local venv).
  • .gitignore: Specifies intentionally untracked files for Git.
  • README.md: This file.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request or open an Issue.

相关推荐

  • Joshua Armstrong
  • Confidential guide on numerology and astrology, based of GG33 Public information

  • https://suefel.com
  • Latest advice and best practices for custom GPT development.

  • Emmet Halm
  • Converts Figma frames into front-end code for various mobile frameworks.

  • Elijah Ng Shi Yi
  • Advanced software engineer GPT that excels through nailing the basics.

  • https://maiplestudio.com
  • Find Exhibitors, Speakers and more

  • lumpenspace
  • Take an adjectivised noun, and create images making it progressively more adjective!

  • https://appia.in
  • Siri Shortcut Finder – your go-to place for discovering amazing Siri Shortcuts with ease

  • Carlos Ferrin
  • Encuentra películas y series en plataformas de streaming.

  • Yusuf Emre Yeşilyurt
  • I find academic articles and books for research and literature reviews.

  • tomoyoshi hirata
  • Sony α7IIIマニュアルアシスタント

  • apappascs
  • Découvrez la collection la plus complète et la plus à jour de serveurs MCP sur le marché. Ce référentiel sert de centre centralisé, offrant un vaste catalogue de serveurs MCP open-source et propriétaires, avec des fonctionnalités, des liens de documentation et des contributeurs.

  • ShrimpingIt
  • Manipulation basée sur Micropython I2C de l'exposition GPIO de la série MCP, dérivée d'Adafruit_MCP230XX

  • jae-jae
  • MCP Server pour récupérer le contenu de la page Web à l'aide du navigateur sans tête du dramwright.

  • ravitemer
  • Un puissant plugin Neovim pour gérer les serveurs MCP (Protocole de contexte modèle)

  • patruff
  • Pont entre les serveurs Olllama et MCP, permettant aux LLM locaux d'utiliser des outils de protocole de contexte de modèle

  • Sysc4lls
  • Lecteur de documentation IDA (Sort-of) MCP Server

  • pontusab
  • La communauté du curseur et de la planche à voile, recherchez des règles et des MCP

  • JackKuo666
  • 🔍 Permettre aux assistants d'IA de rechercher et d'accéder aux informations du package PYPI via une interface MCP simple.

  • av
  • Exécutez sans effort LLM Backends, API, Frontends et Services avec une seule commande.

  • WangRongsheng
  • 🧑‍🚀 全世界最好的 LLM 资料总结 (数据处理、模型训练、模型部署、 O1 模型、 MCP 、小语言模型、视觉语言模型) | Résumé des meilleures ressources LLM du monde.

    Reviews

    1 (1)
    Avatar
    user_3uCilSeI
    2025-04-17

    I have been using FILE-CONVERTER-MCP by MaitreyaM and it has been a game-changer. The seamless file conversion process and support for multiple formats make my tasks so much easier. The well-structured code and easy-to-follow instructions on GitHub showcase the author's expertise. Highly recommended for anyone in need of a reliable file conversion solution!