MCP cover image
See in Github
2025-04-13

Ein MCP -Server, der Bilderkennung 👀 Funktionen mithilfe anthropischer und OpenAI -Vision -APIs bietet

1

Github Watches

3

Github Forks

10

Github Stars

MCP Image Recognition Server

An MCP server that provides image recognition capabilities using Anthropic and OpenAI vision APIs. Version 0.1.2.

Features

  • Image description using Anthropic Claude Vision or OpenAI GPT-4 Vision
  • Support for multiple image formats (JPEG, PNG, GIF, WebP)
  • Configurable primary and fallback providers
  • Base64 and file-based image input support
  • Optional text extraction using Tesseract OCR

Requirements

  • Python 3.8 or higher
  • Tesseract OCR (optional) - Required for text extraction feature
    • Windows: Download and install from UB-Mannheim/tesseract
    • Linux: sudo apt-get install tesseract-ocr
    • macOS: brew install tesseract

Installation

  1. Clone the repository:
git clone https://github.com/mario-andreschak/mcp-image-recognition.git
cd mcp-image-recognition
  1. Create and configure your environment file:
cp .env.example .env
# Edit .env with your API keys and preferences
  1. Build the project:
build.bat

Usage

Running the Server

Spawn the server using python:

python -m image_recognition_server.server

Start the server using batch instead:

run.bat server

Start the server in development mode with the MCP Inspector:

run.bat debug

Available Tools

  1. describe_image

    • Input: Base64-encoded image data and MIME type
    • Output: Detailed description of the image
  2. describe_image_from_file

    • Input: Path to an image file
    • Output: Detailed description of the image

Environment Configuration

  • ANTHROPIC_API_KEY: Your Anthropic API key.
  • OPENAI_API_KEY: Your OpenAI API key.
  • VISION_PROVIDER: Primary vision provider (anthropic or openai).
  • FALLBACK_PROVIDER: Optional fallback provider.
  • LOG_LEVEL: Logging level (DEBUG, INFO, WARNING, ERROR).
  • ENABLE_OCR: Enable Tesseract OCR text extraction (true or false).
  • TESSERACT_CMD: Optional custom path to Tesseract executable.
  • OPENAI_MODEL: OpenAI Model (default: gpt-4o-mini). Can use OpenRouter format for other models (e.g., anthropic/claude-3.5-sonnet:beta).
  • OPENAI_BASE_URL: Optional custom base URL for the OpenAI API. Set to https://openrouter.ai/api/v1 for OpenRouter.
  • OPENAI_TIMEOUT: Optional custom timeout (in seconds) for the OpenAI API.

Using OpenRouter

OpenRouter allows you to access various models using the OpenAI API format. To use OpenRouter, follow these steps:

  1. Obtain an OpenAI API key from OpenRouter.
  2. Set OPENAI_API_KEY in your .env file to your OpenRouter API key.
  3. Set OPENAI_BASE_URL to https://openrouter.ai/api/v1.
  4. Set OPENAI_MODEL to the desired model using the OpenRouter format (e.g., anthropic/claude-3.5-sonnet:beta).
  5. Set VISION_PROVIDER to openai.

Default Models

  • Anthropic: claude-3.5-sonnet-beta
  • OpenAI: gpt-4o-mini
  • OpenRouter: Use the anthropic/claude-3.5-sonnet:beta format in OPENAI_MODEL.

Development

Running Tests

Run all tests:

run.bat test

Run specific test suite:

run.bat test server
run.bat test anthropic
run.bat test openai

Docker Support

Build the Docker image:

docker build -t mcp-image-recognition .

Run the container:

docker run -it --env-file .env mcp-image-recognition

License

MIT License - see LICENSE file for details.

Release History

  • 0.1.2 (2025-02-20): Improved OCR error handling and added comprehensive test coverage for OCR functionality
  • 0.1.1 (2025-02-19): Added Tesseract OCR support for text extraction from images (optional feature)
  • 0.1.0 (2025-02-19): Initial release with Anthropic and OpenAI vision support

相关推荐

  • NiKole Maxwell
  • I craft unique cereal names, stories, and ridiculously cute Cereal Baby images.

  • Bora Yalcin
  • Evaluator for marketplace product descriptions, checks for relevancy and keyword stuffing.

  • https://jgadvisorycpa.com
  • This GPT assists in finding a top-rated business CPA - local or virtual. We account for their qualifications, experience, testimonials and reviews. Business operators provide a short description of your business, services wanted, and city or state.

  • https://suefel.com
  • Latest advice and best practices for custom GPT development.

  • Yusuf Emre Yeşilyurt
  • I find academic articles and books for research and literature reviews.

  • https://maiplestudio.com
  • Find Exhibitors, Speakers and more

  • Carlos Ferrin
  • Encuentra películas y series en plataformas de streaming.

  • Joshua Armstrong
  • Confidential guide on numerology and astrology, based of GG33 Public information

  • Emmet Halm
  • Converts Figma frames into front-end code for various mobile frameworks.

  • Elijah Ng Shi Yi
  • Advanced software engineer GPT that excels through nailing the basics.

  • lumpenspace
  • Take an adjectivised noun, and create images making it progressively more adjective!

  • apappascs
  • Entdecken Sie die umfassendste und aktuellste Sammlung von MCP-Servern auf dem Markt. Dieses Repository dient als zentraler Hub und bietet einen umfangreichen Katalog von Open-Source- und Proprietary MCP-Servern mit Funktionen, Dokumentationslinks und Mitwirkenden.

  • OffchainLabs
  • GO -Umsetzung des Ethereum -Beweises des Anteils

  • huahuayu
  • Ein einheitliches API-Gateway zur Integration mehrerer Ethercan-ähnlicher Blockchain-Explorer-APIs mit Modellkontextprotokoll (MCP) für AI-Assistenten.

  • modelcontextprotocol
  • Modellkontext -Protokollserver

    Reviews

    1 (1)
    Avatar
    user_TlUb9VSz
    2025-04-15

    I am thoroughly impressed with the Youtube app by nabid-pf available on MCP. The integration is seamless, making it easy to access and enjoy all my favorite videos right from the server. The user interface is intuitive, and the streaming quality is excellent. Highly recommend checking this out! You won't be disappointed: https://mcp.so/server/youtube/nabid-pf