I craft unique cereal names, stories, and ridiculously cute Cereal Baby images.

MCP-image-reconnaissance
Un serveur MCP qui fournit des capacités de reconnaissance d'image 👀 utilisant des API de vision anthropique et openai
1
Github Watches
3
Github Forks
10
Github Stars
MCP Image Recognition Server
An MCP server that provides image recognition capabilities using Anthropic and OpenAI vision APIs. Version 0.1.2.
Features
- Image description using Anthropic Claude Vision or OpenAI GPT-4 Vision
- Support for multiple image formats (JPEG, PNG, GIF, WebP)
- Configurable primary and fallback providers
- Base64 and file-based image input support
- Optional text extraction using Tesseract OCR
Requirements
- Python 3.8 or higher
- Tesseract OCR (optional) - Required for text extraction feature
- Windows: Download and install from UB-Mannheim/tesseract
- Linux:
sudo apt-get install tesseract-ocr
- macOS:
brew install tesseract
Installation
- Clone the repository:
git clone https://github.com/mario-andreschak/mcp-image-recognition.git
cd mcp-image-recognition
- Create and configure your environment file:
cp .env.example .env
# Edit .env with your API keys and preferences
- Build the project:
build.bat
Usage
Running the Server
Spawn the server using python:
python -m image_recognition_server.server
Start the server using batch instead:
run.bat server
Start the server in development mode with the MCP Inspector:
run.bat debug
Available Tools
-
describe_image
- Input: Base64-encoded image data and MIME type
- Output: Detailed description of the image
-
describe_image_from_file
- Input: Path to an image file
- Output: Detailed description of the image
Environment Configuration
-
ANTHROPIC_API_KEY
: Your Anthropic API key. -
OPENAI_API_KEY
: Your OpenAI API key. -
VISION_PROVIDER
: Primary vision provider (anthropic
oropenai
). -
FALLBACK_PROVIDER
: Optional fallback provider. -
LOG_LEVEL
: Logging level (DEBUG, INFO, WARNING, ERROR). -
ENABLE_OCR
: Enable Tesseract OCR text extraction (true
orfalse
). -
TESSERACT_CMD
: Optional custom path to Tesseract executable. -
OPENAI_MODEL
: OpenAI Model (default:gpt-4o-mini
). Can use OpenRouter format for other models (e.g.,anthropic/claude-3.5-sonnet:beta
). -
OPENAI_BASE_URL
: Optional custom base URL for the OpenAI API. Set tohttps://openrouter.ai/api/v1
for OpenRouter. -
OPENAI_TIMEOUT
: Optional custom timeout (in seconds) for the OpenAI API.
Using OpenRouter
OpenRouter allows you to access various models using the OpenAI API format. To use OpenRouter, follow these steps:
- Obtain an OpenAI API key from OpenRouter.
- Set
OPENAI_API_KEY
in your.env
file to your OpenRouter API key. - Set
OPENAI_BASE_URL
tohttps://openrouter.ai/api/v1
. - Set
OPENAI_MODEL
to the desired model using the OpenRouter format (e.g.,anthropic/claude-3.5-sonnet:beta
). - Set
VISION_PROVIDER
toopenai
.
Default Models
- Anthropic:
claude-3.5-sonnet-beta
- OpenAI:
gpt-4o-mini
- OpenRouter: Use the
anthropic/claude-3.5-sonnet:beta
format inOPENAI_MODEL
.
Development
Running Tests
Run all tests:
run.bat test
Run specific test suite:
run.bat test server
run.bat test anthropic
run.bat test openai
Docker Support
Build the Docker image:
docker build -t mcp-image-recognition .
Run the container:
docker run -it --env-file .env mcp-image-recognition
License
MIT License - see LICENSE file for details.
Release History
- 0.1.2 (2025-02-20): Improved OCR error handling and added comprehensive test coverage for OCR functionality
- 0.1.1 (2025-02-19): Added Tesseract OCR support for text extraction from images (optional feature)
- 0.1.0 (2025-02-19): Initial release with Anthropic and OpenAI vision support
相关推荐
I find academic articles and books for research and literature reviews.
Evaluator for marketplace product descriptions, checks for relevancy and keyword stuffing.
Confidential guide on numerology and astrology, based of GG33 Public information
This GPT assists in finding a top-rated business CPA - local or virtual. We account for their qualifications, experience, testimonials and reviews. Business operators provide a short description of your business, services wanted, and city or state.
Emulating Dr. Jordan B. Peterson's style in providing life advice and insights.
Your go-to expert in the Rust ecosystem, specializing in precise code interpretation, up-to-date crate version checking, and in-depth source code analysis. I offer accurate, context-aware insights for all your Rust programming questions.
Advanced software engineer GPT that excels through nailing the basics.
Découvrez la collection la plus complète et la plus à jour de serveurs MCP sur le marché. Ce référentiel sert de centre centralisé, offrant un vaste catalogue de serveurs MCP open-source et propriétaires, avec des fonctionnalités, des liens de documentation et des contributeurs.
L'application tout-en-un desktop et Docker AI avec chiffon intégré, agents AI, constructeur d'agent sans code, compatibilité MCP, etc.
Manipulation basée sur Micropython I2C de l'exposition GPIO de la série MCP, dérivée d'Adafruit_MCP230XX
Plateforme d'automatisation de workflow à code équitable avec des capacités d'IA natives. Combinez le bâtiment visuel avec du code personnalisé, de l'auto-hôte ou du cloud, 400+ intégrations.
Une passerelle API unifiée pour intégrer plusieurs API d'explorateur de blockchain de type étherscan avec la prise en charge du protocole de contexte modèle (MCP) pour les assistants d'IA.
Reviews

user_TlUb9VSz
I am thoroughly impressed with the Youtube app by nabid-pf available on MCP. The integration is seamless, making it easy to access and enjoy all my favorite videos right from the server. The user interface is intuitive, and the streaming quality is excellent. Highly recommend checking this out! You won't be disappointed: https://mcp.so/server/youtube/nabid-pf