MCP cover image
See in Github
2025-01-06

MCP Server pour la visualise de jeu de données FACE HUGGING

1

Github Watches

7

Github Forks

12

Github Stars

Dataset Viewer MCP Server

An MCP server for interacting with the Hugging Face Dataset Viewer API, providing capabilities to browse and analyze datasets hosted on the Hugging Face Hub.

Features

Resources

  • Uses dataset:// URI scheme for accessing Hugging Face datasets
  • Supports dataset configurations and splits
  • Provides paginated access to dataset contents
  • Handles authentication for private datasets
  • Supports searching and filtering dataset contents
  • Provides dataset statistics and analysis

Tools

The server provides the following tools:

  1. validate

    • Check if a dataset exists and is accessible
    • Parameters:
      • dataset: Dataset identifier (e.g. 'stanfordnlp/imdb')
      • auth_token (optional): For private datasets
  2. get_info

    • Get detailed information about a dataset
    • Parameters:
      • dataset: Dataset identifier
      • auth_token (optional): For private datasets
  3. get_rows

    • Get paginated contents of a dataset
    • Parameters:
      • dataset: Dataset identifier
      • config: Configuration name
      • split: Split name
      • page (optional): Page number (0-based)
      • auth_token (optional): For private datasets
  4. get_first_rows

    • Get first rows from a dataset split
    • Parameters:
      • dataset: Dataset identifier
      • config: Configuration name
      • split: Split name
      • auth_token (optional): For private datasets
  5. get_statistics

    • Get statistics about a dataset split
    • Parameters:
      • dataset: Dataset identifier
      • config: Configuration name
      • split: Split name
      • auth_token (optional): For private datasets
  6. search_dataset

    • Search for text within a dataset
    • Parameters:
      • dataset: Dataset identifier
      • config: Configuration name
      • split: Split name
      • query: Text to search for
      • auth_token (optional): For private datasets
  7. filter

    • Filter rows using SQL-like conditions
    • Parameters:
      • dataset: Dataset identifier
      • config: Configuration name
      • split: Split name
      • where: SQL WHERE clause (e.g. "score > 0.5")
      • orderby (optional): SQL ORDER BY clause
      • page (optional): Page number (0-based)
      • auth_token (optional): For private datasets
  8. get_parquet

    • Download entire dataset in Parquet format
    • Parameters:
      • dataset: Dataset identifier
      • auth_token (optional): For private datasets

Installation

Prerequisites

  • Python 3.12 or higher
  • uv - Fast Python package installer and resolver

Setup

  1. Clone the repository:
git clone https://github.com/privetin/dataset-viewer.git
cd dataset-viewer
  1. Create a virtual environment and install:
# Create virtual environment
uv venv

# Activate virtual environment
# On Unix:
source .venv/bin/activate
# On Windows:
.venv\Scripts\activate

# Install in development mode
uv add -e .

Configuration

Environment Variables

  • HUGGINGFACE_TOKEN: Your Hugging Face API token for accessing private datasets

Claude Desktop Integration

Add the following to your Claude Desktop config file:

On Windows: %APPDATA%\Claude\claude_desktop_config.json

On MacOS: ~/Library/Application Support/Claude/claude_desktop_config.json

{
  "mcpServers": {
    "dataset-viewer": {
      "command": "uv",
      "args": [
        "run",
        "dataset-viewer"
      ]
    }
  }
}

Usage Examples

  1. Validate a dataset:
{
  "dataset": "stanfordnlp/imdb"
}
  1. Get dataset information:
{
  "dataset": "stanfordnlp/imdb"
}
  1. Search dataset contents:
{
  "dataset": "stanfordnlp/imdb",
  "config": "plain_text",
  "split": "train",
  "query": "great movie"
}
  1. Filter and sort rows:
{
  "dataset": "stanfordnlp/imdb",
  "config": "plain_text",
  "split": "train",
  "where": "label = 'positive'",
  "orderby": "text DESC",
  "page": 0
}
  1. Get dataset statistics:
{
  "dataset": "stanfordnlp/imdb",
  "config": "plain_text",
  "split": "train"
}

License

MIT License - see LICENSE for details

相关推荐

  • NiKole Maxwell
  • I craft unique cereal names, stories, and ridiculously cute Cereal Baby images.

  • https://suefel.com
  • Latest advice and best practices for custom GPT development.

  • Yusuf Emre Yeşilyurt
  • I find academic articles and books for research and literature reviews.

  • https://maiplestudio.com
  • Find Exhibitors, Speakers and more

  • Carlos Ferrin
  • Encuentra películas y series en plataformas de streaming.

  • Joshua Armstrong
  • Confidential guide on numerology and astrology, based of GG33 Public information

  • Contraband Interactive
  • Emulating Dr. Jordan B. Peterson's style in providing life advice and insights.

  • https://jgadvisorycpa.com
  • This GPT assists in finding a top-rated business CPA - local or virtual. We account for their qualifications, experience, testimonials and reviews. Business operators provide a short description of your business, services wanted, and city or state.

  • rustassistant.com
  • Your go-to expert in the Rust ecosystem, specializing in precise code interpretation, up-to-date crate version checking, and in-depth source code analysis. I offer accurate, context-aware insights for all your Rust programming questions.

  • Elijah Ng Shi Yi
  • Advanced software engineer GPT that excels through nailing the basics.

  • Emmet Halm
  • Converts Figma frames into front-end code for various mobile frameworks.

  • apappascs
  • Découvrez la collection la plus complète et la plus à jour de serveurs MCP sur le marché. Ce référentiel sert de centre centralisé, offrant un vaste catalogue de serveurs MCP open-source et propriétaires, avec des fonctionnalités, des liens de documentation et des contributeurs.

  • modelcontextprotocol
  • Serveurs de protocole de contexte modèle

  • Mintplex-Labs
  • L'application tout-en-un desktop et Docker AI avec chiffon intégré, agents AI, constructeur d'agent sans code, compatibilité MCP, etc.

  • ShrimpingIt
  • Manipulation basée sur Micropython I2C de l'exposition GPIO de la série MCP, dérivée d'Adafruit_MCP230XX

  • OffchainLabs
  • Aller la mise en œuvre de la preuve de la participation Ethereum

  • n8n-io
  • Plateforme d'automatisation de workflow à code équitable avec des capacités d'IA natives. Combinez le bâtiment visuel avec du code personnalisé, de l'auto-hôte ou du cloud, 400+ intégrations.

  • huahuayu
  • Une passerelle API unifiée pour intégrer plusieurs API d'explorateur de blockchain de type étherscan avec la prise en charge du protocole de contexte modèle (MCP) pour les assistants d'IA.

    Reviews

    4 (1)
    Avatar
    user_KNZZm1lq
    2025-04-15

    AgenticProductSearching by Gen-AI-Developer is a game-changer! The seamless integration and user-friendly interface are impressive. The product has significantly enhanced my search efficiency and accuracy. I highly recommend it to anyone seeking an effective solution for product searches. The performance is robust and reliable. Check it out: https://mcp.so/server/AgenticProductSearching/Gen-AI-Developer