I craft unique cereal names, stories, and ridiculously cute Cereal Baby images.

dataset-viewer
MCP server for Hugging Face dataset viewer
3 years
Works with Finder
1
Github Watches
7
Github Forks
12
Github Stars
Dataset Viewer MCP Server
An MCP server for interacting with the Hugging Face Dataset Viewer API, providing capabilities to browse and analyze datasets hosted on the Hugging Face Hub.
Features
Resources
- Uses
dataset://
URI scheme for accessing Hugging Face datasets - Supports dataset configurations and splits
- Provides paginated access to dataset contents
- Handles authentication for private datasets
- Supports searching and filtering dataset contents
- Provides dataset statistics and analysis
Tools
The server provides the following tools:
-
validate
- Check if a dataset exists and is accessible
- Parameters:
-
dataset
: Dataset identifier (e.g. 'stanfordnlp/imdb') -
auth_token
(optional): For private datasets
-
-
get_info
- Get detailed information about a dataset
- Parameters:
-
dataset
: Dataset identifier -
auth_token
(optional): For private datasets
-
-
get_rows
- Get paginated contents of a dataset
- Parameters:
-
dataset
: Dataset identifier -
config
: Configuration name -
split
: Split name -
page
(optional): Page number (0-based) -
auth_token
(optional): For private datasets
-
-
get_first_rows
- Get first rows from a dataset split
- Parameters:
-
dataset
: Dataset identifier -
config
: Configuration name -
split
: Split name -
auth_token
(optional): For private datasets
-
-
get_statistics
- Get statistics about a dataset split
- Parameters:
-
dataset
: Dataset identifier -
config
: Configuration name -
split
: Split name -
auth_token
(optional): For private datasets
-
-
search_dataset
- Search for text within a dataset
- Parameters:
-
dataset
: Dataset identifier -
config
: Configuration name -
split
: Split name -
query
: Text to search for -
auth_token
(optional): For private datasets
-
-
filter
- Filter rows using SQL-like conditions
- Parameters:
-
dataset
: Dataset identifier -
config
: Configuration name -
split
: Split name -
where
: SQL WHERE clause (e.g. "score > 0.5") -
orderby
(optional): SQL ORDER BY clause -
page
(optional): Page number (0-based) -
auth_token
(optional): For private datasets
-
-
get_parquet
- Download entire dataset in Parquet format
- Parameters:
-
dataset
: Dataset identifier -
auth_token
(optional): For private datasets
-
Installation
Prerequisites
- Python 3.12 or higher
- uv - Fast Python package installer and resolver
Setup
- Clone the repository:
git clone https://github.com/privetin/dataset-viewer.git
cd dataset-viewer
- Create a virtual environment and install:
# Create virtual environment
uv venv
# Activate virtual environment
# On Unix:
source .venv/bin/activate
# On Windows:
.venv\Scripts\activate
# Install in development mode
uv add -e .
Configuration
Environment Variables
-
HUGGINGFACE_TOKEN
: Your Hugging Face API token for accessing private datasets
Claude Desktop Integration
Add the following to your Claude Desktop config file:
On Windows: %APPDATA%\Claude\claude_desktop_config.json
On MacOS: ~/Library/Application Support/Claude/claude_desktop_config.json
{
"mcpServers": {
"dataset-viewer": {
"command": "uv",
"args": [
"run",
"dataset-viewer"
]
}
}
}
Usage Examples
- Validate a dataset:
{
"dataset": "stanfordnlp/imdb"
}
- Get dataset information:
{
"dataset": "stanfordnlp/imdb"
}
- Search dataset contents:
{
"dataset": "stanfordnlp/imdb",
"config": "plain_text",
"split": "train",
"query": "great movie"
}
- Filter and sort rows:
{
"dataset": "stanfordnlp/imdb",
"config": "plain_text",
"split": "train",
"where": "label = 'positive'",
"orderby": "text DESC",
"page": 0
}
- Get dataset statistics:
{
"dataset": "stanfordnlp/imdb",
"config": "plain_text",
"split": "train"
}
License
MIT License - see LICENSE for details
相关推荐
Evaluator for marketplace product descriptions, checks for relevancy and keyword stuffing.
Confidential guide on numerology and astrology, based of GG33 Public information
A geek-themed horoscope generator blending Bitcoin prices, tech jargon, and astrological whimsy.
Converts Figma frames into front-end code for various mobile frameworks.
PR Professional: Guiding You to Get Media Placements and Publicity Quickly and Effectively
Advanced software engineer GPT that excels through nailing the basics.
Discover the most comprehensive and up-to-date collection of MCP servers in the market. This repository serves as a centralized hub, offering an extensive catalog of open-source and proprietary MCP servers, complete with features, documentation links, and contributors.
Micropython I2C-based manipulation of the MCP series GPIO expander, derived from Adafruit_MCP230xx
A unified API gateway for integrating multiple etherscan-like blockchain explorer APIs with Model Context Protocol (MCP) support for AI assistants.
Mirror ofhttps://github.com/agentience/practices_mcp_server
Mirror ofhttps://github.com/bitrefill/bitrefill-mcp-server
Reviews

user_KNZZm1lq
AgenticProductSearching by Gen-AI-Developer is a game-changer! The seamless integration and user-friendly interface are impressive. The product has significantly enhanced my search efficiency and accuracy. I highly recommend it to anyone seeking an effective solution for product searches. The performance is robust and reliable. Check it out: https://mcp.so/server/AgenticProductSearching/Gen-AI-Developer