Cover image
Try Now
2025-04-14

3 years

Works with Finder

0

Github Watches

0

Github Forks

0

Github Stars

YouTube Transcript MCP

A Python-based MCP (Model Control Protocol) server that provides a robust solution for extracting transcripts from YouTube videos using both subtitle-based and audio-based transcription methods. This project enables AI assistants to easily obtain transcripts from any YouTube video through a standardized interface.

Table of Contents

What It Does

This MCP server provides a dual-approach transcription system:

Primary Method: Subtitle-based Transcription

  • Extracts available subtitles from YouTube videos
  • Supports multiple language preferences
  • Uses the youtube_transcript_api for efficient subtitle extraction

Fallback Method: Audio-based Transcription

If subtitles are unavailable or explicitly requested, the system will:

  • Download the video's audio track using pytubefix
  • Convert the audio to a suitable format using pydub
  • Process the audio in chunks using Google Speech Recognition
  • Generate a transcript from the spoken content

Project Architecture

Component Interaction

  1. YouTubeTranscriptManager (Main Controller)

    • Coordinates the transcription process
    • Manages fallback between subtitle and audio methods
    • Handles error reporting
  2. YouTubeTranscriptExtractor (Subtitle Handler)

    • Processes YouTube URLs
    • Extracts video IDs
    • Manages subtitle retrieval
  3. YouTubeAudioManager (Audio Handler)

    • Downloads audio content
    • Manages audio processing
    • Handles speech recognition

Quick Started

Prerequisites

  • Python 3.8 or higher
  • uv package manager or npm package manager
  • FFmpeg: Required for audio processing (used by pydub)

Dependencies

  • youtube_transcript_api: For subtitle extraction
  • pytubefix: For downloading YouTube audio
  • SpeechRecognition: For audio transcription
  • pydub: For audio processing (requires FFmpeg)
  • mcp-python: For MCP server implementation

Installation

  1. Clone this repository:
git clone [your-repo-url]
cd youtube_transcript_mcp
  1. Install dependencies:
uv venv
uv pip install -r requirements.txt
  1. Install FFmpeg:
    • Windows: Download FFmpeg from https://ffmpeg.org/download.html, extract it, and add the bin folder to your system's PATH.
    • macOS: Use Homebrew: brew install ffmpeg
    • Linux: Use your package manager, e.g., sudo apt install ffmpeg (Debian/Ubuntu)

Running the Server

Run the server using:

uv run youtube_transcript_manager.py

Configuration with Claude for Desktop

To use this server with Claude for Desktop, add the following to your Claude configuration file (~/Library/Application Support/Claude/claude_desktop_config.json):

{
    "mcpServers": {
        "youtube_transcript": {
            "command": "uv",
            "args": [
                "--directory",
                "PATH_TO_YOUR_PROJECT_FOLDER",
                "run",
                "youtube_transcript_manager.py"
            ]
        }
    }
}

Replace PATH_TO_YOUR_PROJECT_FOLDER with the absolute path to your project directory.

MCP Server Usage

The server provides a single powerful tool get_youtube_transcript with the following parameters:

async def get_youtube_transcript(
    url: str,                           # YouTube video URL
    languages: Optional[List[str]] = None,  # Preferred subtitle languages
    use_audio: bool = False             # Force audio-based transcription
) -> str:                               # Returns the transcript text

Example Usage

  1. To get transcript using available subtitles:
transcript = await get_youtube_transcript("https://www.youtube.com/watch?v=VIDEO_ID")
  1. To get transcript in specific languages:
transcript = await get_youtube_transcript(
    "https://www.youtube.com/watch?v=VIDEO_ID",
    languages=["en", "es"]
)
  1. To force audio-based transcription:
transcript = await get_youtube_transcript(
    "https://www.youtube.com/watch?v=VIDEO_ID",
    use_audio=True
)

相关推荐

  • av
  • 毫不费力地使用一个命令运行LLM后端,API,前端和服务。

  • 1Panel-dev
  • 🔥1Panel提供了直观的Web接口和MCP服务器,用于在Linux服务器上管理网站,文件,容器,数据库和LLMS。

  • WangRongsheng
  • 🧑‍🚀 llm 资料总结(数据处理、模型训练、模型部署、 o1 模型、mcp 、小语言模型、视觉语言模型)|摘要世界上最好的LLM资源。

  • rulego
  • ⛓️Rulego是一种轻巧,高性能,嵌入式,下一代组件编排规则引擎框架。

  • sigoden
  • 使用普通的bash/javascript/python函数轻松创建LLM工具和代理。

  • hkr04
  • 轻巧的C ++ MCP(模型上下文协议)SDK

  • RockChinQ
  • 😎简单易用、🧩丰富生态 -大模型原生即时通信机器人平台| 适配QQ / 微信(企业微信、个人微信) /飞书 /钉钉 / discord / telegram / slack等平台| 支持chatgpt,deepseek,dify,claude,基于LLM的即时消息机器人平台,支持Discord,Telegram,微信,Lark,Dingtalk,QQ,Slack

  • dmayboroda
  • 带有可配置容器的本地对话抹布

  • paulwing
  • 使用MCP服务创建的测试存储库

    Reviews

    1.5 (2)
    Avatar
    user_080hK8N1
    2025-04-24

    The youtube_transcript_mcp by yuncheng-wu has been a fantastic tool for me. It effortlessly extracts transcripts from YouTube videos, making content analysis a breeze. The seamless integration and user-friendly interface are remarkable. Highly recommend for anyone who needs accurate and quick transcripts from YouTube!

    Avatar
    user_aVUuRFbf
    2025-04-24

    I'm a huge fan of youtube_transcript_mcp by yuncheng-wu! This tool is incredibly efficient for extracting transcripts from YouTube videos, making it easier to get the content you need quickly. It’s user-friendly and works like a charm. Highly recommended for anyone who frequently relies on YouTube content for study or work!