Whisper MCP: Give Your AI Assistant "Ears" with Local Speech-to-Text

Platform	Backend Engine	Acceleration
macOS	whisper.cpp	Metal / CoreML
Linux	whisper.cpp / faster-whisper	CUDA
Windows	faster-whisper	CUDA / CPU

Model	Size	Accuracy	Best For
large-v3-turbo	~1.6 GB	Highest	Default recommended
large-v3-turbo-q8_0	~874 MB	High	Balance of speed & quality
large-v3-turbo-q5_0	~574 MB	Good	Speed-first scenarios

🤔 What is MCP? Why Does It Matter?

🚀 Core Features of Whisper MCP

1. Local-First, Privacy-First

2. Dual Backend Architecture, Cross-Platform Support

3. Automatic Hardware Acceleration

4. High-Quality Transcription Model

5. Multiple Audio Format Support

6. Timestamps and Segmented Output

7. Smart Splitting for Long Audio

🛠️ MCP Tools Overview

`transcribe` — Transcribe Audio Files

`transcribe_with_split` — Long Audio Segmented Transcription

`get_model_info` — View Current Model Information

`check_health` — Service Health Check

📦 Quick Start

macOS One-Click Launch

Windows Global Command

Configure Claude Desktop

💡 Usage Examples

Example 1: Transcribe Local Audio

Example 2: Generate Subtitle Files

Example 3: Transcribe Foreign Language Content

🔧 Advanced Configuration

🎯 Use Cases

1. Meeting Notes Organization

2. Podcast and Video Subtitles

3. Interview Content Processing

4. Personal Voice Notes

5. Learning Material Processing

🔮 Future Roadmap

🤝 Contributing

📄 License

🤔 What is MCP? Why Does It Matter?

🚀 Core Features of Whisper MCP

1. Local-First, Privacy-First

2. Dual Backend Architecture, Cross-Platform Support

3. Automatic Hardware Acceleration

4. High-Quality Transcription Model

5. Multiple Audio Format Support

6. Timestamps and Segmented Output

7. Smart Splitting for Long Audio

🛠️ MCP Tools Overview

transcribe — Transcribe Audio Files

transcribe_with_split — Long Audio Segmented Transcription

get_model_info — View Current Model Information

check_health — Service Health Check

📦 Quick Start

macOS One-Click Launch

Windows Global Command

Configure Claude Desktop

💡 Usage Examples

Example 1: Transcribe Local Audio

Example 2: Generate Subtitle Files

Example 3: Transcribe Foreign Language Content

🔧 Advanced Configuration

🎯 Use Cases

1. Meeting Notes Organization

2. Podcast and Video Subtitles

3. Interview Content Processing

4. Personal Voice Notes

5. Learning Material Processing

🔮 Future Roadmap

🤝 Contributing

📄 License

`transcribe` — Transcribe Audio Files

`transcribe_with_split` — Long Audio Segmented Transcription

`get_model_info` — View Current Model Information

`check_health` — Service Health Check