AI Skills for Whisper
Discover 46+ Speech-to-text
Browse AI Skills for Whisper
sickn33 / daily
Provides a comprehensive reference for building real-time voice and multimodal AI applications using Daily, enabling seamless integration of AI services.
sickn33 / audio-transcriber
Automates audio-to-text transcription, generating professional Markdown documentation and summaries for meetings and lectures.
nicepkg / transcribe-and-analyze
Transcribes audio and video from URLs using WhisperKit and analyzes transcripts with AI upon request.
aiskillstore / video-processor
Processes video files with audio extraction, format conversion, and transcription using FFmpeg and OpenAI's Whisper model.
Dokhacgiakhoa / voice-ai-engine-development
Architects real-time Voice AI agents with low-latency communication, utilizing advanced speech processing and AI technologies.
Microck / Video Processor
Processes video files with audio extraction, format conversion, and transcription using FFmpeg and OpenAI's Whisper model.
aiskillstore / audio-transcriber
Transforms audio recordings into structured Markdown documentation with intelligent summaries and speaker identification.
majiayu000 / audio-transcribe
Transcribes audio and video to text using Whisper, supporting word-level timestamps for accurate subtitle generation.
majiayu000 / gastrohem-media-processor
Automates the processing of audio and image files from WhatsApp, providing transcription and OCR capabilities for efficient media management.
GeorgeDoors888 / bilibili-transcript
Transcribes Bilibili videos to text with high accuracy, providing detailed summaries and formatted transcripts in multiple languages.
majiayu000 / create-movie
Facilitates comprehensive movie creation through a structured workflow, utilizing AI tools for research, scripting, and assembly.
GeorgeDoors888 / expression-coach
Enhances personal expression skills through voice practice, real-time feedback, and data analysis for effective communication.
mattnigh / Video Processor
Processes video files with audio extraction, format conversion, and transcription using FFmpeg and OpenAI's Whisper model.
mattnigh / gastrohem-media-processor
Automates the processing of audio and image files from WhatsApp, providing transcription and OCR capabilities for efficient media management.
majiayu000 / faion-multimodal-ai
Facilitates multimodal AI applications including image/video generation and speech synthesis for diverse use cases.
alsk1992 / voice
Enables voice recognition and control for trading applications, enhancing user interaction through wake words and speech commands.
Activer007 / Video Processor
Processes video files for audio extraction, format conversion, and transcription using FFmpeg and OpenAI's Whisper model.
majiayu000 / groq-inference
Enables ultra-fast LLM inference using the GROQ API for real-time applications in chat, vision, and audio processing.
diegosouzapw / audio-transcriber
Transforms audio recordings into structured Markdown documentation with intelligent summaries and speaker identification.
MudassarAbrar / audio-transcriber
Transforms audio recordings into structured Markdown documentation with intelligent summaries and speaker identification.