whisper free download

Showing 79 open source projects for "whisper"

View related business solutions

Build a Custom Ad Server in Just Weeks
Adzerk's ad serving APIs allow developers to build and scale innovative, server-side ad platforms without reinventing the wheel.

Adzerk's APIs make it easy for engineers and PMs to build their own server-side, fully-customized ad server. Top e-retailers and user communities use Adzerk to build innovative ad servers to promote anything from native ads to internal content to sponsored listings (where vendors and sellers pay for their organic listing to be promoted in search and browsing results). Engineers reliably see a 90%+ reduction in dev time using Adzerk’s APIs versus doing it entirely from scratch. Adzerk’s customer list includes Fortune 500 brands, public companies, and unicorn startups, including Bed Bath & Beyond, LiveNation/TicketMaster, Wattpad, TradingView, imgur, Strava, and many more. Our Ad.Product community makes it easy for product managers, engineers, ad ops, and others to discover and discuss how to build innovative, user-first ad platforms.

Start Now
Get your free 3CX license delivered to your inbox. Easy deployment and management; on premise or in the cloud, 3CX includes features such as: mobile apps, web conferencing, live chat, click2call and more, for UNLIMITED users.
Business as Usual During Covid-19

3CX is a software-based, open standards IP PBX that offers complete Unified Communications, out of the box. Suitable for any business size or industry 3CX can accommodate your every need; from mobility and status to advanced contact center features and more, at a fraction of the cost. 3CX makes installation, management and maintenance of your PBX so easy that you can effortlessly manage it yourself, whether on an appliance or server at your premise on Windows, Linux or in the cloud.

DOWNLOAD
1

Whisper

Robust Speech Recognition via Large-Scale Weak Supervision

OpenAI Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification. A Transformer sequence-to-sequence model is trained on various speech processing tasks, including multilingual speech recognition, speech translation, spoken language identification, and voice activity detection.

Downloads: 69 This Week

Last Update: 2025-06-26
See Project
2

Faster Whisper

Faster Whisper transcription with CTranslate2

Faster Whisper is an optimized implementation of the Whisper speech recognition model designed to deliver significantly faster inference while maintaining comparable accuracy. It leverages efficient inference engines and optimized computation strategies to reduce latency and resource consumption. The system is particularly useful for real-time or large-scale transcription tasks where performance is critical.

Downloads: 14 This Week

Last Update: 2026-04-06
See Project
3

Whisper-WebUI

A Web UI for easy subtitle using whisper model

Whisper WebUI is an open-source browser-based interface that simplifies the use of Whisper speech recognition models by providing an intuitive graphical environment for transcription, translation, and subtitle generation. Built with Gradio, it allows users to upload audio or video files, process them locally, and generate accurate text outputs without relying on command-line tools.

Downloads: 12 This Week

Last Update: 2026-03-18
See Project
4

whisper-timestamped

Multilingual Automatic Speech Recognition with word-level timestamps

Multilingual Automatic Speech Recognition with word-level timestamps and confidence. Whisper is a set of multi-lingual, robust speech recognition models trained by OpenAI that achieve state-of-the-art results in many languages. Whisper models were trained to predict approximate timestamps on speech segments (most of the time with 1-second accuracy), but they cannot originally predict word timestamps. This repository proposes an implementation to predict word timestamps and provide a more accurate estimation of speech segments when transcribing with Whisper models. ...

Downloads: 6 This Week

Last Update: 2025-09-09
See Project
Infoflo CRM Software
Focus less on your CRM and more on your business

Infoflo is an easy to use CRM that is perfect for managing customer relationships and includes the most robust Outlook sync on the market! It is a fully integrated contact, relationship, email, calendar, document, sales, task management solution and VoIP. It includes a Quick Books, Outlook and Google Sync.

Free Trial
5

whisper.cpp

Port of OpenAI's Whisper model in C/C++

whisper.cpp is a lightweight, C/C++ reimplementation of OpenAI’s Whisper automatic speech recognition (ASR) model—designed for efficient, standalone transcription without external dependencies. The entire high-level implementation of the model is contained in whisper.h and whisper.cpp. The rest of the code is part of the ggml machine learning library. The command downloads the base.en model converted to custom ggml format and runs the inference on all .wav samples in the folder samples. whisper.cpp supports integer quantization of the Whisper ggml models. ...

Downloads: 358 This Week

Last Update: 2026-03-19
See Project
6

Insanely Fast Whisper

An opinionated CLI to transcribe Audio files w/ Whisper on-device

Insanely Fast Whisper is a high-performance command-line tool designed to dramatically accelerate speech-to-text transcription using OpenAI’s Whisper models on local hardware. It leverages modern optimizations such as batch processing, mixed precision, and advanced attention mechanisms like Flash Attention to significantly reduce inference time while maintaining high transcription accuracy.

Downloads: 1 This Week

Last Update: 2026-03-26
See Project
7

WhisperLive

A nearly-live implementation of OpenAI's Whisper

WhisperLive is a “nearly live” implementation of OpenAI’s Whisper model focused on real-time transcription. It runs as a server–client system in which the server hosts a Whisper backend and clients stream audio to be transcribed with very low delay. The project supports multiple inference backends, including Faster-Whisper, NVIDIA TensorRT, and OpenVINO, allowing you to target GPUs and different CPU architectures efficiently.

Downloads: 13 This Week

Last Update: 2026-03-17
See Project
8

Voice-Pro

Comprehensive Gradio WebUI for audio processing

Voice-Pro is the best gradio WebUI for transcription, translation and text-to-speech. It can be easily installed with one click. Create a virtual environment using Miniconda, running completely separate from the Windows system (fully portable). Supports real-time transcription and translation, as well as batch mode.

1 Review

Downloads: 36 This Week

Last Update: 2025-12-05
See Project
9

Go OpenAI

OpenAI ChatGPT, GPT-3, GPT-4, DALL·E, Whisper API wrapper for Go

This library provides Go clients for OpenAI API. OpenAI ChatGPT, GPT-3, GPT-4, DALL·E, Whisper API wrapper for Go.

Downloads: 0 This Week

Last Update: 2025-08-29
See Project
Melis Platform is an enterprise-grade Low Code Platform simplifying app creation, management, and delivery.
Ideal for websites, apps, e-commerce, CRMs, and more

Melis is a new generation of Content Management System and eCommerce platform to achieve and manage websites from a single web interface easy to use while offering the best of open source technology.

Learn More
10

WhisperKit

On-device Speech Recognition for Apple Silicon

WhisperKit is a Swift package that integrates OpenAI's popular Whisper speech recognition model with Apple's CoreML framework for efficient, local inference on Apple devices. Whisper has pulled the future forward when fast, free and virtually error-free translation and transcription will be ubiquitous. It inspired numerous developers to improve and deploy it with minimal friction and maximum performance.

Downloads: 4 This Week

Last Update: 2026-04-01
See Project
11

WhisperSpeech

An Open Source text-to-speech system built by inverting Whisper

WhisperSpeech is an open-source text-to-speech system created by “inverting” OpenAI’s Whisper, reusing its strengths as a semantic audio model to generate speech instead of only transcribing it. The project aims to be for speech what Stable Diffusion is for images: powerful, hackable, and safe for commercial use, with code under Apache-2.0/MIT and models trained only on properly licensed data. Its architecture follows a token-based, multi-stage pipeline inspired by AudioLM and SPEAR-TTS: Whisper is used to produce semantic tokens, EnCodec compresses the waveform into acoustic tokens, and Vocos reconstructs high-fidelity audio from those tokens. ...

Downloads: 2 This Week

Last Update: 2025-11-28
See Project
12

Meetily

Privacy first, AI meeting assistant with 4x faster Parakeet/Whisper

...It’s built for organizations that want meeting intelligence without sending recordings or transcripts to third-party cloud services, which helps address compliance and data sovereignty requirements. The app supports live transcription with local model options (including Whisper- and Parakeet-based workflows) and presents the transcript as the meeting happens, making it useful both for note-taking and accessibility. After or during the session, it can produce structured, AI-generated summaries, and it’s designed to be flexible about where that summarization comes from, supporting local providers as well as external endpoints when allowed by policy.

Downloads: 24 This Week

Last Update: 2026-02-11
See Project
13

WhisperJAV

Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD

...WhisperJAV introduces a specialized pipeline that separates text generation from timestamp alignment, allowing the system to generate transcripts and then align them with audio using forced alignment techniques. The framework supports several speech recognition models, including Qwen-based ASR systems and fine-tuned Whisper models trained on domain-specific dialogue.

Downloads: 19 This Week

Last Update: 2026-04-09
See Project
14

swords for whisper

Downloads: 4 This Week

Last Update: 2025-12-15
See Project
15

Whisper-Studio

Another whisper wrapper, built fully in C++, with some neat features.

a native lightweight C++ application for OpenAI's Whisper, with a few new things like transcribing audio in real-time, identifying speakers, auto-paste transcriptions, and a few other things. Its not the prettiest app, I suck at design, but it gets the job done.

Downloads: 1 This Week

Last Update: 2026-02-07
See Project
16

Handy STT

A free, open source, and extensible speech-to-text application

...Handy allows users to start transcription instantly using a configurable keyboard shortcut—press to record, release to transcribe—and automatically pastes the resulting text into any active text field. Its backend leverages OpenAI’s Whisper models for GPU-accelerated speech recognition and Parakeet V3 for efficient CPU-only transcription with automatic language detection. To further refine accuracy and responsiveness, Handy integrates Silero’s Voice Activity Detection (VAD) for silence filtering, ensuring only speech segments are processed.

Downloads: 67 This Week

Last Update: 2026-04-02
See Project
17

WhisperX

Automatic Speech Recognition with Word-level Timestamps

WhisperX is an advanced speech recognition system built on top of OpenAI’s Whisper model, designed to improve transcription accuracy and timing precision for long-form audio. It addresses key limitations of standard Whisper implementations by introducing voice activity detection and forced alignment techniques to produce word-level timestamps. The system enables batched inference, significantly increasing transcription speed while maintaining high accuracy.

Downloads: 15 This Week

Last Update: 2026-04-06
See Project
18

Hyprnote

Local-first AI Notepad for Private Meetings

Hyprnote is an open-source, privacy-first AI notepad app designed for taking notes during meetings—transcribing audio (microphone and system) and generating context-rich summaries using on-device AI models like Whisper and HyprLLM, all without any data leaving your machine.(turn0search7, turn0search1). Listens to your meetings while you write. Crafts smart summaries based on your quick notes. Runs completely offline using open-source models like Whisper or HyprLLM. Use approved third-party APIs like Gemini, Claude, or Azure-hosted GPT.

Downloads: 20 This Week

Last Update: 2 days ago
See Project
19

Buzz

Transcribe and translate audio offline on your personal computer

Buzz transcribes and translates audio to text offline using OpenAI's Whisper. Import audio and video files into Buzz and export them as TXT, SRT, or VTT files. Buzz supports Whisper, Whisper.cpp, Faster Whisper, Whisper-compatible models from the Hugging Face repository, and the OpenAI Whisper API. Get linux versions from: - https://flathub.org/apps/io.github.chidiwilliams.Buzz - https://snapcraft.io/buzz Home page of Buzz https://github.com/chidiwilliams/buzz Note for Windows: App is not signed, you will get a warning when you install it. ...

1 Review

Downloads: 4,937 This Week

Last Update: 2026-03-14
See Project
20

HeartMuLa

A Family of Open Sourced Music Foundation Models

...The project also includes HeartCodec, a music codec optimized for high reconstruction fidelity, enabling efficient tokenization and reconstruction workflows that are critical for training and generation pipelines. For text extraction from audio, it provides HeartTranscriptor, a Whisper-based model tuned specifically for lyrics transcription, which helps bridge generated or recorded audio back into structured text. It also introduces HeartCLAP, which aligns audio and text into a shared embedding space.

Downloads: 15 This Week

Last Update: 2026-04-10
See Project
21

Scriberr

Self-hosted AI audio transcription

...Unlike cloud-based transcription services, Scriberr runs entirely on the user’s machine, ensuring that sensitive recordings are never sent to third-party servers and remain fully under user control. It leverages modern speech recognition models such as Whisper and other advanced architectures to deliver precise transcripts with word-level timing and speaker identification. The application includes a polished user interface that simplifies the management of recordings, transcripts, and annotations, making it suitable for both casual users and professionals handling large volumes of audio. ...

Downloads: 11 This Week

Last Update: 2026-03-19
See Project
22

Speech Note

Speech Note Linux app. Note taking, reading and translating

...All processing is done locally, which means audio, text, and translations never leave the device, emphasizing strong privacy guarantees. The application supports multiple STT engines such as Coqui STT (DeepSpeech fork), Vosk, whisper.cpp, Faster Whisper, and april-asr, giving users flexibility in accuracy, speed, and hardware requirements. For text-to-speech, it can plug into a wide range of engines including espeak-ng, MBROLA, Piper, RHVoice, Coqui TTS, Mimic 3, WhisperSpeech, Kokoro, Parler-TTS, F5-TTS, and even classic S.A.M., making it highly customizable in terms of voices and languages.

Downloads: 23 This Week

Last Update: 4 days ago
See Project
23

AutoCut

Cut videos with a text editor

...This approach transforms video editing into a textual editing task, greatly lowering the barrier to editing for users who find traditional video editors complex or unintuitive. AutoCut supports multiple transcription backends, including Whisper and faster-whisper modes, allowing users to choose based on speed or accuracy needs. After editing the transcript text, the corresponding video clips are merged into the final output, and the tool also produces matching subtitle files. Its command-line interface can be integrated into scripts, making it suitable for automated workflows or batch processing.

Downloads: 2 This Week

Last Update: 2026-02-06
See Project
24

VideoCaptioner

AI-powered tool for generating, optimizing, and translating subtitles

...It integrates speech recognition, language processing, and translation technologies to automatically generate and refine subtitles from video or audio sources. VideoCaptioner uses speech-to-text engines such as Whisper variants to transcribe spoken content and convert it into subtitle text with accurate timestamps. After transcription, large language models are used to intelligently restructure subtitles into natural sentences, correct wording, and improve readability for viewers. It can also translate subtitles into other languages while preserving the original timing, making it suitable for multilingual video publishing and accessibility. ...

Downloads: 10 This Week

Last Update: 2026-03-28
See Project
25

Note67

A private, local meeting notes assistant

...Built with a cross-platform architecture using Rust (via Tauri) for backend logic and a TypeScript/React frontend, it prioritizes privacy by performing audio transcription locally with Whisper models and generating summaries with locally-hosted AI, eliminating the need to send sensitive meeting content to external servers. Users can record meetings directly from their microphone, view live transcriptions, filter by speaker, and export structured summaries, making it useful for professionals who need searchable, organized records of discussions. ...

Downloads: 4 This Week

Last Update: 1 day ago
See Project