Hi everyone!
A few days ago I released Whishper, a new version of a project I’ve been working for about a year now.
It’s a self-hosted audio transcription suite, you can transcribe audio to text, generate subtitles, translate subtitles and edit them all from one UI and 100% locally (it even works offline).
I hope you like it, check out the website for self-hosting instructions: https://whishper.net
Does this need to connect to openai or does it function fully independently? Its for offline use.
How does it compare to https://github.com/guillaumekln/faster-whisper?
I’ve been using Faster Whisper for a while locally, and its worked out better than raw whisper and benchmarks really well. Just curious if there are any reasons to switch.
Whishper uses faster-whisper in the backend.
Simply put, it is a complete UI for Faster-Whisper with extra features like transcription translation, edition, download options, etc…
how does whisper do transcribing technical documents. like for lawyers, doctors, engineers and what not? or speakers with heavy accents?
Whisper models have a very good WER (word error ratio) for languages like Spanish, English, French… if you use the english-only models it also improves. Check out this page on the docs:
https://whishper.net/reference/models/#languages-and-accuracy
Congratulations on the launch and thanks for making this open-source! Not sure if this supports searching through all transcriptions yet, but that’s what I’d find really helpful. E.g. search for a keyword in all podcast episodes.
Oh, awesome! Does it do speaker detection? That’s been one of my main gripes with Whisper.
Unfortunately, not yet. Whisper per se is not able to do that. Currently, there are few viable solutions for integration, and I’m looking at this one, but all current solutions I know about need GPU for this.