Moonshine, the new state of the art for speech to text
Can you imagine using a keyboard where it took a key press two seconds to show up on screen? That’s the typical latency for most voice interfaces, so it’s no wonder they’ve failed to catch on for most people. Pete Warden writes about open sourcing Moonshine, a new speech to text model that returns results faster and more efficiently than the current state of the art, OpenAI’s Whisper, while matching or exceeding its accuracy.
The paper has the full details, but the key improvements are an architecture that offers an overall 1.7x speed boost compared to Whisper, and a flexibly-sized input window.
To understand what that means in practice, you can check out the Torre translator (video below). The speed of Moonshine provides almost instant translations as people are talking, making for a conversation that’s much more natural than existing solutions.
Have an amazing project to share? The Electronics Show and Tell is every Wednesday at 7:30pm ET! To join, head over to YouTube and check out the show’s live chat and our Discord!
Python for Microcontrollers – Adafruit Daily — Python on Microcontrollers Newsletter: A New Arduino MicroPython Package Manager, How-Tos and Much More! #CircuitPython #Python #micropython @ThePSF @Raspberry_Pi
EYE on NPI – Adafruit Daily — EYE on NPI Maxim’s Himalaya uSLIC Step-Down Power Module #EyeOnNPI @maximintegrated @digikey