We are building a specialized speech recognition system focused on understanding German dialects. Current speech-to-text solutions from major providers like OpenAI and Google struggle with regional dialects, so we’re training our own AI model to recognize and transcribe them accurately.
To train this model effectively, we need voice samples from native speakers across different German-speaking regions. We’ve created both an Android App and a web platform to make it easy for speakers to contribute recordings.
See the Announcement in German for more details.
Users can visit our platform and record themselves speaking naturally in their local dialect. The recordings are automatically processed, labeled with location data, and used to train our speech recognition model. Each contribution helps improve accuracy across German dialects.
All submissions are reviewed to ensure quality and accuracy before being used in model training. We focus primarily on Bavarian dialects initially, but welcome contributions from speakers of any German dialect region.
The project aims to preserve and recognize the linguistic diversity of German-speaking regions while advancing AI technology that works well for all speakers, not just those speaking standard German.
Learn more and contribute at the Dialektsammler project page.