Prompting Whisper for Improved Verbatim Transcription and Finish-to-end Miscue Detection

*Equal Contributors

Figuring out errors (i.e., miscues) made whereas studying aloud is often approached post-hoc by evaluating computerized speech recognition (ASR) transcriptions to the goal studying textual content. Nonetheless, post-hoc strategies carry out poorly when ASR inaccurately transcribes verbatim speech. To enhance on present strategies for studying error annotation, we suggest a novel end-to-end structure that comes with the goal studying textual content through prompting and is educated for each improved verbatim transcription and direct miscue detection. Our contributions embrace: first, demonstrating that incorporating studying textual content by means of prompting advantages verbatim transcription efficiency over fine-tuning, and second, displaying that it’s possible to enhance speech recognition duties for end-to-end miscue detection. We carried out two case studies—children’s read-aloud and grownup atypical speech—and discovered that our proposed methods enhance verbatim transcription and miscue detection in comparison with present state-of-the-art.

Main Menu

What's Hot

AI use is altering how a lot firms pay for cyber insurance coverage

AI-Powered Cybercrime Is Surging. The US Misplaced $16.6 Billion in 2024.

Setting Up a Google Colab AI-Assisted Coding Surroundings That Really Works

Prompting Whisper for Improved Verbatim Transcription and Finish-to-end Miscue Detection

Setting Up a Google Colab AI-Assisted Coding Surroundings That Really Works

We ran 16 AI Fashions on 9,000+ Actual Paperwork. Here is What We Discovered.

Quick Paths and Sluggish Paths – O’Reilly

Evaluating the Finest AI Video Mills for Social Media

Utilizing AI To Repair The Innovation Drawback: The Three Step Resolution

Midjourney V7: Quicker, smarter, extra reasonable

Meta resumes AI coaching utilizing EU person knowledge

AI use is altering how a lot firms pay for cyber insurance coverage

AI-Powered Cybercrime Is Surging. The US Misplaced $16.6 Billion in 2024.

Setting Up a Google Colab AI-Assisted Coding Surroundings That Really Works

Pricing Breakdown and Core Characteristic Overview

Main Menu

Subscribe to Updates

What's Hot

Prompting Whisper for Improved Verbatim Transcription and Finish-to-end Miscue Detection

Related Posts