Label Studio 1.12.0 🚀Automate & Evaluate Labeling Predictions Using LLMs & ML Models
Back to integrations

Automatic speech recognition with Nvidia NeMo

Overview

Combine the power of Label Studio with Nvidia’s NeMo to enhance work done by researchers and practitioners in automatic speech recognition (ASR), text-to-speech synthesis (TTS), large language models (LLMs), and natural language processing (NLP).

With the community-created integration, create audio pre-annotations and automatic transcriptions within a selected speech area within Label Studio.

Benefits

Using NeMo for pre-annotation in Label Studio has the following benefits:

  • Reliability: NeMo is capable of providing highly accurate speech-to-text transcriptions.
  • Speed: Using NeMo for transcriptions speeds the labeling process by offloading most transcription tasks to a ML model, freeing annotators to work on more difficult transcriptions.

Related Integrations

Tesseract

Automated bounding box OCR

YOLO

YOLO data annotation format

LangChain

Evaluate LLM Output Quality

TensorFlow

Open source deep learning framework