0% found this document useful (0 votes)
24 views

Audio Data Annotation

The document outlines audio annotation guidelines, detailing the process of labeling audio files for machine learning applications. It includes best practices, definitions, and specific instructions for annotating filler words, overlapping conversations, inaudible segments, and code switching. The guidelines emphasize clarity, consistency, and quality control throughout the annotation process.

Uploaded by

nicojaywambugu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views

Audio Data Annotation

The document outlines audio annotation guidelines, detailing the process of labeling audio files for machine learning applications. It includes best practices, definitions, and specific instructions for annotating filler words, overlapping conversations, inaudible segments, and code switching. The guidelines emphasize clarity, consistency, and quality control throughout the annotation process.

Uploaded by

nicojaywambugu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

AUDIO ANNOTATION

Audio annotation Guidelines


JANUARY 24, 2025
LOCAL DEVELOPMENT RESEARCH INSTITUTE(LDRI)

TABLE CONCONTENTSS
AUDIO ANNOTATION GUIDELINES 2
AUDIO Annotation 2
DEFINITIONS 2
getting started with lABEL STUDIO 3
process of audio annotation 4
Best Practices AUDIO ANNOTATION 6
1. Clarity and Consistency 6
2. Avoid Bias 6
3. Quality Control 7
4. Collaborative Reviews 7
Annotating filler words 7
1. Understand Filler words 7
2. establish GUIDELINES FOR INCLUSION OR EXCLUSION 8
3. STEPS FOR ANNOTATING FILLER WORDS 8
4. Review and edit 8
Annotating overlapping conversations 9
1. Understanding overlapping speech 9
2. use appropriate notation 9
3. segmenting overlaps 10
4.categorize overlaps 10
5. ANNOTATE WITH CONTEXT 10
6. REVIEW AND EDIT ANNOTATIONS 10
Annotating inaudible Segments 11
CODE SWITCHING 13
FILLER WORDS 13
EXAMPLES 14
1. FILLER WORDS 15
2. OVERLAPPING 16
3. INAUDIBLE 17
4. CODE SWITCHING 17

1
AUDIO ANNOTATION GUIDELINES

AUDIO ANNOTATION

Audio annotation is the process of adding labels or metadata to audio files to


make them more analyzable and useful for training machine learning models.

This manual provides a step-by-step guide for annotating audio data using Label
Studio, with instructions tailored to transcription and translation.

DEFINITIONS

1. Transcription

Converting spoken words into text.

2. Translation

Translation in Natural Language Processing is the process of converting text or


speech from one language to another using advanced computational techniques.

3. Annotation

The process of labeling data in form of text, images or audio so that models can easily
comprehend a given data source and recognize certain formats, objects, information,
or patterns in the future

4. Audio annotation

Audio annotation is the process of adding meaningful labels or metadata to audio


data, enabling machines to better understand and analyze the content. This process
is essential for various applications, including speech recognition, emotion
identification, and sound classification.

5. Clean Verbatim Transcription

2
A clean verbatim transcription is also known as edited transcription or intelligent
transcription. This type of transcription does not change the text’s meaning or
paraphrase it. Also, it does not capture unnecessary words in the speaker’s speech.
Non-verbal communication that does not add value to the content is left out,
including filler words and stammering. The ultimate goal of this mode of
transcription is to achieve a balance between readability and completeness.

GETTING STARTED WITH LABEL STUDIO

1. Install Label Studio:


o Follow the installation guide provided on the Label Studio website.
o Ensure you have the necessary system requirements and dependencies
installed.
2. Set Up a Project:
o Open Label Studio and click on Create Project.
o Enter a name and description for your project.
o Select the data type as Audio.
3. Import Audio Files:
o Upload audio files in supported formats (e.g., WAV, MP3).
o You can upload files directly or connect to cloud storage for bulk
imports.
4. Define Annotation Configurations:
o Use Label Studio’s XML Configuration Editor to define the labels
and tools needed for your task.
o Examples of configurations for specific tasks are provided below.

3
PROCESS OF AUDIO ANNOTATION

1. Open label studio dashboard:


o Use your email to log in to the Label Studio dashboard.
o Remember your password, as you’ll need it for future logins.

2. Select the appropriate project to work on:


o Projects are named and color-coded for easy identification.
o Verify that you’re working on the correct project.
o Any changes to project details will be communicated through the
official communication channels.

3. Listening:

o Press the Play button to listen to the audio.


o You can adjust the playback speed or rewind as needed.
o Listen carefully to ensure accuracy in transcription and translation.

4
4. Labelling:
In the provided text fields:
o Transcribe the audio content accurately.
o Translate the transcription into the required target language.

5. Follow Annotation Guidelines:


To ensure consistency and accuracy, adhere to the following guidelines:
● Write exactly what is spoken (verbatim).
● Use standard punctuation for better readability.
● For unclear or inaudible words, use:
o [inaudible] for completely unintelligible speech.
o [unclear] for partially unclear words.

5
6. Review Annotations:
-Double-check your work to ensure:
o The transcription and translation are accurate.
o The required fields are completed.
o Make corrections if necessary before submitting.
7. Submit the Task:
a. Click on Submit to mark the task as complete.

BEST PRACTICES AUDIO ANNOTATION

1. CLARITY AND CONSISTENCY

● Ensure your annotations are clear and follow consistent labeling practices.
● Adhere strictly to the provided annotation guidelines to maintain
uniformity across tasks.

2. AVOID BIAS

● Annotate based solely on the content of the audio and its characteristics.
● Do not make subjective assumptions or infer meanings beyond what
is explicitly stated in the audio.

3. QUALITY CONTROL

● Regularly review your annotations to ensure they are accurate and


complete.
● Pay close attention to details like punctuation, verbatim transcription, and the
correct use of tags (e.g., [inaudible]).

4. COLLABORATIVE REVIEWS

● In team settings, participate in cross-checks and peer reviews to maintain


high-quality annotations.
● If you encounter uncertain sections or are unsure about the content:
o Do not annotate.
o Seek clarification through the designated communication channels
before proceeding.

ANNOTATING FILLER WORDS


6
Annotating filler words in transcripts involves careful consideration of when to
include or exclude these words based on their contribution to the overall meaning
and readability of the text. Here’s a structured approach to annotating filler words
effectively:

1. UNDERSTAND FILLER WORDS

● Filler words include expressions like "um," "uh," "like," "you know," and "so."
,”Mmh”,”Eeh”,They are often used in speech to fill pauses but do not add
substantive meaning.

2. ESTABLISH GUIDELINES FOR INCLUSION OR EXCLUSION


Include Filler Words:
● When they contribute to the speaker's tone or emotional state (e.g.,
that’s interesting!").
● In contexts where the exact phrasing is important for analysis, such
associolinguistic studies.
Exclude Filler Words:
● When they do not add value to the sentence and can disrupt readability
(e.g., “I, um, think that’s a good idea” can be cleaned to “I think that’s a
good idea”)

3 . STEPS FOR ANNOTATING FILLER WORDS


1. Listen Carefully: Play back the audio and identify filler words as you
transcribe.
2. Use Consistent Formatting: Use the formats allowed for
indicating filler words (square brackets []) and apply it consistently
throughout the transcript.Example:
○ Original: “I was, um, thinking about going, you know, to
the store.”
○ Annotated: “I was [um] thinking about going [you know]
to the store.”

4. REVIEW AND EDIT


● After completing the initial transcription, review it for accuracy and
consistency. Ensure that filler words are annotated according to the
established guidelines.
7
● Consider having a second reviewer check for adherence to guidelines and
overall readability.

ANNOTATING OVERLAPPING CONVERSATIONS


To annotate overlapping conversations effectively, especially when transcribing
audio with multiple speakers, consider the following structured approach:

1. UNDERSTANDING OVERLAPPING SPEECH


Overlapping speech occurs when two or more speakers talk simultaneously. This can
include interruptions, back-channeling, or coordinated speech. Recognizing the type
of overlap is crucial for accurate annotation.

2. USE APPROPRIATE NOTATION


Please use the square brackets for annotation of the overlapping speech :

Square Brackets: Use square brackets to indicate overlapping speech.

Example:

Agronomist: "The best [Speaker B: "when should we grow"] maize


variety is duma."

3. SEGMENTING OVERLAPS
When transcribing overlapping speech, segment the audio into manageable parts:

● Identify Overlap Points: Clearly mark where the overlap begins and
ends using timestamps.
● Sequential Representation: For clarity, you can represent
overlapping speech in a sequential manner, indicating which speaker is
contributing at each point.

4.CATEGORIZE OVERLAPS
Establish categories for different types of overlaps:

● Back-channeling: Indicating understanding or agreement (e.g., "uh-


huh," "yeah").

8
● Turn Stealing: When one speaker interrupts another.
● Anticipated Turn Taking: When a speaker prepares to take their
turn but overlaps momentarily.

5. ANNOTATE WITH CONTEXT


Provide context for each overlap:

● Include notes about the interaction dynamics (e.g., was the overlap
cooperative or interruptive?).
● Use tags or labels to indicate the nature of the overlap (e.g., "bck" for
back-channel, "tst" for turn stealing).

6. REVIEW AND EDIT ANNOTATIONS


After completing the initial annotation:

● Conduct a quality check to ensure accuracy and consistency.


● Consider having multiple annotators review the overlaps to reach a
consensus on labeling.

ANNOTATING INAUDIBLE SEGMENTS

Label Selection: When annotators encounter a segment that is inaudible, they


should select the "Inaudible" label from the labeling options.

Timestamping: Ensure that each segment is timestamped correctly so that the


inaudible portions can be easily identified in the audio file.

9
CODE SWITCHING

Code switching is the switching from one language to another

To annotate code switching we use [cs]: Prefix for words not in the primary language
and [cs] suffix to mark the end of the word not in primary language.

Example

No wĩ ta rĩu ona Nairobi ĩyo ũnjĩĩre, [cs]“Faith, nitumie shamba yako, nipigie simu
shamba yako ikiwa na mahindi ama na maharagwe unitumie,”[cs] nĩ ngũhota
gũtũma na wone ũrĩ o na

Adapted words

Create a catalogue of adapted words that are derived from English or Kiswahili

Transcribe the borrowed word following the phonological rules of the target

language

FILLER WORDS

Mmh- Encouraging , agreeing, Thinking

Uh- shock, surprise

Ehe- Encouraging

Eeh - [Think

10
EXAMPLES

1. FILLER WORDS

11
2. OVERLAPPING

12
3. INAUDIBLE

13
4. CODE SWITCHING

14
15

You might also like