ai
ai
Assistants
1. Problem Identification
2. AI Approach Used
1. Data Collection:
a. Public Datasets: Examples include LibriSpeech (English
audiobooks), Common Voice (multilingual crowd-
sourced data), and TIMIT (phoneme recognition).
b. Proprietary Datasets: Companies like Google and
Amazon collect voice data from user interactions with
their virtual assistants.
c. Diverse Data: To ensure inclusivity, datasets must
include multiple languages, accents, genders, and age
groups.
2. Data Preprocessing:
a. Audio Processing: Raw audio is converted into
spectrograms or MFCCs, which represent the audio
signal in a format suitable for machine learning.
b. Text Normalization: Transcripts are cleaned and
standardized (e.g., removing punctuation, lowercasing,
and expanding abbreviations).
c. Noise Augmentation: Background noise is artificially
added to training data to improve the system's
robustness.
3. Annotation:
a. Human annotators transcribe audio files to create
labeled datasets for supervised learning. This step is
time-consuming but essential for training accurate
models.
4. Data Splitting:
a. Data is divided into training, validation, and test sets. The
training set is used to train the model, the validation set
used to tune hyperparameters, and the test set is used to
evaluate performance.
4. Impact Assessment
1. Positive Impacts:
a. Enhanced User Experience: Virtual assistants provide a
natural and intuitive way to interact with devices,
improving user satisfaction.
b. Accessibility: Speech recognition enables individuals
with disabilities (e.g., visually impaired users) to access
technology more easily.
c. Productivity: Automating tasks like transcription,
scheduling, and information retrieval saves time and
effort.
d. Multilingual Support: Breaking language barriers
enables global communication and collaboration.
2. Negative Impacts:
a. Bias in Recognition: Systems may perform poorly for
underrepresented groups, leading to inequitable
outcomes.
b. Privacy Risks: Voice data collection raises concerns
about surveillance and misuse.
c. Job Displacement: Automation of tasks like transcription
may reduce demand for human workers.
6. Diagrams