0% found this document useful (0 votes)

21 views3 pages

CS4622 Machine Learning PROJECT

This document outlines a machine learning project involving speech recognition tasks including speaker, age, gender, and accent classification. The project has two phases: individual modeling using wav2vec features from different layers, with models evaluated via Kaggle competitions; and a group paper combining all layers and improving the joint model, including literature review, explanations, and findings. Participants are evaluated on their individual Kaggle ranks and code quality, and the group paper will undergo blind peer review. Deadlines for the competitions and paper are September 24th and October 8th, respectively.

Uploaded by

Raveen Shamentha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views3 pages

CS4622 Machine Learning PROJECT

Uploaded by

Raveen Shamentha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

CS4622 - Machine Learning

Project - Speaker, age, gender and accent recognition using

wav2vec base
Dr.R.T.Uthayasanker
August 29, 2023

Project Description
Dataset : AudioMNIST is the dataset used to create the features. Check this link for further
details about the dataset Link.

Below structure is general speech-related task classification models

Figure 1: Overview of speech-related task classification model

This project has two phases:

1. Phase 1: Individual task - Classification model development and Kaggle competition
2. Phase 2: Group submission - 6-page research paper

Phase 1: Individual task

Wav2vec base is commonly used as a feature extraction model. There are 12 transformer layers
in the wav2vec base model. For this project features are extracted from the last 6 transformer
layers (transformer layer 7 to layer 12). For each layer separate kaggle competition is created.
Each train.csv file provided in the competition contains layer features and corresponding speaker,
age, gender, and accent labels.

1
• Label 1 - Speaker

• Label 2 - Age

• Label 3 - Gender

• Label 4 - Accent
Two (2) kaggle competitions will be allocated to each person. Your task is to build classifier
models for predicting all 4 labels individually using features in both training and validation CSV
files provided in the competitions. [Find your competition links here Link]

E.g.
• 1st competition - Layer X

1. Speaker recognition classifier model using layer X

2. Age recognition classifier model using layer X
3. Gender recognition classifier model using layer X
4. Accent recognition classifier model using layer X

• 2nd competition - Layer Y

5. Speaker recognition classifier model using layer Y

6. Age recognition classifier model using layer Y
7. Gender recognition classifier model using layer Y
8. Accent recognition classifier model using layer Y

Do data pre-processing, feature engineering, hyper-parameter tuning, dimensionality reduction,

cross-validation, and other techniques to improve the classifier accuracy. Upload the notebook
and predicted labels as solutions.csv file to Kaggle competition platform created for this project
(More details are provided in the Kaggle competition description, rules sections.)

Phase 2: Group task

Group formation : Maximum 3 people in one group. Only 2 groups can have 2 people.

In your group, the other members would have tried the other two pairs of layers. As a group, your
task is to combine all 6 layers and improve the prediction model, and write a 6-page conference
paper in IEEE format (Link).

For the conference paper writing, do a literature review, do ExplainableAI techniques, and inter-
pret the final model. Include your findings from this project and novel ideas during your feature
engineering and model development stages in the conference paper. Your paper should be up-
loaded in easy-chair. The link will be provided later.

The expected content of your conference paper can be found in this link, for your reference.

2
Evaluation
• Individual task:

– Classifier Model building - 40 marks

∗ Explainability (Interpreting the label predictions and any cross-relations with la-
bels) - 20 marks
∗ Good practice of ML (right evaluation strategy, ensemble methods, feature engi-
neering, etc.) - 10 marks
∗ Git repository (properly documented) - 5 marks
∗ Coding standard - 5 marks
– Kaggle Competition Rank - 20 marks

• Group submission: Conference paper - 40 marks

• We will evaluate your individual task using the ranks from the Kaggle competitions. In
addition, the submitted code (notebook / Git repository) will be evaluated based on above
mentioned criteria.

• We will evaluate your Group-wise conference paper based on a blind review process consid-
ering the quality, findings, interpretations, novelty, etc.

DEADLINES !!
Kaggle competition: 24th September 2023
Paper submission : 8th October 2023

SOCIAL EXPERIMENT Format
100% (3)
SOCIAL EXPERIMENT Format
1 page
MGM464 Case Study 4
67% (3)
MGM464 Case Study 4
3 pages
COE101 - Project Guidelines (Spring 24-25)
No ratings yet
COE101 - Project Guidelines (Spring 24-25)
19 pages
CM2060 NLP Coursework
No ratings yet
CM2060 NLP Coursework
5 pages
Assignment 2
No ratings yet
Assignment 2
3 pages
IntroML Project Description - CLC 2425
No ratings yet
IntroML Project Description - CLC 2425
5 pages
Assignment 3
No ratings yet
Assignment 3
2 pages
Project Requirements Student Version 1.0
No ratings yet
Project Requirements Student Version 1.0
6 pages
ML - Final Project [Fall 2024]
No ratings yet
ML - Final Project [Fall 2024]
2 pages
Semester Project Description and Instructions
No ratings yet
Semester Project Description and Instructions
3 pages
Important Questions
No ratings yet
Important Questions
4 pages
Term Project
No ratings yet
Term Project
2 pages
Challenge-2024
No ratings yet
Challenge-2024
5 pages
Code::Blocks Essentials: Definitive Reference for Developers and Engineers
From Everand
Code::Blocks Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
E1213 PRNN: Assignment 1 - Basic Models: Prof. Prathosh A. P. Submission Deadline: 1st March 2022
No ratings yet
E1213 PRNN: Assignment 1 - Basic Models: Prof. Prathosh A. P. Submission Deadline: 1st March 2022
3 pages
Milestone
No ratings yet
Milestone
7 pages
CSC 603 - Final Project
No ratings yet
CSC 603 - Final Project
3 pages
CS502M_project_spec
No ratings yet
CS502M_project_spec
8 pages
Practical SBT for Modern Scala Development: Definitive Reference for Developers and Engineers
From Everand
Practical SBT for Modern Scala Development: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
hw1 Problem Set
No ratings yet
hw1 Problem Set
8 pages
Project Assignment.2025
No ratings yet
Project Assignment.2025
2 pages
project_descr
No ratings yet
project_descr
2 pages
F21DL 2024-25 Coursework-1 - 240918 - 110502
No ratings yet
F21DL 2024-25 Coursework-1 - 240918 - 110502
7 pages
Project - Machine Learning-Business Report: By: K Ravi Kumar PGP-Data Science and Business Analytics (PGPDSBA.O.MAR23.A)
No ratings yet
Project - Machine Learning-Business Report: By: K Ravi Kumar PGP-Data Science and Business Analytics (PGPDSBA.O.MAR23.A)
38 pages
Speech Recognition Techniques_GUVI
No ratings yet
Speech Recognition Techniques_GUVI
4 pages
ai and ml qp1 solved
No ratings yet
ai and ml qp1 solved
20 pages
Answers 111111111111111111111111111
No ratings yet
Answers 111111111111111111111111111
21 pages
Data Mining & Machine Learning Courseoutline
No ratings yet
Data Mining & Machine Learning Courseoutline
7 pages
Assignment3
No ratings yet
Assignment3
6 pages
CPE 695 WS: Applied Machine Learning: Lecture 0: Course Logistics and Introduction To ML
No ratings yet
CPE 695 WS: Applied Machine Learning: Lecture 0: Course Logistics and Introduction To ML
17 pages
Project Assignment.2024
No ratings yet
Project Assignment.2024
2 pages
Individual Report - CA 2 - 20000086
No ratings yet
Individual Report - CA 2 - 20000086
3 pages
AutoCAD 2019: A Problem - Solving Approach, Basic and Intermediate, 25th Edition
From Everand
AutoCAD 2019: A Problem - Solving Approach, Basic and Intermediate, 25th Edition
Prof. Sham Tickoo
No ratings yet
Project Titles
No ratings yet
Project Titles
3 pages
7641 Assignment 1
No ratings yet
7641 Assignment 1
4 pages
Blender Pro Studio Advanced Techniques for Real-World Projects: Blender, #3
From Everand
Blender Pro Studio Advanced Techniques for Real-World Projects: Blender, #3
Steven Mcananey
No ratings yet
Assignment2 2024
No ratings yet
Assignment2 2024
4 pages
MBAN Assignment
No ratings yet
MBAN Assignment
2 pages
Summative Assessment - Jan 2025 (1)
No ratings yet
Summative Assessment - Jan 2025 (1)
10 pages
Project EE331 2019S
No ratings yet
Project EE331 2019S
2 pages
Open Project Guidelines
No ratings yet
Open Project Guidelines
3 pages
CS419 Assignment
No ratings yet
CS419 Assignment
3 pages
Agile Foundation Courseware – English
From Everand
Agile Foundation Courseware – English
Nader Rad
No ratings yet
Machine Learning-Assignments PDF
No ratings yet
Machine Learning-Assignments PDF
2 pages
Syllabus
No ratings yet
Syllabus
4 pages
KIX3004-2024S1-Asgn
No ratings yet
KIX3004-2024S1-Asgn
2 pages
Project Progress Report Handout and Rubric
No ratings yet
Project Progress Report Handout and Rubric
2 pages
Exploring Autodesk Revit 2023 for Structure, 13th Edition
From Everand
Exploring Autodesk Revit 2023 for Structure, 13th Edition
Prof. Sham Tickoo
No ratings yet
Mastering Generic Programming in C++: Unlock the Secrets of Expert-Level Skills
From Everand
Mastering Generic Programming in C++: Unlock the Secrets of Expert-Level Skills
Larry Jones
No ratings yet
EE4483 Project1
No ratings yet
EE4483 Project1
3 pages
Project+2
No ratings yet
Project+2
2 pages
Lab Assignment - SVM - 2024
No ratings yet
Lab Assignment - SVM - 2024
5 pages
AutoCAD LT 2017 for Designers, 12th Edition
From Everand
AutoCAD LT 2017 for Designers, 12th Edition
Prof. Sham Tickoo
No ratings yet
Machine L-Lab-Manual
No ratings yet
Machine L-Lab-Manual
90 pages
CSL7620_A2
No ratings yet
CSL7620_A2
2 pages
Rithvik Bhuvkar AI Assignment Final Copy
No ratings yet
Rithvik Bhuvkar AI Assignment Final Copy
24 pages
A2
No ratings yet
A2
11 pages
Untitled document (2)
No ratings yet
Untitled document (2)
11 pages
Ci Assigement
No ratings yet
Ci Assigement
10 pages
Modern JavaScript Bundling with Rollup: Definitive Reference for Developers and Engineers
From Everand
Modern JavaScript Bundling with Rollup: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
ARIES Kaizen PS
No ratings yet
ARIES Kaizen PS
6 pages
Lab 6...
No ratings yet
Lab 6...
8 pages
Activity: The History of Basketball Began With Its Invention in 1891 in Springfield
No ratings yet
Activity: The History of Basketball Began With Its Invention in 1891 in Springfield
3 pages
Behaviors at School Final Joint Letter 11 16
No ratings yet
Behaviors at School Final Joint Letter 11 16
3 pages
Frayer Model Chronolgy
No ratings yet
Frayer Model Chronolgy
3 pages
Summer Project Report
100% (8)
Summer Project Report
26 pages
ADEC - Ajyal International School Al Falah 2016-2017
No ratings yet
ADEC - Ajyal International School Al Falah 2016-2017
22 pages
Chapter II - Theoretical Framework
No ratings yet
Chapter II - Theoretical Framework
30 pages
Sender. Encoding. Message. Channel. Decoding
No ratings yet
Sender. Encoding. Message. Channel. Decoding
2 pages
File 2 - Journal Writing
No ratings yet
File 2 - Journal Writing
6 pages
Chapter 1 Guidelines
No ratings yet
Chapter 1 Guidelines
17 pages
1monthly Instructional Plan 2022-2023
100% (1)
1monthly Instructional Plan 2022-2023
9 pages
Jerome Bruner: Background
100% (1)
Jerome Bruner: Background
8 pages
NLC Reflection
100% (1)
NLC Reflection
2 pages
ETH203Q: Tutorial Letter 101/3/2016
0% (1)
ETH203Q: Tutorial Letter 101/3/2016
19 pages
Ulrich 1983 Aesthetic and Affective Response To Natural Environment
No ratings yet
Ulrich 1983 Aesthetic and Affective Response To Natural Environment
42 pages
Non Finite Clauses
No ratings yet
Non Finite Clauses
10 pages
Four Layers Model
0% (1)
Four Layers Model
2 pages
Self Concept in Consumer Behaviour
No ratings yet
Self Concept in Consumer Behaviour
6 pages
SUccess Tweets
No ratings yet
SUccess Tweets
181 pages
Big Five Personality Traits
100% (3)
Big Five Personality Traits
72 pages
My Experiences While Out of My Body - Cora Richmond
No ratings yet
My Experiences While Out of My Body - Cora Richmond
70 pages
Local and Global Processing of Music in High-Functioning Persons With Autism: Beyond Central Coherence ?
No ratings yet
Local and Global Processing of Music in High-Functioning Persons With Autism: Beyond Central Coherence ?
11 pages
English Syntax - Course Introduction
No ratings yet
English Syntax - Course Introduction
4 pages
Lesson plans for IELTS Speaking and Writing Academic Year 2024 - 2025
No ratings yet
Lesson plans for IELTS Speaking and Writing Academic Year 2024 - 2025
6 pages
Subtitling Strategies Result and Discussion
No ratings yet
Subtitling Strategies Result and Discussion
13 pages
The Lively Art of Writing AP Student Notes, AP English Lit and Comp, Mr. Spencer P. Woods
No ratings yet
The Lively Art of Writing AP Student Notes, AP English Lit and Comp, Mr. Spencer P. Woods
23 pages
Imp HR Interview Question
No ratings yet
Imp HR Interview Question
5 pages
Chapter 1-2 - The Skillfull Teacher
No ratings yet
Chapter 1-2 - The Skillfull Teacher
47 pages
Does Writing Help Develop My Amharic Skills? Learn Amharic For Free.
No ratings yet
Does Writing Help Develop My Amharic Skills? Learn Amharic For Free.
7 pages

CS4622 Machine Learning PROJECT

Uploaded by

CS4622 Machine Learning PROJECT

Uploaded by

CS4622 - Machine Learning

Project - Speaker, age, gender and accent recognition using

Below structure is general speech-related task classification models

Figure 1: Overview of speech-related task classification model

This project has two phases:

Phase 1: Individual task

1. Speaker recognition classifier model using layer X

• 2nd competition - Layer Y

5. Speaker recognition classifier model using layer Y

Do data pre-processing, feature engineering, hyper-parameter tuning, dimensionality reduction,

Phase 2: Group task

– Classifier Model building - 40 marks

• Group submission: Conference paper - 40 marks

You might also like