0% found this document useful (0 votes)
32 views13 pages

Rohan AIML

Uploaded by

Rohan Kokatare
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views13 pages

Rohan AIML

Uploaded by

Rohan Kokatare
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Subject name and Code: AIML (22589) Academic year: 2024-25

Course Name: AO 5I Semester: Fifth

Music Recommendation Based on Emotion in Python


MICRO PROJECT REPORT
Submitted in November 2024

Sr. Roll Enrollment Seat No.


No No Full name of student No. (Sem-V)

1 12 ROHAN GANGARAM KOKATARE 23112090100

Under the Guidance of Prof. Shraddha Chaudhari in 3 Years Diploma Programmed in


Engineering & Technology of Maharashtra State Board of Technical Education ISO
9001:2008(ISO/ICE-27001:2013) SHIVAJIRAO S. JONDHLE POLYTECHNIC, ASANGAON

1
MAHARASHTRA STATE BOARD OF TECHNICAL
EDUCATION, MUMBAI
CERTIFICATE

This is to certify that Mr. ROHAN GANGARAM KOKATARE


Roll No- 03 of Fifth Semester of Automation & Robotics Diploma Programmed
in Engineering & Technology at Shivaji Rao S. Jondhle Polytechnic Asangaon
(EAST) Shahapur 421601 has completed the Micro Project satisfactorily in the
subject AIML (22589) in the academic year 2024-2025 prescribed in the
curriculum.

Place: ASANGAON Enrollment No. : 23112090100


Date: / / 2024 Exam seat No:

Project Guide Head of the Department Principal

2
INDEX

Sr.
Title name Page No.
No

1. Introduction

Aim of the
2. Micro-Project

Resources
3. Require
Skill Developed /
4. learning out of
this Micro-Project
Advantage &
5. Disadvantage

6. Code

7. Output

8. Conclusion

9. References

3
PART-A
Movie Recommendation Based on Emotion in Python
1.0 Brief Introduction
The Music Recommendation System aims to suggest songs based on the
content of a given song, using content-based filtering. This approach
relies on the characteristics of songs, such as their title, artist, genre, and
lyrics, to recommend similar tracks.
The system works by:
1. Dataset: A dataset containing song metadata (e.g., title, artist, genre,
and lyrics) is used.
2. Feature Engineering: The song attributes are combined into a single
text feature.
3. TF-IDF Vectorization: The combined text is transformed into numerical
data using TF-IDF (Term Frequency-Inverse Document Frequency),
which highlights unique and important words.
4. Cosine Similarity: The cosine similarity between songs is computed to
determine how similar they are.
5. Recommendation: Based on a given song, the system identifies the
top 3 most similar songs by comparing their cosine similarity scores.
This method provides personalized music recommendations without
requiring user history, making it ideal for new users. Future
improvements could involve incorporating audio features, collaborative
filtering, or using real-world datasets to enhance the accuracy of
recommendations.
2.0 Aim of the Micro-Project
1. Content-Based Recommendation: Build a music recommendation
system that suggests songs based on their similarity in attributes (song
title, artist, genre, and lyrics).
2. TF-IDF & Cosine Similarity: Utilize TF-IDF vectorization to convert text
features into numerical data and cosine similarity to measure song
similarity.
3. Song Discovery: Help users discover songs with similar characteristics
to a given song, enhancing their music experience.
4. Scalability: Serve as a foundation for more advanced recommendation
systems by incorporating additional features like audio attributes or
collaborative filtering.

4
3.0 Resources Require

Sr.
Name of Resource / Material Specification Quantity Remark
No.

Computer (Pentium 12th


1 Hardware: Computer System gen), RAM 16GB, SSD 1
512GB

2 Operating System Windows 11 1

3 Software VS CODE 1

5
PART B

4.0 Skill Developed / learning out of this Micro-Project


Data Preprocessing:
• Combining multiple features into a single text feature.
• Handling text data and filtering stop words.
Text Vectorization:
• Using TF-IDF to convert text into numerical data for machine learning.
Similarity Measures:
• Applying cosine similarity to calculate the similarity between songs.
• Building a similarity matrix for item comparisons.
Recommendation System:
• Implementing content-based filtering for song recommendations.
• Developing a function to provide song suggestions based on similarity
scores.
Error Handling:
• Handling invalid input (e.g., song not found) gracefully.
Data Visualization:
• Visualizing recommendation results using bar charts with matplotlib
and seaborn.
Real-World Application:
• Understanding the foundation of content-based recommendation
systems used in industry (e.g., Spotify, Netflix).
Programming Skills:
• Working with Pandas for data manipulation and analysis.
• Understanding basic machine learning concepts like similarity
measures.

6
5. Advantage and Disadvantage

Advantages

1. Simplicity: Content-based recommendation systems are easy to


implement and understand, relying on song features such as genre,
artist, and lyrics rather than complex user behavior or interactions.
2. Explainability: The recommendations are based on specific attributes
of songs (e.g., lyrics, genre), making the system's decision-making
process transparent. Users can easily understand why a song was
recommended.
3. Cold Start Problem: Content-based systems excel in handling the "cold
start problem," as they can recommend new or less popular songs
based solely on their features, without needing historical user data or
interactions.
4. No User Interaction Needed: Since the system relies on song content
rather than user activity, there is no need for tracking user preferences,
ratings, or listening history, making it easier to deploy.
5. Customizable Features: The system can be adjusted to include different
song features (e.g., audio characteristics, user-defined tags) to improve
recommendation quality and tailor it to specific user needs or contexts.
6. Personalized Recommendations for Users with Known Preferences:
Content-based systems can be particularly effective when the user has
specific, well-defined tastes or preferences. By analyzing attributes like
genre, artist, or lyrics, the system can recommend songs that align
closely with the user’s past preferences, leading to highly relevant
recommendations even for users with limited interaction data. This can
be useful in music streaming platforms where users have distinct tastes
but may not have interacted enough for collaborative filtering to be
effective.

7
Disadvantages:

1. Limited Recommendations: The system tends to recommend songs


that are very similar to the ones the user has interacted with. This can
lead to over-specialization, where users are only recommended songs
from the same genre or with similar characteristics, limiting discovery
of new or diverse content.
2. Lack of Contextual Understanding: Content-based systems struggle to
understand the context in which a user might prefer a particular song.
For example, a user may want energetic music for a workout but the
system may only recommend songs based on content similarity,
ignoring factors like mood or activity context.
3. Dependence on Feature Quality: The effectiveness of the system
depends heavily on the quality and richness of the features. If the
dataset lacks meaningful features like detailed lyrics or specific audio
features, the recommendations will be poor. For instance, using only
basic genres or simplistic tags may not capture the true diversity of
music tastes.
4. Scalability Issues: As the dataset grows, calculating cosine similarity for
all pairs of songs can become computationally expensive, making the
system less efficient for large datasets. This can slow down the
recommendation process unless optimization techniques are applied.
5. Limited Personalization: Content-based filtering does not take into
account individual user preferences beyond the song content. It does
not adapt to user behavior or preferences over time. For example, two
users who like the same genre might receive the same
recommendations, even if one prefers faster or more energetic songs.
6. Text-Bias: The system's reliance on textual features (like lyrics and
genre) might bias the recommendations towards songs with more
extensive or richer descriptions. It may overlook important audio-
related features such as tempo, energy, and mood, which can offer a
deeper understanding of musical similarity.

8
7.0 Outputs of the Micro-Projects

Code
import pandas as pd
import numpy as np
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
import matplotlib.pyplot as plt
import seaborn as sb

# Step 1: Load the dataset


# For this example, we'll use a mock dataset (replace with actual music
data if available)
# Example dataset: Song titles, artists, and genres, plus other features if
available.
data = pd.DataFrame({
'Song': ['Song A', 'Song B', 'Song C', 'Song D', 'Song E'],
'Artist': ['Artist X', 'Artist Y', 'Artist Z', 'Artist X', 'Artist Y'],
'Genre': ['Pop', 'Rock', 'Pop', 'Jazz', 'Rock'],
'Lyrics': ['Love and peace', 'Fight the system', 'Peace and love', 'Jazz
and blues', 'Rock on']
})

# Step 2: Preprocess the data and create a 'combined' feature


# We combine Song, Artist, Genre, and Lyrics into one text feature
data['combined'] = data['Song'] + ' ' + data['Artist'] + ' ' + data['Genre'] +
' ' + data['Lyrics']

# Step 3: Convert the combined text feature into numerical data using TF-
IDF
vectorizer = TfidfVectorizer(stop_words='english')
tfidf_matrix = vectorizer.fit_transform(data['combined'])

# Step 4: Calculate the cosine similarity matrix between songs


cosine_sim = cosine_similarity(tfidf_matrix, tfidf_matrix)

# Step 5: Function to recommend songs based on a given song


def recommend_song(song_name):
idx = data.index[data['Song'] == song_name].tolist()[0]
sim_scores = list(enumerate(cosine_sim[idx]))

# Sort the songs by similarity score in descending order


9
sim_scores = sorted(sim_scores, key=lambda x: x[1], reverse=True)

# Get the top 3 most similar songs (excluding the input song)
sim_scores = sim_scores[1:4]

# Get the song indices and corresponding similarity scores


song_indices = [i[0] for i in sim_scores]
song_scores = [i[1] for i in sim_scores]

# Get the names of the recommended songs


recommended_songs = data['Song'].iloc[song_indices].tolist()

return recommended_songs, song_scores

# Step 6: Test the recommendation function


song_to_recommend = 'Song A'
recommended_songs, scores = recommend_song(song_to_recommend)

print(f"Songs recommended based on '{song_to_recommend}':")


for i, song in enumerate(recommended_songs):
print(f"{i+1}. {song} (Similarity Score: {scores[i]:.2f})")

# Visualizing the similarity scores


plt.figure(figsize=(8, 6))
sb.barplot(x=recommended_songs, y=scores, palette='Blues_d')
plt.title(f"Top 3 Song Recommendations for '{song_to_recommend}'")
plt.xlabel('Song')
plt.ylabel('Cosine Similarity Score')
plt.show()

10
8.0 OUTPUT:

11
CONCLUSION

The content-based music recommendation system presented in the code


suggests songs based on their textual features, such as song title, artist, genre,
and lyrics. By calculating cosine similarity between songs, it recommends
tracks with similar attributes, making it effective when user interaction data is
unavailable. This approach is transparent, as users can easily understand why
certain songs are recommended based on their content.
While it excels at recommending new or less popular songs (solving the "cold
start" problem), the system can become too specialized, offering
recommendations that are too similar to the input song and limiting diversity.
Additionally, its reliance on textual features means that the quality of the
recommendations depends on the richness of those features, and it doesn't
account for user preferences or behaviors, making it less personalized
compared to collaborative filtering.
Scalability can also be an issue, as the computational cost of calculating cosine
similarity grows with the dataset size. Despite these limitations, the system
provides a strong foundation for content-based recommendations. To enhance
it, incorporating audio features or using a hybrid model that combines content-
based and collaborative filtering would improve both diversity and
personalization.

12
REFERENCES
https://www.kaggle.com/
https://www.kaggle.com/
https://towardsdatascience.com/
https://medium.com/

13

You might also like