0% found this document useful (0 votes)

29 views19 pages

Wine Final Projects

This document presents a marketing strategy case study for Vivino, focusing on identifying key physicochemical properties of French Bordeaux wines to classify wine quality using machine learning models. The Random Forest model outperformed the Naive Bayes model, achieving an accuracy of 82.9% compared to 80.83%, with both models identifying alcohol, sulphates, and volatile acidity as critical factors for high-quality wines. Recommendations for wine producers include focusing on alcohol content, increasing sulphates for better preservation, and reducing volatile acidity to enhance wine quality.

Uploaded by

Gamyuii Kitsana

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views19 pages

Wine Final Projects

Uploaded by

Gamyuii Kitsana

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

Marketing Research and Engineering : MAX503 FALL 2024

Key Factors of Wine Quality:

A Marketing Strategy Case
Study for Vivino

Lopa Detroja
Kitsana Sudsaneh
Marketing Research and Engineering : MAX503 FALL 2024

Table of Contents

1 Introduction and Executive 5 Random Forest Model

Summary

Data Preparation (Cleaning

2 and Preparing) 6 Comparative Analysis

3 Exploratory Data Analysis 7 Business Implications

and Recommendation

4 Naive Bayes Model 8 Appendix

Marketing Research and Engineering : MAX503 FALL 2024

Goals
Introduction and
Executive Summary Identify key physicochemical properties from
Vivino’s dataset of 1,599 French Bordeaux wines to
classify wine quality.

Use machine learning models to predict and

Methods understand the factors defining a "good" wine.

Used Naive Bayes (a probabilistic model)

and Random Forest (an ensemble decision
tree method) for classification. Key Findings
Random Forest outperformed Naive Bayes in
accuracy and feature analysis.

Top 3 features:
Alcohol: Correlated with higher quality.
Sulphates: Enhanced flavor and preservation.
Volatile Acidity: Lower levels indicate better
quality.
Marketing Research and Engineering : MAX503 FALL 2024

Data Preparation:
Data Overviews (Cleaning and Preparing)
1,599 observations of French Bordeaux wines.

11 physicochemical properties
fixed acidity
volatile acidity
citric acid
residual sugar
chlorides Data Preparation
free sulfur dioxide
total sulfur dioxide Dataset Loaded:
density 1,599 rows and 12 columns.
pH
sulphates Target Variable Created:
alcohol "Good" wine: Quality > 6 (217 samples).
and one quality score. "Bad" wine: Quality ≤ 6 (1,382 samples).

Train-Test Split:
Training set (70%): 1,119 samples (956 "Bad," 163 "Good").
Testing set (30%): 480 samples (426 "Bad," 54 "Good").
Marketing Research and Engineering : MAX503 FALL 2024

Exploratory Data Analysis

Positive Correlation:

A strong positive correlation indicates that as the

amount of free sulfur dioxide increases, the total
sulfur dioxide also increases proportionally.

Example: Total Sulfur Dioxide and Free Sulfur

Dioxide (r = 0.67)

Negative Correlation:

A strong negative correlation indicates that as the

acidity of the wine increases (lower pH), the pH
value decreases. This is expected because pH
measures acidity inversely.

Example: pH and Fixed Acidity (r = -0.68)

Marketing Research and Engineering : MAX503 FALL 2024

Naive Bayes Model Training Model Details

Dataset: The dataset consists of 1,599 observations of

French Bordeaux wines with 11 variables. We focus on
predicting wine quality, where the target variable
quality.label is categorized as "Good" (quality > 6) or "Bad"
(quality ≤ 6).

Model Type: Naive Bayes (a probabilistic classifier).

Split Dataset: Training Dataset: 70% of the data (1,119

samples), Testing Dataset: 30% of the data (480 samples).

We applied a Naive Bayes classifier, a probabilistic model

based on Bayes’ Theorem, to predict whether a wine is
"Good" or "Bad."
Marketing Research and Engineering : MAX503 FALL 2024

Training Model Details Naive Bayes Model

Built a Naive Bayes model for classification to
predict wine quality.

Split the dataset: 70% training data (1,119

samples) and 30% testing data (480 samples).

Before Handling Imbalance

(Original Model)
Class Error: "Bad" wines had low error (63.21%), Accuracy: 80.83%
while "Good" wines had a higher error (36.79%).

Model Bias: The model had a slight bias towards

the majority class ("Bad").

Cluster Plot: Significant overlap between "Good"

and "Bad" classes, reflecting poor separation for the
"Good" wines.
Marketing Research and Engineering : MAX503 FALL 2024

Naive Bayes Model

After Handling Imbalance
(Balanced Sampling):
Balanced Sampling:
ensures the model equally focuses on both classes,
Issue with Imbalanced Classes improving the accuracy for the "Good" class.

Class Error:
The dataset is imbalanced with more "Bad" Slightly reduced error for both classes.
wines than "Good" wines, affecting prediction
accuracy for the "Good" wines.
Adjusted Rand Index (ARI):
ARI: 0.21, indicating a modest agreement
between the predicted and actual classes.

Cluster Plot:
Improved separation between the "Good" and
"Bad" classes after balancing the dataset.
Marketing Research and Engineering : MAX503 FALL 2024

Naive Bayes Model

Performance Metrics
RESULT EXPLANATION

Naive Bayes provides a solid prediction performance with

Accuracy 80.83% an accuracy of 80.83%, while Random Forest slightly
outperforms it at 82.9%.

The Adjusted Rand Index (ARI) indicates a moderate

agreement between predicted and actual labels, though it
Adjust Rand Index 0.21
suggests room for improvement, especially for predicting
the "Good" class.

Among the "Bad" wines in the test data, Naive Bayes

Error rate: correctly classified 359 "Bad" wines and misclassified 67 as
Confusion Matrix
20.37% (Good) "Good". Among the "Good" wines, 29 were classified as
"Bad".

Top Features: Naive Bayes identifies Alcohol, Residual Sugar, and Sulphates as the most important features in
predicting wine quality.
Marketing Research and Engineering : MAX503 FALL 2024

3 Tops properties
Naive Bayes Model
Summary Function Alcohol:
Predicted: 11.54 for "Good" wines vs. 10.20 for "Bad"
wines.
Actual: 11.63 for "Good" wines vs. 10.32 for "Bad" wines.
Observation: Higher alcohol content is strongly
associated with "Good" wines.

Sulphates:
Predicted: 0.73 for "Good" wines vs. 0.64 for "Bad"
wines.
Actual: 0.74 for "Good" wines vs. 0.65 for "Bad" wines.
Observation: "Good" wines consistently have higher
sulphate levels, which enhance flavor and preservation.

Volatile Acidity:
Predicted: 0.36 for "Good" wines vs. 0.57 for "Bad"
wines.
Actual: 0.43 for "Good" wines vs. 0.54 for "Bad" wines.
Observation: Lower volatile acidity improves the taste,
making wines more likely to be classified as "Good."
Marketing Research and Engineering : MAX503 FALL 2024

Random Forest Model

Training Model Details
Built a Random Forest model with 3,000
decision trees for classification.

Split dataset: 70% training data (1,119 samples)

and 30% testing data (480 samples).

Before Handling Imbalance

(Original Model)
Class Error: "Bad" wines had low error (2.65%),
while "Good" wines had high error (46.01%).

Model Bias: The model performed significantly

better for the majority class ("Bad") compared to
the minority class ("Good").

Cluster Plot: Significant overlap between "Good"

and "Bad" clusters, highlighting poor separation for
"Good" wines.
Marketing Research and Engineering : MAX503 FALL 2024

Random Forest Model

Issue with Imbalanced Classes
The dataset is imbalanced, with more "Bad" wines (1,382)
than "Good" wines (217). The model prioritizes "Bad" wines,
leading to poor performance on "Good" wines.

Balanced sampling ensures the model focuses equally on

"Good" wines, improving their classification accuracy.

After Handling Imbalance

(Balanced Sampling):
Class Error: "Bad": Slightly higher error (13.18%).
"Good": Reduced error (20.25%), improving minority
class predictions.

Cluster Plot: Shows clearer separation and better

classification for "Good" wines.

Balanced Focus: Model now performs better for "Good"

wines, addressing the earlier bias toward "Bad" wines.
Marketing Research and Engineering : MAX503 FALL 2024

Random Forest Model

Performance Metrics
RESULT EXPLANATION

Out of the total predictions, 82.9% matched the actual

Accuracy 82.9%
wine quality labels

Moderate clustering agreement reflects the model’s

Adjust Rand Index 0.32
limitation for minority class “Good” predictions.

Error rate: Among the "Bad" wines in the test data, 16.67% were
Confusion Matrix
16.67% (Bad) incorrectly classified as "Good" wines.

Error rate: Among the "Good" wines in the test data, 20.37% were
20.37% (Good) incorrectly classified as "Bad" wines.

The Random Forest model has proven effective at distinguishing between "Good" and "Bad" wines with an
accuracy of 82.9%. This indicates that the model can reliably predict the quality of most wines in the dataset.
Marketing Research and Engineering : MAX503 FALL 2024

Random Forest Model 3 Tops properties

Summary Function Alcohol :

Higher for “Good” wines than “Bad” in both
Good wines have significant higher alcohol
levels, making it the most importance factor.

Sulphates:
Higher for “Good” wines than “Bad” in both
Predicted Quality Sulphates enhance flavor and preservation,
strongly influencing wine quality.

Volatile Acidity:
Lower for “Good” wines than “Bad” in both
Lower volatile acidity improve taste, making
this a critical factor for high-quality wines.

Actual Quality
Marketing Research and Engineering : MAX503 FALL 2024

Random Forest Model 3 Tops properties

Variable Importance

The heatmap clearly shows

that "Good" wines have higher
alcohol and sulphates and
lower volatile acidity compared
to "Bad" wines.

The feature importance plots

(Mean Decrease Accuracy and
Gini) confirm the rankings of
alcohol, sulphates, and volatile
acidity as the most influential
factors.
Marketing Research and Engineering : MAX503 FALL 2024

Comparative Analysis
NAIVE BAYES RANDOM FOREST EXPLANATION

Naive Bayes predicted 80.83% of wine labels,

Accuracy 80.83% 82.9%
slightly lower than Random Forest.

Naive Bayes shows moderate agreement with

Adjust
0.21 0.32 actual labels, with room for improvement for
Rand Index
"Good" class predictions.

Both models show similar misclassification rates

Confusion Error rate: 20.37% Error rate: 20.37%
for "Good" wines, but Naive Bayes is slightly less
Matrix (Good) (Good)
accurate.

Alcohol, Alcohol,
3 Top Both models agree on Alcohol and Sulphates and
Sulphates, Sulphates,
Features Volatile Acidity as key features.
Volatile Acidity Volatile Acidity

Random Forest outperforms Naive Bayes in terms of accuracy and minority class
Key Takeaway performance.
Random Forest also identifies important physicochemical properties, offering deeper
insights for marketing and product decisions.
Marketing Research and Engineering : MAX503 FALL 2024

Implications
Business Implications and
Recommendation
Model Effectiveness:
Naive Bayes Model: With an accuracy of 80.83%, Naive
Bayes performs well in classifying wine quality based on
physicochemical features, offering a straightforward Feature Importance:
and efficient model for wine quality classification. Top Features Identified: Alcohol, Sulphates,
Random Forest Model: While slightly more accurate Volatile Acidity were identified as the most
(82.9%), it requires more complexity. Naive Bayes still important features in predicting wine quality.
outperforms in terms of accuracy with minimal These features highlight key areas
complexity and provides value in simpler scenarios. winemakers can focus on to improve
product quality, aligning production with
Impact of Class Imbalance: quality characteristics demanded in the
Naive Bayes model is highly effective in handling market.
imbalanced datasets, maintaining robustness even with
skewed distributions between the “Bad” and “Good”
wines.
The Random Forest model’s handling of class imbalance
improves prediction accuracy for "Good" wines, but
Naive Bayes provides a simpler and effective solution
for classifying minority classes.
Marketing Research and Engineering : MAX503 FALL 2024

Recommendations For Marketing and Product Decisions

For Wine Producers Targeting "Good" Wines: The insights from the
models can guide marketing efforts by
Focus on Alcohol Content: Since alcohol emphasizing the importance of alcohol, sulphates,
significantly influences wine quality, experimenting and volatile acidity as key selling points for
with different fermentation processes to adjust premium wine products.
alcohol levels could help producers achieve the Use Model Insights for R&D: The Naive Bayes
optimal balance for quality wine. model can inform the development of wine by
Increase Sulphates for Better Preservation: identifying key physicochemical traits that
Ensuring adequate sulphate content can enhance correlate with high-quality wines. This insight can
flavor, extend the shelf life, and align with consumer be used to enhance product development and
demand for higher-quality wines. continuous improvement.
Reduce Volatile Acidity: Maintaining low levels of
volatile acidity will not only improve taste but also
Data-Driven Marketing Campaigns
enhance the appeal of wines in the high-quality Wine brands can utilize these insights in
segment. marketing campaigns by highlighting
scientifically-backed quality factors, appealing
to consumers who value quality consistency.
Marketing Research and Engineering : MAX503 FALL 2024

Appendix

For more detailed information in part of coding, please access the full document
by scanning the provided QR code with your mobile device's camera or a QR code
scanning application.

Combined Synthetic Minority Oversampling Technique and Deep Neural Network For Red Wine Quality Prediction
No ratings yet
Combined Synthetic Minority Oversampling Technique and Deep Neural Network For Red Wine Quality Prediction
6 pages
Wine Quality Classification
No ratings yet
Wine Quality Classification
36 pages
Project CST 383
No ratings yet
Project CST 383
1,083 pages
Guillermo Garcia Rodriguez - Rivendel S.L
No ratings yet
Guillermo Garcia Rodriguez - Rivendel S.L
85 pages
Wine5 PDF
No ratings yet
Wine5 PDF
29 pages
Mi Worj
No ratings yet
Mi Worj
39 pages
Grupo Turing - Processo Seletivo 2019.1: Exemplo de Análise de Dados - Red Wine Quality
No ratings yet
Grupo Turing - Processo Seletivo 2019.1: Exemplo de Análise de Dados - Red Wine Quality
7 pages
Red Wine Quality Prediction Using Machine Learning Techniques
No ratings yet
Red Wine Quality Prediction Using Machine Learning Techniques
7 pages
Wine Case Report
100% (2)
Wine Case Report
16 pages
Irjmets Journal
No ratings yet
Irjmets Journal
7 pages
The Classification of White Wine and Red Wine Acco
No ratings yet
The Classification of White Wine and Red Wine Acco
5 pages
Wine Quality Prediction Using ML PPR
100% (1)
Wine Quality Prediction Using ML PPR
8 pages
An Investigation of Wine Quality Testing Using Machine Learning Techniques
No ratings yet
An Investigation of Wine Quality Testing Using Machine Learning Techniques
8 pages
Prediction of Wine Quality Using Machine Learning
100% (1)
Prediction of Wine Quality Using Machine Learning
12 pages
WNSAA Onsite Case Wine
No ratings yet
WNSAA Onsite Case Wine
3 pages
ML PR
No ratings yet
ML PR
32 pages
Wine Quality Prediction: Implementation
No ratings yet
Wine Quality Prediction: Implementation
3 pages
HPF 002 04 Entry and Exit Procedure in CNC Grade D Area and Manufacturing
No ratings yet
HPF 002 04 Entry and Exit Procedure in CNC Grade D Area and Manufacturing
52 pages
Data Set Information WINE QUALITY
100% (1)
Data Set Information WINE QUALITY
4 pages
Honours LY Project
No ratings yet
Honours LY Project
31 pages
Wine Quality Synopsis
No ratings yet
Wine Quality Synopsis
3 pages
Industrial Fans - Determination of Fan Sound Power Levels Under Standardized Laboratory Conditions
100% (1)
Industrial Fans - Determination of Fan Sound Power Levels Under Standardized Laboratory Conditions
30 pages
Wine Quality Prediction Using Machine Learning Algorithms
100% (1)
Wine Quality Prediction Using Machine Learning Algorithms
4 pages
Machine Learning Based Predictive Modelling For The Enhancement of Wine Quality
No ratings yet
Machine Learning Based Predictive Modelling For The Enhancement of Wine Quality
18 pages
Coordination of Housekeeping With Other Departments
No ratings yet
Coordination of Housekeeping With Other Departments
63 pages
VinQCheck: An Intelligent Wine Quality Assessment
No ratings yet
VinQCheck: An Intelligent Wine Quality Assessment
9 pages
In Vino Veritas Data Mining and Machine Learning Final Project
No ratings yet
In Vino Veritas Data Mining and Machine Learning Final Project
11 pages
Document
No ratings yet
Document
37 pages
2017 - RA 9275 - Clean Water Act
No ratings yet
2017 - RA 9275 - Clean Water Act
68 pages
Report Revathy
No ratings yet
Report Revathy
13 pages
Econometrics Project AARYAN BHANOT
No ratings yet
Econometrics Project AARYAN BHANOT
13 pages
A Beginner's Guide To ETL With Python - by Jesús Cantú - Medium
No ratings yet
A Beginner's Guide To ETL With Python - by Jesús Cantú - Medium
13 pages
Big Data Projecct
No ratings yet
Big Data Projecct
12 pages
Red Wine Quality Prediction Using Machine Learning
No ratings yet
Red Wine Quality Prediction Using Machine Learning
4 pages
Machine Learning Miniproject
No ratings yet
Machine Learning Miniproject
10 pages
40 Richest Families in Pakistan
100% (1)
40 Richest Families in Pakistan
17 pages
Eng TELE-satellite 1209
No ratings yet
Eng TELE-satellite 1209
324 pages
Wine Quality Prediction GHAR
No ratings yet
Wine Quality Prediction GHAR
19 pages
Wine Quality Prediction by Using Machine Learning Algorithms
No ratings yet
Wine Quality Prediction by Using Machine Learning Algorithms
19 pages
Anterior Resin Infiltration
No ratings yet
Anterior Resin Infiltration
21 pages
Wine Quality Prediction Using Data Mining
No ratings yet
Wine Quality Prediction Using Data Mining
13 pages
ZENITEL Products2007
100% (1)
ZENITEL Products2007
22 pages
Wine Quality Predictions
No ratings yet
Wine Quality Predictions
13 pages
Wine Quality Prediction
No ratings yet
Wine Quality Prediction
22 pages
ML Project Report
No ratings yet
ML Project Report
12 pages
Training Report
No ratings yet
Training Report
27 pages
Understanding Wines
No ratings yet
Understanding Wines
16 pages
DWDM Glob
No ratings yet
DWDM Glob
20 pages
Investor Book: Nippon Paint Holdings Co., LTD
No ratings yet
Investor Book: Nippon Paint Holdings Co., LTD
45 pages
Performance Evaluation of Multiple Machine Learning Models For Wine Quality Prediction
No ratings yet
Performance Evaluation of Multiple Machine Learning Models For Wine Quality Prediction
15 pages
S Selection Nofimp Portant Fe Machi Eatures A Ne Learn and Pred Ning Tech Dicting W Hniques Wine Qual Lity Using G
No ratings yet
S Selection Nofimp Portant Fe Machi Eatures A Ne Learn and Pred Ning Tech Dicting W Hniques Wine Qual Lity Using G
8 pages
Predictive Analysis Using Linear Regression
No ratings yet
Predictive Analysis Using Linear Regression
10 pages
Lab Rep
No ratings yet
Lab Rep
9 pages
Mini Project Report
No ratings yet
Mini Project Report
12 pages
10.1007@978 981 13 7403 623
No ratings yet
10.1007@978 981 13 7403 623
9 pages
Mahima 2020
No ratings yet
Mahima 2020
8 pages
Eol Math MP Moe III 2018
No ratings yet
Eol Math MP Moe III 2018
7 pages
Message - 4 - FORM 5 - APPLICATION FORM FOR REGISTRATION AS RECYLER OF HW
No ratings yet
Message - 4 - FORM 5 - APPLICATION FORM FOR REGISTRATION AS RECYLER OF HW
3 pages
Machine Learning On Wine Quality: Prediction and Feature Importance Analysis
No ratings yet
Machine Learning On Wine Quality: Prediction and Feature Importance Analysis
5 pages
Wine Quality Prediction Research Paper 22
No ratings yet
Wine Quality Prediction Research Paper 22
6 pages
DCNR 20027676 PDF
No ratings yet
DCNR 20027676 PDF
20 pages
Muscle Fatigue What Why and How It Influences Musc
No ratings yet
Muscle Fatigue What Why and How It Influences Musc
14 pages
Wine Quality Predictor
0% (1)
Wine Quality Predictor
9 pages
MLP Slides Merged
No ratings yet
MLP Slides Merged
480 pages
Wine Quality Dataset
No ratings yet
Wine Quality Dataset
2 pages
Target Test (1), Neet
No ratings yet
Target Test (1), Neet
23 pages
Underground Gas Storage in A Partially Depleted Gas Reservoir
No ratings yet
Underground Gas Storage in A Partially Depleted Gas Reservoir
5 pages
The How and Why of The Movement System As The Identity of Physical Therapy
No ratings yet
The How and Why of The Movement System As The Identity of Physical Therapy
8 pages
Xstkfinal
No ratings yet
Xstkfinal
29 pages
Pred Analytics
No ratings yet
Pred Analytics
5 pages
UTP C6 - 250MHz
No ratings yet
UTP C6 - 250MHz
5 pages
Humair Arshad Wine Quality Revised
No ratings yet
Humair Arshad Wine Quality Revised
16 pages
Imran MP
No ratings yet
Imran MP
27 pages
Homework #1 - Hida Efri Nurfina
No ratings yet
Homework #1 - Hida Efri Nurfina
13 pages
Al Karama Ayurveda : Discover The Power of Ayurveda
No ratings yet
Al Karama Ayurveda : Discover The Power of Ayurveda
3 pages
Effect of Rain & Snow On Jet Engines
No ratings yet
Effect of Rain & Snow On Jet Engines
5 pages
Example Design A Non Overflow Gravity Dam by The Single Step Method Using The - Course Hero
No ratings yet
Example Design A Non Overflow Gravity Dam by The Single Step Method Using The - Course Hero
1 page
COD LIst
No ratings yet
COD LIst
14 pages
A Family Case Study
No ratings yet
A Family Case Study
34 pages
A Data Mining Approach To Wine Quality Prediction - Radosavljevic, Ilic, Pitulic
No ratings yet
A Data Mining Approach To Wine Quality Prediction - Radosavljevic, Ilic, Pitulic
5 pages
Skillsheet C2
No ratings yet
Skillsheet C2
5 pages
Data Analysis and Modeling in R
No ratings yet
Data Analysis and Modeling in R
12 pages
Zara
No ratings yet
Zara
7 pages
Early Blight of Potato
No ratings yet
Early Blight of Potato
2 pages
25 Passage 2 - The Development of Plastics Q14-26
No ratings yet
25 Passage 2 - The Development of Plastics Q14-26
5 pages
Wine Quality Analysis
No ratings yet
Wine Quality Analysis
27 pages
Wine Quality Prediction Project Report
No ratings yet
Wine Quality Prediction Project Report
4 pages
FINLATICS
No ratings yet
FINLATICS
8 pages
WJEC June 2024 - Unit 2 Mark Scheme
No ratings yet
WJEC June 2024 - Unit 2 Mark Scheme
20 pages

Wine Final Projects

Uploaded by

Wine Final Projects

Uploaded by

Marketing Research and Engineering : MAX503 FALL 2024

Key Factors of Wine Quality:

1 Introduction and Executive 5 Random Forest Model

Data Preparation (Cleaning

3 Exploratory Data Analysis 7 Business Implications

4 Naive Bayes Model 8 Appendix

Use machine learning models to predict and

Used Naive Bayes (a probabilistic model)

Exploratory Data Analysis

A strong positive correlation indicates that as the

Example: Total Sulfur Dioxide and Free Sulfur

A strong negative correlation indicates that as the

Example: pH and Fixed Acidity (r = -0.68)

Naive Bayes Model Training Model Details

Dataset: The dataset consists of 1,599 observations of

Model Type: Naive Bayes (a probabilistic classifier).

Split Dataset: Training Dataset: 70% of the data (1,119

We applied a Naive Bayes classifier, a probabilistic model

Training Model Details Naive Bayes Model

Split the dataset: 70% training data (1,119

Before Handling Imbalance

Model Bias: The model had a slight bias towards

Cluster Plot: Significant overlap between "Good"

Naive Bayes Model

Naive Bayes Model

Naive Bayes provides a solid prediction performance with

The Adjusted Rand Index (ARI) indicates a moderate

Among the "Bad" wines in the test data, Naive Bayes

Random Forest Model

Split dataset: 70% training data (1,119 samples)

Before Handling Imbalance

Model Bias: The model performed significantly

Cluster Plot: Significant overlap between "Good"

Random Forest Model

Balanced sampling ensures the model focuses equally on

After Handling Imbalance

Cluster Plot: Shows clearer separation and better

Balanced Focus: Model now performs better for "Good"

Random Forest Model

Out of the total predictions, 82.9% matched the actual

Moderate clustering agreement reflects the model’s

Random Forest Model 3 Tops properties

Summary Function Alcohol :

Random Forest Model 3 Tops properties

The heatmap clearly shows

The feature importance plots

Naive Bayes predicted 80.83% of wine labels,

Naive Bayes shows moderate agreement with

Both models show similar misclassification rates

Recommendations For Marketing and Product Decisions

You might also like