0% found this document useful (0 votes)

14 views18 pages

DA Report Format

Uploaded by

komalgautham1208

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views18 pages

DA Report Format

Uploaded by

komalgautham1208

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

Student Specialization using Naïve Bayes

and Support Vector Machine

MINI PROJECT REPORT
SUBMITTED TO

RAMAIAH INSTITUTE OF TECHNOLOGY

(Autonomous Institute, Affiliated to VTU)
Bangalore – 560054

SUBMITTED BY
Keerthana D 1MS14CS053
Manoj J Shet 1MS14CS064
Prasad Hegde 1MS14CS086
Ajeya S H 1MS14CS146

As part of the Course Data Analytics Laboratory– CSL717

SUPERVISED BY
Faculty
Parkavi.A

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

RAMAIAH INSTITUTE OF TECHNOLOGY
Aug-Dec 2017

1
Department of Computer Science and Engineering
Ramaiah Institute of Technology
(Autonomous Institute, Affiliated to VTU)
Bangalore – 54

CERTIFICATE

This is to certify that Keerthana D (1MS14CS053), Manoj J Shet (1MS14CS064), Prasad

Hegde (1MS14CS086) and Ajeya S H (1MS14CS146) have completed the “Student
Specialization using Naïve Bayes and Support Vector Mavhine” as part of Mini Project.

We declare that the entire content embodied in this B.E. 7th Semester report contents are not
copied.

Submitted by: Guided by:

Keerthana D 1MS14CS053 Prof. Parkavi

Manoj J Shet 1MS14CS064 (Assistant Professor, Dept. of CSE, RIT)
Prasad Hegde 1MS14CS086
Ajeya S H 1MS14CS146

(Dept of CSE, RIT)

2
Department of Computer Science and Engineering
Ramaiah Institute of Technology
(Autonomous Institute, Affiliated to VTU)
Bangalore – 54

Evaluation Sheet

Sl. No USN Name Content Speaking Teamwork Neatness Effectiveness Total Marks
and Skills (2) and care &
Demonstration (2) (2) Productivity (25)
(15) (4)
1 1MS14CS053 Keerthana D

2 1MS14CS064 Manoj J Shet

3 1MS14CS086 Prasad Hegde

4 1MS14CS146 Ajeya S H

Evaluated By

Name: Parkavi.A
Designation: Assistant Professor
Department: Computer Science & Engineering, RIT
Signature:

HOD, CSE

3
Table of Contents

Page No

1. Abstract 5

2. Introduction 6

3. Literature Survey 8

4. Algorithm 10

5. Implementation 11

6. Results and Discussions 16

7. Conclusion 17

8. References 18

4
1. Abstract

Student marks in various subjects are usually reflections of their interests and
specializations. One tends to do well in the field they are good at and are specialized. One domain
can be covered in various subjects and an average of those can be considered as a valid measure of
the extent of expertise in that particular domain. The best of those averages is taken as the student’s
specialization. We can achieve this by applying various classification algorithms. We have
considered Naïve Bayes and SVM (Support Vector Machines) to evaluate specializations based on
a set of trained data.

5
2. Introduction

Data analytics, also known as analysis of data or data analysis, is a process of inspecting,
cleansing, transforming, and modeling data with the goal of discovering useful information,
suggesting conclusions, and supporting decision-making. Data analysis has multiple facets and
approaches, encompassing diverse techniques under a variety of names, in different business,
science, and social science domains. It is done with the aid of specialized systems and software.

One of the most popular and widely used data analytics software is R. R is an open
source programming language and software environment for statistical computing and graphics
that is supported by the R Foundation for Statistical Computing. The R language is widely used
among statisticians and data miners for developing statistical software and data analysis. R
provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests,
time-series analysis, classification, clustering, etc) and graphical techniques, and is highly
extensible.

Classification is one of the important statistical techniques and is used in many

applications. In general, it can be defined as systematic arrangement in groups or categories
according to established criteria. There are various classification algorithms, for example,
Naïve Bayes, Support Vector Machine, k-nearest neighbors, logistic regressions, decision trees, et
cetera. The best suited algorithm is chosen based on various factors like the size of data set,
required results, accuracy.

A student takes up various courses during the graduation. These courses can be classified
into broad categories. For example, C++, Java and Python can be classified under the category of
Programming, and Data Communication and Computer Networks comes under Networking. One
always tends to do well and score more in the interested course than in the one without. So, the
average of grade points of each category can be considered as a valid measure of the level of
expertise in that particular domain. Each student will have their own interests and specializations,
which is the one with the best average among the categories.

6
In this project we consider marks of students in various courses and try to find the
individual’s specialization. This is initially done by mathematically calculating averages and
finding the greatest of them. This data is then fed as training data into the chosen classification
algorithms – Naïve Bayes and SVM - and the rest of the computations is done by the classification
algorithms.

In machine learning, naive Bayes classifiers are a family of simple probabilistic

classifiers based on applying Bayes' theorem with strong (naive) independence assumptions
between the features. Naive Bayes classifiers are highly scalable, requiring a number of parameters
linear in the number of variables (features/predictors) in a learning problem.

In machine learning, support vector machines are supervised learning models with
associated learning algorithms that analyze data used for classification and regression analysis.
Given a set of training examples, each marked as belonging to one or the other of two categories,
an SVM training algorithm builds a model that assigns new examples to one category or the other,
making it a non-probabilistic binary linear classifier.

7
3. Literature Survey

Zhongheng Zhang [1] talks about the working of Naive Bayes classifier and few implementations
of Naive Bayes classifier in R. Naïve Bayes classifier uses Bayes Theorem for classification which
uses prior knowledge, current evidence to predict posterior probability. Using training data-set a
model is build and using the model probability for test data can be predicted.
The paper talks about two libraries which implement Naive Bayes algorithm. First is e1071
package which has naiveBayes() function which returns object that can be used for further
prediction. Next is the caret package which used train() function for training and then probability is
predicted.

Anand Shanker Tewari, Tasif Sultan, Ansari Asim, Gopal Barman [2] propose a book
recommendation technique based on opinion mining and Naïve Bayes classifier to recommend top
ranking books to buyers. This paper also considered the important factor like price of the book
while recommendation and presented a novel tabular efficient method for recommending books to
the buyer, especially when the buyer is coming first time to the website. The system has run the
text crawler on the reviews of books and find out all the popular negative and positive adjective
keywords. This recommendation system is considering the price of the book along with the
reviews during recommendation process.

Shih-Chung Hsu , I-Chieh Chen and Chung-Lin Huang [3] present an image classification method
which consists of salient region (SR) detection, local feature extraction, and pairwise local
observations based Naive Bayes classifier (NBPLO). Based on the discriminative pairwise local
observations, the structure object model based Naive Bayes classifier for image classification is
developed here. The paper discusses about Feature extraction, Bag-of-Feature and many other
techniques to support Naïve Bayes classification. Different from the pyramid matching pursuit, this
method also outperforms the conventional BoF method. However, there are still more room for
improvement and some problems to be solved.

Durgesh K Srivastava, Lekha Bhambhu [4] deal with the working of Support Vector
Machine(SVM) for data classification, its performance and steps which make SVM to increase
classification accuracy. SVM is a supervised learning method and its special property is to
simultaneously minimize the empirical classification error and maximize the geographical margin.
The separating hyperplane with the largest margin determines the efficiency of classification. This
paper also deals with kernel function selection and model selection.
Model selection refers to tuning of parameters which affect generalization error. A comparative
study of results using different data with different kernel functions like linear, polynomial, sigmoid
etc. is also done. This paper also talks about rough set which is a new tool to deal with uncertain
and unintergrality knowledge.

8
Alexandros Karatzoglou, David Meyer, Kurt Hornik [5] first briefly explain what SVM is along
with concepts like regression, classification, novelty detection classification. It also brings into the
light about more than ten kernel functions that can be used for classification.
Various other currently existing software that implement SVM are mentioned like libsvm,
SVMlight, SVMtorch, MATLAB SVM Toolbox etc. Then details about various data set like Iris,
Spam, Vowel, DNA etc are given. The paper briefly describes various libraries like ksvm in
kernlab package which provides basic kernel functionality, svmlight of klaR package
which includes utility functions for classification and visualization along with svm and svmpath
functions. Then a comparative study of different SVM implementations on different datasets is
done.

Nikhil Bajaj, Niko J. Murrell, Julie G. Whitney, Jan P. Allebach, George T.-C. Chiu [6] in their
paper, a method for integration of expert defined allowable confusions into SVM systems is
introduced, with an example implementation in a least squares support vector machine (LS-SVM)
tested on industrial data, and shown to improve overall performance of a multi-class classification
system when an appropriate performance measurement method is formulated. The proposed
approach was tested with an industrial dataset collected from a multi-sensor sorting application,
where expert knowledge of allowable and acceptable confusion is available. A confusion matrix
augmented performance metric was shown to have the potential to improve the combined
performance of a LS-SVM based multi-class classifier when expert knowledge on acceptable miss-
classification or confusion is available.

9
4. Algorithm

i. Naïve Bayes Classifier

trainNB = subset(marks_train, select = -c(Domain,V2))

testNB = subset(marks_test, select = -c(Domain,V2))

naive_model <- naiveBayes(trainNB$Domain1 ~.,data=trainNB)

naive_pred <- predict(naive_model,testNB)

ii. Support Vector Machine

trainSvm = subset(marks_train, select = -c(V2))

testSvm = subset(marks_test, select = -c(Domain1,V2) )

svm_model <- svm(trainSvm$Domain, trainSvm$Domain1,gamma = 0.2, cost = 1)

svm_pred <- predict(svm_model, testSvm$Domain)

aa <- svm_pred

10
5. Implementation

#non core 1, program 2, network 3, circuits 4, bot 5,core 6,compiler 7

library(data.table)
library(class)
library(caret)
library(plyr)
library(e1071)
library(rminer)
library(ROCR)
library(ggplot2)
l<-c(0,0)
l1<-c("","")

marks<-read.table("Book1.csv", header=FALSE, sep=",")

marks<-marks[,1:27]
for(i in 3:nrow(marks))
{
count<-c(0,0,0,0,0,0,0)
average<-c(0,0,0,0,0,0,0)
summ<-c(0,0,0,0,0,0,0)
subject<-c("N","P","T","R","B","C","M")
for(j in 3:(ncol(marks)))
{
if(marks[1,j]=="Non core")
{
count[1]<-count[1]+1
summ[1]<-summ[1]+strtoi(marks[i,j])
average[1]<-summ[1]/count[1]
}
else if(marks[1,j]=="Programming")
{
count[2]<-count[2]+1
summ[2]<-summ[2]+strtoi(marks[i,j])
average[2]<-summ[2]/count[2]
}
else if(marks[1,j]=="Networking")
{
count[3]<-count[3]+1
summ[3]<-summ[3]+strtoi(marks[i,j])
average[3]<-summ[3]/count[3]
}
else if(marks[1,j]=="Circuits")
{
count[4]<-count[4]+1
summ[4]<-summ[4]+strtoi(marks[i,j])
average[4]<-summ[4]/count[4]
}
else if(marks[1,j]=="Bot")

11
{
count[5]<-count[5]+1
summ[5]<-summ[5]+strtoi(marks[i,j])
average[5]<-summ[5]/count[5]
}
else if(marks[1,j]=="Core")
{
count[6]<-count[6]+1
summ[6]<-summ[6]+strtoi(marks[i,j])
average[6]<-summ[6]/count[6]
}
else if(marks[1,j]=="Compiler")
{
count[7]<-count[7]+1
summ[7]<-summ[7]+strtoi(marks[i,j])
average[7]<-summ[7]/count[7]
}
}
max<-0
index<-99
s<-""
for(k in 1:7)
{
if(average[k]>=max)
{
max<-average[k]
index<-k
s<-subject[k]
}
}

l<-c(l,as.numeric(index))
l1<-c(l1,s)
}
marks <- transform(marks, Domain=as.numeric(l))
marks <- transform(marks,Domain1=as.factor(l1))
marks <- marks[-1,]
marks <- marks[-1,]
marks <- marks[,-1]
data <- sample(2,nrow(marks),replace=TRUE,prob=c(0.80,0.20))
marks_train <- marks[data==1,]
marks_test <- marks[data==2,]

#~~~~~~~~~~~~~~~~~~~~~~~~~~SVM START~~~~~~~~~~~~~~~~~~~~~~~~~~~~

trainSvm = subset(marks_train, select = -c(V2))

testSvm = subset(marks_test, select = -c(Domain1,V2) )

svm_model <- svm(trainSvm$Domain, trainSvm$Domain1,gamma = 0.2, cost = 1)

12
svm_pred <- predict(svm_model, testSvm$Domain)
aa <- svm_pred
l1=c()
for(n in 1:length(svm_pred))
{
if(svm_pred[n]=="N")
{
t<-1
l1<-c(l1,as.numeric(t))
}
else if(svm_pred[n]=="P")
{
t<-2
l1<-c(l1,as.numeric(t))
}
else if(svm_pred[n]=="T")
{
t<-3
l1<-c(l1,as.numeric(t))
}
else if(svm_pred[n]=='R')
{
t<-4
l1<-c(l1,as.numeric(t))
}
else if(svm_pred[n]=='B')
{
t<-5
l1<-c(l1,as.numeric(t))
}
else if(svm_pred[n]=='C')
{
t<-6
l1<-c(l1,as.numeric(t))
}
else if(svm_pred[n]=='M')
{
t<-7
l1<-c(l1,as.numeric(t))
}
#print(n)
}
svm_pred <- l1

#~~~~~~~~~~~~~~~~~~~~~~~~~~SVM END~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

13
#~~~~~~~~~~~~~~~~~~~NAIVE BAYES START~~~~~~~~~~~~~~~~~~~~~~~~~~

trainNB = subset(marks_train, select = -c(Domain,V2))

testNB = subset(marks_test, select = -c(Domain,V2))
naive_model <- naiveBayes(trainNB$Domain1 ~.,data=trainNB)
naive_pred <- predict(naive_model,testNB)

l1=c()
for(m in 1:length(naive_pred))
{
if(naive_pred[m]=="N")
{
t<-1
l1<-c(l1,as.numeric(t))
}
else if(naive_pred[m]=="P")
{
t<-2
l1<-c(l1,as.numeric(t))
}
else if(naive_pred[m]=="T")
{
t<-3
l1<-c(l1,as.numeric(t))
}
else if(naive_pred[m]=='R')
{
t<-4
l1<-c(l1,as.numeric(t))
}
else if(naive_pred[m]=='B')
{
t<-5
l1<-c(l1,as.numeric(t))
}
else if(naive_pred[m]=='C')
{
t<-6
l1<-c(l1,as.numeric(t))
}
else if(naive_pred[m]=='M')
{
t<-7
l1<-c(l1,as.numeric(t))
}
}
naive_pred <- l1
# ~~~~~~~~~~~~~~~~~~~~~~NAIVE BAYES END~~~~~~~~~~~~~~~~~~~~~~~

14
marks_test <- transform(marks_test,SVM=as.factor(svm_pred))
marks_test <- transform(marks_test,NB=as.factor(naive_pred))
result<-subset(marks_test,select = -c(Domain1))
result <- rename(result,c("V2" = "USN"))

plot.new()
p<-ggplot(result)+geom_point(data=result, aes(USN,SVM),col=2)
p <- p + geom_point(data=result, aes(USN,NB),col=3)
p <- p + geom_point(data=result, aes(USN,Domain),col=4)

print(p)
legend(0.6, 0.6, c('SVM', 'NB', 'Domain'),2:4 )

15
6. Results

The result is a plot of the specialization identified for the test data. It varies each time as the

test data is chosen randomly.

16
7. Conclusion

We have successfully obtained the objective, i.e., to identify the student’s specialization by

applying Naïve Bayes and SVM classification algorithms. We can see from the result that the

output of these 2 have deviations from the expected output. The accuracy of SVM is found to be

better than that of Naïve Bayes. The accuracy is expected to improve with increase in the size of

data set and better training set.

17
8. References

Literature Survey:

[1] Naive Bayes classification in R - Zhongheng Zhang

[2] Opinion Based Book Recommendation Using Naive Bayes Classifier - Anand Shanker
Tewari, Tasif Sultan, Ansari Asim, Gopal Barman
[3] Image Classification Using Pairwise Local Observations Based Naive Bayes Classifier -
Shih-Chung Hsu , I-Chieh Chen and Chung-Lin Huang
[4] Data Classification Using Support Vector Machine - Durgesh K Srivastava, Lekha
Bhambhu
[5] Support Vector Machines in R - Alexandros Karatzoglou, David Meyer, Kurt Hornik
[6] Expert-Prescribed Weighting for Support Vector Machine Classification - Nikhil Bajaj,
Niko J. Murrell, Julie G. Whitney, Jan P. Allebach, George T.-C. Chiu

Web References:

• https://www.r-project.org/about.html

• http://rischanlab.github.io/SVM.html

• https://www.kaggle.com/chinki/naive-bayes-classification-for-iris-dataset

• https://www.tutorialspoint.com/r/

• https://www.w3schools.in/r/

• https://www.rstudio.com/

• https://en.wikipedia.org/wiki/R_(programming_language)

• http://dataaspirant.com/2017/02/06/naive-bayes-classifier-machine-learning/

• https://www.analyticsvidhya.com/blog/2017/09/understaing-support-vector-machine-

example-code/

Novel Support Vector Machines for Diverse Learning Paradigms
No ratings yet
Novel Support Vector Machines for Diverse Learning Paradigms
153 pages
Uma's Final Project1
No ratings yet
Uma's Final Project1
92 pages
Cryptography-Applied Linear Algebra
No ratings yet
Cryptography-Applied Linear Algebra
37 pages
Stable & Unstable ODEs
No ratings yet
Stable & Unstable ODEs
5 pages
Students Performance
No ratings yet
Students Performance
4 pages
Challenges in Computational Statistics and Data Mining (Matwin & Mielniczuk 2015-07-08)
No ratings yet
Challenges in Computational Statistics and Data Mining (Matwin & Mielniczuk 2015-07-08)
404 pages
Kanksha2021_Chapter_SupervsedLearnngAlgorthmASu
No ratings yet
Kanksha2021_Chapter_SupervsedLearnngAlgorthmASu
9 pages
Pa Mod - 3,4,5
No ratings yet
Pa Mod - 3,4,5
47 pages
Batch 9 Liver Disease Prediction Using SVM and Naive Bayes
No ratings yet
Batch 9 Liver Disease Prediction Using SVM and Naive Bayes
96 pages
aipptoriginal-191215023212
No ratings yet
aipptoriginal-191215023212
16 pages
Image Compression_Unit 4
No ratings yet
Image Compression_Unit 4
34 pages
unit 6 ai
No ratings yet
unit 6 ai
28 pages
QUESTIONS
No ratings yet
QUESTIONS
20 pages
Experiment # 10
No ratings yet
Experiment # 10
10 pages
Personal Career Recommendation System 2mwev6m5
No ratings yet
Personal Career Recommendation System 2mwev6m5
5 pages
ilovepdf_merged
No ratings yet
ilovepdf_merged
9 pages
Lecture 18 - 2024
No ratings yet
Lecture 18 - 2024
34 pages
Introduction To Pattern Recognition and Machine Learning PDF
No ratings yet
Introduction To Pattern Recognition and Machine Learning PDF
402 pages
Unvilling - Shapes - P
No ratings yet
Unvilling - Shapes - P
46 pages
Machine Learning
No ratings yet
Machine Learning
32 pages
On The Density Matrix Based Approach To Time-Dependent Density Functional Response Theory
No ratings yet
On The Density Matrix Based Approach To Time-Dependent Density Functional Response Theory
12 pages
Survey_Paper_on_Machine_Learning_and_Deep_Learning_Driven_Applications_using_Bayesian_Techniques
No ratings yet
Survey_Paper_on_Machine_Learning_and_Deep_Learning_Driven_Applications_using_Bayesian_Techniques
7 pages
Vijayi WFH Tech_Assignment_AI Internship_Jan 2025
No ratings yet
Vijayi WFH Tech_Assignment_AI Internship_Jan 2025
3 pages
Unit 3 PPT
No ratings yet
Unit 3 PPT
20 pages
PRCV Viva Notes
No ratings yet
PRCV Viva Notes
32 pages
ML Unit 3 V1
No ratings yet
ML Unit 3 V1
25 pages
Deep Learning Unit 1
No ratings yet
Deep Learning Unit 1
35 pages
14-May - Jupyter Notebook
No ratings yet
14-May - Jupyter Notebook
15 pages
Introduction To Data Mining 2005
60% (5)
Introduction To Data Mining 2005
400 pages
harish
No ratings yet
harish
11 pages
K Nearest Neighbors
No ratings yet
K Nearest Neighbors
16 pages
8
No ratings yet
8
9 pages
Foundations of Machine
No ratings yet
Foundations of Machine
120 pages
Ebug Final
No ratings yet
Ebug Final
25 pages
Machine Learning Algorithms 1728923216
No ratings yet
Machine Learning Algorithms 1728923216
12 pages
ML Unit 3 Part B Material
No ratings yet
ML Unit 3 Part B Material
15 pages
Thesis
No ratings yet
Thesis
364 pages
BSDS 2022
No ratings yet
BSDS 2022
54 pages
HPC Module 4
No ratings yet
HPC Module 4
18 pages
AIML MODEL
No ratings yet
AIML MODEL
13 pages
Marquez2016 PDF
No ratings yet
Marquez2016 PDF
6 pages
Chapter 3 Searching and Planning
No ratings yet
Chapter 3 Searching and Planning
104 pages
Evaluating C-SVM, CRF and LDA Classification For Daily Activity Recognition
No ratings yet
Evaluating C-SVM, CRF and LDA Classification For Daily Activity Recognition
6 pages
Avc06ijarse PDF
No ratings yet
Avc06ijarse PDF
10 pages
Computational Fluid Dynamics: Department of
No ratings yet
Computational Fluid Dynamics: Department of
1 page
Prediction On Iris
No ratings yet
Prediction On Iris
14 pages
Viva & Orals A.I Lab: Brainheaters
No ratings yet
Viva & Orals A.I Lab: Brainheaters
7 pages
A Study On Support Vector Machine Based Linear and Non-Linear Pattern Classification
No ratings yet
A Study On Support Vector Machine Based Linear and Non-Linear Pattern Classification
5 pages
True Skill 2
No ratings yet
True Skill 2
24 pages
Linear Regression Using Python
No ratings yet
Linear Regression Using Python
15 pages
Student Performance Prediction
No ratings yet
Student Performance Prediction
4 pages
T&P Analysis Using Ai: Team Members
No ratings yet
T&P Analysis Using Ai: Team Members
15 pages
Paper 8675
No ratings yet
Paper 8675
6 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
20 pages
A Minor Project Report On DMT
No ratings yet
A Minor Project Report On DMT
11 pages
Ijcrt 195700
No ratings yet
Ijcrt 195700
7 pages
Ascertaining Polarity of Public Opinions On Bangladesh Cricket Through Sentiment Analysis
No ratings yet
Ascertaining Polarity of Public Opinions On Bangladesh Cricket Through Sentiment Analysis
51 pages
Implementation of Credit Card Fraud Detection Using Random Forest Algorithm
100% (1)
Implementation of Credit Card Fraud Detection Using Random Forest Algorithm
10 pages
V3I4201499b50 PDF
No ratings yet
V3I4201499b50 PDF
8 pages
AI Unit 2 Notes
No ratings yet
AI Unit 2 Notes
37 pages
Introduction To Valuation: The Time Value of Money
No ratings yet
Introduction To Valuation: The Time Value of Money
32 pages
PGP-AIML Curriculum - Great Lakes
No ratings yet
PGP-AIML Curriculum - Great Lakes
43 pages
Classification
No ratings yet
Classification
4 pages
Wine Quality Research Paper
No ratings yet
Wine Quality Research Paper
3 pages
Matlab 2
No ratings yet
Matlab 2
40 pages
Expert System For Student Placement Prediction
No ratings yet
Expert System For Student Placement Prediction
5 pages
Unit 5 - Machine Learning
No ratings yet
Unit 5 - Machine Learning
17 pages
"Sentiment Analysis of Survey Comments: Animesh Tilak
No ratings yet
"Sentiment Analysis of Survey Comments: Animesh Tilak
12 pages
ConvNetJS Talk
No ratings yet
ConvNetJS Talk
39 pages
Algorithm Selection
No ratings yet
Algorithm Selection
9 pages
SCD-HW1-Full Name-Student ID
No ratings yet
SCD-HW1-Full Name-Student ID
12 pages
Ijcsea 2
No ratings yet
Ijcsea 2
13 pages
Machine Learning With Python
100% (2)
Machine Learning With Python
137 pages
Survey of Classification Techniques in Data Mining: Open Access
No ratings yet
Survey of Classification Techniques in Data Mining: Open Access
10 pages
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
No ratings yet
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
10 pages
A5 PDF
No ratings yet
A5 PDF
5 pages
Chapter-4 - CS-411 Compiler Construction
No ratings yet
Chapter-4 - CS-411 Compiler Construction
8 pages
Student Performance Prediction and Analysis: Ijarcce
No ratings yet
Student Performance Prediction and Analysis: Ijarcce
4 pages
Variable Names:: Var X Var Z Var W
No ratings yet
Variable Names:: Var X Var Z Var W
6 pages
NUMBERS
100% (1)
NUMBERS
3 pages
Evaluation of Different Classifier
No ratings yet
Evaluation of Different Classifier
4 pages
Unit I: Classification of Signals
No ratings yet
Unit I: Classification of Signals
22 pages
MA3005 Topic 1 - Introduction
No ratings yet
MA3005 Topic 1 - Introduction
15 pages
The 10 Algorithms Machine Learning Engineers Need To Know
No ratings yet
The 10 Algorithms Machine Learning Engineers Need To Know
14 pages
Pulse Transfer Function and Manipulation of Block Diagrams: The Output Signal Is Sample To Obtain
No ratings yet
Pulse Transfer Function and Manipulation of Block Diagrams: The Output Signal Is Sample To Obtain
41 pages
Applied Statistical Analysis with SPSS: Definitive Reference for Developers and Engineers
From Everand
Applied Statistical Analysis with SPSS: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Ultimate Enterprise Data Analysis and Forecasting using Python: Leverage Cloud platforms with Azure Time Series Insights and AWS Forecast Components for Deep learning Modeling using Python (English Edition)
From Everand
Ultimate Enterprise Data Analysis and Forecasting using Python: Leverage Cloud platforms with Azure Time Series Insights and AWS Forecast Components for Deep learning Modeling using Python (English Edition)
Shanthababu Pandian
No ratings yet
Understanding Service-Oriented Architecture (SOA): Designing Adaptive Business Model for SMEs
From Everand
Understanding Service-Oriented Architecture (SOA): Designing Adaptive Business Model for SMEs
Kirti Seth
No ratings yet
Discrete Structure and Automata Theory for Learners: Learn Discrete Structure Concepts and Automata Theory with JFLAP
From Everand
Discrete Structure and Automata Theory for Learners: Learn Discrete Structure Concepts and Automata Theory with JFLAP
Sukhpreet Kaur Gill
No ratings yet
Machine Learning Algorithms for Data Scientists: An Overview
From Everand
Machine Learning Algorithms for Data Scientists: An Overview
Vinaitheerthan Renganathan
No ratings yet

DA Report Format

Uploaded by

DA Report Format

Uploaded by

Student Specialization using Naïve Bayes

and Support Vector Machine

RAMAIAH INSTITUTE OF TECHNOLOGY

As part of the Course Data Analytics Laboratory– CSL717

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

This is to certify that Keerthana D (1MS14CS053), Manoj J Shet (1MS14CS064), Prasad

Submitted by: Guided by:

Keerthana D 1MS14CS053 Prof. Parkavi

(Dept of CSE, RIT)

2 1MS14CS064 Manoj J Shet

3 1MS14CS086 Prasad Hegde

6. Results and Discussions 16

Classification is one of the important statistical techniques and is used in many

In machine learning, naive Bayes classifiers are a family of simple probabilistic

i. Naïve Bayes Classifier

trainNB = subset(marks_train, select = -c(Domain,V2))

testNB = subset(marks_test, select = -c(Domain,V2))

naive_model <- naiveBayes(trainNB$Domain1 ~.,data=trainNB)

naive_pred <- predict(naive_model,testNB)

ii. Support Vector Machine

trainSvm = subset(marks_train, select = -c(V2))

testSvm = subset(marks_test, select = -c(Domain1,V2) )

svm_model <- svm(trainSvm$Domain, trainSvm$Domain1,gamma = 0.2, cost = 1)

svm_pred <- predict(svm_model, testSvm$Domain)

#non core 1, program 2, network 3, circuits 4, bot 5,core 6,compiler 7

marks<-read.table("Book1.csv", header=FALSE, sep=",")

trainSvm = subset(marks_train, select = -c(V2))

svm_model <- svm(trainSvm$Domain, trainSvm$Domain1,gamma = 0.2, cost = 1)

trainNB = subset(marks_train, select = -c(Domain,V2))

test data is chosen randomly.

data set and better training set.

[1] Naive Bayes classification in R - Zhongheng Zhang

You might also like