0% found this document useful (0 votes)
49 views

Pattern Recognition Assignment: Hari Narayan N.U B110490EE EEE A Batch

This document contains the results of several machine learning algorithms applied to a haberman survival dataset: 1. A Naive Bayes classifier was trained on 153 instances with 4 attributes, achieving 77.78% accuracy on the training set and 74.50% on test data. 2. k-Nearest Neighbors with k=7 achieved 78.43% accuracy on 153 test instances. 3. Simple k-Means clustering with k=4 grouped 306 instances into 4 clusters based on the attributes. 70.91% of instances were incorrectly clustered.

Uploaded by

HARINARAYANNU
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views

Pattern Recognition Assignment: Hari Narayan N.U B110490EE EEE A Batch

This document contains the results of several machine learning algorithms applied to a haberman survival dataset: 1. A Naive Bayes classifier was trained on 153 instances with 4 attributes, achieving 77.78% accuracy on the training set and 74.50% on test data. 2. k-Nearest Neighbors with k=7 achieved 78.43% accuracy on 153 test instances. 3. Simple k-Means clustering with k=4 grouped 306 instances into 4 clusters based on the attributes. 70.91% of instances were incorrectly clustered.

Uploaded by

HARINARAYANNU
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

PATTERN RECOGNITION ASSIGNMENT

Submitted by

HARI NARAYAN N.U

B110490EE
EEE A batch
TRAINING DATA
=== Run information ===

Scheme:weka.classifiers.bayes.NaiveBayes
Relation:

haberman-weka.filters.unsupervised.instance.RemovePercentage-P50.0-V

Instances: 153
Attributes: 4
Age_of_patient_at_time_of_operation
Patients_year_of_operation
Number_of_positive_axillary_nodes_detected
Survival_status
Test mode:evaluate on training data

=== Classifier model (full training set) ===

Naive Bayes Classifier

Class
Attribute

(0.74) (0.26)
=============================================================
Age_of_patient_at_time_of_operation

mean

43.0259 45.3968

std. dev.

6.1537 4.7468

weight sum
precision

114

39

1.0476 1.0476

Patients_year_of_operation
58

12.0

6.0

59

12.0

7.0

60

17.0

2.0

61

13.0

1.0

62

10.0

4.0

63

15.0

5.0

64

11.0

7.0

65

10.0

4.0

66

10.0

5.0

67

9.0

4.0

68

3.0

1.0

69

4.0

5.0

[total]

126.0 51.0

Number_of_positive_axillary_nodes_detected
mean

2.7161 7.3333

std. dev.

4.8927 10.2232

weight sum
precision

114

39

2.3636 2.3636

Time taken to build model: 0 seconds

=== Evaluation on training set ===


=== Summary ===

Correctly Classified Instances

119

77.7778 %

34

22.2222 %

Incorrectly Classified Instances


Kappa statistic

0.2817

Mean absolute error

0.2806

Root mean squared error

0.403

Relative absolute error

73.5784 %

Root relative squared error

92.4707 %

Total Number of Instances

153

=== Detailed Accuracy By Class ===

TP Rate FP Rate Precision Recall F-Measure ROC Area Class


0.947

0.718

0.794

0.947

0.864

0.791 1

0.282

0.053

0.647

0.282

0.393

0.791 2

Weighted Avg. 0.778

0.548

=== Confusion Matrix ===

a b <-- classified as
108 6 | a = 1
28 11 | b = 2

0.757

0.778

0.744

0.791

TEST DATA

=== Run information ===

Scheme:weka.classifiers.bayes.NaiveBayes
Relation:

haberman-weka.filters.unsupervised.instance.RemovePercentage-P50.0-V

Instances: 153
Attributes: 4
Age_of_patient_at_time_of_operation
Patients_year_of_operation
Number_of_positive_axillary_nodes_detected
Survival_status
Test mode:user supplied test set: size unknown (reading incrementally)

=== Classifier model (full training set) ===

Naive Bayes Classifier

Class
Attribute

(0.74) (0.26)
=============================================================
Age_of_patient_at_time_of_operation
mean

43.0259 45.3968

std. dev.

6.1537 4.7468

weight sum

114

39

precision

1.0476 1.0476

Patients_year_of_operation
58

12.0

6.0

59

12.0

7.0

60

17.0

2.0

61

13.0

1.0

62

10.0

4.0

63

15.0

5.0

64

11.0

7.0

65

10.0

4.0

66

10.0

5.0

67

9.0

4.0

68

3.0

1.0

69

4.0

5.0

[total]

126.0 51.0

Number_of_positive_axillary_nodes_detected
mean

2.7161 7.3333

std. dev.

4.8927 10.2232

weight sum
precision

114

2.3636 2.3636

Time taken to build model: 0 seconds

=== Evaluation on test set ===


=== Summary ===

39

Correctly Classified Instances

114

74.5098 %

39

25.4902 %

Incorrectly Classified Instances


Kappa statistic

0.2148

Mean absolute error

0.306

Root mean squared error

0.4831

Relative absolute error

78.2813 %

Root relative squared error

108.1765 %

Total Number of Instances

153

=== Detailed Accuracy By Class ===

TP Rate FP Rate Precision Recall F-Measure ROC Area Class


0.937

0.762

0.765

0.937

0.842

0.591 1

0.238

0.063

0.588

0.238

0.339

0.591 2

Weighted Avg. 0.745

0.57

=== Confusion Matrix ===

a b <-- classified as
104 7 | a = 1
32 10 | b = 2

0.716

0.745

0.704

0.591

NEAREST NEIGHBOUR CLASSIFICATION

=== Run information ===

Scheme:weka.classifiers.lazy.IBk -K 7 -W 0 -A "weka.core.neighboursearch.LinearNNSearch -A
\"weka.core.EuclideanDistance -R first-last\""
Relation:

haberman

Instances: 306
Attributes: 4
Age_of_patient_at_time_of_operation
Patients_year_of_operation
Number_of_positive_axillary_nodes_detected
Survival_status
Test mode:user supplied test set: size unknown (reading incrementally)

=== Classifier model (full training set) ===

IB1 instance-based classifier


using 7 nearest neighbour(s) for classification

Time taken to build model: 0 seconds

=== Evaluation on test set ===


=== Summary ===

Correctly Classified Instances


Incorrectly Classified Instances
Kappa statistic

120

78.4314 %

33

21.5686 %

0.3589

Mean absolute error

0.2999

Root mean squared error

0.3919

Relative absolute error

75.9937 %

Root relative squared error

87.808 %

Total Number of Instances

153

=== Detailed Accuracy By Class ===

TP Rate FP Rate Precision Recall F-Measure ROC Area Class


0.946

0.643

0.795

0.946

0.864

0.801 1

0.357

0.054

0.714

0.357

0.476

0.801 2

Weighted Avg. 0.784

0.481

=== Confusion Matrix ===

a b <-- classified as
105 6 | a = 1
27 15 | b = 2

0.773

0.784

0.758

0.801

K MEAN CLUSTERING

=== Run information ===

Scheme:weka.clusterers.SimpleKMeans -N 4 -A "weka.core.EuclideanDistance -R first-last" -I 500 -S 10


Relation:

haberman

Instances: 306
Attributes: 4
Age_of_patient_at_time_of_operation
Patients_year_of_operation
Number_of_positive_axillary_nodes_detected
Ignored:
Survival_status
Test mode:Classes to clusters evaluation on training data
=== Model and evaluation on training set ===

kMeans
======

Number of iterations: 6
Within cluster sum of squared errors: 197.29360453534517
Missing values globally replaced with mean/mode

Cluster centroids:
Cluster#
Attribute

Full Data
(306)

(52)

0
(89)

2
(87)

3
(78)

=====================================================================================
==============
Age_of_patient_at_time_of_operation
Patients_year_of_operation

52.4575 56.3462
58

Number_of_positive_axillary_nodes_detected

67

58

Clustered Instances

52 ( 17%)

89 ( 29%)

87 ( 28%)

78 ( 25%)

Class attribute: Survival_status


Classes to Clusters:

0 1 2 3 <-- assigned to cluster


31 67 67 60 | 1
21 22 20 18 | 2

Cluster 0 <-- No class

63

4.0261 10.1731

Time taken to build model (full training data) : 0.02 seconds

=== Model and evaluation on training set ===

59.618 43.8506 51.2949


64
2.3034

4.1494

1.7564

Cluster 1 <-- 2
Cluster 2 <-- 1
Cluster 3 <-- No class

Incorrectly clustered instances : 217.0

70.915 %

Sample Clusters

2.PROBLEM 1

2.PROBLEM 2

You might also like