0% found this document useful (0 votes)

207 views

R Assignment Classification of Ocean Microbes

1) The document describes using machine learning algorithms like decision trees, random forests, and SVM to classify different types of ocean microbes (crypto, nano, pico, synecho, ultra) based on characteristics like size and chlorophyll content. 2) It finds that random forests achieves the highest accuracy at 92% while decision trees and SVM achieve slightly lower accuracy around 92% and 92%. 3) Feature importance is also calculated which finds that characteristics like PE, chlorophyll content, and size are the most important predictors for classification.

Uploaded by

api-302999229

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

207 views

R Assignment Classification of Ocean Microbes

Uploaded by

api-302999229

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

RAssignment:ClassificationofOcean

Microbes
Chia
Loaddataintoworkspace.
library("tree")
library("ggplot2")
seaflow<read.csv("seaflow_21min.csv",header=T)
Plotpeagainstchl_small
##file_idtimecell_idd1
##Min.:203.0Min.:12.0Min.:0Min.:1328
##1stQu.:204.01stQu.:174.01stQu.:74861stQu.:7296
##Median:206.0Median:362.0Median:14995Median:17728
##Mean:206.2Mean:341.5Mean:15008Mean:17039
##3rdQu.:208.03rdQu.:503.03rdQu.:224013rdQu.:24512
##Max.:209.0Max.:643.0Max.:32081Max.:54048
##d2fsc_smallfsc_perpfsc_big
##Min.:32Min.:10005Min.:0Min.:32384
##1stQu.:95841stQu.:313411stQu.:134961stQu.:32400
##Median:18512Median:35483Median:18069Median:32400
##Mean:17437Mean:34919Mean:17646Mean:32405
##3rdQu.:246563rdQu.:391843rdQu.:222433rdQu.:32416
##Max.:54688Max.:65424Max.:63456Max.:32464
##pechl_smallchl_bigpop
##Min.:0Min.:3485Min.:0crypto:102
##1stQu.:16351stQu.:225251stQu.:2800nano:12698
##Median:2421Median:30512Median:7744pico:20860
##Mean:5325Mean:30164Mean:8328synecho:18146
##3rdQu.:58543rdQu.:382993rdQu.:12880ultra:20537
##Max.:58675Max.:64832Max.:57184

Partitions

library("caret")
##Loadingrequiredpackage:lattice
train_index<createDataPartition(1:nrow(seaflow),1,.5,list=F)
trainset<seaflow[train_index,]
testset<seaflow[train_index,]
DecisonTree
library("rpart")
fol<formula(pop~fsc_small+fsc_perp+fsc_big+pe+chl_big+chl_small)
model_dt<rpart(fol,method="class",data=trainset)
print(model_dt)
##n=36172
##
##node),split,n,loss,yval,(yprob)
##*denotesterminalnode
##
##1)root3617225756pico(0.00140.170.290.250.28)
##2)pe<5001.52631215954pico(00.220.3900.39)
##4)chl_small<32273.5115012055pico(00.000350.8200.18)*
##5)chl_small>=32273.5148116703ultra(00.390.06200.55)
##10)chl_small>=41054.55382771nano(00.86000.14)*
##11)chl_small<41054.594292092ultra(00.130.09700.78)*
##3)pe>=5001.59860794synecho(0.00520.0540.00590.920.015)
##6)chl_small>=37876668145nano(0.0760.7800.0730.067)*
##7)chl_small<378769192175synecho(00.00130.00630.980.011)*
predict_dt<predict(model_dt,newdata=testset)
prepop_dt<cbind(rownames(predict_dt),colnames(predict_dt)[apply(predict_dt,1,which.max)])
accuracy_dt<(sum(prepop_dt[,2]==seaflow[prepop_dt[,1],'pop']))/dim(prepop_dt)[1]
##Loadingrequiredpackage:grid

RadomForest
library("randomForest")
##randomForest4.612
##TyperfNews()toseenewfeatures/changes/bugfixes.
model_rf<randomForest(fol,data=trainset)
#model_rf<randomForest(fol,data=trainset,importance=T,keep.forest=T)
#testing
predict_rf<predict(model_rf,newdata=testset)
accuracy_rf<(sum(predict_rf==seaflow[labels(predict_rf),'pop']))/length(predict_rf)
print(accuracy_rf)
##[1]0.9202676
#Ginicoef
#"Thehigherthenumber,themoretheginiimpurityscoredecreasesbybranchingonthisvar
iable,indicatingthatthevariableismoreimportant"
importance(model_rf)

##MeanDecreaseGini
##fsc_small2705.5014
##fsc_perp2098.5750
##fsc_big206.2317
##pe8802.4334
##chl_big4923.9151
##chl_small8113.0261
SVM
library("e1071")
#SVM
#model<svm(fol,data=trainingdata)
model_svm<svm(fol,data=trainset)
#testing
predict_svm<predict(model_svm,newdata=testset)
accuracy_svm<(sum(predict_svm==seaflow[labels(predict_svm),'pop']))/length(predict_svm)
print(accuracy_svm)
##[1]0.919687
ConfusionMatrices
##true
##predcryptonanopicosynechoultra
##nano515204138870
##pico01949101978
##synecho0934904294
##ultra0115491807286
##true
##predcryptonanopicosynechoultra
##crypto512010
##nano0556801331
##pico001008101383
##synecho03390774
##ultra079536018510
##true
##predcryptonanopicosynechoultra
##crypto491020
##nano1565001369
##pico0010036281367
##synecho155590478
##ultra071235328484

##ConfusionMatrixandStatistics
##
##Reference
##Predictioncryptonanopicosynechoultra
##crypto491020
##nano1565001369
##pico0010036281367
##synecho155590478
##ultra071235328484
##
##OverallStatistics
##
##Accuracy:0.9197
##95%CI:(0.9168,0.9225)
##NoInformationRate:0.2887
##PValue[Acc>NIR]:<2.2e16
##
##Kappa:0.8917
##Mcnemar'sTestPValue:NA
##
##StatisticsbyClass:
##
##Class:cryptoClass:nanoClass:picoClass:synecho
##Sensitivity0.9607840.88720.96090.9964
##Specificity0.9999170.98760.94580.9975
##PosPredValue0.9423080.93840.87800.9924
##NegPredValue0.9999450.97620.98350.9988
##Prevalence0.0014100.17610.28870.2510
##DetectionRate0.0013550.15620.27750.2501
##DetectionPrevalence0.0014380.16650.31600.2520
##BalancedAccuracy0.9803510.93740.95340.9969
##Class:ultra
##Sensitivity0.8295
##Specificity0.9589
##PosPredValue0.8883
##NegPredValue0.9345
##Prevalence0.2828
##DetectionRate0.2346
##DetectionPrevalence0.2641
##BalancedAccuracy0.8942
DiscrteVariable?

Download Full (eBook PDF) Introduction to Probability and Statistics 4rd Canadian Edition PDF All Chapters
83% (6)
Download Full (eBook PDF) Introduction to Probability and Statistics 4rd Canadian Edition PDF All Chapters
56 pages
BBBC
0% (2)
BBBC
12 pages
Two-Way (Between-Groups) ANOVA: Statstutor Community Project
No ratings yet
Two-Way (Between-Groups) ANOVA: Statstutor Community Project
4 pages
Classification of Microbes
No ratings yet
Classification of Microbes
5 pages
Introduction To Microbiology
No ratings yet
Introduction To Microbiology
18 pages
Descriptive Stats and Visualization
No ratings yet
Descriptive Stats and Visualization
16 pages
Tinamoni (20msb0104) Da1 Marine BTN
No ratings yet
Tinamoni (20msb0104) Da1 Marine BTN
12 pages
Microbial Community of Coral Reefs Using Standardized Autonomous Reef Monitoring Structures (ARMS)
No ratings yet
Microbial Community of Coral Reefs Using Standardized Autonomous Reef Monitoring Structures (ARMS)
12 pages
rfishbase
No ratings yet
rfishbase
44 pages
1. Introduction of Microbiology
No ratings yet
1. Introduction of Microbiology
48 pages
Capstone Cyo Report
No ratings yet
Capstone Cyo Report
36 pages
Notes
No ratings yet
Notes
25 pages
2019 743 Moesm1 Esm
No ratings yet
2019 743 Moesm1 Esm
44 pages
Coding An
No ratings yet
Coding An
19 pages
Biodegradation Aspects of Polycyclic Aromatic Hydrocarbons A Review
No ratings yet
Biodegradation Aspects of Polycyclic Aromatic Hydrocarbons A Review
17 pages
Cyanobacteria To Sub-Species Level
No ratings yet
Cyanobacteria To Sub-Species Level
39 pages
ARTICLE REVIEW OUTLINE
No ratings yet
ARTICLE REVIEW OUTLINE
4 pages
Aneela
No ratings yet
Aneela
57 pages
T1 PBMOM 2022
No ratings yet
T1 PBMOM 2022
58 pages
Diversity of Halophilic Mycoflora Habitat in Saltpans of Tuticorin and Marakkanam Along Southeast Coast of India
No ratings yet
Diversity of Halophilic Mycoflora Habitat in Saltpans of Tuticorin and Marakkanam Along Southeast Coast of India
17 pages
5 Classification of Microorganisms
No ratings yet
5 Classification of Microorganisms
4 pages
1 s2.0 S0269749116303852 mmc1
No ratings yet
1 s2.0 S0269749116303852 mmc1
16 pages
General Microbiology II
No ratings yet
General Microbiology II
12 pages
Actual Problems of Oceanography in Portugal
No ratings yet
Actual Problems of Oceanography in Portugal
9 pages
Expansion of The Geographic Distribution of A Novel Lineage of O-Proteobacteria To A Hydrothermal Vent Site On The Southern East Paci C Rise
No ratings yet
Expansion of The Geographic Distribution of A Novel Lineage of O-Proteobacteria To A Hydrothermal Vent Site On The Southern East Paci C Rise
7 pages
Gold2020
No ratings yet
Gold2020
61 pages
MIB2602 Learning Unit 3_LS
No ratings yet
MIB2602 Learning Unit 3_LS
9 pages
1 2 BR
No ratings yet
1 2 BR
3 pages
Recap: Environmental Sample Extraction DNA Hybridization
No ratings yet
Recap: Environmental Sample Extraction DNA Hybridization
3 pages
2004 JMM
No ratings yet
2004 JMM
6 pages
Geochem Geophys Geosyst - 2022 - Richmond - Forabot Automated Planktic Foraminifera Isolation and Imaging
No ratings yet
Geochem Geophys Geosyst - 2022 - Richmond - Forabot Automated Planktic Foraminifera Isolation and Imaging
17 pages
phyloseq
No ratings yet
phyloseq
87 pages
What Is Microbial Ecology?
No ratings yet
What Is Microbial Ecology?
6 pages
What Is Microbial Ecology?
No ratings yet
What Is Microbial Ecology?
6 pages
PDF 3 PDF
No ratings yet
PDF 3 PDF
10 pages
1 s2.0 S1364032115004839 Main PDF
No ratings yet
1 s2.0 S1364032115004839 Main PDF
14 pages
Estuarine, Coastal and Shelf Science: Qiongxuan Qiu, Zhi Tan, Jundong Wang, Jinping Peng, Meimin Li, Zhiwei Zhan
No ratings yet
Estuarine, Coastal and Shelf Science: Qiongxuan Qiu, Zhi Tan, Jundong Wang, Jinping Peng, Meimin Li, Zhiwei Zhan
8 pages
MCB c112 Midterm 1 Study Guide
No ratings yet
MCB c112 Midterm 1 Study Guide
16 pages
02-PCA
No ratings yet
02-PCA
14 pages
The Pgfmolbio Package - Molecular Biology Graphs With Tikz: Wolfgang Skala 2013/08/01
No ratings yet
The Pgfmolbio Package - Molecular Biology Graphs With Tikz: Wolfgang Skala 2013/08/01
122 pages
MHP 3
No ratings yet
MHP 3
7 pages
Science 1262073
No ratings yet
Science 1262073
10 pages
1.1.2 Bacteriology Flow Chart2014
No ratings yet
1.1.2 Bacteriology Flow Chart2014
7 pages
Characterization of An Intertidal Cyanob
No ratings yet
Characterization of An Intertidal Cyanob
11 pages
Lecture-Metagenomics - Using Mothur
No ratings yet
Lecture-Metagenomics - Using Mothur
48 pages
Comandos Bioinformatica Metagenomica
No ratings yet
Comandos Bioinformatica Metagenomica
6 pages
Oil Type and Temperature Dependent Biodegradation Dynamics-Combining Chemical and Microbial Community Data
No ratings yet
Oil Type and Temperature Dependent Biodegradation Dynamics-Combining Chemical and Microbial Community Data
15 pages
Biological Treatment
No ratings yet
Biological Treatment
25 pages
4-1 Water
No ratings yet
4-1 Water
26 pages
Tmp74a4 TMP
No ratings yet
Tmp74a4 TMP
9 pages
2_Classification of bacteria_22-23
No ratings yet
2_Classification of bacteria_22-23
19 pages
NAL Thesis v6
No ratings yet
NAL Thesis v6
88 pages
Biology 9701 MJ23 P41
No ratings yet
Biology 9701 MJ23 P41
19 pages
Mushroom Classification Using Machine Learning
No ratings yet
Mushroom Classification Using Machine Learning
23 pages
Novogene Amplicon Standard Analysis DEMO REPORT
100% (1)
Novogene Amplicon Standard Analysis DEMO REPORT
37 pages
Archaebacteri and Eubacteria Notes
No ratings yet
Archaebacteri and Eubacteria Notes
6 pages
2015-FrontMicro-THe Cultivable Surface Microbiota of The Brown Alga Ascophyllum Nodosum Is Enriched in Macroalgal Polysaccharide
No ratings yet
2015-FrontMicro-THe Cultivable Surface Microbiota of The Brown Alga Ascophyllum Nodosum Is Enriched in Macroalgal Polysaccharide
14 pages
Class of Micro
No ratings yet
Class of Micro
26 pages
Feild Methods Notes 2
No ratings yet
Feild Methods Notes 2
6 pages
Modeling Complexity
No ratings yet
Modeling Complexity
83 pages
Application and Implementation of DES Algorithm Based on FPGA
From Everand
Application and Implementation of DES Algorithm Based on FPGA
madhav
No ratings yet
Laptop Cleaning Basics
From Everand
Laptop Cleaning Basics
Isaac Berners-Lee
No ratings yet
An Introduction To Data Acquisition
From Everand
An Introduction To Data Acquisition
Jason King
No ratings yet
XSTK
No ratings yet
XSTK
9 pages
Case Processing Summary
No ratings yet
Case Processing Summary
3 pages
Makesens Manual 2002
No ratings yet
Makesens Manual 2002
35 pages
Survey Scale: 0 1 2 3 4 5 Sample: Dealer Satisfaction
No ratings yet
Survey Scale: 0 1 2 3 4 5 Sample: Dealer Satisfaction
61 pages
Buy Ebook (Ebook PDF) Statistics Unplugged 4th Edition by Sally Caldwell Cheap Price
100% (8)
Buy Ebook (Ebook PDF) Statistics Unplugged 4th Edition by Sally Caldwell Cheap Price
41 pages
Panel Data Problem Set 2
No ratings yet
Panel Data Problem Set 2
6 pages
Percobaan 1 Pengaruh Cara Pemberian Terhadap Absorpsi Obat: 35g 20g 50ml 0,5ml 700 MG 600 MG
No ratings yet
Percobaan 1 Pengaruh Cara Pemberian Terhadap Absorpsi Obat: 35g 20g 50ml 0,5ml 700 MG 600 MG
7 pages
Tugas 2 Pemodelan Matematika
No ratings yet
Tugas 2 Pemodelan Matematika
14 pages
Table 1
No ratings yet
Table 1
3 pages
Assignment
No ratings yet
Assignment
6 pages
Pum2019 Validation in Clinical Laboratory
No ratings yet
Pum2019 Validation in Clinical Laboratory
67 pages
Kinh tế lượng code R
No ratings yet
Kinh tế lượng code R
10 pages
Chapter II
No ratings yet
Chapter II
31 pages
Exersice Week 11 Answer
No ratings yet
Exersice Week 11 Answer
6 pages
Statistics
No ratings yet
Statistics
9 pages
The D2star Table Exposed - Gruska
No ratings yet
The D2star Table Exposed - Gruska
6 pages
Statisctic Excercises
No ratings yet
Statisctic Excercises
3 pages
Econometrics Test Bank
No ratings yet
Econometrics Test Bank
134 pages
Racunari U Gradjevinarstvu 2
No ratings yet
Racunari U Gradjevinarstvu 2
4 pages
Slides On T and Chi Square Distributions
No ratings yet
Slides On T and Chi Square Distributions
22 pages
CRF_Laura_Kallmeyer
No ratings yet
CRF_Laura_Kallmeyer
21 pages
CAUSALITY_FROM_A_DISTRIBUTIONAL_ROBUSTNESS_POINT_OF_VIEW
No ratings yet
CAUSALITY_FROM_A_DISTRIBUTIONAL_ROBUSTNESS_POINT_OF_VIEW
5 pages
Engr371 S
No ratings yet
Engr371 S
5 pages
STAT 1200 - 3 Principles of Probability and Probability Distributions
No ratings yet
STAT 1200 - 3 Principles of Probability and Probability Distributions
45 pages
7.2 and 7.3 Quiz
No ratings yet
7.2 and 7.3 Quiz
2 pages
chapter-4
No ratings yet
chapter-4
38 pages
Polynomials in Error Detection and Correction in Data Communication System - IntechOpen PDF
No ratings yet
Polynomials in Error Detection and Correction in Data Communication System - IntechOpen PDF
24 pages

R Assignment Classification of Ocean Microbes

Uploaded by

R Assignment Classification of Ocean Microbes

Uploaded by

RAssignment:ClassificationofOcean

You might also like