0% found this document useful (0 votes)

136 views7 pages

Bigdata Programs&Solutions

1. The document describes several R programs for analyzing big data. It includes programs to create vectors and data frames, perform operations on datasets like mtcars, and write data to CSV files. 2. The solutions section provides code snippets to implement the programs, such as creating vectors with repeating values, turning strings into factors, plotting variables, and reading/writing to CSV files. 3. Examples analyze the mtcars dataset, creating vectors from repeated characters, joining data frames, and selecting rows based on column conditions. Functions like rep(), factor(), write.csv() and read.csv() are used.

Uploaded by

Anitha Mc

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

136 views7 pages

Bigdata Programs&Solutions

Uploaded by

Anitha Mc

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

Big Data Programs

Part A

(List of R programs)

1. Create vector for the following

a. (4, 6, 3, 4, 6, 3, . . . , 4, 6, 3) where there are 10 occurrences of 4.
b. Use the function paste to create the following character vectors of length 30
("fn1", "fn2", ..., "fn30").In this case, there is no space between fn and the number

2. a. Turn the vector of character items "Control", "Control", "Control", "Ear Removal", "Ear Removal",
"Ear Removal", "Ear Removal", "Fake Ear Removal", "Fake Ear Removal", "Fake Ear Removal", "Fake
Ear Removal" into a Factor variable and create a table from it to show the number of entries in each
treatment.
b. Create a vector of character variables that contains 25 ”a”, 15 ”b”, and 58 ”c” instances. What is
the length of this vector? Create a table from the entries.
3. a. Create three different variables, one that is numeric type and other two are vector of characters.
Use these to create data frame of student.(USN,Name,Marks)
b. Add a new numeric data column to the existing data frame (Age). Provide summary of the data
c. Display the list of student whose Age is less than 20 and Marks greater than 25

4. Write a program to create the csv file for storing Employee data. Containing the data
(EmpID, EmpName , DOJ, EmpCode, Dept, Desig.)
a. Read the suitable number of employee details from the user.
b. Create a dataframe of Employee
c. Store the dataframe in the csv file
d. Check the difference between csv and csv2 file
e. Read the data from csv and Display the contents
f. Append a new row into the csv file

5. Dataset example
a. List the data set available in your system using suitable command
b. Select “mtcars” data set, find and display the number of rows and columns in that data
set
c. Find are there more automatic (0) or manual (1) transmission-type cars in the
dataset? Hint: 9th column indicate the transmission type
d. Get a scatter plot of ‘hp’ vs ‘weight’.
e. Change ‘am’, ‘cyl’ and ‘vs’ to integer and store the new dataset as ‘newmtc’.
f. Extract the cases where cylinder is less than 5

6. Consider “Airquality” dataset

a. Display the dimension of the dataset
b. Display the class of each fields in the data set
c. Test the missing values
d. Recode the missing values, as mean of the column values
e. Exclude the missing values

Solutions

1. Create vector for the following

a. tmp <- c(4,6,3) # Create the vector

rep(tmp,10) #Repeat the vector 10 times
b. paste("fn",1:30,sep="") # paste 1st and 2nd argument

a. # Create the vector of strings

x<-c("Control", "Control", "Control", "Ear Removal", "Ear Removal", "Ear
Removal", "Ear Removal", "Fake Ear Removal", "Fake Ear Removal", "Fake Ear
Removal", "Fake Ear Removal")

# display the vector

> x
[1] "Control" "Control" "Control" "Ear Removal"
[5] "Ear Removal" "Ear Removal" "Ear Removal" "Fake Ear
Removal"
[9] "Fake Ear Removal" "Fake Ear Removal" "Fake Ear Removal"

#construct factor from the vector

> xfact<- factor(x)

#Display the vector

> xfact
[1] Control Control Control Ear Removal
[5] Ear Removal Ear Removal Ear Removal Fake Ear Removal
[9] Fake Ear Removal Fake Ear Removal Fake Ear Removal
Levels: Control Ear Removal Fake Ear Removal
> nlevels(xfact)
[1] 3

2b.

#Create the vector

> x<-c(rep("a",25),rep("b",15),rep("c",58))
> x
[1] "a" "a" "a" "a" "a" "a" "a" "a" "a" "a" "a" "a" "a" "a" "a" "a" "a" "a"
"a" "a"
[21] "a" "a" "a" "a" "a" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b"
"b" "b"
[41] "c" "c" "c" "c" "c" "c" "c" "c" "c" "c" "c" "c" "c" "c" "c" "c" "c" "c"
"c" "c"
[61] "c" "c" "c" "c" "c" "c" "c" "c" "c" "c" "c" "c" "c" "c" "c" "c" "c" "c"
"c" "c"
[81] "c" "c" "c" "c" "c" "c" "c" "c" "c" "c" "c" "c" "c" "c" "c" "c" "c" "c"
> x<-c(rep("a",25),rep("b",15),rep("c",58))
> x
# Find the length of the vector

[1] 98
> table1<- data.frame(x) # Construct table from the vector
> table1
x
1 a
2 a
3 a
4 a
5 a
6 a
|
|
|
|
|
93 c
94 c
95 c
96 c
97 c
98 c

3. a. Create three different variables, one that is numeric type and other two are vector of characters.
Use these to create data frame of student.(USN,Name,Marks)
b. Add a new numeric data column to the existing data frame (Age). Provide summary of the data
c. Display the list of student whose Age is less than 20 and Marks greater than 25

n <- as.integer(readline(prompt = "Enter no of students")) # Read No of

students
# Declare the vector of character of length n
USN <- vector(mode="character", length= n)
Name <- vector(mode="character", length= n)
Marks <- vector(mode="numeric", length= n)

#Read the elements of the vector

print("Enter USN")
for (i in 1:n)
USN[i] <- as.character(readline())
print("Enter Name")
for (i in 1:n)
Name [i] <- readline()
print("Enter Marks" )
for (i in 1:n)
Marks[i] <- as.integer(readline())

#Construct the data frame from the vectors

student <- data.frame(USN,Name,Marks)
print("The Student detials are as follows")
print(student) # Display data frame

print("Enter Age") # Read the vector of Age

Age <- vector(mode="integer", length=n)
for (i in 1:n)
Age [i] <- readline()
student <- cbind(student,Age) # Append the vector to the data frame

print(student)

for(i in 1:n) # Print student age > 20 , marks > 25

if ( student[i,][3] > 25 )
if (student[i,][4] > 20)
print(student[i,])

4. Write a program to create the csv file for storing Employee data. Containing the data
(EmpID, EmpName , DOJ, EmpCode, Dept, Desig.)
a. Read the suitable number of employee details from the user.
b. Create a dataframe of Employee
c. Store the dataframe in the csv file
d. Read the data from csv and Display the contents
e. Append a new row into the csv file

a. n <- as.integer(readline(prompt = "Enter no of Employee"))

EmpId <- vector(mode="character", length= n)
EmpName <- vector(mode="character", length= n)
DOJ <- vector(mode="character", length= n)
EmpCode <- vector(mode="numeric",length = n)
Desig <- vector(mode="character",length = n)
Dept <- vector(mode="character",length = n)

print("Enter EmpId")
for (i in 1:n)
EmpId[i] <- as.character(readline())
print("Enter EmployeeName")
for (i in 1:n)
EmpName [i] <- readline()
print("Enter DOJ" )
for (i in 1:n)
DOJ[i] <- (readline())
print("Enter EmployeeCode" )
for (i in 1:n)
EmpCode[i] <- as.integer(readline())
print("Enter Designation" )
for (i in 1:n)
Desig[i] <- (readline())
print("Enter Dept" )
for (i in 1:n)
Dept[i] <- (readline())

b.
Emp <- data.frame(EmpId,EmpName,EmpCode,Desig,Dept,DOJ)

print("The Employee detials are as follows")

print(Emp)

c.
write.csv(Emp,"C:/Users/ARCHANA/Documents/Empfile.csv")

d.
readStudent=read.csv("C:/Users/ARCHANA/Documents/file.csv")

e.
print("Enter a new row")
u<- readline(prompt = "EmpId")
n<- readline(prompt = "EmpName")
m<- readline(prompt = "EmpCode")
A<- readline(prompt = "Desig")
s<- readline(prompt = "Dept")
t<- readline(prompt = "DOJ")

x<- data.frame(u,n,m,A,s,t)

write.table(x,"C:/Users/ARCHANA/Documents/Empfile.csv",col.names = FALSE, append = T,row.names

= T, quote= FALSE, sep = ",")

c. x<- data.frame(mtcars)
automatic <-0
manual <-0
for (i in 1:rownum)
ifelse( x[i,9] == 1, automatic <- automatic + 1, manual <- manual +1)
ifelse (automatic > manual,
print("There are more automatic transmission type"),
print("There are more manual transmission type") )

d. //The scatter plot

HorsePower <- x[,4]
Weight <- x[,6]
scatter.smooth(HorsePower,Weight, span=2/3, degree = 1, family =c("symmetric","gaussian"))

// Plot histogram of Miles/gallon

Mpg <- x[,1]
hist(Mpg, breaks = 12, col ="lightblue", border = "pink")

e. // Solution for e
x[,2]<- as.integer(x[,2])
x[,8]<- as.integer(x[,8])
x[,9]<- as.integer(x[,9])
x[,2] <= 5

f. mtcars[mtcars$cyl <=5 ]

6. Consider “Airquality” dataset

a. df <- airquality
dim(df)
b. sapply(df,class)
c. #Printing the missing values
print("The Missing values are as follows")
xcolNames <- colnames(df)
x<- colSums(is.na(df))
print(x)
d. which(is.na(df))
sum(is.na(df))
df1<- as.data.frame(df)
e. #Recoding the missing values
for(i in 1:4)
df1[,i]<- ifelse ( is.na(df[,i]), mean(df[,i], na.rm = TRUE), df[,i])
# Excluding the missing values

df2<-na.omit(df)

Ansi Ieeec37.20.7-2001
100% (2)
Ansi Ieeec37.20.7-2001
28 pages
Rodica Mijaiche Roma Pasiuni Intunecate
97% (34)
Rodica Mijaiche Roma Pasiuni Intunecate
264 pages
Arunav Da Prac
No ratings yet
Arunav Da Prac
55 pages
R Assignment
No ratings yet
R Assignment
9 pages
R-Programming Record - Odd Sem 21-22
No ratings yet
R-Programming Record - Odd Sem 21-22
35 pages
Lec 13
No ratings yet
Lec 13
46 pages
Experiment 5
No ratings yet
Experiment 5
13 pages
Practical 2 Kunal
No ratings yet
Practical 2 Kunal
6 pages
Dsda Manual
No ratings yet
Dsda Manual
64 pages
R-1ST Internal-Lab Notes
No ratings yet
R-1ST Internal-Lab Notes
14 pages
RemoveWatermark pdf24 Merged+
No ratings yet
RemoveWatermark pdf24 Merged+
76 pages
DAV LAB3.pdf 20250306 141450 0000
No ratings yet
DAV LAB3.pdf 20250306 141450 0000
57 pages
R Assignment 10
No ratings yet
R Assignment 10
12 pages
DS Tutorial-2 Dinesh Dodeja 52119
No ratings yet
DS Tutorial-2 Dinesh Dodeja 52119
5 pages
Dav Lab
No ratings yet
Dav Lab
55 pages
Midterm Session II #0000000224 - On March 25, 2016 14 13: Processing
No ratings yet
Midterm Session II #0000000224 - On March 25, 2016 14 13: Processing
11 pages
Prog 9,10,11,12
No ratings yet
Prog 9,10,11,12
7 pages
Learn R by Intensive Practice - Coding Test
No ratings yet
Learn R by Intensive Practice - Coding Test
22 pages
R Programming Materials
No ratings yet
R Programming Materials
51 pages
R Programing Bhagu
No ratings yet
R Programing Bhagu
40 pages
Da Lab It
No ratings yet
Da Lab It
20 pages
R Programming
No ratings yet
R Programming
50 pages
R File New
No ratings yet
R File New
22 pages
Lecture 5 (Managing and Understanding Data)
No ratings yet
Lecture 5 (Managing and Understanding Data)
9 pages
R Lab
No ratings yet
R Lab
15 pages
R Programming Lab Manual
No ratings yet
R Programming Lab Manual
44 pages
Grade11 Datascience
No ratings yet
Grade11 Datascience
4 pages
Introduction To R Chap 2
No ratings yet
Introduction To R Chap 2
30 pages
DA Lab Manual
No ratings yet
DA Lab Manual
42 pages
Exercises For R
No ratings yet
Exercises For R
40 pages
Part A R Programming
No ratings yet
Part A R Programming
10 pages
Simple Tutorial in R
No ratings yet
Simple Tutorial in R
15 pages
Experiment - 6 DATE: 28.2.2020 Data Analytics Lab: Seq (1, 3, by 0.2)
No ratings yet
Experiment - 6 DATE: 28.2.2020 Data Analytics Lab: Seq (1, 3, by 0.2)
3 pages
FE418 RLectureNotes1
No ratings yet
FE418 RLectureNotes1
15 pages
R
No ratings yet
R
15 pages
Data Manipulation: Ionut Bebu
No ratings yet
Data Manipulation: Ionut Bebu
19 pages
Da Lab File
No ratings yet
Da Lab File
33 pages
20mia1006 - Fda - Consolidated Report
No ratings yet
20mia1006 - Fda - Consolidated Report
119 pages
R Porgramming Notes
No ratings yet
R Porgramming Notes
20 pages
Big Data Lab R Code With Output
No ratings yet
Big Data Lab R Code With Output
13 pages
R Programmimg Practical Journal All-1
No ratings yet
R Programmimg Practical Journal All-1
25 pages
Data - Analysis - With - R - 24
No ratings yet
Data - Analysis - With - R - 24
47 pages
Character Vectors: Letters Letters
No ratings yet
Character Vectors: Letters Letters
5 pages
Introduction To R
No ratings yet
Introduction To R
21 pages
R Lab Programs-1
No ratings yet
R Lab Programs-1
26 pages
Big Data File in R
No ratings yet
Big Data File in R
23 pages
Experiment 1: Working With Objects in Memory
No ratings yet
Experiment 1: Working With Objects in Memory
6 pages
Workshop Activity: X Seq y Length
No ratings yet
Workshop Activity: X Seq y Length
3 pages
Lab 02 - Compound Data Structures
No ratings yet
Lab 02 - Compound Data Structures
12 pages
A Short List of Some Useful R Commands: Input and Display
No ratings yet
A Short List of Some Useful R Commands: Input and Display
2 pages
Analysis Using Statistical: Introduction & Data Exploration
No ratings yet
Analysis Using Statistical: Introduction & Data Exploration
23 pages
Materi 4
No ratings yet
Materi 4
30 pages
Certificate: Alard College of Business Studies
No ratings yet
Certificate: Alard College of Business Studies
55 pages
R Programming Basics Guide
No ratings yet
R Programming Basics Guide
30 pages
R Programming: © 2016 SMART Training Resources Pvt. LTD
No ratings yet
R Programming: © 2016 SMART Training Resources Pvt. LTD
28 pages
R Prgms
No ratings yet
R Prgms
12 pages
An Introduction To R Language
No ratings yet
An Introduction To R Language
11 pages
18 3 24 Upto Week 6 A B Latest 1
No ratings yet
18 3 24 Upto Week 6 A B Latest 1
25 pages
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
Computer Engineering Laboratory Solution Primer
From Everand
Computer Engineering Laboratory Solution Primer
Karan Bhandari
No ratings yet
Advanced C Concepts and Programming: First Edition
From Everand
Advanced C Concepts and Programming: First Edition
Gayatri
3/5 (1)
IBPS RRB Officer Scale Prelim 1 AIT
No ratings yet
IBPS RRB Officer Scale Prelim 1 AIT
12 pages
Ad Hoc Wireless Networking Course Notes: March 2002
No ratings yet
Ad Hoc Wireless Networking Course Notes: March 2002
59 pages
Taxonomy For Wireless Sensor Networks Services Characterisation and Classification
No ratings yet
Taxonomy For Wireless Sensor Networks Services Characterisation and Classification
4 pages
Cse I SEM Q.BANK 2016-17 PDF
No ratings yet
Cse I SEM Q.BANK 2016-17 PDF
127 pages
Design of Horizontal Curve: Formula
No ratings yet
Design of Horizontal Curve: Formula
3 pages
Wireless Communication July 2017 (2010 Scheme)
No ratings yet
Wireless Communication July 2017 (2010 Scheme)
1 page
Information and Network Security Lab: Lab Mannual FOR VII SEMESTER Computer Science & Engineering
No ratings yet
Information and Network Security Lab: Lab Mannual FOR VII SEMESTER Computer Science & Engineering
30 pages
Meehanite Handbook2 PDF
No ratings yet
Meehanite Handbook2 PDF
89 pages
First Quadrant Single Phase Ac To DC Converter Semiconverter Separately Excited DC Motor
No ratings yet
First Quadrant Single Phase Ac To DC Converter Semiconverter Separately Excited DC Motor
9 pages
Divya Solanki PDF
No ratings yet
Divya Solanki PDF
15 pages
Unit 2
No ratings yet
Unit 2
76 pages
Nevertheless
No ratings yet
Nevertheless
12 pages
WallAirDisplacement Manual PDF
100% (1)
WallAirDisplacement Manual PDF
65 pages
TTL 2 Report
No ratings yet
TTL 2 Report
16 pages
Signpost Words
100% (1)
Signpost Words
1 page
Exp 2 - EDF - 2021-Nikoo Ghasemkhanvali
No ratings yet
Exp 2 - EDF - 2021-Nikoo Ghasemkhanvali
3 pages
10 Types of Reasoning
No ratings yet
10 Types of Reasoning
12 pages
Marine Plastic Debris Management in Indonesia National Plan of
No ratings yet
Marine Plastic Debris Management in Indonesia National Plan of
39 pages
FASTag - Statement27
No ratings yet
FASTag - Statement27
2 pages
Do Chinese Citizens Conceal Opposition To The CCP in Surveys Evidence From Two Experiments
No ratings yet
Do Chinese Citizens Conceal Opposition To The CCP in Surveys Evidence From Two Experiments
10 pages
My Studybook Module 1-5 (PORTFOLIO) - EBORDE, GWENDOLYN Q PDF
100% (1)
My Studybook Module 1-5 (PORTFOLIO) - EBORDE, GWENDOLYN Q PDF
65 pages
Attracting New Customers
No ratings yet
Attracting New Customers
2 pages
Bearing Design
No ratings yet
Bearing Design
24 pages
Grade 7 ELA - Modified From: Unit 4 - Historical Fiction
No ratings yet
Grade 7 ELA - Modified From: Unit 4 - Historical Fiction
2 pages
Numerical Analysis Mathematics of Scientific Computing 3e PDF Download
100% (2)
Numerical Analysis Mathematics of Scientific Computing 3e PDF Download
18 pages
GS6 Repair Tips
No ratings yet
GS6 Repair Tips
24 pages
General Characteristic Of The Lexic Of The Sphere "Beauty Industry" (on the material of the nouns of modern German language) Kovalenko D. О
No ratings yet
General Characteristic Of The Lexic Of The Sphere "Beauty Industry" (on the material of the nouns of modern German language) Kovalenko D. О
4 pages
APEC Architects Operations Manual 08
No ratings yet
APEC Architects Operations Manual 08
20 pages
Anthropic Cosmological Principles - Coments Theology
No ratings yet
Anthropic Cosmological Principles - Coments Theology
7 pages
Peranan Saksi Korban Tindak Pidana Perkosaan Pada Tingkat Penyidikan
No ratings yet
Peranan Saksi Korban Tindak Pidana Perkosaan Pada Tingkat Penyidikan
22 pages
Info Session Lang-B-200 Oct2024
No ratings yet
Info Session Lang-B-200 Oct2024
21 pages
What Is Curriculum Design
No ratings yet
What Is Curriculum Design
5 pages
MBA 1st Year Course Structure and Syllabus 2024 25 PDF Economics Patent 2
No ratings yet
MBA 1st Year Course Structure and Syllabus 2024 25 PDF Economics Patent 2
1 page
Critical Analysis of The 1987 Philippine Constitution
No ratings yet
Critical Analysis of The 1987 Philippine Constitution
6 pages
Manual Servicio Martillo B35
No ratings yet
Manual Servicio Martillo B35
120 pages

Bigdata Programs&Solutions

Uploaded by

Bigdata Programs&Solutions

Uploaded by

Big Data Programs

1. Create vector for the following

6. Consider “Airquality” dataset

1. Create vector for the following

a. tmp <- c(4,6,3) # Create the vector

a. # Create the vector of strings

# display the vector

#construct factor from the vector

#Display the vector

#Create the vector

n <- as.integer(readline(prompt = "Enter no of students")) # Read No of

#Read the elements of the vector

#Construct the data frame from the vectors

print("Enter Age") # Read the vector of Age

for(i in 1:n) # Print student age > 20 , marks > 25

a. n <- as.integer(readline(prompt = "Enter no of Employee"))

print("The Employee detials are as follows")

write.table(x,"C:/Users/ARCHANA/Documents/Empfile.csv",col.names = FALSE, append = T,row.names

d. //The scatter plot

// Plot histogram of Miles/gallon

6. Consider “Airquality” dataset

You might also like