0% found this document useful (0 votes)

24 views33 pages

R-Programming Lab Mannual

The R Programming Lab Manual for B.Tech 5th AIML includes a series of experiments focused on data manipulation, visualization, and statistical analysis using R. Students will learn to import and clean data, perform data wrangling, create visualizations, and conduct statistical analyses including hypothesis testing. Each experiment provides practical examples and code snippets to facilitate hands-on learning.

Uploaded by

yeeshandas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views33 pages

R-Programming Lab Mannual

Uploaded by

yeeshandas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 33

B.

TECH 5th AIML

R Programming Lab Manual

Sr. No. Name of Experiment Date of Date of Remark

Exp. Submission
1 Importing and cleaning data :
In this experiment, students will learn how to
import data from a variety of sources, such as CSV
files, Excel files, and databases. They will also
learn how to clean data by removing missing
values, outliers, and duplicate rows.
2 Data wrangling
In this experiment, students will learn how to
transform data by changing the data types,
merging data sets, and creating new variables.
They will also learn how to explore data by using
statistical methods such as descriptive statistics
and hypothesis testing.
3 Data visualization
In this experiment, students will learn how to
create effective data visualizations using R. They
will learn how to choose the right type of plot for
the data, how to customize plots, and how to save
plots.
4 Statistical analysis
In this experiment, students will learn how to
conduct descriptive and inferential statistical
analysis using R. They will learn how to calculate
descriptive statistics, such as mean, median, and
standard deviation. They will also learn how to
conduct hypothesis testing to determine if there is
a statistically significant difference between two
groups.
5 Machine learning
In this experiment, students will learn how to
apply machine learning algorithms to solve real-
world problems. They will learn how to train and
evaluate machine learning models, and how to use
machine learning models to make predictions.
6 Design an experiment to determine the effect of
different types of fertilizer on plant growth. This
experiment allows students to explore the factors
Computer Science & Engineering Department

RSR Rungta College of Engineering & Technology Bhilai

B.TECH 5th AIML

R Programming Lab Manual

that affect plant growth.

7 This experiment allows students to explore the
relationship between food and energy.
8 Design an experiment to determine the effect of
different types of light on the growth of plants.
This experiment allows students to explore the role
of light in plant growth.
9 Design an experiment to determine the effect of
different types of soil on the growth of plants. This
experiment allows students to explore the role of
soil in plant growth.

Computer Science & Engineering Department

RSR Rungta College of Engineering & Technology Bhilai

B.TECH 5th AIML

R Programming Lab Manual

Experiment 1:

Importing and cleaning data

In this experiment, students will learn how to import data from a variety of sources, such
as CSV files, Excel files, and databases. They will also learn how to clean data by removing
missing values, outliers, and duplicate rows.

Importing Data from CSV Files

CSV files are commonly used for storing data, and they can be easily imported into R using the
read.csv() function.

Importing the 'readr' library for CSV import

library(readr)

Importing a CSV file

data_csv <- read_csv("path_to_file.csv")

Displaying the first few rows of the dataset

head(data_csv)

Alternatively, you can use the base R function read.csv():

data_csv <- read.csv("path_to_file.csv")

Display the first few rows of the dataset

head(data_csv)

Importing Data from Excel Files

To import Excel files in R, you will need the readxl or openxlsx package.

Computer Science & Engineering Department

RSR Rungta College of Engineering & Technology Bhilai

B.TECH 5th AIML

R Programming Lab Manual

Importing the 'readxl' library

library(readxl)

Importing an Excel file

data_excel <- read_excel("path_to_file.xlsx", sheet = 1)

Displaying the first few rows of the dataset

head(data_excel)

you can use the openxlsx package for more advanced Excel file manipulation

library(openxlsx)

Importing data from an Excel file

data_excel <- read.xlsx("path_to_file.xlsx", sheet = 1)

Displaying the first few rows of the dataset

head(data_excel)

Importing Data from a Database (e.g., MySQL, SQLite)

To import data from a database, you can use the DBI and RMySQL (or RSQLite for SQLite
databases) packages.

Installing and loading necessary libraries

install.packages("DBI")

install.packages("RMySQL")

Computer Science & Engineering Department

RSR Rungta College of Engineering & Technology Bhilai

B.TECH 5th AIML

R Programming Lab Manual

library(DBI)

library(RMySQL)

Connecting to a MySQL database

con <- dbConnect(RMySQL::MySQL(), dbname = "your_database_name", host = "localhost",

user = "your_username", password = "your_password")

Querying data from a table

data_db <- dbGetQuery(con, "SELECT * FROM your_table_name")

Display the first few rows of the dataset

head(data_db)

Close the connection

dbDisconnect(con)

Cleaning the Data

Handling Missing Values

Handling missing values is crucial to ensure that the analysis is not biased or incomplete. There
are various strategies for dealing with missing values, such as removing or imputing them.

Checking for missing values in the dataset

sum(is.na(data_csv))

Computer Science & Engineering Department

RSR Rungta College of Engineering & Technology Bhilai

B.TECH 5th AIML

R Programming Lab Manual

Option 1: Remove rows with any missing values

data_no_missing <- na.omit(data_csv)

Option 2: Impute missing values (e.g., using the mean or median)

data_imputed <- data_csv

data_imputed$column_name[is.na(data_imputed$column_name)] <-
mean(data_imputed$column_name, na.rm = TRUE)

Alternatively, for median imputation:

data_imputed$column_name[is.na(data_imputed$column_name)] <-
median(data_imputed$column_name, na.rm = TRUE)

Removing Duplicate Rows :

Checking for duplicate rows

duplicates <- duplicated(data_csv)

sum(duplicates) This will show the number of duplicated rows

Removing duplicate rows

data_no_duplicates <- data_csv[!duplicated(data_csv),

Detecting and Handling Outliers :

Calculating the IQR

Q1 <- quantile(data_csv$column_name, 0.25)

Q3 <- quantile(data_csv$column_name, 0.75)

Computer Science & Engineering Department

RSR Rungta College of Engineering & Technology Bhilai

B.TECH 5th AIML

R Programming Lab Manual

IQR <- Q3 - Q1

Defining the lower and upper bounds for outliers

lower_bound <- Q1 - 1.5 * IQR

upper_bound <- Q3 + 1.5 * IQR

Identifying outliers

outliers <- data_csv$column_name < lower_bound | data_csv$column_name > upper_bound

sum(outliers) Number of outliers

Removing outliers

data_no_outliers <- data_csv[!outliers, ]

Saving the Cleaned Data :

Saving the cleaned data to a CSV file

write.csv(data_no_duplicates, "cleaned_data.csv", row.names = FALSE)

Saving the cleaned data to an Excel file

library(openxlsx)

write.xlsx(data_no_duplicates, "cleaned_data.xlsx")

Computer Science & Engineering Department

RSR Rungta College of Engineering & Technology Bhilai

B.TECH 5th AIML

R Programming Lab Manual

Experiment 2:

Data wrangling
In this experiment, students will learn how to transform data by changing the data types,
merging data sets, and creating new variables. They will also learn how to explore data by
using statistical methods such as descriptive statistics and hypothesis testing.

Data Transformation

1. Changing Data Types

Sometimes, the data types of your variables might need to be changed for effective analysis. In
R, you can use functions like as.numeric(), as.character(), and as.factor() to change data types.

Example dataset

data <- data.frame(

ID = c(1, 2, 3, 4),

Date = c('2024-01-01', '2024-02-01', '2024-03-01', '2024-04-01'),

Score = c('85', '90', '87', '88')

Changing 'Score' from character to numeric

data$Score <- as.numeric(data$Score)

Changing 'Date' from character to Date type

data$Date <- as.Date(data$Date)

Computer Science & Engineering Department

RSR Rungta College of Engineering & Technology Bhilai

B.TECH 5th AIML

R Programming Lab Manual

Changing 'ID' to factor

data$ID <- as.factor(data$ID)

Viewing the data types

str(data)

2. Merging Datasets :

Example dataframes to merge

df1 <- data.frame(ID = c(1, 2, 3, 4), Name = c("Alice", "Bob", "Charlie", "David"))

df2 <- data.frame(ID = c(1, 2, 3, 5), Score = c(85, 90, 87, 88))

Merging data on the 'ID' column (inner join by default)

merged_data <- merge(df1, df2, by = "ID", all = FALSE) all = FALSE means inner join

Viewing the merged data

print(merged_data)

3. Creating New Variables :

Creating a new variable 'TotalScore' by adding two columns

data$TotalScore <- data$Score + 10 Adding 10 to each Score

Computer Science & Engineering Department

RSR Rungta College of Engineering & Technology Bhilai

B.TECH 5th AIML

R Programming Lab Manual

Creating a new categorical variable based on conditions

data$Performance <- ifelse(data$Score > 90, "High", "Low")

Viewing the updated dataset

head(data)

Data Exploration with Statistical Methods :

1. Descriptive Statistics

Descriptive statistics help summarize the main characteristics of a dataset. In R, you can use
functions like summary(), mean(), median(), sd(), and table() to explore data.

Summary of the data

summary(data)

Calculating mean and standard deviation of 'Score'

mean_score <- mean(data$Score)

sd_score <- sd(data$Score)

Median of 'Score'

median_score <- median(data$Score)

Frequency table of 'Performance'

table(data$Performance)

Computer Science & Engineering Department

RSR Rungta College of Engineering & Technology Bhilai

B.TECH 5th AIML

R Programming Lab Manual

2. Visualizing Data (Descriptive Exploration) :

Basic histogram of 'Score'

hist(data$Score, main = "Histogram of Scores", xlab = "Score", col = "lightblue", border =

"black")

Boxplot of 'Score' to detect outliers

boxplot(data$Score, main = "Boxplot of Scores", ylab = "Score", col = "lightgreen")

Bar plot for 'Performance' category

barplot(table(data$Performance), main = "Performance Distribution", col = c("blue", "red"))

If you are using the ggplot2 package for visualization:

Install and load ggplot2 package

install.packages("ggplot2")

library(ggplot2)

Scatter plot of Score vs TotalScore

ggplot(data, aes(x = Score, y = TotalScore)) +

geom_point() +

ggtitle("Score vs TotalScore") +

xlab("Score") +

ylab("Total Score")

Computer Science & Engineering Department

RSR Rungta College of Engineering & Technology Bhilai

B.TECH 5th AIML

R Programming Lab Manual

Experiment 3:

Data visualization
In this experiment, students will learn how to create effective data visualizations using R.
They will learn how to choose the right type of plot for the data, how to customize plots,
and how to save plots.

Step 1: Installing and Loading Required Libraries

To get started with data visualization in R, we’ll use two primary libraries:

 Base R plotting functions (e.g., plot(), hist(), boxplot())

 ggplot2: A powerful and flexible package for creating visually appealing plots.

Install ggplot2 if not already installed

install.packages("ggplot2")

Load ggplot2 library

library(ggplot2)

Creating Basic Plots in R

1. Histogram (for Distribution of a Single Variable)

Creating a histogram using Base R

data <- c(85, 90, 87, 88, 92, 95, 91, 89, 88, 86)

Basic histogram in Base R

hist(data, main = "Histogram of Scores", xlab = "Scores", col = "lightblue", border = "black")

Computer Science & Engineering Department

RSR Rungta College of Engineering & Technology Bhilai

B.TECH 5th AIML

R Programming Lab Manual

Histogram with ggplot2

ggplot(data = data.frame(Scores = data), aes(x = Scores)) +

geom_histogram(binwidth = 2, fill = "lightblue", color = "black", alpha = 0.7) +

ggtitle("Histogram of Scores") +

xlab("Scores") +

ylab("Frequency")

Box Plot (for Distribution and Outliers) :

Creating a box plot using Base R

boxplot(data, main = "Boxplot of Scores", ylab = "Scores", col = "lightgreen")

Box plot using ggplot2

ggplot(data = data.frame(Scores = data), aes(y = Scores)) +

geom_boxplot(fill = "lightgreen", color = "black") +

ggtitle("Boxplot of Scores") +

ylab("Scores")

Customizing Plots :

1. Customizing Base R Plots

Customizing a histogram

Computer Science & Engineering Department

RSR Rungta College of Engineering & Technology Bhilai

B.TECH 5th AIML

R Programming Lab Manual

hist(data, main = "Customized Histogram of Scores", xlab = "Scores", col = "lightblue",

border = "black", breaks = 5)

Adding gridlines and titles

plot(x, y, main = "Customized Scatter Plot", xlab = "X Values", ylab = "Y Values", pch = 19,
col = "blue")

grid()

Saving Plots :

a file in various formats such as PNG, JPEG, or PDF using the ggsave() function or base R
functions like png(), jpeg(), or pdf().

Saving as PNG

png("scatter_plot.png")

plot(x, y, main = "Scatter Plot", xlab = "X", ylab = "Y", pch = 19, col = "blue")

dev.off() Don't forget to turn off the device

Saving as PDF

pdf("line_plot.pdf")

plot(time, value, type = "o", main = "Line Plot Example", xlab = "Time", ylab = "Value", col =
"blue")

dev.off()

Computer Science & Engineering Department

RSR Rungta College of Engineering & Technology Bhilai

B.TECH 5th AIML

R Programming Lab Manual

Experiment 4:

Statistical analysis
In this experiment, students will learn how to conduct descriptive and inferential statistical
analysis using R. They will learn how to calculate descriptive statistics, such as mean,
median, and standard deviation. They will also learn how to conduct hypothesis testing to
determine if there is a statistically significant difference between two groups.

Calculating Descriptive Statistics

Descriptive statistics include measures of central tendency (mean, median), dispersion (standard
deviation, variance), and shape (skewness, kurtosis).

Example data

data <- c(23, 45, 56, 67, 45, 23, 56, 78, 90, 34, 56, 45)

Mean

mean_data <- mean(data)

cat("Mean:", mean_data, "\n")

Median

median_data <- median(data)

cat("Median:", median_data, "\n")

Standard Deviation

sd_data <- sd(data)

cat("Standard Deviation:", sd_data, "\n")

Computer Science & Engineering Department

RSR Rungta College of Engineering & Technology Bhilai

B.TECH 5th AIML

R Programming Lab Manual

Variance

variance_data <- var(data)

cat("Variance:", variance_data, "\n")

Minimum and Maximum values

min_data <- min(data)

max_data <- max(data)

cat("Min:", min_data, "Max:", max_data, "\n")

Summary (gives min, 1st quartile, median, mean, 3rd quartile, max)

summary_data <- summary(data)

cat("Summary:", summary_data, "\n")

Output:

Inferential Statistics :

One-Sample t-Test

Computer Science & Engineering Department

RSR Rungta College of Engineering & Technology Bhilai

B.TECH 5th AIML

R Programming Lab Manual

A one-sample t-test is used to determine if the sample mean is significantly different from a
known value (typically the population mean).

One-sample t-test to test if the mean is different from 50

t_test_one_sample <- t.test(data, mu = 50)

cat("One-Sample t-Test Results:\n")

print(t_test_one_sample)

Chi-Square Test for Independence

A chi-square test is used to determine whether there is an association between two categorical
variables.

Contingency table for gender and smoking status

smoking_data <- data.frame(

Gender = c("Male", "Female"),

Non_Smoker = c(40, 60),

Computer Science & Engineering Department

RSR Rungta College of Engineering & Technology Bhilai

B.TECH 5th AIML

R Programming Lab Manual

Smoker = c(10, 20)

Perform the Chi-Square test

chisq_test <- chisq.test(smoking_data[, -1])

cat("Chi-Square Test Results:\n")

print(chisq_test)

Output:

Computer Science & Engineering Department

RSR Rungta College of Engineering & Technology Bhilai

B.TECH 5th AIML

R Programming Lab Manual

Experiment 5

In this experiment, students will learn how to apply machine learning algorithms to solve
real-world problems. They will learn how to train and evaluate machine learning models,
and how to use machine learning models to make predictions.

1. Setting Up the Environment

Install necessary libraries:

install.packages(c("caret", "randomForest", "e1071", "ggplot2"))

library(caret)

library(randomForest)

library(e1071)

library(ggplot2)

2. Understanding the Data

Students will begin by loading a dataset and performing basic exploration.

Example: Using the `iris` dataset:

data(iris)

str(iris) Check the structure of the data

summary(iris) Summary statistics of the dataset

3. Data Preprocessing

Clean the data by checking for missing values and normalizing or scaling if necessary.

Computer Science & Engineering Department

RSR Rungta College of Engineering & Technology Bhilai

B.TECH 5th AIML

R Programming Lab Manual

sum(is.na(iris)) Check for missing values

4. Splitting the Data

Split the dataset into training and testing sets (typically 80% training and 20% testing).

set.seed(123)

trainIndex <- createDataPartition(iris$Species, p = 0.8, list = FALSE)

trainData <- iris[trainIndex, ]

testData <- iris[-trainIndex, ]

5. Training a Model

Example: Using the `randomForest` model to train the data.

model_rf <- randomForest(Species ~ ., data = trainData)

print(model_rf) Print model summary

6. Evaluating the Model

predictions <- predict(model_rf, newdata = testData)

confusionMatrix(predictions, testData$Species)

7. Making Predictions

new_data <- data.frame(Sepal.Length = 5.1, Sepal.Width = 3.5, Petal.Length = 1.4,

Petal.Width = 0.2)

prediction <- predict(model_rf, new_data)

Computer Science & Engineering Department

RSR Rungta College of Engineering & Technology Bhilai

B.TECH 5th AIML

R Programming Lab Manual

print(prediction)

8. Model Tuning (Optional)

tune_rf <- train(Species ~ ., data = trainData, method = "rf", trControl = trainControl(method

= "cv", number = 10))

print(tune_rf)

9. Visualizing the Results

varImpPlot(model_rf) Plot variable importance

Computer Science & Engineering Department

RSR Rungta College of Engineering & Technology Bhilai

B.TECH 5th AIML

R Programming Lab Manual

Experiment 6:

Design an experiment to determine the effect of different types of fertilizer on plant

growth. This experiment allows students to explore the factors that affect plant growth.

Step 1: Set up the Environment and Simulate Data

Load necessary libraries

library(ggplot2)
library(dplyr)

Set seed for reproducibility

set.seed(123)

Create a dataset for plant growth simulation (4 weeks of data)

weeks <- rep(1:4, times = 3) 4 weeks, repeated for 3 fertilizer types
fertilizer_type <- rep(c("Organic", "Inorganic", "Control"), each = 4) Fertilizer types
growth_data <- data.frame(Week = weeks,
Fertilizer = fertilizer_type,
Height = numeric(12),
Leaves = numeric(12))

Simulate plant height and leaf number based on fertilizer type

growth_data$Height <- ifelse(growth_data$Fertilizer == "Organic",
rnorm(12, mean = 20 + growth_data$Week * 5, sd = 2),
ifelse(growth_data$Fertilizer == "Inorganic",
rnorm(12, mean = 25 + growth_data$Week * 6, sd = 2),
rnorm(12, mean = 15 + growth_data$Week * 3, sd = 2)))

growth_data$Leaves <- ifelse(growth_data$Fertilizer == "Organic",

rnorm(12, mean = 10 + growth_data$Week * 3, sd = 1),
ifelse(growth_data$Fertilizer == "Inorganic",
rnorm(12, mean = 12 + growth_data$Week * 4, sd = 1),
rnorm(12, mean = 8 + growth_data$Week * 2, sd = 1)))

View simulated data

head(growth_data)

Computer Science & Engineering Department

RSR Rungta College of Engineering & Technology Bhilai

B.TECH 5th AIML

R Programming Lab Manual

Step 2: Visualize the Growth Data

ggplot(growth_data, aes(x = Week, y = Height, color = Fertilizer)) +

geom_line() +
geom_point() +
labs(title = "Plant Height Over Time by Fertilizer Type", x = "Week", y = "Plant Height (cm)")
+
theme_minimal()

Visualize number of leaves by fertilizer type over time

ggplot(growth_data, aes(x = Week, y = Leaves, color = Fertilizer)) +
geom_line() +
geom_point() +
labs(title = "Number of Leaves Over Time by Fertilizer Type", x = "Week", y = "Number of
Leaves") +
theme_minimal()

Step 3: Statistical Analysis (ANOVA Test)

ANOVA for Plant Height

anova_height <- aov(Height ~ Fertilizer + Week + Fertilizer:Week, data = growth_data)
summary(anova_height)

ANOVA for Number of Leaves

anova_leaves <- aov(Leaves ~ Fertilizer + Week + Fertilizer:Week, data = growth_data)
summary(anova_leaves)

Step 4: Post-Hoc Test (If ANOVA is significant)

Post-Hoc Test for Plant Height

tukey_height <- TukeyHSD(anova_height)
summary(tukey_height)

Post-Hoc Test for Number of Leaves

tukey_leaves <- TukeyHSD(anova_leaves)
summary(tukey_leaves)

Computer Science & Engineering Department

RSR Rungta College of Engineering & Technology Bhilai

B.TECH 5th AIML

R Programming Lab Manual

Experiment 7 :

This experiment allows students to explore the relationship between food and energy.

Step 1: Simulating Data for Food Types and Energy Levels

Load necessary libraries

library(ggplot2)
library(dplyr)

Set seed for reproducibility

set.seed(123)

Define food types and their caloric values per 100g (in kcal)
food_data <- data.frame(
Food = c('Carbohydrates', 'Proteins', 'Fats', 'Fruits'),
Calories = c(250, 200, 300, 100), Approximate calories for 100g portion
EnergyBefore = c(5, 6, 5, 7), Energy level before consumption (scale 1-10)
EnergyAfter = c(7, 7, 6, 8), Energy level after consumption (scale 1-10)
DurationEnergy = c(3, 2.5, 2, 3) Duration of energy in hours
)

View the simulated data

print(food_data)

Step 2: Visualizing Energy Levels Before and After Eating

Boxplot of energy before and after eating

ggplot(food_data, aes(x = Food, y = EnergyAfter, fill = Food)) +
geom_boxplot() +
labs(title = "Energy After Eating Different Foods", y = "Energy Level (1-10)", x = "Food
Type") +
theme_minimal()

Boxplot of energy duration (how long energy lasts)

Computer Science & Engineering Department

RSR Rungta College of Engineering & Technology Bhilai

B.TECH 5th AIML

R Programming Lab Manual

ggplot(food_data, aes(x = Food, y = DurationEnergy, fill = Food)) +

geom_boxplot() +
labs(title = "Duration of Energy After Eating Different Foods", y = "Duration of Energy
(hours)", x = "Food Type") +
theme_minimal()

Step 3: Statistical Analysis

ANOVA for Energy Levels After Eating

anova_energy <- aov(EnergyAfter ~ Food, data = food_data)
summary(anova_energy)

ANOVA for Duration of Energy

anova_duration <- aov(DurationEnergy ~ Food, data = food_data)
summary(anova_duration)

Step 4: Post-Hoc Analysis (Tukey's HSD)

Tukey's HSD test for post-hoc analysis

tukey_energy <- TukeyHSD(anova_energy)
summary(tukey_energy)

Tukey's HSD test for energy duration

tukey_duration <- TukeyHSD(anova_duration)
summary(tukey_duration)

Experiment 8:

Computer Science & Engineering Department

RSR Rungta College of Engineering & Technology Bhilai

B.TECH 5th AIML

R Programming Lab Manual

Design an experiment to determine the effect of different types of light on the growth of
plants. This experiment allows students to explore the role of light in plant growth.

Step 1: Set up the Environment and Simulate Data

Load necessary libraries

library(ggplot2)

library(dplyr)

Set seed for reproducibility

set.seed(123)

Define the light conditions and simulate plant growth data over 4 weeks

weeks <- rep(1:4, times = 3) 4 weeks repeated for each light condition

light_condition <- rep(c("Sunlight", "LED", "Fluorescent"), each = 4) Light conditions

Simulate plant growth data: height and number of leaves over time

growth_data <- data.frame(Week = weeks,

Light = light_condition,

Height = numeric(12), Plant height in cm

Leaves = numeric(12)) Number of leaves

Simulate plant height and leaf number based on light condition

Computer Science & Engineering Department

RSR Rungta College of Engineering & Technology Bhilai

B.TECH 5th AIML

R Programming Lab Manual

growth_data$Height <- ifelse(growth_data$Light == "Sunlight",

rnorm(12, mean = 10 + growth_data$Week * 2, sd = 1),

ifelse(growth_data$Light == "LED",

rnorm(12, mean = 9 + growth_data$Week * 1.8, sd = 1),

rnorm(12, mean = 8 + growth_data$Week * 1.5, sd = 1)))

growth_data$Leaves <- ifelse(growth_data$Light == "Sunlight",

rnorm(12, mean = 5 + growth_data$Week * 1, sd = 1),

ifelse(growth_data$Light == "LED",

rnorm(12, mean = 4 + growth_data$Week * 0.8, sd = 1),

rnorm(12, mean = 3 + growth_data$Week * 0.6, sd = 1)))

View simulated data

head(growth_data)

Step 2: Visualize the Data.

Line plot for plant height over time by light condition

ggplot(growth_data, aes(x = Week, y = Height, color = Light)) +

geom_line() +

geom_point() +

labs(title = "Plant Height Over Time by Light Condition", x = "Week", y = "Plant Height
(cm)") +

theme_minimal()

Line plot for number of leaves over time by light condition

Computer Science & Engineering Department

RSR Rungta College of Engineering & Technology Bhilai

B.TECH 5th AIML

R Programming Lab Manual

ggplot(growth_data, aes(x = Week, y = Leaves, color = Light)) +

geom_line() +

geom_point() +

labs(title = "Number of Leaves Over Time by Light Condition", x = "Week", y = "Number of

Leaves") +

theme_minimal()

Step 3: Statistical Analysis (ANOVA)

ANOVA for Plant Height

anova_height <- aov(Height ~ Light + Week + Light:Week, data = growth_data)

summary(anova_height)

ANOVA for Number of Leaves

anova_leaves <- aov(Leaves ~ Light + Week + Light:Week, data = growth_data)

summary(anova_leaves)

Step 4: Post-Hoc Test (If ANOVA is significant)

Post-Hoc Test for Plant Height

tukey_height <- TukeyHSD(anova_height)

summary(tukey_height)

Post-Hoc Test for Number of Leaves

tukey_leaves <- TukeyHSD(anova_leaves)

Computer Science & Engineering Department

RSR Rungta College of Engineering & Technology Bhilai

B.TECH 5th AIML

R Programming Lab Manual

summary(tukey_leaves)

Output:

ANOVA for Plant Height:

summary(anova_height)

Example:

Df Sum Sq Mean Sq F value Pr(>F)

Light 2 2.456 1.228 5.43 0.015

Week 3 3.872 1.290 6.17 0.004

Tukey's HSD test for Plant Height:

summary(tukey_height)

Example:

diff lwr upr p adj

Sunlight-LED 0.45 -0.21 1.11 0.32

Sunlight-Florescent 1.15 0.72 1.58 0.001 *

Computer Science & Engineering Department

RSR Rungta College of Engineering & Technology Bhilai

B.TECH 5th AIML

R Programming Lab Manual

Experiment 9 :
Design an experiment to determine the effect of different types of soil on the growth of
plants. This experiment allows students to explore the role of soil in plant growth.

Step 1: Set up the Environment and Simulate Data

Load necessary libraries

library(ggplot2)

library(dplyr)

Set seed for reproducibility

set.seed(123)

Define the soil types and simulate plant growth data over 4 weeks

weeks <- rep(1:4, times = 3) 4 weeks repeated for each soil condition

soil_type <- rep(c("Loamy", "Sandy", "Clay"), each = 4) Soil types

Simulate plant growth data: height and number of leaves over time

growth_data <- data.frame(Week = weeks,

Soil = soil_type,

Height = numeric(12), Plant height in cm

Leaves = numeric(12)) Number of leaves

Simulate plant height and leaf number based on soil type

growth_data$Height <- ifelse(growth_data$Soil == "Loamy",

rnorm(12, mean = 10 + growth_data$Week * 2, sd = 1),

Computer Science & Engineering Department

RSR Rungta College of Engineering & Technology Bhilai

B.TECH 5th AIML

R Programming Lab Manual

ifelse(growth_data$Soil == "Sandy",

rnorm(12, mean = 8 + growth_data$Week * 1.5, sd = 1),

rnorm(12, mean = 7 + growth_data$Week * 1.2, sd = 1)))

growth_data$Leaves <- ifelse(growth_data$Soil == "Loamy",

rnorm(12, mean = 5 + growth_data$Week * 1, sd = 1),

ifelse(growth_data$Soil == "Sandy",

rnorm(12, mean = 4 + growth_data$Week * 0.8, sd = 1),

rnorm(12, mean = 3 + growth_data$Week * 0.6, sd = 1)))

View simulated data

head(growth_data)

Step 2: Visualize the Data

Line plot for plant height over time by soil type

ggplot(growth_data, aes(x = Week, y = Height, color = Soil)) +

geom_line() +

geom_point() +

labs(title = "Plant Height Over Time by Soil Type", x = "Week", y = "Plant Height (cm)") +

theme_minimal()

Line plot for number of leaves over time by soil type

ggplot(growth_data, aes(x = Week, y = Leaves, color = Soil)) +

Computer Science & Engineering Department

RSR Rungta College of Engineering & Technology Bhilai

B.TECH 5th AIML

R Programming Lab Manual

geom_line() +

geom_point() +

labs(title = "Number of Leaves Over Time by Soil Type", x = "Week", y = "Number of

Leaves") +

theme_minimal()

Step 3: Statistical Analysis (ANOVA)

ANOVA for Plant Height

anova_height <- aov(Height ~ Soil + Week + Soil:Week, data = growth_data)

summary(anova_height)

ANOVA for Number of Leaves

anova_leaves <- aov(Leaves ~ Soil + Week + Soil:Week, data = growth_data)

summary(anova_leaves)

Step 4: Post-Hoc Test (If ANOVA is significant)

Post-Hoc Test for Plant Height

tukey_height <- TukeyHSD(anova_height)

summary(tukey_height)

Post-Hoc Test for Number of Leaves

tukey_leaves <- TukeyHSD(anova_leaves)

summary(tukey_leaves)

Computer Science & Engineering Department

RSR Rungta College of Engineering & Technology Bhilai

B.TECH 5th AIML

R Programming Lab Manual

Output :

Computer Science & Engineering Department

RSR Rungta College of Engineering & Technology Bhilai

R1 Uptovisualisation
No ratings yet
R1 Uptovisualisation
122 pages
Six Sigma Catapult Project
No ratings yet
Six Sigma Catapult Project
16 pages
Data Analytics-Lab Manual
No ratings yet
Data Analytics-Lab Manual
19 pages
20ITPL702 - DataScienceWithMachineLearning
No ratings yet
20ITPL702 - DataScienceWithMachineLearning
69 pages
Statistics A Practical Approach For Process Control Engineers 1st Edition Myke King Download
No ratings yet
Statistics A Practical Approach For Process Control Engineers 1st Edition Myke King Download
67 pages
R Programming Lab Manual
No ratings yet
R Programming Lab Manual
44 pages
R Programming Shiv
No ratings yet
R Programming Shiv
18 pages
DSR LAB MANUAL - 10 Programs
No ratings yet
DSR LAB MANUAL - 10 Programs
34 pages
50 R Exercises
No ratings yet
50 R Exercises
44 pages
Dev Record Edited-4
No ratings yet
Dev Record Edited-4
69 pages
BCA 280 Lab
No ratings yet
BCA 280 Lab
29 pages
Computer Networks Lab Manual WORD
No ratings yet
Computer Networks Lab Manual WORD
39 pages
R Basic
No ratings yet
R Basic
16 pages
Unit II Data Science Notes
No ratings yet
Unit II Data Science Notes
38 pages
Ziyaul 12
No ratings yet
Ziyaul 12
26 pages
R-Programming Lab Mannual
No ratings yet
R-Programming Lab Mannual
33 pages
4.standard On Ratio Studies
No ratings yet
4.standard On Ratio Studies
64 pages
Pushpendra Lab File
No ratings yet
Pushpendra Lab File
51 pages
Data Minig and Techniquezz
No ratings yet
Data Minig and Techniquezz
48 pages
Data Science Using R - Lab Manual-Complete Ver 2.0 - Nov 2024
No ratings yet
Data Science Using R - Lab Manual-Complete Ver 2.0 - Nov 2024
36 pages
Data Mining
No ratings yet
Data Mining
13 pages
W2 Descriptive Statistics
No ratings yet
W2 Descriptive Statistics
60 pages
MDA - 1.module 1 - BI Introduction - Data Prep
No ratings yet
MDA - 1.module 1 - BI Introduction - Data Prep
131 pages
BIO259 Note
No ratings yet
BIO259 Note
55 pages
Data - Analysis - With - R - 24
No ratings yet
Data - Analysis - With - R - 24
47 pages
Data Analytics Using R Lab - Master Manual
No ratings yet
Data Analytics Using R Lab - Master Manual
29 pages
Data Analytics With R - BDS306C - LAB - Full
No ratings yet
Data Analytics With R - BDS306C - LAB - Full
61 pages
Development of Robust Design Under Contaminated and Non-Normal Data
No ratings yet
Development of Robust Design Under Contaminated and Non-Normal Data
21 pages
Vinit R Programming
No ratings yet
Vinit R Programming
39 pages
Best-Practice Recommendations For Defining Identifying and Handling Outliers
No ratings yet
Best-Practice Recommendations For Defining Identifying and Handling Outliers
33 pages
Outlier
No ratings yet
Outlier
9 pages
Saicejournalofcivilengineeringvol 65 No 3
No ratings yet
Saicejournalofcivilengineeringvol 65 No 3
60 pages
Xamar Cadey
No ratings yet
Xamar Cadey
48 pages
Data Analytics With Python Laboratory - Lab Manual
No ratings yet
Data Analytics With Python Laboratory - Lab Manual
45 pages
Clustering Project
100% (1)
Clustering Project
44 pages
1.03 Statistical Measures of Asset Returns
100% (1)
1.03 Statistical Measures of Asset Returns
13 pages
Ida Lab Final
No ratings yet
Ida Lab Final
29 pages
R Lab Manual
No ratings yet
R Lab Manual
19 pages
Applying Multiple Linear Regression and Neural Network To Predict Bank Performance
No ratings yet
Applying Multiple Linear Regression and Neural Network To Predict Bank Performance
8 pages
Comprehensive Guidelines For The Application of In-Situ Polymer Gels For Injection Well Conformance Improvement Based On Field Projects 179575
No ratings yet
Comprehensive Guidelines For The Application of In-Situ Polymer Gels For Injection Well Conformance Improvement Based On Field Projects 179575
27 pages
Machine Learning and Business Analytics Surprize Quiz
No ratings yet
Machine Learning and Business Analytics Surprize Quiz
5 pages
Group Project (Grab Ehailing Experience and Insight)
No ratings yet
Group Project (Grab Ehailing Experience and Insight)
18 pages
Lab File AD PDF
No ratings yet
Lab File AD PDF
25 pages
ML File
No ratings yet
ML File
12 pages
Section 03
No ratings yet
Section 03
20 pages
Statistics 2022
No ratings yet
Statistics 2022
14 pages
Unit 4
No ratings yet
Unit 4
27 pages
Dav Exps - Merged - Merged
No ratings yet
Dav Exps - Merged - Merged
99 pages
Sullivan 2021
No ratings yet
Sullivan 2021
14 pages
DSF Gourav-2
No ratings yet
DSF Gourav-2
30 pages
DADS301 MBA Sem 3programming in DS
No ratings yet
DADS301 MBA Sem 3programming in DS
10 pages
Galgotias College of Engineering & Technology: Inroduction To Data Analytics and Visualization Lab File (KDS-551)
No ratings yet
Galgotias College of Engineering & Technology: Inroduction To Data Analytics and Visualization Lab File (KDS-551)
47 pages
S24 Stats10 Lab1-1
No ratings yet
S24 Stats10 Lab1-1
8 pages
BFC 34303 Civil Engineering Statistics SEMESTER I 2024/2025
No ratings yet
BFC 34303 Civil Engineering Statistics SEMESTER I 2024/2025
9 pages
FM MB Aluminum 6063 Billet Premiums Germany Italy
No ratings yet
FM MB Aluminum 6063 Billet Premiums Germany Italy
10 pages
Data Normalization
No ratings yet
Data Normalization
6 pages
Da Session 4
No ratings yet
Da Session 4
75 pages
Data Anlytics Using R Notes
No ratings yet
Data Anlytics Using R Notes
14 pages
My R Report
No ratings yet
My R Report
52 pages
Using Big Data Analytics To Create A Predictive Model For Joint Strike Fighter
No ratings yet
Using Big Data Analytics To Create A Predictive Model For Joint Strike Fighter
7 pages
Wa0002.
No ratings yet
Wa0002.
22 pages
Deep Learning MCQA
No ratings yet
Deep Learning MCQA
20 pages
Out of Specification Investigation
No ratings yet
Out of Specification Investigation
3 pages
5 Sem Lab Manual R Programming BCA-BSC
No ratings yet
5 Sem Lab Manual R Programming BCA-BSC
16 pages
Module 4 - Study Material - Overview of Predictive Analytics
No ratings yet
Module 4 - Study Material - Overview of Predictive Analytics
15 pages
Turing Machine
No ratings yet
Turing Machine
12 pages
Lab Manual FOR CSE 355/ Data Science Professional Certification Name
No ratings yet
Lab Manual FOR CSE 355/ Data Science Professional Certification Name
20 pages
(2004) Wahl - Uncertainty of Predictions of Embankment Dam Breach Parameters PDF
No ratings yet
(2004) Wahl - Uncertainty of Predictions of Embankment Dam Breach Parameters PDF
9 pages
The Relationship Between Out of Stocks and Total Settlement in Coca Cola Official Distributor at Betro-Surabaya
No ratings yet
The Relationship Between Out of Stocks and Total Settlement in Coca Cola Official Distributor at Betro-Surabaya
4 pages
EMERGENCY
No ratings yet
EMERGENCY
19 pages
Lab Manual DS 7 To 10
No ratings yet
Lab Manual DS 7 To 10
4 pages
ProgrammingForDS14 Rbasics
No ratings yet
ProgrammingForDS14 Rbasics
32 pages
Question Paper 1 Answers (R) by Siddu
No ratings yet
Question Paper 1 Answers (R) by Siddu
17 pages
18 3 24 Upto Week 6 A B Latest 1
No ratings yet
18 3 24 Upto Week 6 A B Latest 1
25 pages
Data Science Lab Manual
No ratings yet
Data Science Lab Manual
40 pages
2 Undefined
No ratings yet
2 Undefined
86 pages
Lecture 1
No ratings yet
Lecture 1
35 pages
MTech R Notes
No ratings yet
MTech R Notes
14 pages
INF30036 DataTypes Lecture2-1
No ratings yet
INF30036 DataTypes Lecture2-1
42 pages
DS Lab
No ratings yet
DS Lab
31 pages
Bdo Co1 Session 4
No ratings yet
Bdo Co1 Session 4
43 pages
Lab1 411 Eman Yahya 7773225
No ratings yet
Lab1 411 Eman Yahya 7773225
16 pages
Glocal University: Practical File of R Programming
100% (1)
Glocal University: Practical File of R Programming
32 pages
Big Data File in R
No ratings yet
Big Data File in R
23 pages
Data Science Minor Syllabus-Sem-04
No ratings yet
Data Science Minor Syllabus-Sem-04
4 pages
CH 3
No ratings yet
CH 3
33 pages
Intro To Data Science Lecture 4
No ratings yet
Intro To Data Science Lecture 4
13 pages
Analysis Using Statistical: Introduction & Data Exploration
No ratings yet
Analysis Using Statistical: Introduction & Data Exploration
23 pages
B. Tech. 1st & 2nd Semester (AICTE Scheme)
No ratings yet
B. Tech. 1st & 2nd Semester (AICTE Scheme)
1 page
Final Cost Practical
No ratings yet
Final Cost Practical
29 pages
Unit - I: Topic - 1
No ratings yet
Unit - I: Topic - 1
13 pages
Module 1: Unit - 1.1: Introduction To Analytics or R Programming
No ratings yet
Module 1: Unit - 1.1: Introduction To Analytics or R Programming
26 pages
RSTUDIO
No ratings yet
RSTUDIO
44 pages
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet