R Programming Lab
R Programming Lab
Experiment 1
RStudio is an integrated development environment (IDE) for R that provides a user-friendly interface to write, run, and debug R scripts. It allows
users to interact with R efficiently, manage projects, and perform data analysis. This experiment introduces beginners to the RStudio interface and
helps them become familiar with navigation and basic functionalities of the platform.
● Help and Viewer Pane: Access documentation, tutorials, and R Markdown files.
Objective
1
● To explore various R features and functionalities.
Learning Outcomes
Preparation:
● Familiarize yourself with the Console, Script Editor, Environment Pane, and Plots Pane.
Exploring R Features:
CopyEdit
print("Hello, R!")
CopyEdit
x <- c(1, 2, 3, 4, 5)
mean(x)
2
CopyEdit
setwd("C:/Users/YourName/Documents/R_Projects")
Create variables and datasets, then observe them in the Environment Pane:
CopyEdit
View(data)
CopyEdit
install.packages("ggplot2")
library(ggplot2)
CopyEdit
?mean
help("lm")
CopyEdit
Expected Outcome
Participants will have a foundational understanding of the RStudio Environment, enabling them to navigate, configure, and explore R
3
**************
Experiment 2
R packages extend the functionality of R by providing additional functions, datasets, and tools for various tasks such as data
visualization, statistical analysis, and machine learning. The Comprehensive R Archive Network (CRAN) hosts thousands of R
This experiment introduces beginners to installing, loading, updating, and managing R packages within RStudio.
● Installation and Loading: Install and activate packages for use in R scripts.
● Updating and Removing Packages: Ensures up-to-date functions and security fixes.
Objective
Learning Outcomes
Preparation:
4
Installing Packages from CRAN:
CopyEdit
2.
CopyEdit
3.
CopyEdit
install.packages("devtools")
library(devtools)
CopyEdit
CopyEdit
installed.packages()
5
Check if a package is installed:
CopyEdit
any(grepl("ggplot2", installed.packages()))
CopyEdit
packageVersion("ggplot2")
CopyEdit
update.packages(ask = FALSE)
CopyEdit
remove.packages("tidyverse")
CopyEdit
help(package = "ggplot2")
CopyEdit
?ggplot
6
Get example code for a function:
CopyEdit
example("lm")
Expected Outcome
Participants will have a solid understanding of R package management, including installation, loading, updating, removal, and
documentation usage, enabling them to work efficiently in data science and statistical computing.
**************
Experiment 3
Data frames are one of the most commonly used data structures in R for storing and analyzing tabular data. R provides built-in
functions for creating, manipulating, and summarizing data frames. Additionally, R supports various file formats such as CSV, Excel,
This experiment introduces students to data frames, how to read and write data files, and perform basic data manipulation in
R.
● Tabular Data Representation: Stores data in rows and columns similar to Excel or SQL tables.
● Data Import and Export: Supports CSV, Excel, JSON, and native R formats.
● Integration with Tidyverse: Enhances data manipulation with dplyr and readr packages.
Objective
Learning Outcomes
7
By the end of this experiment, participants will be able to:
Preparation:
CopyEdit
install.packages("tidyverse")
●
● Download or create sample CSV, Excel, and JSON files for testing.
CopyEdit
CopyEdit
str(students)
8
summary(students)
CopyEdit
head(data_csv)
CopyEdit
library(readxl)
head(data_excel)
CopyEdit
library(jsonlite)
print(data_json)
9
r
CopyEdit
CopyEdit
library(writexl)
write_xlsx(students, "output/students.xlsx")
CopyEdit
1. Selecting Columns:
CopyEdit
2. Filtering Rows:
CopyEdit
10
3. Sorting Data:
CopyEdit
print(students_sorted)
CopyEdit
print(students)
Expected Outcome
Participants will have a strong foundation in data handling and manipulation in R, enabling them to efficiently import, process,
**************
Experiment 4
Data visualization is a critical aspect of data analysis, as it helps in identifying patterns, trends, and outliers in the data. ggplot2 is
one of the most powerful and flexible R packages for creating visualizations. Based on the Grammar of Graphics, it allows users to
create complex, multi-layered visualizations by combining different components such as data, aesthetics, geometries, and statistics.
This experiment introduces students to the basics of creating visualizations using ggplot2 and explores different types of charts and
11
Key Features of ggplot2
● Customization: Offers extensive customization for themes, colors, labels, and axes.
● Statistical Summaries: Automatically adds statistical summaries (e.g., regression lines, confidence intervals).
● Wide Range of Visualizations: Includes scatter plots, bar charts, histograms, box plots, and more.
● Compatibility with Tidyverse: Integrates well with other tidyverse packages like dplyr and tidyr.
Objective
● To create various types of visualizations such as scatter plots, bar charts, and histograms.
Learning Outcomes
● Use ggplot2 to create various plots such as scatter plots, bar charts, and histograms.
Preparation:
CopyEdit
install.packages("ggplot2")
CopyEdit
library(ggplot2)
12
Use the mtcars dataset (a built-in dataset in R) to create a scatter plot:
CopyEdit
data(mtcars)
geom_point() +
CopyEdit
CopyEdit
Create a bar chart to visualize the frequency of car cylinders (cyl) in the mtcars dataset:
13
CopyEdit
geom_bar(fill = "skyblue") +
xlab("Number of Cylinders") +
ylab("Frequency")
Creating a Histogram:
CopyEdit
ylab("Frequency")
Create a box plot to visualize the distribution of mpg for different cylinder values:
CopyEdit
xlab("Number of Cylinders") +
14
Use built-in themes and customize labels for better presentation:
CopyEdit
theme_minimal() +
Expected Outcome
Participants will be able to create and customize a wide range of visualizations using ggplot2. By the end of the experiment, students
will have the skills to create meaningful plots for analyzing and interpreting data effectively.
**************
Experiment 5
Statistical analysis is essential for making informed decisions based on data. R provides a wide range of statistical functions to
perform data analysis, such as calculating descriptive statistics, performing hypothesis testing, and running regression models.
Hypothesis testing is a fundamental concept that allows us to test assumptions and make inferences about data.
In this experiment, students will explore descriptive statistics (mean, median, variance, etc.), perform t-tests, and ANOVA to
assess the significance of different variables. Students will also learn how to perform chi-square tests and work with confidence
● Descriptive Statistics: Summary measures like mean, median, variance, and standard deviation.
● Confidence Intervals: Calculating the range of values that likely contain the population parameter.
15
● p-Values and Significance: Understanding the significance of statistical results.
Objective
Learning Outcomes
Preparation:
● Download the mtcars dataset (built-in R dataset) or use your own dataset for practice.
Descriptive Statistics:
Calculate the mean, median, variance, standard deviation, and summary statistics:
CopyEdit
data(mtcars)
mean(mtcars$mpg) # Mean
median(mtcars$mpg) # Median
var(mtcars$mpg) # Variance
Perform a one-sample t-test to compare the mean of miles per gallon (mpg) to a known value (e.g., 20):
16
CopyEdit
t_test_result
●
● Interpret the p-value and decide whether to reject or fail to reject the null hypothesis.
Compare the mpg of cars with 4 and 6 cylinders using an independent t-test:
CopyEdit
t_test_result_2
●
● Look at the p-value to determine if there is a significant difference between the groups.
Perform ANOVA to analyze the differences in mpg across multiple cylinder groups:
CopyEdit
summary(aov_result)
●
● Check the F-statistic and p-value to assess the significance of the differences.
Chi-Square Test:
Perform a chi-square test to analyze the association between two categorical variables, such as cylinder and gear:
CopyEdit
chi_square_result
●
● Evaluate the chi-square statistic and p-value to check for independence between the variables.
17
Calculate the 95% confidence interval for the mpg of cars in the dataset:
CopyEdit
conf_int
Expected Outcome
Participants will have a foundational understanding of how to conduct statistical analysis and hypothesis testing using R. They
will be able to apply t-tests, ANOVA, and chi-square tests to draw conclusions from data. Additionally, participants will be able to
**********
18