0% found this document useful (0 votes)

20 views28 pages

CSV Files in R

Uploaded by

Vandana Monappa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views28 pages

CSV Files in R

Uploaded by

Vandana Monappa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 28

Function Syntax Works On Output

apply(X,
apply() MARGIN, Matrix / Array Vector / Array
FUN, ...)
lapply(X,
lapply() List / Vector List
FUN, ...)
sapply(X,
sapply() FUN, ..., List / Vector Vector / Matrix
simplify=TRUE)
mapply(FUN, ...,
MoreArgs=NULL
Multiple vectors /
mapply() , Vector / List
lists
SIMPLIFY=TRU
E)
tapply(X, INDEX,
Vector + Factor
tapply() FUN, ..., Array / List
(groups)
simplify=TRUE)
A vector contains the numbers from 1 to 5. Write a program in R to apply the square
function to each element using lapply().

# Input vector
v <- 1:5

# Apply square function using lapply

result <- lapply(v, function(x) x^2)

print(result)
Given a 3x3 matrix of numbers, calculate the sum of each row using
the apply() function.

# Create a 3x3 matrix

m <- matrix(1:9, nrow = 3, byrow = TRUE)
print(m)

# Apply sum function across rows (MARGIN = 1

→ rows)
row_sums <- apply(m, 1, sum)

print(row_sums)
Given a vector of numbers from 1 to 5, find the square of each element
using sapply().

v <- 1:5

# Use sapply to square each number

result <- sapply(v, function(x) x^2)

print(result)
# tapply()
ages <- c(21, 23, 25, 30, 40, 45)
gender <- c("F", "M", "F", "M", "F", "M")
tapply(ages, gender, mean)
Getting and Setting the Working Directory

 Before working with CSV files, it is important to know and set

the working directory where your CSV files are stored.

print(getwd())
[1] "C:/Users/VANDANA/OneDrive/Desktop/2025 odd/prog"

setwd("/Example_Path/")

print(getwd())
Sample CSV File Example
Consider the following sample CSV data saved as sample.csv:
id,name,department,salary,projects
1,A,IT,60754,4
2,B,Tech,59640,2
3,C,Marketing,69040,8
4,D,Marketing,65043,5
5,E,Tech,59943,2
6,F,IT,65000,5
7,G,HR,69000,7
Reading CSV Files into R

 We can load a CSV file into R as a data frame using

the read.csv() function.

 The ncol() and nrow() return the number of columns and rows in the data

frame, respectively.

csv_data <- read.csv(file = 'C:\\Users\\GFG19565\\Downloads\\sample.csv')

return(csv_data)
print(ncol(csv_data))
print(nrow(csv_data))
min_pro <- min(csv_data$projects)
print(min_pro)

result <- csv_data[csv_data$salary > 60000, c("name", "salary")]

print(result)
Calculate average salary per department

result <- tapply(csv_data$salary, csv_data$department, mean)

result_df <- data.frame(Department = names(result), AverageSalary =

as.vector(result))

write.csv(result_df, "Mean_salary.csv", row.names = FALSE)

 2. Calculate total number of projects handled per department and write
to CSV
 The tapply() function is used to compute the total number of projects
handled in each department.
 The result is converted into a data frame for better structure and then written
to a CSV file named department_project_totals.csv.

total_projects <- tapply(csv_data$projects, csv_data$department, sum)

projects_df <- data.frame(Department = names(total_projects), TotalProjects =

total_projects)

write.csv(projects_df, "department_project_totals.csv", row.names = FALSE)

Calculate

CTR (%) = (Clicks ÷ Impressions) × 100

Engagement Score = Likes + Shares + Comments
Engagement Rate (%) = (Engagement Score ÷ Reach) × 100
Sentiment Distribution = Count of Positive, Neutral, Negative posts
CPC (₹) = Total Ad Cost ÷ Clicks (only for Paid = Yes)

•ifelse(condition, value_if_true, value_if_false)

•→ vectorized conditional function in R
# Read CSV
df <- read.csv("data.csv", stringsAsFactors = FALSE)

# Step 1: CTR (Click Through Rate) = (Clicks / Impressions) * 100

df$CTR_pct <- (df$Clicks / df$Impressions) * 100

# Step 2: Engagement Score = Likes + Shares + Comments

df$Engt_Score <- df$Likes + df$Shares + df$Comments

# Step 3: Engagement Rate (%) = (Engagement Score / Reach) * 100

df$Engt_Rate_pct <- (df$Engt_Score / df$Reach) * 100
# Step 4: CPC (Cost per Click) = Ad_Cost / Clicks (only for Paid posts)
df$CPC <- ifelse(df$Paid == "Yes" & df$Clicks > 0, df$Ad_Cost / df$Clicks, NA)

# Step 5: Round off values

df$CTR_pct <- round(df$CTR_pct, 2)
df$Engt_Rate_pct <- round(df$Engt_Rate_pct, 2)
df$CPC <- round(df$CPC, 2)

# Final Output
print(df)
Problem Statement
You are given a dataset containing Post_ID and Comment_Text from
social media.
Your task is to:
Perform sentiment analysis on each comment.
Categorize each comment as Positive, Negative, or Neutral.
Calculate the percentage of positive engagement per post.

Post_ID Comment_Text
101 Love this post! Very inspiring ❤️
101 Not useful at all, waste of time.
101 Nice effort, keep going.
102 This update is terrible 😡
102 I enjoyed reading this, very helpful.
103 Okay post, nothing special.
# Install once (if not already installed)
install.packages("tidyverse") # For data manipulation
install.packages("tidytext") # For text mining
install.packages("textdata") # For sentiment lexicons

# Load libraries
library(tidyverse)
library(tidytext)
library(textdata)
1. tidyverse
•The tidyverse is a collection of R packages designed for data
Analytics.
•All packages in the tidyverse share a common philosophy: tidy
data (where each column is a variable, each row is an
observation, and each cell is a single value).
•When you load library(tidyverse), it automatically loads several
core packages for data manipulation (filter, select, group,
summarize, etc.) and visulaization.
2. tidytext

•The tidytext package is for text mining using tidy data principles.

•It transforms unstructured text into structured formats.

•Key functions:

•unnest_tokens() → Breaks text into tokens (words, bigrams,

sentences).

•get_sentiments() → Fetches sentiment lexicons.

3. textdata
•The textdata package provides datasets and lexicons commonly
used in text analysis and NLP.
•Instead of you downloading sentiment dictionaries manually, textdata
helps install them directly from R.
•Examples of resources it provides:
•Sentiment lexicons:
•AFINN (numerical score −5 to +5)
•Bing (positive/negative)
The AFINN lexicon is a list of English words rated for valence with an integer
between -5 (very negative) and +5 (very positive). It’s widely used in sentiment
analysis.

Word Score
love +3
happy +3
good +3
nice +3
excellent +5
win +4
wow +3
bad -3

sentence: “I love this product, it is excellent!”

love = +3, excellent = +5 → total sentiment = +8 (positive)
What is Bing Lexicon?
It is a dictionary-based sentiment lexicon that classifies words as either:
Positive 👍
Negative 👎
👉 It does not assign numeric strength (like AFINN does) or multiple emotions (like
NRC).
It’s a binary classification: just positive or negative.
Step 2: Prepare Dataset
Imagine we have a CSV file comments.csv like this:

Post_ID Comment_Text
101 Love this post! Very inspiring ❤️
101 Not useful at all, waste of time.
101 Nice effort, keep going.
102 This update is terrible 😡
102 I enjoyed reading this, very helpful.
103 Okay post, nothing special.
comments <- read.csv("comments.csv", stringsAsFactors = FALSE)
print(comments)

🔹 Step 3: Load Sentiment Dictionary

We use the Bing lexicon, which has a list of positive & negative words.

bing <- get_sentiments("bing")

head(bing)
TedyText
🔹 Step 4: Break Comments into Words
We split each comment into individual words (called tokenization).

comments_words <- unnest_tokens(data = comments,

output = word,
input = Comment_Text)
print(comments_words)
🔹 Step 5: Match Words with Sentiments
Now, we join our comment words with the Bing sentiment dictionary.

sentiment_data <- inner_join(comments_words, bing, by = "word")

print(sentiment_data)

ID Name ID Score ID Name Score

1 Asha 2 88 2 Ravi 88
2 Ravi 3 92 3 Meera 92
3 Meera 4 76 4 Kiran 76
4 Kiran 5 85
comments_words
bing (subset)
word
happy word sentiment
sad happy positive
exam joy positive
joy sad negative
failure failure negative

inner_join(comments_words, bing, by = "word") →

word sentiment
happy positive
sad negative
joy positive
failure negative

Apply, Lapply, Sapply, Tapply Function in R With Examples
No ratings yet
Apply, Lapply, Sapply, Tapply Function in R With Examples
10 pages
Rubric Quiz1
No ratings yet
Rubric Quiz1
2 pages
Part C - Assignment No. 2 Mini-Project On Twitter
No ratings yet
Part C - Assignment No. 2 Mini-Project On Twitter
7 pages
R Text Mining & Sentiment Guide
No ratings yet
R Text Mining & Sentiment Guide
9 pages
Python Sentiment Analysis Guide
No ratings yet
Python Sentiment Analysis Guide
5 pages
Data Mining Lab Manual for R
No ratings yet
Data Mining Lab Manual for R
48 pages
BAET Record
No ratings yet
BAET Record
19 pages
Mod3 Tables EPP
No ratings yet
Mod3 Tables EPP
9 pages
Dsda Manual
No ratings yet
Dsda Manual
64 pages
Apply Family in R
No ratings yet
Apply Family in R
10 pages
FDP Indoglobal Group of Colleges: 27 April To 1 May R Programming Language Assignment Submission
No ratings yet
FDP Indoglobal Group of Colleges: 27 April To 1 May R Programming Language Assignment Submission
12 pages
Brand Perception Analysis Group Assignment
No ratings yet
Brand Perception Analysis Group Assignment
23 pages
Twitter Sentiment Analysis Guide
No ratings yet
Twitter Sentiment Analysis Guide
7 pages
Project Walkthrough - Bike Share-2020
No ratings yet
Project Walkthrough - Bike Share-2020
58 pages
Ba 340: Data Analytics: Pipes/Apply in R
No ratings yet
Ba 340: Data Analytics: Pipes/Apply in R
19 pages
X - 15 x-1 2. Print ('Hello Word!') ## (1) "Hello Word!" 3. X - 4 y - 5 Z - X+y Print (Z) 4. X - 4 y - 5 Cat ('The Sum of X and y Is', X+y)
No ratings yet
X - 15 x-1 2. Print ('Hello Word!') ## (1) "Hello Word!" 3. X - 4 y - 5 Z - X+y Print (Z) 4. X - 4 y - 5 Cat ('The Sum of X and y Is', X+y)
15 pages
Lab 2
No ratings yet
Lab 2
5 pages
Text Preprocessing and Sentiment Analysis
No ratings yet
Text Preprocessing and Sentiment Analysis
13 pages
R-Lab p-4,2,1
No ratings yet
R-Lab p-4,2,1
12 pages
Emotion Classification With DistilBERT
No ratings yet
Emotion Classification With DistilBERT
25 pages
CH 3
No ratings yet
CH 3
33 pages
Intermediate R
No ratings yet
Intermediate R
13 pages
Data Science Practical Completion Report
No ratings yet
Data Science Practical Completion Report
31 pages
R Programming
No ratings yet
R Programming
50 pages
Chapter 3 Programming Basics: 3.1 Conditional Expressions
No ratings yet
Chapter 3 Programming Basics: 3.1 Conditional Expressions
7 pages
Exploratory Data Analysis Using Python
No ratings yet
Exploratory Data Analysis Using Python
41 pages
Advanced R: Hadley Wickham
No ratings yet
Advanced R: Hadley Wickham
15 pages
Prototype 1
No ratings yet
Prototype 1
10 pages
Loops. Programming in R
No ratings yet
Loops. Programming in R
8 pages
Data Science
No ratings yet
Data Science
20 pages
R Programming Basics for Beginners
No ratings yet
R Programming Basics for Beginners
14 pages
Semi-Automated EDA in Python
No ratings yet
Semi-Automated EDA in Python
3 pages
MBA Sem 1 Unit 3 Fundamentals of R
No ratings yet
MBA Sem 1 Unit 3 Fundamentals of R
41 pages
Handling The Dataset Using R - Word
No ratings yet
Handling The Dataset Using R - Word
54 pages
Naive Bayes for Sentiment Analysis Guide
No ratings yet
Naive Bayes for Sentiment Analysis Guide
10 pages
Working with Data Frames in R
No ratings yet
Working with Data Frames in R
8 pages
Unit 3
No ratings yet
Unit 3
110 pages
R Docs
No ratings yet
R Docs
45 pages
Singh Project1 Report
No ratings yet
Singh Project1 Report
12 pages
PPPT
No ratings yet
PPPT
20 pages
Functional Programming: Hadley Wickham
No ratings yet
Functional Programming: Hadley Wickham
58 pages
Problem Statement
No ratings yet
Problem Statement
10 pages
Module 8 - Text - Update
No ratings yet
Module 8 - Text - Update
42 pages
R Programming Basics and Functions
No ratings yet
R Programming Basics and Functions
13 pages
2 Functions
No ratings yet
2 Functions
49 pages
Bert Sentiment
No ratings yet
Bert Sentiment
7 pages
Aspiring Data Scientist's Journey
No ratings yet
Aspiring Data Scientist's Journey
1 page
R BasicCommands
No ratings yet
R BasicCommands
5 pages
BA Notes
No ratings yet
BA Notes
5 pages
Unit 2 Reading and Writing Files
No ratings yet
Unit 2 Reading and Writing Files
33 pages
Factors and Tables - Un
No ratings yet
Factors and Tables - Un
44 pages
Introduction To R For Business Analytics
No ratings yet
Introduction To R For Business Analytics
7 pages
Project Ip
No ratings yet
Project Ip
38 pages
Resume 3
No ratings yet
Resume 3
2 pages
DS Journal
No ratings yet
DS Journal
46 pages
Albay Province Overview and Map
No ratings yet
Albay Province Overview and Map
24 pages
File Test 1 A - Elementary
No ratings yet
File Test 1 A - Elementary
6 pages
Q2 W6 Mapeh Matatag
No ratings yet
Q2 W6 Mapeh Matatag
82 pages
TM Series: Multi-Channel (4 Channel / 2 Channel) Modular Type PID Control
No ratings yet
TM Series: Multi-Channel (4 Channel / 2 Channel) Modular Type PID Control
8 pages
Bank Token Display System
No ratings yet
Bank Token Display System
22 pages
Solution Design Document (SDD) : Serviewnow Assignment
No ratings yet
Solution Design Document (SDD) : Serviewnow Assignment
18 pages
Lesson Plan Deconstruction Assignment
No ratings yet
Lesson Plan Deconstruction Assignment
2 pages
MIT App Inventor IoT Starter Tutorial PDF
No ratings yet
MIT App Inventor IoT Starter Tutorial PDF
4 pages
CMPT 407 - Complexity Theory Lecture 6 - Search-to-Decision, Levin's Universal Search Algorithm
No ratings yet
CMPT 407 - Complexity Theory Lecture 6 - Search-to-Decision, Levin's Universal Search Algorithm
6 pages
Characterizing Matrices: Image & Kernel
No ratings yet
Characterizing Matrices: Image & Kernel
53 pages
Nonverbal Communication in Academic and Professional Settings
No ratings yet
Nonverbal Communication in Academic and Professional Settings
10 pages
Consumer-Cause List-07-11-2024
No ratings yet
Consumer-Cause List-07-11-2024
17 pages
Individualizing Instruction in Preschool Classrooms Mary B Boat Laurie A Dinnebeil Youlmi Bae Volume 38 Issue 1
No ratings yet
Individualizing Instruction in Preschool Classrooms Mary B Boat Laurie A Dinnebeil Youlmi Bae Volume 38 Issue 1
8 pages
Superman: Origins and Impact on Comics
No ratings yet
Superman: Origins and Impact on Comics
7 pages
Section E Trial
No ratings yet
Section E Trial
2 pages
Goethe's Werther: Aesthetic Analysis
No ratings yet
Goethe's Werther: Aesthetic Analysis
17 pages
Class 7 Worksheet CH 8
No ratings yet
Class 7 Worksheet CH 8
11 pages
Lesson Plan English Form 1
No ratings yet
Lesson Plan English Form 1
1 page
Mahaperiaval On Subrahmanyaya Namaste PDF
No ratings yet
Mahaperiaval On Subrahmanyaya Namaste PDF
5 pages
Forex Trade Copier Guide
No ratings yet
Forex Trade Copier Guide
5 pages
Plural Dos Substantivos (The Plural of Nouns)
No ratings yet
Plural Dos Substantivos (The Plural of Nouns)
12 pages
House and Home in Anita Desai's Clear Light of Day.-1
No ratings yet
House and Home in Anita Desai's Clear Light of Day.-1
18 pages
Organising Project Data
No ratings yet
Organising Project Data
3 pages
Indian Culture Script
No ratings yet
Indian Culture Script
3 pages
Graphic Novel
No ratings yet
Graphic Novel
126 pages
Romanization: Seungri - Magic Lyrics With English Translation
No ratings yet
Romanization: Seungri - Magic Lyrics With English Translation
3 pages
Summer Express 3 4
100% (4)
Summer Express 3 4
142 pages
Text: A Day in The Life of Catherine Deneuve
No ratings yet
Text: A Day in The Life of Catherine Deneuve
2 pages
Noli Me Tangere
No ratings yet
Noli Me Tangere
6 pages
Bhagat Singh Essay Help
100% (1)
Bhagat Singh Essay Help
7 pages

CSV Files in R

Uploaded by

CSV Files in R

Uploaded by

Function Syntax Works On Output

# Apply square function using lapply

# Create a 3x3 matrix

# Apply sum function across rows (MARGIN = 1

# Use sapply to square each number

 Before working with CSV files, it is important to know and set

 We can load a CSV file into R as a data frame using

the read.csv() function.

csv_data <- read.csv(file = 'C:\\Users\\GFG19565\\Downloads\\sample.csv')

result <- csv_data[csv_data$salary > 60000, c("name", "salary")]

result <- tapply(csv_data$salary, csv_data$department, mean)

result_df <- data.frame(Department = names(result), AverageSalary =

write.csv(result_df, "Mean_salary.csv", row.names = FALSE)

total_projects <- tapply(csv_data$projects, csv_data$department, sum)

projects_df <- data.frame(Department = names(total_projects), TotalProjects =

write.csv(projects_df, "department_project_totals.csv", row.names = FALSE)

CTR (%) = (Clicks ÷ Impressions) × 100

•ifelse(condition, value_if_true, value_if_false)

# Step 1: CTR (Click Through Rate) = (Clicks / Impressions) * 100

# Step 2: Engagement Score = Likes + Shares + Comments

# Step 3: Engagement Rate (%) = (Engagement Score / Reach) * 100

# Step 5: Round off values

•It transforms unstructured text into structured formats.

•unnest_tokens() → Breaks text into tokens (words, bigrams,

•get_sentiments() → Fetches sentiment lexicons.

sentence: “I love this product, it is excellent!”

🔹 Step 3: Load Sentiment Dictionary

bing <- get_sentiments("bing")

comments_words <- unnest_tokens(data = comments,

sentiment_data <- inner_join(comments_words, bing, by = "word")

ID Name ID Score ID Name Score

inner_join(comments_words, bing, by = "word") →

You might also like