Assignment 2 297
Assignment 2 297
ASSIGNMENT 2
BHUVANA SREE
VU22CSEN0300297
* Explore new dataset
* Load/Import dataset
* Understand the data (size, dimensions, type)
* Perform Data Preprocessing
* Descriptive Statistics
* Analysis
* Visualization summary
* Deduct the outliers
* Conclude the findings - Insights
Introduction
This report presents an in-depth analysis of a dataset focused on employee performance within a
fictional company. The dataset consists of 100 records, capturing key attributes such as Employee ID,
Department, Salary, Years of Experience, Job Satisfaction, Performance Score, and Age. The objective
of this analysis is to explore these attributes to identify trends, relationships, and potential outliers
that may influence overall employee performance.
We will begin by understanding the structure and composition of the dataset, followed by necessary
data preprocessing steps to ensure data quality. Descriptive statistics will provide insights into the
central tendencies and variability of the key metrics. We will then visualize important aspects of the
data to illustrate trends and distributions, followed by a thorough investigation of outliers in the
dataset.
set.seed(789)
# Create an employee performance dataset
Employee_ID = 1:100,
head(employee_data)
dim(employee_data) # Dimensions
if (missing_data > 0) {
} else {
5. Descriptive Statistics
summarise(
Mean_Salary = mean(Salary),
Median_Salary = median(Salary),
Mean_Job_Satisfaction = mean(Job_Satisfaction),
Mean_Performance_Score = mean(Performance_Score),
Age_Range = range(Age)
print(desc_stats)
a. Distribution of Salary
# Distribution of Salary
print(salary_plot)
geom_boxplot(fill = "lightgreen") +
print(job_satisfaction_plot)
7. Detect Outliers
IQR <- Q3 - Q1
# Print outliers
print(outliers)
# Visualize Outliers
geom_boxplot() +
print(outlier_plot)
Conclusion
In this structured analysis of the employee performance dataset, we walked through each phase from data
creation to visualization and conclusion. The insights gained can help organizations understand salary
distributions, the impact of job satisfaction on performance, and identify outliers that may indicate areas for
further investigation. These findings can ultimately guide effective management practices aimed at improving
employee engagement and productivity.