Predictive Analytics
Group Assignment 2
Group ID: 191233
Submitted by:
Names Roll No
Divyesh Jain 191222
Hemil Joshi 191228
Namit Maheshwari 191233
Omkar Khandekar 191234
Shashank Saxena 191247
Submitted To:
Prof. Chetan Jhaveri
Batch: MBA – FT (2019-2021)
Institute of Management, Nirma University
Date of Submission: 4th October, 2020
List of Packages Used
S No. Package Name
1 caret
2 caTools
3 cowplot
4 Datarium
5 DMwR
6 ggplot2
7 Lift
8 lmtest
9 MASS
10 nortest
11 olsrr
12 ROCR
Pre-modelling Process
S.no. Function Description
1 any(is.na()) To check whether database has missing values or not
2 as.factor() Converts data in to factor
3 as.integer() To converts any value to integer
4 as.numeric() To convert any value to numeric
5 attach() To attach the data to R search path
6 boxplot() To display the boxplot of dataset
7 cbind() Combines vector, matrix or data frame column wise
Page 1 of 5
8 class() To know about which class data belongs
9 cor() To check the correlation between different variables
10 cov() To check the covariance of the data
11 data() To include the dataset
12 dim() Gives the dimension of the dataset
13 getwd() Gives the working directory of R
14 head() Displays first 6 elements of dataset
15 hist() To create the histogram of dataset
16 ifelse() Conditional statement
17 install.packages() To install any package to the R file
18 IQR() Provides the inter quartile range
19 is.na() to check the presence of any missing values
20 library() To access the particular package from library
21 ls() Give list of memory contents
22 matrix() To create a matrix with m*n dimension
23 mean() To find out mean of columns of dataset
24 names() To check the names of column in dataset
25 nrow() Provides the number of rows
26 pairs() Gives correlation matrix plots
27 plots() Gives scatter plot between two variables
28 rbind() Combines vector, matrix or data frame row wise
29 read.csv() Command to read the csv file
30 sd() find out standard deviation of data
Page 2 of 5
31 setwd() To set the new working directoryy for R
32 str() Display the overall structure of the dataset
33 subset() Creates a new subset from the superset dataset
34 sum(is.na()) To check total no. of missing values
35 summary() Provide 5 point summary of dataset
36 tail() Displays last 6 elements of dataset
37 var() Finds out the variance of data
38 View() To view the dataset
39 which(is.na()) Gives the position of missing values
Modelling Process
S.no Function Description
1 abline() Used to add vertical, horizontal or regression line to the graph
2 ad.test() [nortest] To check property of normal distribution
3 anova() To get the anova table of the model
4 bptest() [lmtest] Test for costant residual variance
5 confint() It computes confidence interval (by default 95%)
6 Datarium] Data Package for visualisation and sataistical analysis
durbinwatsonTest()
7 Check for Auto correlation
[car package]
8 exp() To find the exponential value
9 ggplot() [ggplot2] To visualize the data graphically
Page 3 of 5
10 lm() To create least square regression line
11 lrtest [lmtest] To compare two or models for goodness of fit
ncvTest() [car
12 Test for non-constant error variance
package]
13 qqline() Add a straight line to qq plot
14 qqnorm() To make a normality plot
15 qqplot [car package] Quantile comparison plot
16 regr.eval() [DMwR] Calculate series of regression evaluation statistics
sample.split()
17 Used to split the dataset
[caTools]
18 stepAIC() [MASS] To find out the best fit model
19 var() [car package] variance covariance matrices
20 varImp() [caret] To list important variables
21 vif() [car package] To check multi collinearity
Post Modelling Process
S.no Function Description
1 confusionMatrix() [caret] Table that describes performance of the model
Page 4 of 5
2 geom_point() [cowplot,ggplot2] To create the visual graph with colours
3 grep() Pattern matching
4 performance() [ROCR] To create 2D parameterized performance curve
5 plotLift() [Lift] Draws the actual vs predicted graph in logistic regression
used to predict the values obtained by using regression
6 predict()
function
7 table() To create a table
Page 5 of 5