0% found this document useful (0 votes)

18 views20 pages

ProbList2 24 SLN

The document outlines a series of exercises related to applied statistics and data analysis using R. It includes tasks such as reading data, creating plots, and analyzing relationships between variables using datasets like prob1.csv and Auto from the ISLR package. The exercises cover boxplots, histograms, and scatter plots, with commentary on the observations made from the visualizations.

Uploaded by

Sam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views20 pages

ProbList2 24 SLN

Uploaded by

Sam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 20

STAT 210

Applied Statistics and Data Analysis

Problem List 2 - Solution
(due on week 3)
Spring 2024

Exercise 1
This is an exercise on the use of plot and its arguments. You will use the data in the file prob1.csv.
(1) Read the data in the file prob1.csv into a file named data1. Use str to explore the structure of the
set data1. If the variable class is of mode chr, change it to a factor.
data1 <- read.csv('prob1.csv', header = T)
str(data1)

## 'data.frame': 100 obs. of 5 variables:

## $ var1 : num 3.18 4.28 3.2 3.33 3.67 ...
## $ var2 : num 5.312 7.678 10.07 -0.496 -2.169 ...
## $ var3 : num 6.56 12.33 6.89 7.3 9.4 ...
## $ var4 : num 3.199 4.838 6.415 0.412 0.716 ...
## $ class: chr "A" "A" "A" "A" ...
## Change class to a factor:
data1$class <- factor(data1$class)

(2) Divide the plotting window into four sectors and draw boxplots for the four numerical variables according
to class. Comment on what you observe.
par(mfrow = c(2,2))
plot(var1 ~ class, data = data1, col = 'wheat')
plot(var2 ~ class, data = data1, col = 'wheat')
plot(var3 ~ class, data = data1, col = 'wheat')
plot(var4 ~ class, data = data1, col = 'wheat')

1
5.0

20
10
4.0
var1

var2

0
3.0

−20 −10
2.0

A B A B

class class

10
15

5
var3

var4

0
10

−5
5

−10

A B A B

class class
par(mfrow = c(1,1))

For var1 we see that the values cover approximately the same range and there seem to be no important
differences between the two classes. On the other hand, for var2 the differences are significant. The boxes
(representing the central 50% of the data) are disjoint, and the range of values for class B is much shorter
than for class A. For var3 the values for class A are lower than for class B but the central boxes overlap
considerably. Finaly, for var4 the situation is similar to that of var1‘.
(3) Divide the plotting window into four sectors; on the left column, you will use var1 and on the right
column, var2, while on top, you will use data from class A, and the bottom corresponds to class B.
Plot histograms of relative frequencies in the four windows according to the previous description. Use
the variable name as a label for the x-axis and add a title to each plot. Since you want to compare the
distribution of the variables according to class, the scales on the x-axis should be the same for plots on
the same column. Comment on what you observe.
par(mfcol = c(2,2))
hist(data1$var1[data1$class=='A'], breaks = 10, xlim = c(2, 5), xlab = 'var 1',
main = 'Var1 for class A', col = 'azure2')
hist(data1$var1[data1$class=='B'], breaks = 8, xlim = c(2, 5), xlab = 'var 1',
main = 'Var1 for class B', col = 'azure2')

hist(data1$var2[data1$class=='A'], breaks = 10, xlim = c(-20, 20), xlab = 'var 2',

main = 'Var2 for class A', col = 'azure2')

2
hist(data1$var2[data1$class=='B'], breaks = 8, xlim = c(-20, 20), xlab = 'var 2',
main = 'Var2 for class B', col = 'azure2')

Var1 for class A Var2 for class A

8 10

12
Frequency

Frequency

0 2 4 6 8
6
4
2
0

2.0 2.5 3.0 3.5 4.0 4.5 5.0 −20 −10 0 10 20

var 1 var 2

Var1 for class B Var2 for class B

12
8
Frequency

Frequency

0 2 4 6 8
6
4
2
0

2.0 2.5 3.0 3.5 4.0 4.5 5.0 −20 −10 0 10 20

var 1 var 2
par(mfrow = c(1,1))

We confirm our observations from (2), the distributions for var1 in classes A and B are very similar, while
for var2 these distributions are very different. var2 for class B only has negative values, roughly between -15
and 0, while for class A the values go from -20 to 20.
(4) Using the function plot, create a matrix of plots for the four numerical variables in prob1. Use a solid
square as the plotting symbol and color by the values of class. Comment on what you observe. Which
variables seem to be related?
plot(data1[,1:4], col = data1$class, pch = 15)

3
−20 −10 0 10 20 −10 −5 0 5 10

5.0
4.0
var1

3.0
2.0
20
10

var2
0
−20

15
var3

10
5
10
5

var4
0
−10 −5

2.0 3.0 4.0 5.0 5 10 15

We see that var1 and var3 seem to be linearly related for both classes, having similar slopes. var2 and var4
also seem to have a linear relation, but in this case the slopes for the two classes are very different. The rest
of the variables do not seem to be related.

Exercise 2
For this exercise, we will use the data set Auto in the ISLR package, that has information regarding fuel
consumption (miles per galon, mpg) and other variables for 392 different car models.
(1) Use the functions str and help to explore this data set.
library(ISLR)
data(Auto)
str(Auto)

## 'data.frame': 392 obs. of 9 variables:

4
## $ year : num 70 70 70 70 70 70 70 70 70 70 ...
## $ origin : num 1 1 1 1 1 1 1 1 1 1 ...
## $ name : Factor w/ 304 levels "amc ambassador brougham",..: 49 36 231 14 161 141 54 223 241 2
# help(Auto)

The file has information on mpg and 8 other variables for 392 vehicles. The variables are numeric except name
which is a factor. The help page (not shown) gives detailed information about the variables in the data set.
(2) If you use the command unique(Auto$cylinders), you will get the different values for the number of
cylinders in the data set. We will be only interested in cars with 4, 6, or 8 cylinders. Using the function
select, create a new data frame named Auto_new that only includes cars with the selected number of
cylinders.
unique(Auto$cylinders)

## [1] 8 4 6 3 5
We see that there are cars with 3, 4, 5, 6, and 8 cylinders. We use subset to extract the data according to
the question
Auto_new <- subset(Auto,cylinders == 4 | cylinders == 6 | cylinders == 8)
str(Auto_new)

## 'data.frame': 385 obs. of 9 variables:

## $ mpg : num 18 15 18 16 17 15 14 14 14 15 ...
## $ cylinders : num 8 8 8 8 8 8 8 8 8 8 ...
## $ displacement: num 307 350 318 304 302 429 454 440 455 390 ...
## $ horsepower : num 130 165 150 150 140 198 220 215 225 190 ...
## $ weight : num 3504 3693 3436 3433 3449 ...
## $ acceleration: num 12 11.5 11 12 10.5 10 9 8.5 10 8.5 ...
## $ year : num 70 70 70 70 70 70 70 70 70 70 ...
## $ origin : num 1 1 1 1 1 1 1 1 1 1 ...
## $ name : Factor w/ 304 levels "amc ambassador brougham",..: 49 36 231 14 161 141 54 223 241 2
With the restriction on the number of cylinders, we have lost seven cars in the data set.
(3) Using the file you created in (2), plot mpg as a function of year, and color the points according to the
value of cylinder. Use a solid triangle as plotting symbol and add a legend on the top left corner.
Comment on the plot.
plot(mpg ~ year, data = Auto_new, col = cylinders, pch = 17)
legend('topleft',legend = c(4,6,8),pch = 17, col = c(4,6,8), title = 'cylinders')

5
cylinders
4
6

40
8

30
mpg

20
10

70 72 74 76 78 80 82

year

Blue triangles, corresponding to cars with 4 cylinders, usually have the higher values, indicating better
efficiency. Grey triangles, representing cars with 8 cylinders, are usually at the bottom. There are no grey
triangles for years 80 - 82. The plot shows an increasing trend, indicating improving fuel efficiency.
(4) In some countries, fuel consumption is measured in liters of fuel per 100 kilometers. Using the fact that
one mile = 1.61 kilometers, and one gallon = 3.785 liters, create a new variable in Auto_new called fc
(for fuel consumption), that has fuel consumption measured in liters per 100 kilometers.
km_lt <- Auto_new$mpg*1.61/3.785
Auto_new$fc <- 100/km_lt

(5) Plot fc against displacement and color the dots by cylinders. Use a solid point as plotting symbol
and add a legend and title to the plot.
plot(fc ~ displacement, data = Auto_new, pch = 16, col = cylinders)
legend('topleft',legend = c(4,6,8),pch = 16, col = c(4,6,8), title = 'cylinders')

6
cylinders

25
4
6
8

20
fc

15
10
5

100 200 300 400

displacement

We see an increasing trend in fuel consumption as the engine displacement increases, but the variability does
not seem constant.
In this plot, blue dots, corresponding to cars with four cylinders, are at the lower left corner, corresponding
to smaller engines (lower displacement) and less fuel consumption. In the central part of the plot, we
have red dots corresponding to 6-cylinder cars. They have more significant displacement and increased fuel
consumption. The variability seems similar to that of the blue dots. Finally, the grey dots occupy mostly the
upper right region of the graph, with larger displacement and higher fuel consumption. They also show more
variability than the other two groups.

Exercise 3
Histograms
For this exercise we are going to use simulated data from a mixture of normal distributions. In this population,
45% of the points come from a normal distribution with mean 13 and standard deviation 0.75, and 55% come
from a normal distribution with mean 16 and standard deviation 1.

0.45 × N (13, 0.752 ) + 0.55N (16, 1)

The code below plots the density for this distribution.

points.x <- seq(10,20,length=1000)
points.dens <- 0.45*(dnorm(points.x, mean=13, sd = 0.75)) +
0.55*(dnorm(points.x, mean = 16, sd = 1))
plot(points.x,points.dens,type='l',xlab='values',ylab='density',lwd = 2,
col = 'navyblue', main = 'Mixture Distribution')

7
Mixture Distribution

0.20
density

0.10
0.00

10 12 14 16 18 20

values

The following commands draw a sample of size 500 from this mixture and print the range of values for the
simulated data. The sample is stored in the vector mix.sample
n <- 500; set.seed(4567)
unif.sample <- runif(n) <= 0.45
mix.sample <- unif.sample *rnorm(n, mean=13, sd = 0.75) +
(1-unif.sample)*rnorm(n, mean=16, sd = 1)
(rng <- range(mix.sample))

## [1] 10.84217 18.75402

We will use this sample to draw histograms with the function truehist in the MASS package. Look up the
help for truehist. It is also a good idea to explore the use of the function hist on the base package by
repeating this exercise using hist.
1. Divide the plotting window into 4 using the function par with argument mfrow. Select four disjoint
subsets of data of length 25 and draw histograms for them. Set the bin width to 0.5 in all plots. Make
sure that the scales are the same for all plots. Are these plots similar to the density in the previous
slide?
library(MASS)
par(mfrow=c(2,2))
truehist(mix.sample[1:25], xlab = 'values', h = 0.5,
xlim = c(floor(rng[1]),ceiling(rng[2])),
ylim = c(0,0.6))
truehist(mix.sample[101:125], xlab = 'values', h = 0.5,
xlim = c(floor(rng[1]),ceiling(rng[2])),
ylim = c(0,0.6))
truehist(mix.sample[201:225], xlab = 'values', h = 0.5,
xlim = c(floor(rng[1]),ceiling(rng[2])),
ylim = c(0,0.6))
truehist(mix.sample[301:325], xlab = 'values', h = 0.5,
xlim = c(floor(rng[1]),ceiling(rng[2])),
ylim = c(0,0.6))

8
0.6

0.6
0.4

0.4
0.2

0.2
0.0

0.0
10 12 14 16 18 10 12 14 16 18

values values
0.6

0.6
0.4

0.4
0.2

0.2
0.0

0.0
10 12 14 16 18 10 12 14 16 18

values values

Not really. In the second and third plots the bimodality is not clear. The second plot looks like a right-skewed
distribution. In the third plot the data look more uniformly distributed, with data missing in some intervals.
2. Divide the plotting window into 4 using the function par with argument mfrow. Draw successive
histograms of relative frequency for the first 25, 50, 100, and 500 points in mix.sample. Set the bin
width to 0.5 in all plots. Make sure that the scales are the same for all plots. Are these plots similar to
the density in the previous slide?
par(mfrow=c(2,2))
truehist(mix.sample[1:50], xlab = 'values', h = 0.5,
xlim = c(floor(rng[1]),ceiling(rng[2])))
truehist(mix.sample[1:100], xlab = 'values', h = 0.5,
xlim = c(floor(rng[1]),ceiling(rng[2])))
truehist(mix.sample[1:500], xlab = 'values', h = 0.5,
xlim = c(floor(rng[1]),ceiling(rng[2])))
truehist(mix.sample[1:1000], xlab = 'values', h = 0.5,
xlim = c(floor(rng[1]),ceiling(rng[2])))

9
0.30

0.00 0.10 0.20 0.30

0.20
0.10
0.00

10 12 14 16 18 10 12 14 16 18

values values
0.20

0.20
0.10

0.10
0.00

0.00
10 12 14 16 18 10 12 14 16 18

values values

As the sample size increases, the plots lok more and more like the population density. They show clearly the
bimodal nature of the data and the proportion between the two modes is approximately correct.
3. Using again the function par with argument mfrow, set the graphical window to a single graph. Draw a
histogram of relative frequency using all the points in mix.sample. Choose the number of bins (nbins)
using the Scott rule.
par(mfrow=c(1,1))
truehist(mix.sample[1:1000], xlab = 'values',
xlim = c(floor(rng[1]),ceiling(rng[2])),
main = 'Histogram of simulated data', nbins = 'Scott',
ylim = c(0,0.25))

10
Histogram of simulated data

0.25
0.20
0.15
0.10
0.05
0.00

10 12 14 16 18

values
4. Using the function lines with argument density(sample.mix), add an estimate of the density for
this sample. Color the line in blue. Add also a graph of the theoretical density in red (look back to the
previous page to see how this density was plotted before and make the necessary changes). Comment
on what you observe.
par(mfrow=c(1,1))
truehist(mix.sample[1:1000], xlab = 'values',
xlim = c(floor(rng[1]),ceiling(rng[2])),
main = 'Histogram of simulated data', nbins = 'FD',
ylim = c(0,0.25))
lines(density(mix.sample),col = 'blue', lwd=2)
lines(points.x,points.dens,type='l',col = 2, lwd=2)

11
Histogram of simulated data

0.25
0.20
0.15
0.10
0.05
0.00

10 12 14 16 18

values
We see that the estimated density and the histogram are reasonably close to the population density.

Exercise 4
In this exercise we look at quantile plots. In all cases we will consider samples simulated from the normal
distribution. We explore the effect of size, mean, and variance, and also use qqplot to compare samples.
1. Divide the graphical window into four regions using par and mfrow. Generate four samples from
the standard normal distribution of size 10 and draw normal quantile plots. Add lines with qqline.
Comment on what you observe.
2. Repeat for sample sizes 20, 50, and 100. Comment on what you observe.
3. Draw samples of size 50 from normal distributions with means -6, -2, 2, and 6, all with variance 1 and
draw the corresponding quantile plots. To be able to compare the four graphs, find a suitable common
scale for the axes for all plots. Comment on the similarities and differences between the plots.
4. Draw samples of size 50 from normal distributions with mean 1 and standard deviations 0.5, 2, 4, and
6, and draw the corresponding quantile plots. To be able to compare the four graphs, find a suitable
common scale for the axes for all plots. Comment on the similarities and differences between the plots.
5. Draw two samples of size 10 from the standard normal distribution and compare them using qqplot.
Repeat a total of four times. Plot the four graphs on the same window. Comment on what you see.
In this exercise we look at quantile plots. In all cases we will consider samples simulated from the normal
distribution. We explore the effect of size, mean, and variance, and also use qqplot to compare samples.

12
1. Divide the graphical window into four regions using par and mfrow. Generate four samples from
the standard normal distribution of size 10 and draw normal quantile plots. Add lines with qqline.
Comment on what you observe.
par(mfrow=c(2,2))
for(i in 1:4) {samp1 <- rnorm(10); qqnorm(samp1); qqline(samp1)}

Normal Q−Q Plot Normal Q−Q Plot

1.5
Sample Quantiles

Sample Quantiles
0.5

0.5
−0.5

−1.5 −0.5
−1.5

−1.5 −0.5 0.5 1.0 1.5 −1.5 −0.5 0.5 1.0 1.5

Theoretical Quantiles Theoretical Quantiles

Normal Q−Q Plot Normal Q−Q Plot

1.0
Sample Quantiles

Sample Quantiles
0.5

0.0
−0.5

−1.0
−1.5

−1.5 −0.5 0.5 1.0 1.5 −1.5 −0.5 0.5 1.0 1.5

Theoretical Quantiles Theoretical Quantiles

In plot 1 the six points in the middle are aligned but the other four points do not show a good fit. Something
similar occurs with plot 3. Plots 2 and 4 show a good alignment of the points.

13
2. Repeat for sample sizes 20, 50, and 100. Comment on what you observe.
par(mfrow=c(2,2))
for(i in 1:4) {samp1 <- rnorm(20); qqnorm(samp1); qqline(samp1)}

Normal Q−Q Plot Normal Q−Q Plot

1.5
Sample Quantiles

Sample Quantiles
1
0

0.5
−1

−1.5 −0.5
−3

−2 −1 0 1 2 −2 −1 0 1 2

Theoretical Quantiles Theoretical Quantiles

Normal Q−Q Plot Normal Q−Q Plot

0.0 0.5
Sample Quantiles

Sample Quantiles

0.5
−0.5
−1.0

−1.5

−2 −1 0 1 2 −2 −1 0 1 2

Theoretical Quantiles Theoretical Quantiles

For sample size 20 the fit is reasonable, but there are still some points that deviated markedly from the line.

14
2. Repeat for sample sizes 20, 50, and 100.
par(mfrow=c(2,2))
for(i in 1:4) {samp1 <- rnorm(50); qqnorm(samp1); qqline(samp1)}

Normal Q−Q Plot Normal Q−Q Plot

2
Sample Quantiles

Sample Quantiles
1

1
0

0
−1

−1
−2

−2
−2 −1 0 1 2 −2 −1 0 1 2

Theoretical Quantiles Theoretical Quantiles

Normal Q−Q Plot Normal Q−Q Plot

2
2
Sample Quantiles

Sample Quantiles

1
1

0
0

−1
−1

−2

−2 −1 0 1 2 −2 −1 0 1 2

Theoretical Quantiles Theoretical Quantiles

For sample size 50 the fit is better. In three out of four plots the fit is very good. Plot 3 has a large minimum
value that deviates from the rest.

15
par(mfrow=c(2,2))
for(i in 1:4) {samp1 <- rnorm(100); qqnorm(samp1); qqline(samp1)}

Normal Q−Q Plot Normal Q−Q Plot

2
Sample Quantiles

Sample Quantiles
1

1
0

0
−1

−2 −1
−3

−2 −1 0 1 2 −2 −1 0 1 2

Theoretical Quantiles Theoretical Quantiles

Normal Q−Q Plot Normal Q−Q Plot

2
Sample Quantiles

Sample Quantiles
1

1
0

0
−2 −1

−1
−2

−2 −1 0 1 2 −2 −1 0 1 2

Theoretical Quantiles Theoretical Quantiles

Now the fit is very good iin all cases. We see that, as the sample size grows, the fit improves.

16
3. Draw samples of size 50 from normal distributions with means -6, -2, 2, and 6, all with variance 1 and
draw the corresponding quantile plots. To be able to compare the four graphs, find a suitable common
scale for the axes for all plots. Comment on the similarities and differences between the plots.
par(mfrow=c(2,2))
for (i in c(-3,-1,1,3)) {
dat <- rnorm(30,i);qqnorm(dat,ylim=c(-6,6));qqline(dat)}

Normal Q−Q Plot Normal Q−Q Plot

6
Sample Quantiles

Sample Quantiles
4

4
2

2
−2

−2
−6

−6
−2 −1 0 1 2 −2 −1 0 1 2

Theoretical Quantiles Theoretical Quantiles

Normal Q−Q Plot Normal Q−Q Plot

6
Sample Quantiles

Sample Quantiles
4

4
2

2
−2

−2
−6

−6

−2 −1 0 1 2 −2 −1 0 1 2

Theoretical Quantiles Theoretical Quantiles

In the plots we see that the slope of the lines remain constant, but the lines shift upwards as the mean
increases.

17
par(mfrow=c(2,2))
for (i in c(-3,-1,1,3)) {
dat <- rnorm(30,i);qqnorm(dat,ylim=c(-6,6));qqline(dat)
abline(v=0,col='red'); abline(h=mean(dat),col = 'red')}

Normal Q−Q Plot Normal Q−Q Plot

6
Sample Quantiles

Sample Quantiles
4

4
2

2
−2

−2
−6

−6
−2 −1 0 1 2 −2 −1 0 1 2

Theoretical Quantiles Theoretical Quantiles

Normal Q−Q Plot Normal Q−Q Plot

6
Sample Quantiles

Sample Quantiles
4

4
2

2
−2

−2
−6

−6

−2 −1 0 1 2 −2 −1 0 1 2

Theoretical Quantiles Theoretical Quantiles

18
4. Draw samples of size 50 from normal distributions with mean 1 and standard deviations 0.5, 2, 4, and
6, and draw the corresponding quantile plots. To be able to compare the four graphs, find a suitable
common scale for the axes for all plots. Comment on the similarities and differences between the plots.
par(mfrow=c(2,2))
for (i in c(0.5,1,2,3)) {
dat <- rnorm(30,0,i);qqnorm(dat,ylim=c(-6,6));qqline(dat)}

Normal Q−Q Plot Normal Q−Q Plot

6
Sample Quantiles

Sample Quantiles
4

4
2

2
−2

−2
−6

−6
−2 −1 0 1 2 −2 −1 0 1 2

Theoretical Quantiles Theoretical Quantiles

Normal Q−Q Plot Normal Q−Q Plot

6
Sample Quantiles

Sample Quantiles
4

4
2

2
−2

−2
−6

−6

−2 −1 0 1 2 −2 −1 0 1 2

Theoretical Quantiles Theoretical Quantiles

In these plots we see that the height of the central points remains constant, but the slope of the lines increases
as the variance increases.

19
5. Draw two samples of size 10 from the standard normal distribution and compare them using qqplot.
Repeat a total of four times. Plot the four graphs on the same window. Comment on what you see.
par(mfrow=c(2,2))
for (i in 1:4) {
dat <- rnorm(20);qqplot(dat[1:10],dat[11:20], ylim=c(-2.5,2.5),
xlab = 'Sample 1', ylab = 'Sample 2')}
2

2
1

1
Sample 2

Sample 2
0

0
−2

−2
−0.5 0.0 0.5 1.0 1.5 −2.0 −1.0 0.0 0.5 1.0

Sample 1 Sample 1
2

2
1

1
Sample 2

Sample 2
0

0
−2

−2

−1 0 1 2 −1.5 −0.5 0.5 1.0 1.5

Sample 1 Sample 1

We see that the first two plots do not show adequate alignment, and we would probably conclude that the
two samples come from different distributions. The last two plots show points that are reasonably aligned,
and we would conclude that in this case they come from a common distribution function.

R Module 11 - Statistics
No ratings yet
R Module 11 - Statistics
35 pages
R Module 6 - Data Summarization
No ratings yet
R Module 6 - Data Summarization
25 pages
FDS Lab Manual
No ratings yet
FDS Lab Manual
32 pages
DSR - Unit 2-2.1 ExploringBasicgraphs
No ratings yet
DSR - Unit 2-2.1 ExploringBasicgraphs
51 pages
Lab Exercise 1
No ratings yet
Lab Exercise 1
16 pages
Lab Manual - DSR
No ratings yet
Lab Manual - DSR
32 pages
Exploratory Data Analysis in R
No ratings yet
Exploratory Data Analysis in R
50 pages
2 Table and Graphical Representations
No ratings yet
2 Table and Graphical Representations
46 pages
DA R Unit-4
No ratings yet
DA R Unit-4
32 pages
Statistics and Data Science With R Part - 4
No ratings yet
Statistics and Data Science With R Part - 4
23 pages
STAT 1000 - Worksheet 2
No ratings yet
STAT 1000 - Worksheet 2
14 pages
STAT 1000 - Worksheet 2
No ratings yet
STAT 1000 - Worksheet 2
14 pages
R Practical
No ratings yet
R Practical
9 pages
KrutikaKolhe 862467252 HW2
No ratings yet
KrutikaKolhe 862467252 HW2
25 pages
STAT 1000 - Worksheet 2
No ratings yet
STAT 1000 - Worksheet 2
14 pages
Plot
No ratings yet
Plot
34 pages
Apuntes de Clase - DataCamp - Visualization in Higher Dimensions
No ratings yet
Apuntes de Clase - DataCamp - Visualization in Higher Dimensions
50 pages
BDA Experiment 9 and 10
No ratings yet
BDA Experiment 9 and 10
22 pages
#PART 1a) : "Vqv/ggbiplot"
No ratings yet
#PART 1a) : "Vqv/ggbiplot"
29 pages
Using R For Basic Statistical Analysis
No ratings yet
Using R For Basic Statistical Analysis
11 pages
Sahil R
No ratings yet
Sahil R
5 pages
Module 5-6
No ratings yet
Module 5-6
12 pages
CBLM 6-Present-Relevant-Information
No ratings yet
CBLM 6-Present-Relevant-Information
41 pages
Preet Chupebaz
No ratings yet
Preet Chupebaz
5 pages
Signed Off Statistics and Probability11 q2 m3 Random Sampling and Sampling Distribution v3 Pages Deleted
56% (9)
Signed Off Statistics and Probability11 q2 m3 Random Sampling and Sampling Distribution v3 Pages Deleted
52 pages
Chapter I The Problem and Its Scope
100% (8)
Chapter I The Problem and Its Scope
7 pages
Final DSR Lab Record
No ratings yet
Final DSR Lab Record
16 pages
Business Analytics Unit - IV Notes - 60637706 - 2025 - 05!15!02 - 16
No ratings yet
Business Analytics Unit - IV Notes - 60637706 - 2025 - 05!15!02 - 16
28 pages
Module 2 ExploratoryDataAnalysis
No ratings yet
Module 2 ExploratoryDataAnalysis
22 pages
Business Analytics-1: STR (Crew - Data)
No ratings yet
Business Analytics-1: STR (Crew - Data)
16 pages
MDPN460 Lecture05
No ratings yet
MDPN460 Lecture05
32 pages
Data Visualization in R Sem-III 2021 PDF
No ratings yet
Data Visualization in R Sem-III 2021 PDF
57 pages
DV - Unit 2
No ratings yet
DV - Unit 2
73 pages
2023 Tutorial 12
No ratings yet
2023 Tutorial 12
6 pages
Exercise 3
No ratings yet
Exercise 3
4 pages
Assignment 1
No ratings yet
Assignment 1
7 pages
Statistical Modeling Using R - Lab Manual
No ratings yet
Statistical Modeling Using R - Lab Manual
23 pages
BES - R Lab
No ratings yet
BES - R Lab
5 pages
Day3 Session1
No ratings yet
Day3 Session1
13 pages
Data Science Using R
No ratings yet
Data Science Using R
11 pages
BAB 5-2 MTK Graph in R PT 2 Materi Line Plot
No ratings yet
BAB 5-2 MTK Graph in R PT 2 Materi Line Plot
9 pages
ESOMAR Guideline On Conducting Market and Opinion Research Using The Internet
No ratings yet
ESOMAR Guideline On Conducting Market and Opinion Research Using The Internet
19 pages
Unit 1 Assignment SKELETON R spr18
No ratings yet
Unit 1 Assignment SKELETON R spr18
23 pages
Unit - 2: Data Manipulation With R & Data Visualization in Watson Studio
No ratings yet
Unit - 2: Data Manipulation With R & Data Visualization in Watson Studio
58 pages
228
No ratings yet
228
2 pages
Model Lab
No ratings yet
Model Lab
6 pages
R For Data Exploration
No ratings yet
R For Data Exploration
52 pages
R Unit5
No ratings yet
R Unit5
12 pages
AMDA Practical - A048
No ratings yet
AMDA Practical - A048
35 pages
Descriptive and Inferential Statistics With R
No ratings yet
Descriptive and Inferential Statistics With R
6 pages
Graphics in R
No ratings yet
Graphics in R
8 pages
Introduction To Research
No ratings yet
Introduction To Research
4 pages
R Intro 2011
No ratings yet
R Intro 2011
115 pages
Graph Plotting in R Programming
No ratings yet
Graph Plotting in R Programming
12 pages
Lab1: Introduction To R: Islr2
No ratings yet
Lab1: Introduction To R: Islr2
10 pages
6 TTE Regression
No ratings yet
6 TTE Regression
37 pages
Advanced Statistics and Probability
No ratings yet
Advanced Statistics and Probability
37 pages
Data Visualization Using R
No ratings yet
Data Visualization Using R
26 pages
Final Cost Practical
No ratings yet
Final Cost Practical
29 pages
Lesson 7 - The Data Frame
No ratings yet
Lesson 7 - The Data Frame
7 pages
Analysis Using Statistical: Introduction & Data Exploration
No ratings yet
Analysis Using Statistical: Introduction & Data Exploration
23 pages
Visualization - Hist and Box
No ratings yet
Visualization - Hist and Box
23 pages
Saveetha Institute of Medical and Technical Sciences: Unit V Plotting and Regression Analysis in R
No ratings yet
Saveetha Institute of Medical and Technical Sciences: Unit V Plotting and Regression Analysis in R
63 pages
Minor Project Main PDF
No ratings yet
Minor Project Main PDF
45 pages
Statistical Methods 15 Critical Appraisal: Community Project
No ratings yet
Statistical Methods 15 Critical Appraisal: Community Project
17 pages
Hegels Science of Logic
No ratings yet
Hegels Science of Logic
26 pages
Computational Thinking in K-12 Education LeadershiptToolkit
No ratings yet
Computational Thinking in K-12 Education LeadershiptToolkit
46 pages
1-Lab-Virtual-The-Scientific-Method - Clothing - Single Page
No ratings yet
1-Lab-Virtual-The-Scientific-Method - Clothing - Single Page
4 pages
2 R - Zajecia - 4 - Eng
No ratings yet
2 R - Zajecia - 4 - Eng
7 pages
Dickey, Fuller - 1981 - Likelihood Ratio Statistics For Autoregressive Time Series With A Unit Root
No ratings yet
Dickey, Fuller - 1981 - Likelihood Ratio Statistics For Autoregressive Time Series With A Unit Root
17 pages
Reading 7 Estimation and Inference Answers
No ratings yet
Reading 7 Estimation and Inference Answers
4 pages
Lecture 2 - R Graphics PDF
No ratings yet
Lecture 2 - R Graphics PDF
68 pages
Predictive Analytics: A Survey, Trends, Applications, Oppurtunities & Challenges
No ratings yet
Predictive Analytics: A Survey, Trends, Applications, Oppurtunities & Challenges
5 pages
Jurnal Titus
No ratings yet
Jurnal Titus
11 pages
Hypothesis: K.Raju
No ratings yet
Hypothesis: K.Raju
19 pages
Wilcoxon Test: Lador, Cindy P. Obinguar, Ma. An Gelica U Saludo, Coke Aidenry E. Sombelon, Mary Grace B
No ratings yet
Wilcoxon Test: Lador, Cindy P. Obinguar, Ma. An Gelica U Saludo, Coke Aidenry E. Sombelon, Mary Grace B
20 pages
Oxford University Press - Online Resource Centre - Multiple Choice Questions7
No ratings yet
Oxford University Press - Online Resource Centre - Multiple Choice Questions7
4 pages
07 - UNI+EXP Distribution
No ratings yet
07 - UNI+EXP Distribution
18 pages
Pangasinan State University
No ratings yet
Pangasinan State University
10 pages
Stats Cheat Sheets
No ratings yet
Stats Cheat Sheets
15 pages
EdTPA Glossary - 2014
No ratings yet
EdTPA Glossary - 2014
8 pages
Speed, X Residual Plot Speed, X Line Fit Plot: Soalan 1
No ratings yet
Speed, X Residual Plot Speed, X Line Fit Plot: Soalan 1
4 pages
Cheat Sheet 2 in 1-1
No ratings yet
Cheat Sheet 2 in 1-1
2 pages
Workshop Activity: X Seq y Length
No ratings yet
Workshop Activity: X Seq y Length
3 pages
Course Syllabus: A1: Solar System and Exoplanets Summer 2019
No ratings yet
Course Syllabus: A1: Solar System and Exoplanets Summer 2019
5 pages
Solution To The Econometrics Paper Mid Term Spring 2021
No ratings yet
Solution To The Econometrics Paper Mid Term Spring 2021
2 pages
Chapter # 4
No ratings yet
Chapter # 4
3 pages
Ho Mediation
No ratings yet
Ho Mediation
3 pages
Pretest Science 7
No ratings yet
Pretest Science 7
4 pages
Elementary Matrix Theory
From Everand
Elementary Matrix Theory
Howard Eves
2.5/5 (3)

ProbList2 24 SLN

Uploaded by

ProbList2 24 SLN

Uploaded by

STAT 210

Applied Statistics and Data Analysis

## 'data.frame': 100 obs. of 5 variables:

hist(data1$var2[data1$class=='A'], breaks = 10, xlim = c(-20, 20), xlab = 'var 2',

Var1 for class A Var2 for class A

2.0 2.5 3.0 3.5 4.0 4.5 5.0 −20 −10 0 10 20

Var1 for class B Var2 for class B

2.0 2.5 3.0 3.5 4.0 4.5 5.0 −20 −10 0 10 20

2.0 3.0 4.0 5.0 5 10 15

## 'data.frame': 392 obs. of 9 variables:

## 'data.frame': 385 obs. of 9 variables:

100 200 300 400

0.45 × N (13, 0.752 ) + 0.55N (16, 1)

The code below plots the density for this distribution.

## [1] 10.84217 18.75402

0.00 0.10 0.20 0.30

Normal Q−Q Plot Normal Q−Q Plot

Theoretical Quantiles Theoretical Quantiles

Normal Q−Q Plot Normal Q−Q Plot

Theoretical Quantiles Theoretical Quantiles

Normal Q−Q Plot Normal Q−Q Plot

Theoretical Quantiles Theoretical Quantiles

Normal Q−Q Plot Normal Q−Q Plot

Theoretical Quantiles Theoretical Quantiles

Normal Q−Q Plot Normal Q−Q Plot

Theoretical Quantiles Theoretical Quantiles

Normal Q−Q Plot Normal Q−Q Plot

Theoretical Quantiles Theoretical Quantiles

Normal Q−Q Plot Normal Q−Q Plot

Theoretical Quantiles Theoretical Quantiles

Normal Q−Q Plot Normal Q−Q Plot

Theoretical Quantiles Theoretical Quantiles

Normal Q−Q Plot Normal Q−Q Plot

Theoretical Quantiles Theoretical Quantiles

Normal Q−Q Plot Normal Q−Q Plot

Theoretical Quantiles Theoretical Quantiles

Normal Q−Q Plot Normal Q−Q Plot

Theoretical Quantiles Theoretical Quantiles

Normal Q−Q Plot Normal Q−Q Plot

Theoretical Quantiles Theoretical Quantiles

Normal Q−Q Plot Normal Q−Q Plot

Theoretical Quantiles Theoretical Quantiles

Normal Q−Q Plot Normal Q−Q Plot

Theoretical Quantiles Theoretical Quantiles

−1 0 1 2 −1.5 −0.5 0.5 1.0 1.5

You might also like