0% found this document useful (0 votes)
18 views192 pages

Data Analysiswith SPSSPPT

Uploaded by

Rahul Yadav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views192 pages

Data Analysiswith SPSSPPT

Uploaded by

Rahul Yadav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 192

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/365266143

Data Analysis with SPSS PPT


Presentation · November 2022

CITATION READS

1 3,630

1 author:

Thanavathi C.
Karapettai Nadar Girls Higher Secondary School
144 PUBLICATIONS 61 CITATIONS

SEE PROFILE
All content following this page was uploaded by Thanavathi C. on 10 November 2022.

The user has requested enhancement of the downloaded file.


Dr.C.THANAVATHI
LEARNING OBJECTIVES
1 Understand basic concepts of biostatistics
. and computer software SPSS.
2. Select appropriate statistical tests for
particular types of data.
3 Recognize and interpret the output from
. statistical analyses.
4 Report statistical output in a concise and
. appropriatemanner.
BASIC TERMINOLOGY
Statistics, Biostatistics, Variable, Measurement
Scale, Data, Medical Data, type of data, Data
Analysis
VARIABLE, SCALE,
DATAis a characteristics which varies
Variable
scale
and is a device on which observations
taken.
are Data is set of
taken from experiment/survey or external
observations/measurements
of a specific variable using some appropriate
source
measurement
scale
Statistics and Bio-statistics

Statistics is generally understood as the


subject dealing with number and data,
more broadly it involves activities such as
collection of data from survey or
experiment, summarization or
management of data, presentation of
results in a convincing format, analysis of
data or drawing valid inferences from
findings. Whereas Bio-Statistics is science
which helps us in managing medical data
with application of statistical
methods/techniques/tools or a collection
of statistical procedures particularly well-
suited to the analysis of healthcare-
related data
What is medical
data?
The data which is related to patient care or
numerical
information regarding patient’s clinical
mortality rate survival rate, disease
characteristics,
prevalence of disease, efficacy of
distribution,
other such information
treatment, and is called medical
data.
NATURE OF DATA

Data is the value you get from


(observing
measuring, counting, assessing etc.)
experiment
from or survey. Data is either categorical
metric. Categorical data is further divided
or
Nominal and ordinal, whereas metric into
into
and continuous (quantitative)
discrete
data.
Nominal data
The data is divided into classes or categories. Blood type, sex, causes of
disease, urban/rural, alive/ dead, infected/not infected, hair color,
smoking status. No meaningful order of classes.
Ordinal data
The data is also divided into classes or categories but be put in meaningful
order. For example satisfaction level:-Very satisfied, satisfied, neutral,
unsatisfied, very unsatisfied. Pain as mild, moderate, sever. Socioeconomic
status: poor, middle, rich, grade of breast cancer, better, same, worst.
Discrete data
When data is taken from some counting process, for example number of
patients in different wards, number of nurses, number of hospitals in different
cities.
Continuous or quantitative data
When data is taken from some measuring process, for example, height,
weight, Temperature, uric acid, blood glucose and serum level.
Primary Scales of Measurement

Scale Basic Common Marketing Permissible Statistics


Characteristics Examples Examples Descriptive Inferential
Nominal Numbers identify Social Security Brand nos., store Percentages, Chi-square,
& classify objects nos., numbering types mode binomial test
of football players
Ordinal Nos. indicate the Quality rankings, Preference Percentile, Rank-order
relative positions rankings of teams rankings, market median correlation,
of objects but not in a tournament position, social Friedman
the magnitude of class ANOVA
differences
between them
Interval Differences Temperature Attitudes, Range, mean, Product-
between objects (Fahrenheit) opinions, index standard moment
Ratio Zero point is fixed, Length, weight Age, sales, Geometric Coefficient of
ratios of scale income, costs mean, harmonic variation
values can be mean
compared
Nominal
Scale
The numbers serve only as labels or tags for identifying and
object
classifying
s.
When used for identification, there
-t -one
is a
between
strict one the numbers and the o correspondence
objects.
The numbers do not reflect the amount of the characteristic
object
possessed by the
s. only permissible operation on the numbers in a nominal scale
The
is counting.
Social security number, hockey players number. Imn
marketing research
respondents, brands, attributes, stores and
other objects
ORDINAL SCALE
A ranking scale in which numbers are assigned
to objects to indicate the relative extent to
which the objects possess some characteristic.
Can determine whether an object has more or
less of a characteristic than some other object,
but not how much more or less. any series of
numbers can be assigned that preserves the
ordered relationships between the objects. So
relative position of objects not the magnitude
of difference between the objects. In addition
to the counting operation allowable for
nominal scale data, ordinal scales permit the
use of statistics based on percentile, quartile,
median. Possess description and order, not
distance or origin
INTERVAL
SCALE
Numerically equal distances on the scale
equal values in the characteristic being
represent
measured.
It permits comparison of the differences
objects.
betweenThe difference between 1 & 2 is same
between
as 2 & 3 The location of the zero point is not
fixed. Both the zero point and the units
measurement
of are arbitrary. Everyday
temperature scale. Attitudinal data obtained on
rating scales. Do not possess origin
(characteristics
zero and exact )
measurement
RATIO SCALE
The highest scale that allows to identify objects,
order of objects, and compare intervals or differences.
rank
It is also meaningful to compute ratios of scale
values
Possesses all the properties of the nominal, ordinal, and
interval scales. It has an absolute zero
point.
Height, weight, age, money. Sales, costs, market
and number of customers are variables measured on a
share
ratio scale
All statistical techniques can be applied to ratio data.
Statistical Variables:
Different classes of information are known as the
variables of a dataset, e.g:
• Age
• Weight
• Height
• Gender
• Marital status
• Annual income
Variables which are experimentally
manipulated by an investigator are called
independent variables.
• Variables which are measured are called
dependent variables.
• All other factors which may affect the
dependent variable are called confounding,
extraneous or secondary variables - unless
these are the same for each group being tested
comparisons will be unreliable.
• Quantitative data measures either how much
or how many of something, i.e. a set of
observations where any single observation is a
number that represents an amount or a count.
• Qualitative data provide labels, or names, for
categories of like items, i.e. a set of
observations where any single observation is a
word or code that represents a class or
category.
Qualitative data can be divided into:
• Nominal variables: Variables with no inherent order or
ranking sequence, e.g. numbers used as names (group 1,
group 2...), gender, etc.
• Ordinal variables: Variables with an ordered series, e.g.
"greatly dislike, moderately dislike, indifferent, moderately
like, greatly like". Numbers assigned to such variables
indicate rank order only - the "distance" between the
numbers has no meaning
• Interval variables: Equally spaced variables, e.g.
temperature. The difference between a temperature of 66
degrees and 67 degrees is taken to be the same as the
difference between 76 degrees and 77 degrees. Interval
variables do not have a true zero, e.g. 88 degrees is not
necessarily double the temperature of 44 degrees.
• Ratio variables: Variables spaced equal intervals with a true
zero point, e.g. age.
Kinds of data analysis
• Descriptive, (univariate & bivariate
correlational) & inferential (bivariate &
multivariate)
• The differences between descriptive and
inferential statistics have to do with the nature
of the problem that the researcher is trying to
solve.
Descriptive
• Descriptive statistics = Data are summarized in
terms of how they clump together (central
tendency)and how they vary (distribution).
• Univariate descriptive: 1 variable
• Bivariate descriptive statistics (correlational
statistics) = Describe the direction &
magnitude of the relationships between 2
variables.
• Bivariate descriptive: Correlational
• Describe the direction & magnitude of the
relationships between 2 variables.
Inferential statistics

• Test hypotheses using probability sampling in which the population


parameter and sampling error can be accurately estimated.
– Inferential bivariate statistics = to test relationship between 2 variables
– Inferential multivariate statistics = to test relationships among 3 or more
variables.
Parametric vs. nonparametric

• The difference between parametric and nonparametric statistics has to do


with the kind of data available for analysis
Parametric vs. Nonparametric

• Parametric when estimate of one parameter is interval or ratio level.


Most common are t-test & ANOVA which measure differences between group
means.
• Nonparametric when level of data is nominal or ordinal and the normality
of the distribution cannot be assumed.
Most common is chi-square which measures difference between 2 nominal
variables or Spearman's r which can measure relationship

Parametric tests include:


• t-test
• ANOVA
• Regression
• Correlation
Nonparametric methods include:
• Chi-squared test
• Wilcoxon signed-rank test
• Mann-Whitney-Wilcoxon test
• Spearman rank correlation coefficient
Type of Data
Goal Measurement (from Rank, Score, or Binomial (Two Survival Time
Gaussian Population) Measurement (from Possible Outcomes)
NonGaussian Population)

Describe one group Mean, SD Median, interquartile range Proportion Kaplan Meier survival curve

Compare one group to a One-sample t test Wilcoxon test Chi-square


hypothetical value or
Binomial test **
Compare two unpaired Unpaired t test Mann-Whitney test Fisher's tes Log-rank test or Mantel-
groups (chi-square for t Haenszel*
samples) larg
e
Compare two paired Paired t test Wilcoxon test McNemar's test Conditional proportional
groups hazards regression*

Compare three or more One-way ANOVA Kruskal-Wallis test Chi-square test Cox proportional hazard
unmatched groups regression**

Compare three or more Repeated-measures ANOVA Friedman test Cochrane Q** Conditional proportional
matched groups hazards regression**

Quantify association Pearson correlation Spearman correlation Contingency coefficients**


between two variables
Predict value from another Simple linear regression or Nonparametric regression** Simple logistic regression* Cox proportional hazard
measured variable Nonlinear regression regression*

Predict value from several Multiple linear Multiple logistic regression* Cox proportional hazard
measured or binomial regression* or regression*
variables Multiple nonlinear
regression**
* is OK, ***** is fantastic
Data
Analysis
After collecting the accurate and reliable
successfully
data by using the appropriate
from the source, the next step is how to
method
the pertinent and useful information buried
extract
data
in thefor further manipulation and
The process of performing certain
interpretation.
and evaluation in order to extract
calculations
information from data is called data
relevant
analysis.
Cont……
The data analysis may take several steps
reach
to certain conclusions. Simple data can
organized very easily, while the complex
be
requires
data proper processing. The
“processing” means the recasting and
word
with data making ready for
dealing
analysis.
Steps in data
analysis
• Questionnaire checking/Data

•preparation
Codin
•g
Cleaning
•data
Applying most appropriate tools
analysi
for
s
QUESTIONNAIRE CHECKING
A questionnaire returned from the field may be
unacceptable for several
reasons.
Parts of the
questionnaire may be
The pattern incomplete.
of responses may indicate that the respondent did
understand or follow the
not
instructions.
The responses show little
variance.
One or more pages are missing.
The questionnaire is received after
-established
the cutoff
pre
The questionnaire is answered by date.
someone who does not qualify
for .
participation
DATA
PREPARATION
Preparation of data
file
It is important to convert raw data into a usable
analysis
data for (coding where it needed), simply
information
transform from questionnaire to computer
database
The analysis and results will surely depend on
of
the quality
data
There are possibilities of errors in handling
raw data, transcribing, data entry, assigning
instruments,
value
codes, values,
labels
Data need to be cleaned to fulfill the analysis
conditions
CODING

Codingmeans assigning a code, usually a


number, to each possible response to each
question
.
Data
cleaning
One of the first steps in analyzing data is

“clean”
to it of any obvious data entry
errors:
Outliers? (really high or low
numbers)
Example: Age = 110 (really 10 or 11?)
•Value entered that doesn’t exist for
variable? 2 entered where 1=male,
Example:
0=female
•Missing
values?
Did the person not give an answer? Was
accidentally
answer not entered into the
Cont……
•May be able to set defined limits when entering
Prevents entering a 2 when only 1, 0, or missing are
data
value
acceptable
s
•Univariate data analysis is a useful way to check
quality
the of the
data
SPS
is S
SPSS is a statistical Packages for data analysis, it
very
a popular software because of its friendly
in Social & Medical
usage
sciences
Launching SPSS

Before starting this session, you should know how to run a program in windows operating system. Click and hold on
button at lower left of your screen, and among the program listed select SPSS 16.0, click and release the mouse button
to lauanch the program
On clicking of SPSS this window will open then click on cancel button if you like to enter data in a new file or
click on OK for opening an existing file. A window will open known as data editor with variable view.
SPSS WINDOWS
There are a number of different types of windows in SPSS. The window in which you are currently working is called
the active window. Some of the frequently used windows are:
Data Editor Window: It displays the contents of the data file. This is the window that opens
automatically when you start an SPSS session. In this window, you can create new data files or modify existing ones.
When you open more than one data file, each data file has a separate Data Editor Window. The Data Editor Window
provides two view of the data:

Data View: It displays the data values. Each variable is a column. Each row is a case.

Variable View: It displays a table consisting of variable names and their attributes. You can modify the properties of
each variable or add new variables or delete existing variables in the Variable View Window.

Data view window variable view window


Viewer Window: It displays statistical results, tables, and charts. This window opens automatically the first time you
run a procedure that generates output
MORE ABOUT
WINDOWS
PULL-DOWN MENUS
Many tasks in SPSS are performed by selecting appropriate
-down""pull
menus. Each window in SPSS has its own
menu bar with appropriate menu selections and toolbars. The Analyze and Graphs menus are available in all
windows. Here are some Data Editor Window menus and their uses:
File Menu: From the file menu you can open several different existing files or a database file such as
an excel file or read in a text file. You can also save any changes to the current file.

Edit Menu: from the Edit menu, you can cut, copy, paste, insert variables, insert cases, or use find in
the Data Editor window.

Data Menu: The data menu allows you to define variable properties, sort cases, merge files, split files,
select cases and use a variable to weight cases.

Transform Menu: The transform menu is where you will find the options to do some computations on
variables, to create new variables from existing ones or recode old variables.

Analyze Menu: The analyze menu is where all statistical analysis takes place. From descriptive statistics to
regression analysis to nonparametric tests
Graphs Menu: The graph menu is where you can create high resolution plots and graphs to be edited in
the chart editor window or you can create interactive graphs.

Utilities Menu: The utilities menu is used to display information on the contents of SPSS data files or to
run scripts.

Add-Ons Menu: From the add


-ons menu you can run other packages like conjoint, classification trees, or
Neural Networks. Also there are programmability extensions that allow you to integrate programs like R
and Python into SPSS. But you should keep in mind that if you want to run any-ons
of the
listed
addhere
you will have to purchase them separately.

Window: From the window menu you can change the active window. The window with a check mark is the
active one. In this case it is the data editor window.

Help: The help menu allows you to get help on topics in SPSS or to ask the statistics coach some basic
questions.

TOOLBARS
Each window in SPSS has its own toolbars that provides access to common tasks. Some windows have
more than one. When you put the mouse pointer on a tool, there is a brief description of what the tool
does. You can show, move or hide a toolbar.
STATUS
BARS
The status bar is at the bottom of each SPSS window and provides the following
information:
Command Status: gives information about a procedure that is
running.
Filter Status: Filter On shows when a subset of cases in the data is used for
analysis.
Weight Status: Weight On indicates that a weight variable is being used in the
analysis.
Split File Status: Split File On indicates that the file has been split into separate groups for
analysis.
DIALOG
Many menu selections will open dialog boxes.BOXESIn these dialog boxes, you select variables and options for analysis. The
main
dialog box in any statistical procedure has the following
parts: Source variable list: A list of variable types (allowed by the procedure) from the working data
file.
Target variable lists: One or more lists of variables needed for the
analysis.
Command push buttons: Buttons that can be used to run the procedure by opening a subdialog box to make
additional specifications. Some of the push buttons
are:
OK : Click this button to run the
procedure.
Paste: Click this button to generate command syntax from your selections. The command syntax is pasted into a syntax window,
where it can be modified for future analysis. This creates the code regularly known as SPSS
programs.
Reset: Deselects any selections, and resets all specifications in the dialog box and any subdialog
s. boxes to the default
statu
Cancel: Cancels any change in the dialog box settings since the last time it was opened. This will close the dialog
box.

Help: Provides help about the current dialog


box.
Name
The name of each SPSS variable in a given file must be unique; it must start with
a letter; it may have up to 8 characters (including letters, numbers, and the
underscore _ (note that certain key words are reversed and may not be used as
variable names, e.g., "compute", "sum", and so forth). To change an existing
name, click in the cell containing the name, highlight the part you want to
change, and type in the replacement. To create a new variable name, click in the
first empty row under the name column and type a new (unique) variable name.

Notice that we can use "cat_dog" but-dog"


not "cat
and not "cat dog". The hyphen
gets interpreted as subtraction (cat minus dog) by S PSS, and the space confuses
SPSS as to how many variables are being named.
TYP
THE
E TWO BASIC TYPES OF VARIABLES THAT YOU WILL
ARE NUMERIC AND STRING. NUMERIC VARIABLES MAY
USE
HAVE NUMBERS ASSIGNED. STRING VARIABLES
ONLY
CONTAIN LETTERS OR NUMBERS, BUT EVEN IF A
MAY
VARIABLE HAPPENS TO CONTAIN ONLY NUMBERS,
STRING
OPERATIONS
NUMERIC ON THAT VARIABLE WILL NOT BE
(E.G., FINDING THE MEAN, VARIANCE,
ALLOWED
DEVIATION,
STANDARD ETC...). TO CHANGE A VARIABLE TYPE,
THAT CELL
CLICK IN ON THE GREY BOX
WITH ...
Decimals
The decimal of a variable is the number of decimal places that SPSS will display. If more decimals have
been entered (or computed by SPSS), the additional information will be retained internally but not
displayed on screen. For whole numbers, you would reduce the number of decimals to zero. You can
change the number of decimal places by clicking in the decimals cell for the desired variable and
typing a new number or you can use the arrow keys at the edge of the cell

Label
The label of a variable is a string of text to indentify in more detail what a variable represents.
Unlike the name, the label is limited to 255 characters and may contain spaces and
punctuation. For instance, if there is a variable for each question on a questionnaire, you would
type the question as the variable label. To change or edit a variable label, simply click anywhere
within the cell
Values
Although the variable label goes a long way to explaining what the variable represents, for categorical
data (discrete data of both nominal and ordinal levels of measurement), we often need to know which
numbers represent which categories. To indicate how these numbers are assigned, one can add labels to
specific values by clicking on the ... box in the values cell

Clicking here opens up the Value Labels dialogue box.


To value 1.0 to cats and 2.0 to dogs, write 1.0 in value box and write cats in value label then click Add button,
the following box will appear.
Clicking on this box will bring up the variable type menu:

If you select a numeric variable, you can then click in the width box or
the decimal box to change the default values of 8 characters reserved
to displaying numbers with 2 decimal places. For whole numbers, you
can drop the decimals down to 0.
If you select a string variable, you can tell SPSS how much "room" to
leave in memory for each value, indicating the number of characters
to be allowed for data entry in this string variable.
When you are satisfied with the definitions of each value, click on the OK button
The real beauty of value labels can be seen in the Data View by clicking on the "toe
tag" icon in the tool bar , which switches between the numeric values
and their labels
A view of different variables with their descriptions
Missing
When you click missing button the SPSS will display this

We sometimes want to signal to SPSS that data should be treated as missing, even though there is some
other numerical code recorded instead of the data actually being missing (in which case SPSS displays a
single period
-- this is also called SYSTEM MISSING data). In this example, after clicking on the ... button in
the Missing cell, I declared "9", "99", and "999" all to be treated by SPSS as missing (i.e., these values will be
ignored)
Columns
The columns property tells SPSS how wide the column should be for each variable. Don't confuse this one
with width, which indicates how many digits of the number will be displayed. The column size indicates how
much space is allocated rather than the degree to which it is filled.
Align
The alignment property indicates whether the information in the Data View should
-justified,
be left right
-
justified, or centered
Measure
The Measure property indicates the level of measurement. Since SPSS does not differentiate between
interval and ratio levels of measurement, both of these quantitative variable types are lumped together
as "scale". Nominal and ordinal levels of measurement, however, are differentiated
ENTERING
DATA SET
Into SPSS
Exampl
e
Let we have data set with different
and we need to enter in SPSS, below is set
variables
variables and data set, this file is named
of
“bp”
as in
dataset
Data
Professor Christopher conducted a study on subjects; the variable
Set: is as with data
Varia
description
Sjcode
ble
Descript
ubject Code
ion
Sex Subject sex (0 = female, 1= male)
Age Subject age
Height Height in inches
Weight weight, in pound
Race Subject Race (1=Amer, 2= Asian, 3= black, 4=
Hispanic, 5= white, 9= none of above)
Med Taking prescription medication (0= No, 1= Yes)
Smoke Does subject smoke? (0 =Nonsmoker, 1= smoker)
SBPCP Systolic blood pressure with cold presser
DBPCP Diastolic blood pressure with cold presser
HRCP Heart rate with cold presser
SBPMA Systolic blood pressure while doing mental
arithmetic
DBPMA Diastolic blood pressure while doing mental
arithmetic
HRMA Heart rate with while doing mental arithmetic
SBPREST Systolic blood pressure at rest
DBPREST Diastolic blood pressure at rest
PH Parental hypertension (0= No, 1= yes)
MEDPH Parent(s) on EH meds (0= No, 1=yes)
Entering data into data editor
In this lesson our goal is only, how to enter, save, and edit data (the data sheet given above). The first step in
entering the data into data editor is to define all the variables. Creating a variable requires us to name it,
specify the type of data (nominal, ordinal, Scale) and assign label to the variables and data values if needed.
•Move the cursor to the bottom of the data editor, named as variable view and click it, a different grid appears
as

•Move the cursor into first empty cell in row 1 (under name) here type sjcode, then press enter
•When the cursor moves to the Type column , a small grey button marked with three dots

will appear, click on it you see this dialog box, numeric is default variable type, click ok.
Note that the Measure column (far right column) be put on scale, because you took numeric as variable
type, In SPSS, each variable carry a descriptive label to help identify its meaning. To add label, here is
procedure:
•Move the cursor into the label column and type Subject Code.

This complete the definition of first column.


•Now lets creats a varable to represent sex, move the fisrt colume of row 2, and name the variable
sex.
•Because sex is categorical (qualitative ) variable and we are going to represent it numerically ( for
data analysis purpose, because SPSS only entertains quantitative variable). Sinse numeric is the
default in type column, we shall skip it and go to width taking width as per our requirement, in
decimal column reduce from 2 to 0
•Label this variable as subject sex
•Now we can assign text label to our coded values ( as discussed previously). In the values column
click the grey box with three dots. A box will open as below
Type “0” in value box and type Female in the value label box.
Then click add

Now type 1 in Value and Male in Label, click add

and the click OK. In similar way we will add all the variables, the variable view window will be seen as
Now Switch to data view by clicking the appropriate tab in the lower left of screen.

Move the cursor to the first cell below the sjcode, and type 3, and then press Enter.
In the next cell type 4, when you completed the subject code, move to the tope cell
under sex, type\0” for female and\1” for male and go on. When you are done all,
the data editor should look as

On clicking the third button (named Value label) at left most you will see the screen as below
Saving the data file
It is wise to save all your work in a disk file. To save a file, click on file menu, choose save
…,as
then next to file name, where
type BP, then click save.
Editing the data file/value

To edit any value, just to open the data file and click edit menu, and
select the case or variable which is required for editing.

Quitting SPSS
When you have completed your work, it is important to exit the program propoerly. Go
to file menu, then click on Exit , generally you will see a message asking if you wish to
save changes. Since we saved every thing earlier, click No.
File
management
Here we discuss the issues like,
select, split, compute new
transform,
rvariables,
-coding of data, merging files,
transpose,
e sorting, weighted
cases
Sorting data
This tool allows you to rearrange the data
Open file data sort
cases
select variable then ok
Replacing missing
values
If some values are missing in data/variables
can
that be replaced by different
variable isifcategorical then the value is
methods,
by the researcher on his/her
replaced
experience, but the variable is
personal
will help using
continuous, the Replace missing
SPSS
command.
value Open file, and investigate any
value using sort
missing
command,
Cont………
Then go to transform tool replace
missing using option
value
Creating Variables
Sometimes a new variable is needed on
basis
the of current/existing variable or set of
variables. The producer is
as
Men transfor comput
variable …..m
u Insert target value
e and write
desired operation in target expression like
square, log
ect.
Activit
y
Open file “student” , convert weight into Kg
fiend
then BMI of students. 1 Kg = 2.20462 Lb
1M = 39.3701 and find BMI= )
and
weight/(height
Compare this BMI with 2
this
BMI =weight in Lb/height in inch
x703
R -codin
etheg researcher is interested
If -code to
data as you want to recode
re the15 or wants
make
to numerical data into groups , then we
. From the
r -code tool. Open the data
use
choose:
e file | Recode|menus
Into
Different
Transform
Variables...
Following Recode into Different
Dialog box
Variables
appears.
Select the variable you want to recode. For this example
AAA,andselect
click the
right arrow button
►) to
( move the variable into the Input Variable
> Output
Variable box, following sign appears in this box:
AAA >?
In the Output Variable group, enter an output variable
AA1)name
in the(e.g.
Name
box, and you may labelStillbirth
it as Rate Category
[optional] for new variable and
click
change.
Up to now, the dialog box looks as under:
Click Old and New Values... tab following dialog box appears, and specify how to recode
values

In the old value group, select the 5th choice then put 24 in the lowest through box.. In the
value box under new value group input 1.
Click Add tab. Similarly, for the closed class interval
-29,like
select
25 the 4th choice in the old
value group then put 25 (selection of 4th choice in each case) till the time when you input 5 in the
New Value through 29 and in the value under new value input 2, then click Add tab. Repeat this
process . Now for the highest open class, select the 6th choice in the Old Value group then put 45
in the through highest box. In the Value box under New Value group input 6, then click Add tab.
The final shape looks as under.

Click Continue and then OK. The-SPSS


XYZ Data Editor containing two variables
AAAviz.
and AA1
t looks as under,
one in Variable View and other in Data View.
Specify Value Labels
Make the Data Editor the active window.
If the data view is displayed, double
-click the variable name at the top of the column in
the data view or click the Variable View tab. Click the button in the values cell for the
variable that you want to define. For each value, enter the value and a label (the one
as seen below). Click Add to enter the value label, at last click OK.
Activit
y
For above activity make grouping of BMI as
Underweight < 18.5
Normal 18.5
- 22.9
Overweight > 22.9
Also make output of
groups
Select
cases
This tool is used to analysis-grou
data for
or
suba specific group like meanp of
whose weight is above 85
respondent
Kg
Open file, select data at MENU bar, select
,cases
click on if and write your option for
for example
selection , select male in BP file as
gender=1
Activit
y
Select male cases in “bp” file also female
age is more than 50
whose
years
Merging
file
Two file may be merged either by variables
by
or case. Let we have 1000 respondents
has six variables. If two data entry
whose
are completing this task. They can do this
operators
two ways
task in (1) divide the cases to
divide the(2)
complete number of
variables
Split
file
File can be split into two or three
to menu then
categories, godata then select split file and
perform
then
operation
Data analysis
BASIC
STRATEGY
The following strategy is adopted to analyze the
data
• Description , counting,

Proportion relationship,
•Prediction,

•Association
Comparing , estimation (95% confidence
interval)
DATA ANALYSIS MAY BE
DESCRIPTIVE OR
INFERENTIAL
DESCRIPTIVE CONTAINS
MEAN,
MEDIAN , MODE, SD,
REGRESSION,
CORRELATION ,
ON THE OTHER HAND
CONFIDENCE INTERVAL,
TESTING
OF HYPOTHESIS, P-VALUE,
ANOVA RELATE TO
INFERENTIAL
UNI-VARIATE DESCRIPTIVE
ANALYSIS
Graphical
Method
For nominal & ordinaldata we use Bar or pie
chart
For continuous data we use
histogram
Numerical
method
For nominal & ordinal data we use
Frequency/proportions
For continuous data we use Mean , Standard
deviation
Summary Guide
Scale Nominal Ordinal

Displaying data
Histogram Bar chart, Pie chart Bar chart, Pie chart
Box-plot

Summarizing data
Mean, Median, SD Frequency table, Frequency table,
Percentages, Percentages,
Proportion Proportion
GRAPHS FOR
CATEGORICAL DATA
MAKING BAR/PIE CHART

Open the file, then from


-down
pull menu
on legacy dialogue, then click Bar/pie chart ,
click
select variable then click ok
DATA SUMMERY
Open the file, then from -down
pull menu click on
analyze Descriptive
frequency select variable
statistics
Click ok, output window will appear
GRAPH FOR CONTINUOUS
DATA
MAKING HISTOGRAM
Open the file, then from
-down
pullmenu
on legacy dialogue, then click histogram,
click
variable,
select click ok
DATA SUMMARY

Open the file, then from


-down
pullmenu click on analyze
Descriptive Descriptive
statistics
select variable Statistics
Click ok, output window will appear
FOR ALL DESCRIPTIVE
AND 95% CONFIDENCE INTERVAL
STATISTICS

Open the file, then from


-down menu click on
Descriptive
pull explore
analyze selec
variable
statistics Click ok, output windowt will
appear
Summary Guide for appropriate analysis
for two variable
Type of variables Graphical display Relationship

Categorical- Multiple bar Contingency table


categorical

Categorical-Scale Box-plot Descriptive statistics


for each group

Scale-scale Scatter plot Correlation


GRAPH FOR CATEGORICAL
DATA
MULTIPLE BAR CHART

Open the file, then from


-down
pull menu click on legacy
dialogue, then click Bar chart , select variable to
category axis and one to cluster then click ok
CONTINGENCY
TABLE
Open the file, then -from
down menu click on
pull Descriptive cros-ta
analyze sele
statistics
variables, one to row ands one b to column,
ct for cell
Click cell and click on total,
proportion -square click on
ok, chi
for output window will statistics
appear
GRAPH FOR CONTINUOUS
DATA
SCATTER PLOT

Open file, on pull


-down menu, click on graph
legacy dialogs scatter plot
enter variables -to
axis
x and-y
axis then click ok
CORRELATION COEFFICIENT

Open the file, then from


-down
pull menu click
analyze correlate on select variables
ok, output window
will appear
SUMMARY ONE CATEGORICAL
ONE CONTINUOUS VARIABLE
When we have one categorical and
continuous variable , then for
one
analysis we will use Explore command and
descriptive
graph we use-plot , suppose we
for
gender
Box and weight
have of
respondents
DESCRIPTIVE
STATISTICS
Open file, go to analyze, then select
statisti
descriptive explore , a window will open
select continuous
cs thenvariable and past to
and categorical
dependent list to factor list , then
click ok
BOX PLOT

Open file, click on Graph then click to legacy


the box plot then click simple then define now
dialog,
continuous variable to variable and categorical
put
SES) to category axis and click
(sex,
ok
REGRESSION
ANALYSIS
Prediction of one variable on the basis of other
set of variables
variable or (be sure all variables are
example prediction
continuous) for of BP when age of a person
years.
is 55 The mathematical equation
is as
BP(Y) a b(Age) X
Where a and b are coefficients of
equation
CONT…..

Open analyz Regressio Linea


the put dependent
file e variable
n and independent
r
respected
variable in o
box k
REGRESSION LINE
Y(BP) 129.61 0.075(Age)

This is regression line using results of previous


slide.
MEASURE OF
RISK
When we have exposure and outcome (2x2) ,
Odds Ratio (OR) is measure
the -tab
in
command,
cross when we open-tab, click
statistic,
cross then click on Risk
on and
continue
Activit
y
Open file “states”, for variable “bac”, what percentage of
use the 0.8
states
standard.
Open file “Aids”, determine the shape of distribution of Aids
reported in
cases
1994
Open file “students”, make
-b -side histogram of height
comparison
side for male and
y infemale.-tab
Make a
table)
cross of gender, and
-color, also compare blue color in
(contingency
and
eye female. Make male
a scatter plot between height and weight
interpret
and the graph. Compute descriptive statistics of
amount
variable paid for hair
cut.
Cont……
Open file “college” , focus on two- variables
state
in tuition and
-state tuition , show
varies more (calculate
out which coefficient of
Construct
variation).-plot for math score in public
private school
Box and and comments on plot. On
average, in which subjects (mathsat,
the
score is
verbsat)
larger.
Cont….
Open file “GSS94” , answer the
questions
Did female tends to watch more or less TV per day than
(calculate descriptive )
male
Ifstatistics
the respondents are afraid to walk alone in neighborhood,
compare mean age of those who said “yes” or
“no”.
Make contingency table for sex and
Race.
Make a cross
-tab of variables marital status and marnomar
find the probability
and of a person who is married
Cont….
.Open file “bodyfat”, calculate
between neck and chest circumference,
correlation
regression
also fit a line chest circumference on
circumferen
neck
ce.
Investigate the variables “Fatperc”,
“weight”,
“age” , “neck” about their normality
appropriate
using test and
graph.
Cont…
.Open file “sleep”, using appropriate descriptive and
technique, how would you establish relationship between the
graphical
of sleep a species require and mean weight of
amount
interpretAlso
species. the results. Make a frequency distribution of
amount of sleep taking appropriate interval. Construct
variable
confidence interval for total sleep and life
95%
span
Open file “colleges”, construct 95% confidence interval for
room
mean and board charges and what does it
mean?
TESTING OF HYPOTHESIS

Here we will discuss


• one sample
-test
•Two sample t t (independent groups, dependent
-test
groups)
•One way AVOVA-test
(F
)
ONE SAMPLE -TES
T
T
Open data file “bodyfat”, test the hypothesis
population mean body fat is 23 against it is
the
equal
not to 23.
Analyze compare one sample
-
test, select variable
means body fattand enter 23 as
value, results are
test
as
INTERPRETATION OF
RESULTS

Here the sample mean is-statistic


19.15 -7.3
and
p-value
and t is 0.000, which suggested
is 0 to reject null
and it is concluded that population mean body
hypothesis
2
fat is not
3
TWO (INDEPENDENT) -
TES
SAMPLE T
T
Sometimes we focus on comparing means of
interest of
variable of two different samples. For example
height of bys is different from girl’s height.
whether
“students”
Open file and compare height of boys
and girls
Open file analyze compare
independent samples ,means
click then a window
open select height as test variable and gender
will
as grouping variable. Define grouping
variable putting the value of male and female
then click
ok
T value
P-value
PAIRED
-TEST (DEPENDENT
T SAMPLES)
Sometimes observations are taken before and
treatment
after someon same respondents. For
measure BP
example before
is and after medicine. This type of
called paired
sample is sample. Open file “swimmer2” and
to see
we any difference is freestyle at two points of
wish
students
Open file analyze compare
paired sample-test , clickmeans
then a window will
tselect two variables
open 100 meter freestyle click
ok
ONE WAY
- ANOVA
For more than two independent groups-wa we
ANOVA.
use one Suppose we are interestedy to know
campus
whether job
outeffect the students GPA. Open file
and test GPA with grouping variable work
student
null hypothesis
category. The is that GPA is same for all
category. If null hypothesis is rejected then we
working
test
post hoc
(LSD)
PROCEDUR
E
Open file analyze compare
One-way ANOVA, the dependent list variable is
means,
GPA, Factor variable is workcat ,click option
unde
statistics , select then click on rpost hoc,
window will open select LSD
descriptive a cick
ok
Postho
tes
c
t
Activit
y
Open file “GSS94” and test the null hypothesis
adults
that thein United States watch an average of three
of TV daily. Test the hypothesis males spent 3
hours
while watching TV (Use select
hours
command)
Is there a statistically significant difference in
time men
amount ofand women spend watching TV. Is
statistically
there a significant difference in amount of
married
time and divorced spend watching
TV?
Cont…
.Open file “students”, test the hypothesis, commuters and
earn significantly different mean grades? Do car owners
residents
significantly fewer accidents on average
have -owners?
your results
than non using 95% confidence -valu
interval
Interpret
and
Openp file “BP”, test the hypothesis:
e. do subjects with parental
of hypertension have significantly higher resting
history
Diastolicand
Systolic BP than subjects with no parental
history?
Open file “GSS94”, does the amount of television viewing
by respondent’s race?
varying
(ANOVA)
Open file “BP”, is systolic BP (sbpma) related to a
parental hypertension
person’s sex, (ph) or some combination of these
factors.
Open file “group” , is subject’s perception
-worker related
gender
of co , group size or combination to of these two
factors?
Open file “bodyfat”, consider a man whose chest
95 cm, abdomen
measurement is is 85 cm, and whose weight is 158
regression equation to estimate this man’s body fat
pounds; use
multiple regression)
percentage. (use Also write the regression equation and
the
interpret
results.
Develop the multiple regression line to estimate
percentage
body fat on the basis of following variables, Age,
abdomen circumference, chest circumference, thigh
weight,
wrist circumference using matrix plot/correlation
circumference, -valu
matrix/ p “salem”, test whether variablese.proparri and
Open file
independent
accuser are (use
-square
chi test) test smokers tend to drink more
Open file “students”,
nonsmokers?
beer than (select parametric
-parametric test , t test
Man -U
or non or
n test)
Graphs, Bar, Pie
Descriptive Analysis Charts
Frequency (f),
(%), Proportion
Percentage
Categorical
Data Chi-squareχ2)
tes
Inferential Analysis (
Z-tes
Univariat t
e Histogra
Descriptive Analysis m
Mean± S.
Continuous D
Data Z-test
Inferential Analysis (n>30)
t-test
Medical Data (n<30)
Analysis
Multiple Bar
Descriptive Analysis Charts
Contigency Table
Categorical
Data Association
χ2, OR, RR
Inferential Analysis
Prediction,
Regressio
Logistic
Multivariat n
e Scatter Plot, Box
Descriptive Analysis Plot
Relationship,
Correlatio
Regression,
Continuous n
Data t-tes
Inferential Analysis t
ANOVA, Multiple
Regressio
n
View publication stats

You might also like